Closed
Bug 807637
Opened 10 years ago
Closed 8 years ago
Use xz (lzma) for compressing release tarballs instead of bzip2 : smaller archive and faster decompression
Categories
(Release Engineering :: General, defect, P4)
Release Engineering
General
Tracking
(firefox41 fixed)
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
firefox41 | --- | fixed |
People
(Reporter: jerome.bouat, Unassigned)
References
(Depends on 1 open bug)
Details
(Whiteboard: [release])
Attachments
(1 file, 1 obsolete file)
2.50 KB,
patch
|
mshal
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:16.0) Gecko/20100101 Firefox/16.0 Build ID: 20121025205329 Steps to reproduce: I downloaded the linux archive tar.bz2 in french here : https://www.mozilla.org/fr/thunderbird/ Actual results: Bzip2 compression increases the archive size by 15% compared to xz (lzma) : ----- j@dt:~$ bunzip2 -c thunderbird-16.0.2.tar.bz2 | xz -9e > thunderbird-16.0.2.tar.xz j@dt:~$ ls -lk thunderbird-16.0.2.tar.* -rw-r--r-- 1 j j 20455 2012-11-01 12:23 thunderbird-16.0.2.tar.bz2 -rw-r--r-- 1 j j 17820 2012-11-01 12:26 thunderbird-16.0.2.tar.xz j@dt:~$ ----- Note that decompression is faster with xz than bzip2 : ----- j@dt:~/tmp$ time { for i in $(seq 1 9) ; do bunzip2 -c thunderbird-16.0.2.tar.bz2 > /dev/null ; done ; } real 0m48.445s user 0m48.160s sys 0m0.320s j@dt:~/tmp$ time { for i in $(seq 1 9) ; do unxz -c thunderbird-16.0.2.tar.xz > /dev/null ; done ; } real 0m19.643s user 0m18.750s sys 0m0.710s ----- Expected results: I think that all tar archives should be compressed with xz in order to save servers and users bandwidths, as well as all intermediate networks and storages (on servers or into temporary user directory). Moreover, when uncompressing the archive on a laptop, unxz (unlzma) will use less energy that bunzip2. On a laptop, the battery operation could be extended with lzma, especially if you perform a lot of software installations and updates.
Comment 1•10 years ago
|
||
is unxz installed by default on linux distros ?
Comment 2•10 years ago
|
||
This is more a release engineering or core change rather than being specific to Thunderbird.
Component: Installer → Release Engineering: Releases
Product: Thunderbird → mozilla.org
QA Contact: bhearsum
Version: 16 → other
Comment 3•10 years ago
|
||
(In reply to Ludovic Hirlimann [:Usul] from comment #1) > is unxz installed by default on linux distros ? It usually is I think. At least in openSUSE it is since a few distribution releases. Our RPM payload is lzma/xz compressed as well. GNU tar seems to have xz compression since 2009-03-05 (http://www.gnu.org/software/tar/)
Reporter | ||
Comment 4•10 years ago
|
||
On the current stable release of Debian : - the xz-utils package has "required" priority - xz-utils is a strong dependency of dpkg (which handles the Debian archives to be installed) See the details on the stable release of Debian : ----- j@d64:~$ dpkg-query -S /usr/bin/unxz xz-utils: /usr/bin/unxz j@d64:~$ dpkg-query -s xz-utils | head -5 Package: xz-utils Status: install ok installed Priority: required Section: utils Installed-Size: 460 j@d64:~$ dpkg-query -s dpkg | grep -i depends Pre-Depends: libbz2-1.0, libc6 (>= 2.6), libselinux1 (>= 1.32), zlib1g (>= 1:1.1.4), coreutils (>= 5.93-1), xz-utils j@d64:~$ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 6.0.6 (squeeze) Release: 6.0.6 Codename: squeeze -----
Comment 5•10 years ago
|
||
In Ubuntu 10.04, it's not installed by default, but I believe it is in Ubuntu 12.04
Reporter | ||
Comment 6•10 years ago
|
||
The only drawback of lzma compression is the required memory for decompression (the man page says 65 MiB with -9 option for decompression). However Thunderbird already requires this amount of memory at runtime. Requiring the same memory at installation time shouldn't be an issue. The man page says xz requires 674 MiB of memory for compression (and a lot of time). However, this step is done only 1 time against the decompression which will be performed thousands times.
Reporter | ||
Comment 7•10 years ago
|
||
In reply to comment #5, the long term support of Ubuntu is now the 12.04 altough the 10.04 is supported without charges up to 2013, April. For older distributions, a comment on the download page would explain how to install xz if it is missing : ----- Execute the below command in a terminal : For Ubuntu like distributions : "sudo apt-get install xz-utils" For Debian like distributions : (as root) "apt-get install xz-utils" For Redhat like distributions : (as root) "yum install xz" -----
Reporter | ||
Comment 8•10 years ago
|
||
Maybe if the developpement releases of Thunderbird start to use the xz compression then the stable releases of Thunderbird will start to use xz compression when the Ubuntu 10.04 version will be ended ?
Updated•10 years ago
|
Component: Release Engineering: Releases → Release Engineering: Automation (General)
QA Contact: bhearsum → catlee
Updated•10 years ago
|
Priority: -- → P4
Summary: xz (lzma) versus bzip2 : smaller archive and faster decompression → Use xz (lzma) for compressing release tarballs instead of bzip2 : smaller archive and faster decompression
Whiteboard: [release]
Assignee | ||
Updated•10 years ago
|
Product: mozilla.org → Release Engineering
Comment 9•9 years ago
|
||
In general I'm in favour of doing this, but I do worry about what other impact this change will have to downstream consumers of these files. Please start a newsgroup thread advocating for this change so that we can have some discussion there instead of here in the bug.
Comment 10•9 years ago
|
||
Please consider also to use lzip. lzip is another user of the lzma algorithm but has some advantages over xz: - plzip supports multithreaded compression/decompression. xz doesn't support this option yet. - the lzip project has differents compressors and decompressors written in C and C++. So mozilla would have always an alternative plan if something goes wrong. - lunzip is really tiny and the next version will have an option to limit the memory use. This is perfect for memory-constrained enviroments. BTW, the standard version uses only 33MB of RAM for files compressed with -9. - the lzip project uses the GPLv3 license for every project but also has an alternative compressor/decompressor with a public domain license (no complains from i-dont-like-the-gpl environments). - clzip and lunzip support any OS with a C compiler, no external dependencies. There is also a windows version. - lzip uses the same interface and same numbers for the exit status (https://lists.gnu.org/archive/html/bug-tar/2013-05/msg00001.html) than bzip2. - the format is focused on long term archiving and high compression ratios. 622M firefox-25.0.source.tar 122M firefox-25.0.source.tar.bz2 98.0M firefox-25.0.source.tar.lz 97.9M firefox-25.0.source.tar.xz http://lzip.nongnu.org/
![]() |
||
Comment 11•9 years ago
|
||
(In reply to Juan Francisco Cantero Hurtado from comment #10) > Please consider also to use lzip. lzip is another user of the lzma algorithm > but has some advantages over xz: > - plzip supports multithreaded compression/decompression. xz doesn't support > this option yet. There is pxz. > - the lzip project uses the GPLv3 license for every project but also has an > alternative compressor/decompressor with a public domain license (no > complains from i-dont-like-the-gpl environments). The bug only concerns GNU/Linux packages. > - the format is focused on long term archiving and high compression ratios. This bug is about *release* tarballs. Nightly builds would benefit from compression more, but release builds need recognizability first. Numbers of packages installed and run recently on Debian popcon users' computers according to http://popcon.debian.org/: Inst Vote xz-utils 145439 39044 lzip 1272 168 xzdec 169 24 plzip 258 42 lunzip 122 24 pxz 86 10 clzip 60 20
![]() |
||
Comment 12•9 years ago
|
||
bzip2 138838 66950 pbzip2 3926 499 Weird, bzip2 has fewer installs than xz-utils does?
Comment 13•9 years ago
|
||
(In reply to Aleksej [:Aleksej] from comment #11) > (In reply to Juan Francisco Cantero Hurtado from comment #10) > > Please consider also to use lzip. lzip is another user of the lzma algorithm > > but has some advantages over xz: > > - plzip supports multithreaded compression/decompression. xz doesn't support > > this option yet. > > There is pxz. Good to know. > > > - the lzip project uses the GPLv3 license for every project but also has an > > alternative compressor/decompressor with a public domain license (no > > complains from i-dont-like-the-gpl environments). > > The bug only concerns GNU/Linux packages. I was thinking in the source tarballs, my bad :) . Yes, xz makes more sense in this context. I'll open another bug report. > > > - the format is focused on long term archiving and high compression ratios. > > This bug is about *release* tarballs. Nightly builds would benefit from > compression more, but release builds need recognizability first. > > Numbers of packages installed and run recently on Debian popcon users' > computers according to http://popcon.debian.org/: > Inst Vote > xz-utils 145439 39044 > lzip 1272 168 > xzdec 169 24 > plzip 258 42 > lunzip 122 24 > pxz 86 10 > clzip 60 20
Reporter | ||
Comment 14•9 years ago
|
||
(In reply to Juan Francisco Cantero Hurtado from comment #10) > - plzip supports multithreaded compression/decompression. xz doesn't support > this option yet. Maybe you should think about using pipes at first like the below command : --- tar cf - [file 1] [file 2] | xz -9e > archive.tar.xz --- Basically, this ensure 2 separate processes to run simultaneously. If there are many cores, you will get actual parallelism. > - lunzip is really tiny and the next version will have an option to limit > the memory use. This is perfect for memory-constrained enviroments. BTW, the > standard version uses only 33MB of RAM for files compressed with -9. The memory used for extracting Thunderbird isn't an issue since you are expecting at least 256 MB physical memory (Windows and Mac) in order to use this software : http://www.mozilla.org/en-US/thunderbird/system-requirements/
Reporter | ||
Comment 15•9 years ago
|
||
I just sent a message to the dev-builds mailing list.
Comment 17•9 years ago
|
||
Hi mshal, first patch on enabling the xz archives, this switches source-packages from bz2 to xz compression. time make -f client.mk source-package * firefox-35.0a1.source.tar.bz2 real 2m30.527s user 1m52.656s sys 0m9.690s size: 169M * firefox-35.0a1.source.tar.xz real 10m28.532s user 10m32.638s sys 0m7.321s size: 137M To have it working on my local machine (OSX 10.9) I had to change the --exclude option to: --exclude='*/dist' (it was: --exclude='$(MOZILLA_DIR)/dist'), without this, the final archive is bigger than 1GB. Our release process runs source-package on linux boxes and we don't have the huge tarball issue there. I don't know if this is a problem of my local setup and/or macosx in general.
Attachment #8497467 -
Flags: feedback?(mshal)
Comment 18•9 years ago
|
||
(In reply to Massimo Gervasini [:mgerva] from comment #17) > To have it working on my local machine (OSX 10.9) I had to change the > --exclude option to: --exclude='*/dist' > (it was: --exclude='$(MOZILLA_DIR)/dist'), without this, the final archive > is bigger than 1GB. > Our release process runs source-package on linux boxes and we don't have the > huge tarball issue there. > > I don't know if this is a problem of my local setup and/or macosx in general. I mentioned this in an email (re-pasting below) - I think the problem is this block: --exclude='$(MOZILLA_DIR)/dist' ifdef MOZ_OBJDIR SRC_TAR_EXCLUDE_PATHS += --exclude='$(MOZ_OBJDIR)' endif In release builds, we have MOZ_OBJDIR set, so it gets an '--exclude=/path/to/release/obj-firefox'. Since dist is under obj-firefox, dist is also implicitly ignored. This line: --exclude='$(MOZILLA_DIR)/dist' points to a non-existent directory, since it's really $(MOZILLA_DIR)/objdir/dist that we want to ignore. I think it's just a bug from the original implementation that was missed because the MOZ_OBJDIR exclude effectively hides it. However, when you build locally, you probably don't set MOZ_OBJDIR explicitly, so that exclude doesn't show up. We should probably fix this instead to be: - --exclude='$(MOZILLA_DIR)/dist' -ifdef MOZ_OBJDIR -SRC_TAR_EXCLUDE_PATHS += --exclude='$(MOZ_OBJDIR)' -endif + --exclude='$(notdir $(MOZ_BUILD_ROOT))' Can you try this instead and see if it helps? If so, you should probably make this a separate bug to fix the exclude paths rather than include it as part of the xz changes.
Comment 19•9 years ago
|
||
Comment on attachment 8497467 [details] [diff] [review] xz compression for souce-package >diff --git a/toolkit/mozapps/installer/packager.mk b/toolkit/mozapps/installer/packager.mk >--- a/toolkit/mozapps/installer/packager.mk >+++ b/toolkit/mozapps/installer/packager.mk >@@ -957,14 +957,14 @@ SRC_TAR_EXCLUDE_PATHS += \ > --exclude='.mozconfig*' \ > --exclude='*.pyc' \ > --exclude='$(MOZILLA_DIR)/Makefile' \ >- --exclude='$(MOZILLA_DIR)/dist' >+ --exclude='*/dist' See #c18 > ifdef MOZ_OBJDIR > SRC_TAR_EXCLUDE_PATHS += --exclude='$(MOZ_OBJDIR)' > endif > CREATE_SOURCE_TAR = $(TAR) -c --owner=0 --group=0 --numeric-owner \ > --mode=go-w $(SRC_TAR_EXCLUDE_PATHS) -f > >-SOURCE_TAR = $(DIST)/$(PKG_SRCPACK_PATH)$(PKG_SRCPACK_BASENAME).tar.bz2 >+SOURCE_TAR = $(DIST)/$(PKG_SRCPACK_PATH)$(PKG_SRCPACK_BASENAME).tar.xz > HG_BUNDLE_FILE = $(DIST)/$(PKG_SRCPACK_PATH)$(PKG_BUNDLE_BASENAME).bundle > SOURCE_CHECKSUM_FILE = $(DIST)/$(PKG_SRCPACK_PATH)$(PKG_SRCPACK_BASENAME).checksums > SOURCE_UPLOAD_FILES = $(SOURCE_TAR) >@@ -993,7 +993,7 @@ endif > source-package: > @echo 'Packaging source tarball...' > $(MKDIR) -p $(DIST)/$(PKG_SRCPACK_PATH) >- (cd $(MOZ_PKG_SRCDIR) && $(CREATE_SOURCE_TAR) - $(DIR_TO_BE_PACKAGED)) | bzip2 -vf > $(SOURCE_TAR) >+ (cd $(MOZ_PKG_SRCDIR) && $(CREATE_SOURCE_TAR) - $(DIR_TO_BE_PACKAGED)) | xz -9e > $(SOURCE_TAR) > $(SIGN_SOURCE_TAR_CMD) > > hg-bundle: Do we have any interest/need to maintain both bz2 and xz? If not, this looks fine to me!
Attachment #8497467 -
Flags: feedback?(mshal) → feedback+
Comment 20•9 years ago
|
||
(In reply to Michael Shal [:mshal] from comment #18) > + --exclude='$(notdir $(MOZ_BUILD_ROOT))' This assumes MOZ_BUILD_ROOT is a subdirectory of the current directory. It's preferable not to do such assumptions.
Comment 21•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #20) > (In reply to Michael Shal [:mshal] from comment #18) > > > + --exclude='$(notdir $(MOZ_BUILD_ROOT))' > > This assumes MOZ_BUILD_ROOT is a subdirectory of the current directory. > > It's preferable not to do such assumptions. Ahh, ok. What should we use here then?
Reporter | ||
Comment 22•9 years ago
|
||
I would like to reply to comment #10 to #14 for later reference : the parallel compression makes the compression ratio worth than a single thread compression. Since the compression speed isn't an issue when building the release tarball, I think the "tar ... | xz -9e" command is a good tradeoff. If you want to speed up the compression, maybe you could try the "tar ... | xz -9" command (without "-e" extreme option) which provides a good compression ratio.
Updated•8 years ago
|
OS: Linux → All
Hardware: x86_64 → All
Comment 23•8 years ago
|
||
Do we need to block on the local OSX case here?
Updated•8 years ago
|
Attachment #8497467 -
Attachment is patch: true
Attachment #8497467 -
Attachment mime type: text/x-patch → text/plain
Comment 24•8 years ago
|
||
Attachment #8497467 -
Attachment is obsolete: true
Attachment #8623190 -
Flags: review?(mshal)
Comment 25•8 years ago
|
||
Comment on attachment 8623190 [details] [diff] [review] use xz for source archives Looks fine, as long as try is happy.
Attachment #8623190 -
Flags: review?(mshal) → review+
Comment 26•8 years ago
|
||
Comment on attachment 8623190 [details] [diff] [review] use xz for source archives there is no try
Attachment #8623190 -
Flags: checked-in+
Comment 28•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/834ad47007f2
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
status-firefox41:
--- → fixed
Resolution: --- → FIXED
Assignee | ||
Updated•5 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•