Open Bug 1710599 Opened 3 years ago Updated 2 days ago

Switch from bz2 to zstd for Firefox releases on https://ftp.mozilla.org/

Categories

(Release Engineering :: General, enhancement)

enhancement

Tracking

(Not tracked)

UNCONFIRMED

People

(Reporter: aros, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0

Steps to reproduce:

Consider these:

-rw-r--r--.  1 user user 75181163 May 11 11:01 firefox-88.0.1.tar.bz2
-rw-r--r--.  1 user user   64302774 May 11 11:03 firefox-88.0.1.tar.zst

I.e. over 15% smaller.

Time to unpack:

time bzip2 -t firefox-88.0.1.tar.bz2 

real	0m7.095s
user	0m7.072s
sys	0m0.018s

time zstd -t firefox-88.0.1.tar.zst 
firefox-88.0.1.tar.zst     : 230144000 bytes                                          

real	0m0.318s
user	0m0.311s
sys	0m0.007s

I.e. 22 times faster.

Compressed using --ultra -22 --long

There are compatibility reasons to keep bz2, but last time I think we discussed this, zstd was still young. Maybe time to reconsider?

Flags: needinfo?(bhearsum)
Product: Firefox Build System → Release Engineering
QA Contact: mtabara
Version: Trunk → unspecified

(In reply to Mike Hommey [:glandium] from comment #1)

There are compatibility reasons to keep bz2, but last time I think we discussed this, zstd was still young. Maybe time to reconsider?

All the main distros support zstd out of box now as far as I know, so I don't see a compatability reason why we shouldn't do this. The only old bug I can find on this is https://bugzilla.mozilla.org/show_bug.cgi?id=1303190.

Our builds are much less used than distro builds, though, so the benefits are more marginal than one might think. Given that we're very, very understaffed at the moment, I doubt we'd be able to prioritize this right now.

Flags: needinfo?(bhearsum)
  1. ZSTD nowadays is supported out of the box by all major distros
  2. Most distro users use Firefox bundled by the distro regardless, so they don't use the official packages, so whatever format you're distributing Firefox in doesn't affect them
  3. Those who actually download Firefox from your website I'm sure can install ZSTD on their system even if their distro is really old.

You'll save a lot of space by converting to ZSTD.

Severity: -- → S3

This is still actual.

Type: defect → enhancement

(In reply to Artem S. Tashkinov from comment #3)

You'll save a lot of space by converting to ZSTD.

Thank you for bringing this up, Artem! I know cloud storage is usually cheap. That said, I don't know how much bandwidth we could save if we publish Linux archives that are 15% smaller. Tom, is this number easy to estimate for the CloudOps team? If not, what team should we reach out to?

Flags: needinfo?(thealy)

It's not about just saving space. As I've shown earlier zstd is 30 faster at decompression than bzip2.

(In reply to Johan Lorenzo [:jlorenzo] from comment #5)

(In reply to Artem S. Tashkinov from comment #3)

You'll save a lot of space by converting to ZSTD.

Thank you for bringing this up, Artem! I know cloud storage is usually cheap. That said, I don't know how much bandwidth we could save if we publish Linux archives that are 15% smaller. Tom, is this number easy to estimate for the CloudOps team? If not, what team should we reach out to?

Yes, cloud storage is on the cheaper side but the bandwidth and CDN costs come into play as well to distribute it. It's not a quick estimate as we'd have to segment it out from the other items sharing the bandwidth and CDN. Honestly, if the level of development effort is not that big, I don't see a downside into saving costs and speeding up decompression.

Flags: needinfo?(thealy)
See Also: → 1303190

downloading a recent nightly:

$ wget https://archive.mozilla.org/pub/firefox/nightly/2024/09/2024-09-08-21-18-00-mozilla-central-l10n/firefox-132.0a1.fr.linux-x86_64.tar.bz2
# size = 97805542
$ tar jxvf ../firefox-132.0a1.fr.linux-x86_64.tar.bz2

I just tried with:

$ ZSTD_CLEVEL=19 tar -I zstd -cvpf firefox-132.0a1.fr.linux-x86_64.tar.zst firefox  
# size = 84456956 

xz is doing better:

$ XZ_OPT=-9 tar -cJf firefox-132.0a1.fr.linux-x86_64.tar.xz firefox  
# size = 76992276

XZ is very slow to unpack, in fact it's more than ten times slower and offers only a marginal ~5% compression ratio improvement.

In fact a year ago or so I convinced NVIDIA to switch from XZ to ZSTD and they actually liked it.

Also, you did not use better compression options for ZSTD, please do:

They are --ultra -22 --long.

The default options are meant for serving web content and are not so good for offline compression.

And with proper options:

zstd --ultra -22 --long *.tar
xz -9e *.tar
bz2

-rw-r--r--. 1 birdie birdie 97805542 Sep  8 23:49 firefox-132.0a1.fr.linux-x86_64.tar.bz2
-rw-r--r--. 1 birdie birdie 77070748 Sep  8 23:49 firefox-132.0a1.fr.linux-x86_64.tar.xz
-rw-r--r--. 1 birdie birdie 82182583 Sep  8 23:49 firefox-132.0a1.fr.linux-x86_64.tar.zst

And time to decompress:

time zstd -t *.zst 
firefox-132.0a1.fr.linux-x86_64.tar.zst: 316395520 bytes                       

real	0m0.303s
user	0m0.296s
sys	0m0.029s

time xz -t *xz 

real	0m2.601s
user	0m2.579s
sys	0m0.017s

time bzip2 -t *bz2

real	0m5.486s
user	0m5.454s
sys	0m0.020s

bzip2 is the absolute worst.

ZSTD is 8.7 times faster than XZ and 18 times faster than bzip2.

Ran a few tests to see what the average gain would be across different locales:

ach   linux-x86_64: 83.38mb -> 69.84mb (16.23% smaller)
ach   linux-i686  : 84.37mb -> 72.75mb (13.77% smaller)
en-CA linux-i686  : 84.63mb -> 72.88mb (13.89% smaller)
en-US linux-x86_64: 83.18mb -> 69.94mb (15.91% smaller)
fr    linux-i686  : 84.68mb -> 73.21mb (13.55% smaller)
en-US linux-i686  : 84.50mb -> 72.85mb (13.78% smaller)
en-CA linux-x86_64: 83.27mb -> 69.96mb (15.98% smaller)
fr    linux-x86_64: 83.67mb -> 70.28mb (16.01% smaller)
fi    linux-i686  : 84.38mb -> 72.82mb (13.69% smaller)
fi    linux-x86_64: 83.19mb -> 69.90mb (15.97% smaller)
es-ES linux-x86_64: 83.68mb -> 70.11mb (16.22% smaller)
bs    linux-x86_64: 83.16mb -> 69.86mb (15.99% smaller)
he    linux-x86_64: 83.27mb -> 69.90mb (16.07% smaller)
pt-BR linux-x86_64: 83.59mb -> 70.06mb (16.19% smaller)
bs    linux-i686  : 84.34mb -> 72.79mb (13.69% smaller)
es-ES linux-i686  : 84.63mb -> 73.03mb (13.70% smaller)
he    linux-i686  : 84.32mb -> 72.83mb (13.62% smaller)
pt-BR linux-i686  : 84.53mb -> 72.99mb (13.65% smaller)
Average reduction: 14.88%

I used what we'd use in CI for zstd:

cctx = zstd.ZstdCompressor(level=22)
with open(output_file, "wb") as f, cctx.stream_writer(f) as z:
    with tarfile.open(mode="w|", fileobj=z) as tf:
        with chdir(input_dir):
            tf.add("firefox")

What's the impact on memory usage for decompression, when using level=22 vs lower levels (and vs bzip2, I guess)?

Flags: needinfo?(hneiva)

Also, what's the impact on the time spent compressing the archive (vs xz, I guess).

Was curious about xz, so switched my script to use xz/lzma compression:

en-US linux-x86_64: 83.18mb -> 65.46mb (21.30% smaller)
fr    linux-i686  : 84.68mb -> 68.52mb (19.09% smaller)
fr    linux-x86_64: 83.67mb -> 65.79mb (21.37% smaller)
pt-BR linux-i686  : 84.53mb -> 68.32mb (19.17% smaller)
ach   linux-x86_64: 83.38mb -> 65.36mb (21.60% smaller)
en-US linux-i686  : 84.50mb -> 68.18mb (19.31% smaller)
ach   linux-i686  : 84.37mb -> 68.09mb (19.30% smaller)
en-CA linux-i686  : 84.63mb -> 68.21mb (19.41% smaller)
fi    linux-i686  : 84.38mb -> 68.15mb (19.23% smaller)
fi    linux-x86_64: 83.19mb -> 65.43mb (21.35% smaller)
pt-BR linux-x86_64: 83.59mb -> 65.59mb (21.53% smaller)
en-CA linux-x86_64: 83.27mb -> 65.49mb (21.35% smaller)
es-ES linux-x86_64: 83.68mb -> 65.63mb (21.58% smaller)
bs    linux-x86_64: 83.16mb -> 65.40mb (21.36% smaller)
he    linux-x86_64: 83.27mb -> 65.43mb (21.43% smaller)
es-ES linux-i686  : 84.63mb -> 68.36mb (19.22% smaller)
bs    linux-i686  : 84.34mb -> 68.11mb (19.24% smaller)
he    linux-i686  : 84.32mb -> 68.16mb (19.16% smaller)
Average reduction: 20.33%

Also ran some tests to find memory usage + time spent.
(keep in mind this is running on my local computer, YMMV)

Using build: en-US linux-x86_64 firefox 130.0 (84MB in tar.bz2 format)


Compressing

zstd with -22: ~1GB of memory in 171 seconds -> file size: 71MB
xz with -9: ~730MB of memory in 143 seconds -> file size: 65MB
bzip2 with -9: ~20MB of memory in 20 seconds -> file size: 84MB (ran via cli and not python)


Decompressing

(all via cli)
zstd: 140MB of memory in 5 seconds
xz: 73MB of memory in 6 seconds
bzip2: 12MB of memory in 10 seconds


By looking at these numbers, it seems xz is a more sensible option than zstd?
It does use more memory than bzip2 to decompress, but I don't think <100MB of ram usage is a huge concern?

Flags: needinfo?(hneiva)

By looking at these numbers, it seems xz is a more sensible option than zstd?

I think we should test on other hardware like old SDD before making a decision.

You should also use hyperfine for benchmarking

Ran some decompression benchmarks with hyperfine as suggested by :Sylvestre (thanks for the tool suggestion BTW!)

Note: vms running with the same version of debian on GCP


2 Core - 2GB ram - VM with balanced SSD disk

$ hyperfine --runs 10 --prepare 'rm -rf firefox/; sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' \
"tar xf firefox-130.0.tar.xz" \
"tar xf firefox-130.0.tar.zst" \
"tar xf firefox-130.0.tar.bz2"
Benchmark 1: tar xf firefox-130.0.tar.xz
  Time (mean ± σ):      6.273 s ±  0.251 s    [User: 6.057 s, System: 1.118 s]
  Range (min … max):    6.055 s …  6.690 s    10 runs
 
Benchmark 2: tar xf firefox-130.0.tar.zst
  Time (mean ± σ):     970.2 ms ±  59.9 ms    [User: 876.4 ms, System: 697.9 ms]
  Range (min … max):   867.2 ms … 1090.6 ms    10 runs
 
Benchmark 3: tar xf firefox-130.0.tar.bz2
  Time (mean ± σ):     20.612 s ±  1.315 s    [User: 20.216 s, System: 2.002 s]
  Range (min … max):   19.011 s … 22.862 s    10 runs
 
Summary
  'tar xf firefox-130.0.tar.zst' ran
    6.47 ± 0.48 times faster than 'tar xf firefox-130.0.tar.xz'
   21.25 ± 1.89 times faster than 'tar xf firefox-130.0.tar.bz2'

2 Core - 2GB ram - VM with basic HDD disk

$ hyperfine --runs 10 --prepare 'rm -rf firefox/; sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' \
"tar xf firefox-130.0.tar.xz" \
"tar xf firefox-130.0.tar.zst" \
"tar xf firefox-130.0.tar.bz2"
Benchmark 1: tar xf firefox-130.0.tar.xz
  Time (mean ± σ):      7.420 s ±  0.058 s    [User: 6.950 s, System: 1.107 s]
  Range (min … max):    7.340 s …  7.515 s    10 runs
 
Benchmark 2: tar xf firefox-130.0.tar.zst
  Time (mean ± σ):      1.942 s ±  0.119 s    [User: 1.176 s, System: 0.758 s]
  Range (min … max):    1.813 s …  2.202 s    10 runs
 
Benchmark 3: tar xf firefox-130.0.tar.bz2
  Time (mean ± σ):     20.234 s ±  0.500 s    [User: 19.183 s, System: 1.370 s]
  Range (min … max):   19.793 s … 21.289 s    10 runs
 
Summary
  'tar xf firefox-130.0.tar.zst' ran
    3.82 ± 0.24 times faster than 'tar xf firefox-130.0.tar.xz'
   10.42 ± 0.69 times faster than 'tar xf firefox-130.0.tar.bz2'

Thoughts

zstd is the fastest option, but uses a bit more ram compared to xz (~140mb vs ~75mb)
xz is the most efficient compression ratio (~20% reduction vs bzip2, zstd has ~15% reduction)

Either one of those options are a great improvement over bzip2.

I'd be happy to run other scenarios if you need.

We can tweak the compression effort of those, but xz will always beat zstd in compression ratio, zstd will always beat everything in speed. bz2 shouldn't be measured at this point.

In general, it's important to take into account the number of downloads and decompression of a file when deciding on a compression format, and the projected cost of bandwidth.

For CI, zstd can be nice: we don't page for transfer, we pay per second. zstd compresses faster, decompresses faster, bandwidth is extremely high.

For releases, what I believe to be the focus here xz shaves 6MB off the size. We should do further test with the --x86 flag, for binaries (see man page). That can improve compression by a few percentage points (10 ish ?). We don't really case about decode speed (not our problem, and we can do it in the background), because we care about storage cost and egress cost. I believe compression duration isn't of importance, considering the relative amount of compression vs. decompression.

I tried adding the x86 filter like the docs suggested:

    filters = [
        {"id": lzma.FILTER_X86},
        {"id": lzma.FILTER_LZMA2, "preset": 9 | lzma.PRESET_EXTREME},
    ]
    with lzma.open(output_file, "wb", filters=filters) as f:
        with tarfile.open(mode="w|", fileobj=f) as tf:
            with chdir(input_dir):
                tf.add("firefox")

This shaves another 0.55mb off the compressed file:

Original XZ -9: 65.46MB
XZ with filters: 64.91MB

I think there are two factors, somehow touched by Paul, and what I'm going to talk about is broader than what this single bug is about.

  • We care about the size of what we ship to users because it impacts how much external bandwidth we use.
  • We care how long our CI pipeline takes, and the longest compressing the archive/installer takes, the worst it is for us. That is even compounded by the fact that we do that twice(!), and I think we should change that.

What happens now is that we build Firefox, and in the same job, we create an archive/installer. Then in a separate task, we unpack that archive/installer, sign things, and recreate an archive/installer. There is, in fact, no reason, for us to keep creating an efficient archive/installer in the first step. In fact, there isn't even a reason to keep creating a .dmg or a .exe in the first step. The output of the build tasks could be a tar.zst with not even the best compression level. What matters is that the repack jobs do their best.

Now, as whether to choose xz or zstd specifically for what we ship to users for linux, all things considered, I would pick xz.

While I see the appeal of using zstd in CI for its speed, I also think that sticking with xz across the board makes the most sense for simplicity and consistency. Juggling between zstd for CI and xz for shipping would introduce more complexity, especially when ensuring that xz works correctly for final delivery. Keeping things straightforward by using only xz would reduce potential issues, while still achieving our goals for efficient shipping and pipeline optimization. Ideally, we can revisit zstd once things are more aligned, but for now, focusing on one method seems like the most efficient approach.

We should do a back-of-the-envelope calculation about the number of seconds we'd win by speeding up compression and decompression by 5-ish, and translate this to $, especially in light of what glandium says. It probably isn't negligible, and we seem to already use zst on various artifact, so that would be a win for consistency, not a regression.

Here are some rough estimates based on a release graph, which I estimated at ~2300 linux tasks

Speed gain zst vs xz (range is with* and without SSD VMs)

  • Decompression: 5.2* ~ 5.5 seconds
  • Compression: 49 ~ 79* seconds

Total: from 124660 to 194350 seconds
Converted estimate: 35 to 54 hours

Based on c2-standard-8 machine @ $0.07 hourly (before discounts)

ZST over XZ potential savings: $2.45 to $3.78 per release.


All that being said, supporting ZST for in-between tasks and XZ for final product would require us to add steps and intentionally switch the format (likely a repackage job, which currently isn't used by linux builds).
I'll leave that idea as a potential future improvement, unless if anyone objects.


Compression benchmarks:

SSD:

hneiva@hneiva-compression-study-ssd:~/study2$ hyperfine --runs 10 --prepare 'rm -rf ./firefox.tar.*; sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' \
"tar -I 'zstd -22 -T0' -cf firefox-130.0.tar.zst firefox/" \
"tar -I 'xz -9 -T0' -cf firefox-130.0.tar.xz firefox"
Benchmark 1: tar -I 'zstd -22 -T0' -cf firefox-130.0.tar.zst firefox/
  Time (mean ± σ):     90.025 s ±  1.301 s    [User: 176.509 s, System: 0.719 s]
  Range (min … max):   88.407 s … 92.032 s    10 runs
 
Benchmark 2: tar -I 'xz -9 -T0' -cf firefox-130.0.tar.xz firefox
  Time (mean ± σ):     169.736 s ±  6.836 s    [User: 169.435 s, System: 0.988 s]
  Range (min … max):   165.144 s … 183.177 s    10 runs
 
Summary
  'tar -I 'zstd -22 -T0' -cf firefox-130.0.tar.zst firefox/' ran
    1.89 ± 0.08 times faster than 'tar -I 'xz -9 -T0' -cf firefox-130.0.tar.xz firefox'

Non-ssd:

hneiva@hneiva-crompression-study:~/study2$ hyperfine --runs 10 --prepare 'rm -rf ./firefox.tar.*; sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' \
"tar -I 'zstd -22 -T0' -cf firefox-130.0.tar.zst firefox/" \
"tar -I 'xz -9 -T0' -cf firefox-130.0.tar.xz firefox"
Benchmark 1: tar -I 'zstd -22 -T0' -cf firefox-130.0.tar.zst firefox/
  Time (mean ± σ):     159.919 s ±  0.892 s    [User: 312.455 s, System: 1.173 s]
  Range (min … max):   158.797 s … 161.456 s    10 runs
 
Benchmark 2: tar -I 'xz -9 -T0' -cf firefox-130.0.tar.xz firefox
  Time (mean ± σ):     209.398 s ±  3.043 s    [User: 206.060 s, System: 1.209 s]
  Range (min … max):   202.577 s … 213.635 s    10 runs
 
Summary
  'tar -I 'zstd -22 -T0' -cf firefox-130.0.tar.zst firefox/' ran
    1.31 ± 0.02 times faster than 'tar -I 'xz -9 -T0' -cf firefox-130.0.tar.xz firefox'
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: