If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Recompress APK before zipalign/jarsigner (potential ~13% space reduction)

RESOLVED WONTFIX

Status

()

Firefox for Android
Build Config & IDE Support
RESOLVED WONTFIX
2 years ago
11 months ago

People

(Reporter: rnewman, Unassigned)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---

Firefox Tracking Flags

(fennec+)

Details

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
http://advancemame.sourceforge.net/doc-advzip.html

offers better zip compression, including Zopfli.

It also allows recompression of existing files, which is handy 'cos we're dealing with jar files.

50 iterations takes a while on my machine, but produces the following:

36750514 Jun 11 11:12 gecko-unsigned-unaligned.apk

(6 minutes later)

32252636 Jun 11 11:12 gecko-unsigned-unaligned.apk

Yeah, that's a 13% saving.
(Reporter)

Comment 1

2 years ago
Invocation:

advzip --recompress -k -4 --iter 50 gecko-unsigned-unaligned.apk

`brew install AdvanceCOMP` if you want to give it a try.

I get a CRC error on the final output of packaging, though:

advzip --recompress -k -4 --iter 50 original.apk
Invalid crc on data descriptor 2903009080/134695760 on original.apk
(Reporter)

Comment 2

2 years ago
(Props to hsivonen for suggesting this in Bug 864843.)
(Reporter)

Comment 3

2 years ago
Looks like this might break stuff, though:

E/GeckoJarReader(19073): java.lang.IllegalArgumentException: Got class java.util.zip.InflaterInputStream, but expected ByteBufferInputStream!
E/GeckoJarReader(19073): 	at org.mozilla.gecko.mozglue.NativeZip.<init>(NativeZip.java:28)
E/GeckoJarReader(19073): 	at org.mozilla.gecko.util.GeckoJarReader.getStream(GeckoJarReader.java:131)
E/GeckoJarReader(19073): 	at org.mozilla.gecko.util.GeckoJarReader.getBitmapDrawable(GeckoJarReader.java:48)
E/GeckoJarReader(19073): 	at org.mozilla.gecko.util.GeckoJarReader.getBitmap(GeckoJarReader.java:35)
(Reporter)

Updated

2 years ago
tracking-fennec: --- → ?
status-firefox41: affected → ---
Summary: Recompress APK before zipalign/jarsigner → Recompress APK before zipalign/jarsigner (potential ~13% space reduction)
We don't compress some stuff on purpose like lib*.so, omni.ja, and classes.dex. Maybe you can try again with those excluded.
(In reply to Jim Chen [:jchen] [:darchons] from comment #4)
> We don't compress some stuff on purpose like lib*.so, omni.ja, and
> classes.dex. Maybe you can try again with those excluded.

In fact, I think the *.so files are szipped already and shouldn't be touched, otherwise the reading of the files is broken? Maybe?
Why use advzip (never heard of it until now) instead of 7z (that we likely have on the build slaves)?

> In fact, I think the *.so files are szipped already and shouldn't be touched, otherwise the reading of the files is broken?

That is true.
Oh, -N I guess... adding 5 minutes to the builds is not very nice, though.
(Reporter)

Comment 8

2 years ago
(In reply to Mike Hommey [:glandium] from comment #7)
> Oh, -N I guess... adding 5 minutes to the builds is not very nice, though.

Yeah, I wouldn't expect this to run on every push.


(In reply to Mike Hommey [:glandium] from comment #6)
> Why use advzip (never heard of it until now) instead of 7z (that we likely
> have on the build slaves)?

It was the obvious search result that had an implementation of Zopfli (which is very very slow but typically beats everything else in output size) and could do in-place recompression.

I'd be very happy for someone to repeat this experiment with another suitable compression tool.



(In reply to Jim Chen [:jchen] [:darchons] from comment #4)
> We don't compress some stuff on purpose like lib*.so, omni.ja, and
> classes.dex. Maybe you can try again with those excluded.

Presumably that's to allow our own seeking magic?

Those files are the bulk of the APK (15MB for libxul, 6MB for omni.ja, 6MB for classes.dex), so perhaps this is indication that we're leaving a lot of savings on the table thereā€¦.
So, I took it for a spin on last nightly's gecko-unsigned-unaligned.apk.

Original size: 43136507
Compressed with advzip --recompress -k -4 --iter 50 gecko-unsigned-unaligned.apk: 38434380 (and that took 6 minutes on my machine)
Compressed with advzip --recompress -k -3 gecko-unsigned-unaligned.apk (presumably equivalent to using 7z): 38501878.
Spending 6 minutes for 67498 bytes seems overkill.

Now, let's see how the difference spans:
./mach python
>>> from mozpack.mozjar import JarReader
>>> orig = dict(JarReader('gecko-unsigned-unaligned-orig.apk').entries)
>>> new = dict(JarReader('gecko-unsigned-unaligned.apk').entries)
>>> assert orig.keys() == new.keys()
>>> diff = { f: new[f]['compressed_size'] - orig[f]['compressed_size'] for f in orig }
>>> cumm = 0
>>> for f, n in sorted(diff.items(), key=lambda x: x[1], reverse=True):
...     if n != 0:
...         cumm += n
...         print f, n, cumm
...
(snip)
AndroidManifest.xml -420 -4915
assets/armeabi-v7a/libfreebl3.so -675 -5590
assets/armeabi-v7a/libnss3.so -3639 -9229
assets/armeabi-v7a/libnssckbi.so -5570 -14799
lib/armeabi-v7a/libmozglue.so -5632 -20431
resources.arsc -24199 -44630
assets/armeabi-v7a/libxul.so -424700 -469330
assets/omni.ja -442196 -911526
classes.dex -3635431 -4546957

So the bulk of the difference is libxul.so, omni.ja and classes.dex, classes.dex being the biggest.

libxul.so can't be compressed, it's already compressed in chunks, using some tricks, but yes, it's kind of expected that it's possible to compress it more considering it's compressed in chunks. It /might/ be possible to get some amount off libxul.so by using 7z's deflate instead of zlib's, but iirc, 7z's deflate doesn't take a dictionary, which *does* help a little: without a dictionary, libxul.so would be 22665452 instead of 22444612 in that apk. Note that increasing the chunk size to 32k gets the size down to 21908326, but the impact at runtime should be evaluated (in terms of RSS and startup time).

omni.ja can't be compressed, it's already a compressed zip. You might ask why we have a zip in a zip(apk), well, that's because omni.ja has several thousand files, and android likes to waste time touching them all or something, so that the more files you have in the apk, the slower your app is to start. BTW, when that was done, there were way less files in the apk itself, and I'm worried that the now 1600+ files in the apk *are* impacting startup times. But maybe android itself improved on that, who knows (but that wouldn't solve the problem on older android)
That being said, recompressing omni.ja itself can be a win:
- original omni.ja: 10858746
- recompressed with -3: 10410919

Here, looking at the detail with the same python script shows the difference is more spread:
(snip)
res/fonts/ClearSans-Thin.ttf -1877 -156776
hyphenation/hyph_pl.dic -1882 -158658
res/fonts/ClearSans-Regular.ttf -1883 -160541
res/fonts/ClearSans-Bold.ttf -1906 -162447
res/fonts/ClearSans-Light.ttf -1921 -164368
res/fonts/ClearSans-Medium.ttf -1976 -166344
res/fonts/ClearSans-Italic.ttf -1993 -168337
modules/addons/XPIProvider.jsm -2515 -170852
chrome/chrome/content/browser.js -2635 -173487
hyphenation/hyph_cy.dic -2907 -176394
hyphenation/hyph_en_US.dic -3434 -179828
hyphenation/hyph_uk.dic -3465 -183293
hyphenation/hyph_sh.dic -3602 -186895
chrome/shumway/content/shumway.gfx.js -3927 -190822
hyphenation/hyph_ru.dic -4293 -195115
hyphenation/hyph_nl.dic -4809 -199924
hyphenation/hyph_de-CH.dic -5307 -205231
hyphenation/hyph_de-1996.dic -5326 -210557
hyphenation/hyph_de-1901.dic -5501 -216058
hyphenation/hyph_af.dic -5786 -221844
components/interfaces.xpt -7709 -229553
hyphenation/hyph_nb.dic -10566 -240119
hyphenation/hyph_nn.dic -10567 -250686
chrome/shumway/content/shumway.player.js -14606 -265292
hyphenation/hyph_hu.dic -26995 -292287
res/fonts/CharisSILCompact-BI.ttf -37313 -329600
res/fonts/CharisSILCompact-B.ttf -37353 -366953
res/fonts/CharisSILCompact-I.ttf -39826 -406779
res/fonts/CharisSILCompact-R.ttf -41048 -447827

So the win here would be to use 7z's deflate in mozjar. That's something I've been tempted to do for a while.

As for classes.dex, AIUI, android mmap()s directly the data there, so it can*not* be compressed.
Created attachment 8621508 [details] [diff] [review]
PoC

For the interested, this kind of implements using 7z's deflate in mozjar, which is used by various things in the tree, including the packager that creates omni.ja. The subprocess handling is fragile and actually freezes when the input is too large or something, but it's enough to know that doing this brings the omni.ja size on my local fennec build from 9581684 to 9071771.
As this is rather slow (but less slow than using zopfli), it should be optional, which the patch doesn't do. I'm however not planning to work on this further in the near future.
(In reply to Mike Hommey [:glandium] from comment #9)
> So, I took it for a spin on last nightly's gecko-unsigned-unaligned.apk.
> 
> Original size: 43136507
> Compressed with advzip --recompress -k -4 --iter 50
> gecko-unsigned-unaligned.apk: 38434380 (and that took 6 minutes on my
> machine)
> Compressed with advzip --recompress -k -3 gecko-unsigned-unaligned.apk
> (presumably equivalent to using 7z): 38501878.
> Spending 6 minutes for 67498 bytes seems overkill.
> 
> Now, let's see how the difference spans:
> ./mach python
> >>> from mozpack.mozjar import JarReader
> >>> orig = dict(JarReader('gecko-unsigned-unaligned-orig.apk').entries)
> >>> new = dict(JarReader('gecko-unsigned-unaligned.apk').entries)
> >>> assert orig.keys() == new.keys()
> >>> diff = { f: new[f]['compressed_size'] - orig[f]['compressed_size'] for f in orig }
> >>> cumm = 0
> >>> for f, n in sorted(diff.items(), key=lambda x: x[1], reverse=True):
> ...     if n != 0:
> ...         cumm += n
> ...         print f, n, cumm
> ...
> (snip)
> AndroidManifest.xml -420 -4915
> assets/armeabi-v7a/libfreebl3.so -675 -5590
> assets/armeabi-v7a/libnss3.so -3639 -9229
> assets/armeabi-v7a/libnssckbi.so -5570 -14799
> lib/armeabi-v7a/libmozglue.so -5632 -20431
> resources.arsc -24199 -44630
> assets/armeabi-v7a/libxul.so -424700 -469330
> assets/omni.ja -442196 -911526
> classes.dex -3635431 -4546957
> 
> So the bulk of the difference is libxul.so, omni.ja and classes.dex,
> classes.dex being the biggest.
> 
> libxul.so can't be compressed, it's already compressed in chunks, using some
> tricks, but yes, it's kind of expected that it's possible to compress it
> more considering it's compressed in chunks. It /might/ be possible to get
> some amount off libxul.so by using 7z's deflate instead of zlib's, but iirc,
> 7z's deflate doesn't take a dictionary, which *does* help a little: without
> a dictionary, libxul.so would be 22665452 instead of 22444612 in that apk.
> Note that increasing the chunk size to 32k gets the size down to 21908326,
> but the impact at runtime should be evaluated (in terms of RSS and startup
> time).
> 
> omni.ja can't be compressed, it's already a compressed zip. You might ask
> why we have a zip in a zip(apk), well, that's because omni.ja has several
> thousand files, and android likes to waste time touching them all or
> something, so that the more files you have in the apk, the slower your app
> is to start. BTW, when that was done, there were way less files in the apk
> itself, and I'm worried that the now 1600+ files in the apk *are* impacting
> startup times. But maybe android itself improved on that, who knows (but
> that wouldn't solve the problem on older android)
> That being said, recompressing omni.ja itself can be a win:
> - original omni.ja: 10858746
> - recompressed with -3: 10410919
> 
> Here, looking at the detail with the same python script shows the difference
> is more spread:
> (snip)
> res/fonts/ClearSans-Thin.ttf -1877 -156776
> hyphenation/hyph_pl.dic -1882 -158658
> res/fonts/ClearSans-Regular.ttf -1883 -160541
> res/fonts/ClearSans-Bold.ttf -1906 -162447
> res/fonts/ClearSans-Light.ttf -1921 -164368
> res/fonts/ClearSans-Medium.ttf -1976 -166344
> res/fonts/ClearSans-Italic.ttf -1993 -168337
> modules/addons/XPIProvider.jsm -2515 -170852
> chrome/chrome/content/browser.js -2635 -173487
> hyphenation/hyph_cy.dic -2907 -176394
> hyphenation/hyph_en_US.dic -3434 -179828
> hyphenation/hyph_uk.dic -3465 -183293
> hyphenation/hyph_sh.dic -3602 -186895
> chrome/shumway/content/shumway.gfx.js -3927 -190822
> hyphenation/hyph_ru.dic -4293 -195115
> hyphenation/hyph_nl.dic -4809 -199924
> hyphenation/hyph_de-CH.dic -5307 -205231
> hyphenation/hyph_de-1996.dic -5326 -210557
> hyphenation/hyph_de-1901.dic -5501 -216058
> hyphenation/hyph_af.dic -5786 -221844
> components/interfaces.xpt -7709 -229553
> hyphenation/hyph_nb.dic -10566 -240119
> hyphenation/hyph_nn.dic -10567 -250686
> chrome/shumway/content/shumway.player.js -14606 -265292
> hyphenation/hyph_hu.dic -26995 -292287
> res/fonts/CharisSILCompact-BI.ttf -37313 -329600
> res/fonts/CharisSILCompact-B.ttf -37353 -366953
> res/fonts/CharisSILCompact-I.ttf -39826 -406779
> res/fonts/CharisSILCompact-R.ttf -41048 -447827

The only value I can add here: the files here are res/fonts and hyphenation.  We have tickets to not ship these in the APK at all, and the font tickets are likely to fall pretty soon.  That leaves two major wins: browser.js and shumway.player.js.  I presume glandium did not minify locally, so we'd need to see how this helps compress minified source.
> I presume glandium did not minify locally, so we'd need to see how this helps compress minified source.

I didn't for the numbers in comment 10, but the ones you quote are with minification, since they are for a "simply" recompressed omni.ja from a nightly build.
tracking-fennec: ? → +

Updated

2 years ago
Duplicate of this bug: 1234009
fwiw, you can use advzip to add all the files you want extra compression on, and then use regular 'zip -0' to add files like omni.ja, classes.dex, etc. that you explicitly don't want compressed.
On a current nightly without fonts zopfli saves ~74 kb or 0.18%. Is this still something we want to do?
(In reply to Kevin Brosnan [:kbrosnan] from comment #15)
> On a current nightly without fonts zopfli saves ~74 kb or 0.18%. Is this
> still something we want to do?

I think not, given the complexity.  Thanks for investigating!
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.