Closed Bug 1291424 Opened 8 years ago Closed 8 years ago

Consider removing on-demand decompression linker

Categories

(Firefox for Android Graveyard :: General, defect)

50 Branch
defect
Not set
normal

Tracking

(firefox51 fixed, firefox52 fixed)

RESOLVED FIXED
Firefox 52
Tracking Status
firefox51 --- fixed
firefox52 --- fixed

People

(Reporter: snorp, Assigned: esawin)

References

Details

Attachments

(1 file, 3 obsolete files)

I wonder if this is worth doing anymore. Modern devices have a lot more storage, so allowing Android to unpack libxul.so is not a big deal. I'd like to see startup performance measurements with normal linkage.
Assignee: nobody → nchen
This was done for speed, not storage limitations originally. It would be interesting to confirm that we still get a start up speed win from it.
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #1)
> This was done for speed, not storage limitations originally. It would be
> interesting to confirm that we still get a start up speed win from it.

On-demand decompression was done for speed, but dlopen directly from the apk was done for storage limitations originally. I don't expect on-demand decompression to be faster than dlopening the libraries from decompressed-in-advance files.

That said, IIRC, we can't totally switch off the custom linker without modifying other stuff. IIRC, we're relying on how it resolves symbols to interpose/wrap some system symbols, which simply dlopen()ing with the system linker wouldn't do. That said, the custom linker can also load normal libraries, but it would need some minor modifications to actually do it (because IIRC, it delegates to the system linker in that case, except in the case where it decompressed to storage from the apk first).

Relatedly, the custom linker also doesn't support multi-process without wasting memory yet, so depending on the expected state of e10s for mobile, it would require either implementing some items from my TODO list, or disabling.
nchen: perhaps this replaces https://bugzilla.mozilla.org/show_bug.cgi?id=1291377?  I'll let you keep track.
Assignee: nchen → esawin
I wonder if this would allow us to revisit Bug 1173894 and get better compression of libxul.so?
On-demand decompression needs a seekable zlib, which means it's less compressed. So yes, turning it off and letting Android install the libs would decrease the APK size. Something that I've seen opera do a long time ago is compress their native code with iirc LZMA, which compresses even more, and decompress on first run instead of on install (although I guess there's an intent that could be abused to make it happen before startup), but that's food for followup.
Can we quickly have a decision on whether we want to go forward with this bug? There are at least two bugs I'm aware of that would be simplified if they didn't have to deal with on-demand decompression. Not that it's particularly hard, but if I can tell people they don't need to do some work now before they do it and it becomes useless because in x weeks it is decided that, in the end, we'll stop using on-demand decompression...
(In reply to Mike Hommey [:glandium] from comment #6)
> Can we quickly have a decision on whether we want to go forward with this
> bug? There are at least two bugs I'm aware of that would be simplified if
> they didn't have to deal with on-demand decompression. Not that it's
> particularly hard, but if I can tell people they don't need to do some work
> now before they do it and it becomes useless because in x weeks it is
> decided that, in the end, we'll stop using on-demand decompression...

IMO, if we can get better startup perf by not doing on-demand decompression AND lower the APK size, then I think the post-install disk usage increase is an acceptable trade-off.

The other reason I wanted to look into this is for GeckoView. Doing this on-demand insanity is one thing when it's our own app, but distributing GeckoView with this seems....unwise.

If those first two assertions are correct then I think this is worth doing.
(In reply to Mike Hommey [:glandium] from comment #2)
> That said, IIRC, we can't totally switch off the custom linker without
> modifying other stuff. IIRC, we're relying on how it resolves symbols to
> interpose/wrap some system symbols, which simply dlopen()ing with the system
> linker wouldn't do. That said, the custom linker can also load normal
> libraries, but it would need some minor modifications to actually do it
> (because IIRC, it delegates to the system linker in that case, except in the
> case where it decompressed to storage from the apk first).

Do you know which symbols? Is this the BionicGlue stuff?
That, and malloc.
We could go back to building with --Wl,--wrap, but that wouldn't attach libraries we dlopen but that aren't ours to jemalloc (e.g. flash).
So, one quick way we can do (or at least test) this is:
- from java, set MOZ_LINKER_CACHE to the /data/something/lib/ directory where Android extract stuff
- from java, set MOZ_LINKER_EXTRACT to 1
- make the build system put the libraries in lib/ANDROID_CPU_ARCH instead of assets/ANDROID_CPU_ARCH
- make APKOpen.cpp use lib/ANDROID_CPU_ARCH instead of assets/ANDROID_CPU_ARCH

... and I think that's all.
The patch based on comment 11 doesn't seem to affect startup performance, here are some results:

Patched:
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +416ms (total +498ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +448ms (total +533ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +439ms (total +519ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +472ms (total +551ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +482ms (total +565ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +439ms (total +521ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +433ms (total +527ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +452ms (total +530ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +451ms (total +538ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +448ms (total +538ms)

Original:
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +550ms (total +576ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +450ms (total +547ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +440ms (total +530ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +462ms (total +546ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +428ms (total +511ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +434ms (total +517ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +414ms (total +501ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +442ms (total +525ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +456ms (total +538ms)
ActivityManager: Displayed org.mozilla.fennec_esawin/org.mozilla.gecko.BrowserApp: +439ms (total +532ms)

However, I've left MOZ_LINKER_CACHE untouched, so in my case that would be /data/user/0/org.mozilla.fennec_esawin/cache since /lib already exists so we can't set the writable permission on it.

Based on the MappableExtractFile::Create logs it looks like we're doing the right thing, is there a better way to verify it?
The point of using /lib and not /cache is to pick the files that Android uncompressed at installation time. Without this, the linker will still be uncompressing on its own.

I also forgot a step: disable szip.
(In reply to Mike Hommey [:glandium] from comment #13)
> The point of using /lib and not /cache is to pick the files that Android
> uncompressed at installation time. Without this, the linker will still be
> uncompressing on its own.
> 
> I also forgot a step: disable szip.

Right, but since it will only decompress them once (I've left out the first result), this should not affect the startup performance.
Eugen, the ActivityManager time is probably not relevant here. You should time how long it takes to get gecko up and running. We print some of these things already, grep for 'zerdatime'.
App start to runGecko durations:
patched:   99 108 114 111 116 105 ... ~ avg. 110ms
original: 280 275 270 275 258 284 ... ~ avg. 270ms
GeckoLibLoad [1] reports ~20ms patched, ~150ms original.

[1] https://dxr.mozilla.org/mozilla-central/source/mozglue/android/APKOpen.cpp#237
(In reply to Eugen Sawin [:esawin] from comment #15)
> Right, but since it will only decompress them once (I've left out the first
> result), this should not affect the startup performance.

Of course it should, and you've demonstrated it does. Checking things like firstpaint etc. would be useful too.
(In reply to Mike Hommey [:glandium] from comment #10)
> We could go back to building with --Wl,--wrap, but that wouldn't attach
> libraries we dlopen but that aren't ours to jemalloc (e.g. flash).

Do we care about this? There could be issues if we delete something that was not allocated by us, but I don't know if that's a common problem.

What about the symbols we need to inject in order to load Flash? I seem to remember doing that was ~impossible with the system linker.

Maybe the easiest thing is to just continue using the custom linker, but don't do the on-demand decompression, as Eugen has experimented with.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #20)
> (In reply to Mike Hommey [:glandium] from comment #10)
> > We could go back to building with --Wl,--wrap, but that wouldn't attach
> > libraries we dlopen but that aren't ours to jemalloc (e.g. flash).
> 
> Do we care about this? There could be issues if we delete something that was
> not allocated by us, but I don't know if that's a common problem.
> 
> What about the symbols we need to inject in order to load Flash? I seem to
> remember doing that was ~impossible with the system linker.

Ah, true, I forgot about those. We can't load Flash with the system linker.

> Maybe the easiest thing is to just continue using the custom linker, but
> don't do the on-demand decompression, as Eugen has experimented with.

Indeed. Eugen, can you attach your patch?
This is the patch I've used for the startup tests.

It simply moves the libraries out of /assets into /lib, extracts on first run (~800ms) and reuses the cached libs from then on (~20ms).
Ok, here are startup messages for Nightly going to about:home. You can see from the timestamps on the left side that it took 4.1 seconds to get to browser chrome finished:

08-09 12:04:31.661  4026  4026 I GeckoApplication: zerdatime 247011323 - Fennec application start
08-09 12:04:32.368  4026  4043 W GeckoThread: zerdatime 247012030 - runGecko
08-09 12:04:33.082  4026  4026 I GeckoToolbarDisplayLayout: zerdatime 247012744 - Throbber stop
08-09 12:04:35.763  4026  4043 D GeckoBrowser: zerdatime 1470758675758 - browser chrome startup finished.

Here are the same messages for my local build with Eugen's patch:

08-09 12:06:20.873  4374  4374 I GeckoApplication: zerdatime 247120535 - Fennec application start
08-09 12:06:21.353  4374  4392 W GeckoThread: zerdatime 247121015 - runGecko
08-09 12:06:22.282  4374  4374 I GeckoToolbarDisplayLayout: zerdatime 247121944 - Throbber stop
08-09 12:06:24.324  4374  4392 D GeckoBrowser: zerdatime 1470758784315 - browser chrome startup finished.

That's 3.45s. I did a few different runs and both seemed to hover around that. So I think in the best case removing the on-demand decompression could *improve* startup, and in the worst case it'll be about the same. Couple that with the LZMA compression from bug 1291913 and I think we're onto something here.

One thing I was thinking about. If we have the APK extract the (LZMA-compressed) library to lib/, we could use the presence of that to validate the cached (uncompressed) libxul. The procedure on startup would be:

if exists(lib/libxul-compressed.so)
    decompress(lib/libxul-compressed.so, lib/libxul.so)
    unlink(lib/libxul-compressed.so)

load(libxul.so)

Glandium, Eugen, what do you guys think? For version 1.0 without LZMA we can just allow the libxul to be compressed normally in the APK and let the package installer stick it in the lib directory.
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(esawin)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #23)
> Glandium, Eugen, what do you guys think? For version 1.0 without LZMA we can
> just allow the libxul to be compressed normally in the APK and let the
> package installer stick it in the lib directory.

Which it should be doing with the patch, considering the file is in lib/. But since the patch also sets MOZ_LINKER_EXTRACT without changing the cache directory, what currently happens is that every startup is extracting the file to some other place. Which means a) it's slower than it should be b) it uses double the storage it should.

It's interesting that's still faster than on-demand decompression. Device performance characteristics have very much changed in the past 3-4 years.
Flags: needinfo?(mh+mozilla)
As discussed on IRC, we don't have write access to /data/app/<fennec>/lib and /data/data/<fennec>/lib (which is a symlink to the former's /arm sub-directory). If we want to allow for custom compression schemes, we'll need to cache the decompressed libs in, e.g., /data/data/<fennec>/cache, as we already do with my WIP patch.

We might want to change the version checking mechanics away from timestamps to checksums, given that system timestamps may not always be reliable.

@glandium: the cache directory is "static", so we only extract the libs when the apk timestamp is greater than the lib timestamp.
Flags: needinfo?(esawin)
To reduce apk size and improve startup performance, we could do the following:
1. Extract and cache the (szip-compressed) libs from /assets (this patch).
2. Switch szip compression/decompression of /assets with lzma.
3. Change version verification from timestamps to checksums.

There is no point in moving the libs from /assets to /lib if we can't reuse (write to) /lib, the same goes for disabling szip in this case. We'll have to make a copy of the libs if we want a custom compression scheme.
Attachment #8779828 - Flags: review?(snorp)
Attachment #8779828 - Flags: review?(mh+mozilla)
Comment on attachment 8779828 [details] [diff] [review]
0001-Bug-1291424-1.2-Extract-and-cache-libs-on-first-run..patch

Review of attachment 8779828 [details] [diff] [review]:
-----------------------------------------------------------------

I fear putting this into Nightly is a little risky because of the timestamp-based invalidation. I think we should go ahead and do the crc32 stuff now.
Attachment #8779828 - Flags: review?(snorp) → review-
Comment on attachment 8779828 [details] [diff] [review]
0001-Bug-1291424-1.2-Extract-and-cache-libs-on-first-run..patch

Review of attachment 8779828 [details] [diff] [review]:
-----------------------------------------------------------------

::: old-configure.in
@@ +185,4 @@
>      case "$target" in
>      *-android*|*-linuxandroid*)
>          ZLIB_DIR=yes
> +        MOZ_LINKER_EXTRACT=1

This removes the "cached" libraries on every shutdown. Which means you'd be decompressing on every startup. It's also desirable to disable szip, because it's more involved to decompress szip than pure deflate (there's a shared dictionary, and a filter).
Attachment #8779828 - Flags: review?(mh+mozilla) → review-
(In reply to Mike Hommey [:glandium] from comment #29)
> Comment on attachment 8779828 [details] [diff] [review]
> 0001-Bug-1291424-1.2-Extract-and-cache-libs-on-first-run..patch
> 
> Review of attachment 8779828 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: old-configure.in
> @@ +185,4 @@
> >      case "$target" in
> >      *-android*|*-linuxandroid*)
> >          ZLIB_DIR=yes
> > +        MOZ_LINKER_EXTRACT=1
> 
> This removes the "cached" libraries on every shutdown. Which means you'd be
> decompressing on every startup.

Yeah, I see there is code for this in MappableExtractFile via AutoUnlinkFile. In practice, though, this will never be executed because we rarely "shutdown". We just get SIGKILLed from Android. Still, should probably fix this.

>  It's also desirable to disable szip, because
> it's more involved to decompress szip than pure deflate (there's a shared
> dictionary, and a filter).

Sure, but we're going to do lzma immediately after anyway, so....
Maybe we want to use a MOZ_LINKER_EXTRACT_PERSISTENT, and then create a new MappableExtractFilePersistent to use in that case which just omits the unlink? Or maybe just another value for MOZ_EXTRACT_FILE (2, perhaps)?
Depends on: 1294731
Blocks: 1294736
Blocks: 1298090
Comment on attachment 8779828 [details] [diff] [review]
0001-Bug-1291424-1.2-Extract-and-cache-libs-on-first-run..patch

I think bug 1294731 has addressed all the issues we had with this patch, can we revisit it?
Attachment #8779828 - Flags: review?(snorp)
Attachment #8779828 - Flags: review?(mh+mozilla)
Attachment #8779828 - Flags: review-
Attachment #8779323 - Attachment is obsolete: true
Attachment #8779828 - Flags: review?(snorp) → review+
Comment on attachment 8779828 [details] [diff] [review]
0001-Bug-1291424-1.2-Extract-and-cache-libs-on-first-run..patch

Review of attachment 8779828 [details] [diff] [review]:
-----------------------------------------------------------------

::: old-configure.in
@@ +185,4 @@
>      case "$target" in
>      *-android*|*-linuxandroid*)
>          ZLIB_DIR=yes
> +        MOZ_LINKER_EXTRACT=1

Please disable szip.
Attachment #8779828 - Flags: review?(mh+mozilla) → review-
Disabling szip increases the APK size by 180kB but reduces initial extraction time from ~800ms to ~550ms.

I don't think either trade-off is relevant given that we have bug 1298090 to move the extraction away from startup and bug 1294736 to replace szip with xz.
Flags: needinfo?(snorp)
There are /other/ bugs waiting for szip going away.
Disable szip.
Attachment #8779828 - Attachment is obsolete: true
Flags: needinfo?(snorp)
Attachment #8785298 - Flags: review?(snorp)
Attachment #8785298 - Flags: review?(mh+mozilla)
Attachment #8785298 - Flags: review?(snorp) → review+
Comment on attachment 8785298 [details] [diff] [review]
0001-Bug-1291424-1.3-Extract-and-cache-libs-on-first-run..patch

Review of attachment 8785298 [details] [diff] [review]:
-----------------------------------------------------------------

::: old-configure.in
@@ +185,4 @@
>      case "$target" in
>      *-android*|*-linuxandroid*)
>          ZLIB_DIR=yes
> +        MOZ_LINKER_EXTRACT=1

Please set this in the following block:
https://dxr.mozilla.org/mozilla-central/rev/26e22af660e543ebb69930f082188b69ec756185/old-configure.in#1306-1308
Attachment #8785298 - Flags: review?(mh+mozilla) → review+
Surprisingly, we get libc crashes with this patch on a Java thread: https://treeherder.mozilla.org/#/jobs?repo=try&revision=72a1e2235b95

This isn't touching any of the new code, disabling the new checksum validation doesn't help.
Attachment #8785298 - Attachment is obsolete: true
Attachment #8787001 - Flags: review+
We have some race condition at startup: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3713ab002e88

Simply delaying GetMappableFromPath causes the libc crashes.
Any ideas, glandium? The crash stacks aren't very insightful.
Flags: needinfo?(mh+mozilla)
In one of the crashes:
 17:47:42     INFO -  Crash reason:  SIGSEGV
 17:47:42 INFO - Crash address: 0x57157ec6 

Looking at the minidump, there's nothing mapped at that address.

Now, in the logcat:
 09-02 10:47:24.741   750   759 I GeckoLinker: Unmapped /data/app/org.mozilla.fennec-1.apk @0x55200000

Coincidentally, the minidump tells there's a mapping for the apk:
 579ac000-5a158000 r--p 00000000 1f:01 269        /data/app/org.mozilla.fennec-1.apk

The size of that mapping is 0x27ac000, and 0x55200000 + 0x27ac000 is ... 0x579ac000.

So it's probably the crash occurs trying to read from that mapping.
Flags: needinfo?(mh+mozilla)
Depends on: 1301665
Depends on: 1302189
It looks good now, with all the linker issues fixed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8d0d2e50eff7
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2026d893a981
[1.4] Extract and cache libs on first run. r=glandium,snorp
https://hg.mozilla.org/mozilla-central/rev/2026d893a981
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 52
lots of perf improvements from autophone:
== Change summary for alert #3436 (as of September 27 2016 05:02 UTC) ==

Improvements:

 18%  remote-twitter summary android-6-0-armv8-api15 opt     1248.55 -> 1020.75
 17%  remote-nytimes summary android-4-2-armv7-api15 opt     3858.79 -> 3203.36
 17%  local-twitter summary android-4-2-armv7-api15 opt      3003.02 -> 2496.75
 17%  remote-blank summary android-4-4-armv7-api15 opt       1537.01 -> 1278.19
 16%  local-nytimes summary android-5-1-armv7-api15 opt      2520 -> 2104.73
 16%  remote-nytimes summary android-5-1-armv7-api15 opt     2659.07 -> 2222.49
 16%  local-twitter summary android-6-0-armv8-api15 opt      1200.19 -> 1005.16
 16%  local-nytimes summary android-4-2-armv7-api15 opt      3702.24 -> 3102.28
 16%  remote-twitter summary android-4-2-armv7-api15 opt     3103.33 -> 2602.77
 16%  local-blank summary android-4-4-armv7-api15 opt        1501.83 -> 1262.65
 14%  remote-twitter summary android-4-4-armv7-api15 opt     1789.71 -> 1536.32
 13%  local-twitter summary android-4-4-armv7-api15 opt      1740.98 -> 1506.83
 12%  local-nytimes summary android-4-4-armv7-api15 opt      2103.71 -> 1848.92
 11%  remote-nytimes summary android-4-4-armv7-api15 opt     2216.64 -> 1973.46

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=3436
We should consider uplifting this too, IMHO. Opinions?
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(esawin)
It does provide a nice startup performance boost, but it's not a crash fix and may expose more startup race conditions currently unknown.

Given that our user base on Aurora is rapidly growing, how about uplifting it just there?
Flags: needinfo?(esawin)
Likewise.
Flags: needinfo?(mh+mozilla)
Comment on attachment 8787001 [details] [diff] [review]
0001-Bug-1291424-1.4-Extract-and-cache-libs-on-first-run..patch

Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: This patch reduces startup time considerably
[Describe test coverage new/current, TreeHerder]: Nightly
[Risks and why]: Low, affects library loading during startup only, all known issues have been fixed in 50+
[String/UUID change made/needed]: None
Attachment #8787001 - Flags: approval-mozilla-aurora?
Comment on attachment 8787001 [details] [diff] [review]
0001-Bug-1291424-1.4-Extract-and-cache-libs-on-first-run..patch

Enhance startup performance. I would like to take it in 51 aurora as early as possible.
Attachment #8787001 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1312545
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: