Consider Increasing JavaScript source compression to reduce memory usage

NEW
Unassigned

Status

()

4 years ago
3 years ago

People

(Reporter: gsvelto, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
Every function JS source is kept in memory since it might be needed when invoking the toString() method on the function object. The JavaScript JIT compiler uses zlib to compress those JS sources and it uses a helper thread for the task so - provided that enough cores are available - it shouldn't impact application startup speed.

However all our early FxOS devices had only one core, and code compression was getting in the way of application startup; especially for apps with a large source footprint like e-mail. Thus in bug 837715 we had switched the compressor to use the fastest compression mode available (i.e. the one yielding the largest output). This was a pretty straightforward size/speed trade-off optimization.

Since all of our next-gen devices are at least dual-cores it might be time to reverse that decision. Low-end ones will still ship with relatively little RAM and so trading of a little extra CPU time for saving some memory might be a good idea; especially if it doesn't impact startup time.

We should do two things:

- Test with various compression levels to see how much they impact the overall memory consumption

- Measure app startup time for each compression level we tested to ensure we're not regressing it (email, calendar and settings are all good candidates since they've got the largest code footprint)
(Reporter)

Comment 1

4 years ago
Created attachment 8539249 [details] [diff] [review]
[PATCH] Use the maximum compression mode when compressing JS code

Here's a quick patch that switches the compressor to use the best compression mode. I've put it together to see what's the maximum amount of memory we can save just by tweaking this parameter.
(Reporter)

Comment 2

4 years ago
I did a quick comparison between the current mode (best speed) and the best compression mode to ascertain the impact of this parameter on memory consumption. This is what I found by launching a set of test apps and looking at their USS value:

App name               |    speed    | compression |  difference
-----------------------+-------------+-------------+-------------
(Nuwa process)         |     2.9 MiB |     2.8 MiB |    -0.1 MiB
(Preallocated process) |     3.2 MiB |     3.1 MiB |    -0.1 MiB
keyboard               |     7.2 MiB |     6.7 MiB |    -0.5 MiB
homescreen             |    15.2 MiB |    15.1 MiB |    -0.1 MiB
email                  |    12.0 MiB |    11.5 MiB |    -0.5 MiB
settings               |    13.2 MiB |    13.0 MiB |    -0.2 MiB
calendar               |    11.3 MiB |    11.2 MiB |    -0.1 MiB
communications         |    11.6 MiB |    11.4 MiB |    -0.2 MiB

My tests had some variation so these are rough averages of a few runs. More investigation is needed but from these early results this seems to be a road worth pursuing if we want to put those extra cores to good use and save some memory.
Did you verify that we are indeed keeping the source?  Last time I heard discussion about the sources, was when we were discussing about removing the sources completely and parsing everything ahead.

I think the ultimate goal would be to avoid having to decompress the full sources.  Luke mentioned multiple times that we should compress per-chunk, even if this leads to a smaller compression ratio, at least this would avoid memory-peak when we have to de-lazify a function, and chunk compression might be even easier to parallelize to multiple core*s*.
(Reporter)

Comment 4

4 years ago
(In reply to Nicolas B. Pierron [:nbp] from comment #3)
> Did you verify that we are indeed keeping the source?  Last time I heard
> discussion about the sources, was when we were discussing about removing the
> sources completely and parsing everything ahead.

No, I haven't, in fact I haven't yet double-checked if we're really using a separate thread too. It's from FxOS 1.0 taken from the top of my head :-)

> I think the ultimate goal would be to avoid having to decompress the full
> sources.  Luke mentioned multiple times that we should compress per-chunk,
> even if this leads to a smaller compression ratio, at least this would avoid
> memory-peak when we have to de-lazify a function, and chunk compression
> might be even easier to parallelize to multiple core*s*.

This is interesting. My thought in this case was about reducing the overall memory footprint so that would go in the opposite direction. Peak consumption is also important though I don't think we ever use the JS sources in FxOS/gaia. I seem to remember that we were dropping them entirely on the FxOS v1.3t where we were really tight on memory.
Don't we use them at least when we remote debug using WebIDE?

Although I'd agree that keeping them in production phones could be useless. We should have a pref for this.
(In reply to Gabriele Svelto [:gsvelto] from comment #0)
> The JavaScript JIT
> compiler uses zlib to compress those JS sources and it uses a helper thread
> for the task so - provided that enough cores are available - it shouldn't
> impact application startup speed.

Not exactly: the end of JS script compilation/evaluation currently blocks on the completion of the async script compression job.  Thus, if your compilation helper thread takes longer than it takes to parse/evaluation the script, you increase total script execution time.  In the FxOS case you mentioned:

> However all our early FxOS devices had only one core, and code compression
> was getting in the way of application startup; especially for apps with a
> large source footprint like e-mail.

we hit this all the time since the devices only had one core.

One idea to avoid this hazard is to avoid blocking on compression at the end of parsing/evaluation.  The only reason (I know of) we currently block is because compression operates on the char array passed to the parse/evaluate APIs which is only guaranteed to be valid for the duration of the call.  However, bug 987556 added the ability for the browser to hand the JS engine ownership of the chars which avoids this limitation: we can keep the uncompressed chars alive until compression completes.  If we did this, we could compress even on single-core devices w/o hurting load time.  We might even want to consider delaying the compression until after some quiescent period to avoid load-time resource contention.

The other question is what decompression performance looks like.  The important thing to consider here is that, with the current design, we have "lazy" functions which aren't fully parsed and thus we must parse when executed.  To parse these functions, we have to decompress their whole script (if it is compressed), and thus whole-script decompression time can cause main-thread janks (bug 938385 is an extreme example).

To get an idea of the impact of these two hazards, you can add logging statements to log the time spent in the functions SourceCompressionTask::complete (time spent blocking on the compression task), and ScriptSource::chars (time spent blocking on decompression of scripts).  The JS engine would greatly benefit from all these fixes and better source compression ratios, so I hope you pursue this.
(Reporter)

Comment 7

4 years ago
(In reply to Julien Wajsberg [:julienw] from comment #5)
> Don't we use them at least when we remote debug using WebIDE?

Indeed, when I mentioned we don't need the sources I intended that the applications themselves are not using a function toString() method for their own functionality.
(Reporter)

Comment 8

4 years ago
(In reply to Luke Wagner [:luke] from comment #6)
> Not exactly: the end of JS script compilation/evaluation currently blocks on
> the completion of the async script compression job.  Thus, if your
> compilation helper thread takes longer than it takes to parse/evaluation the
> script, you increase total script execution time. 

Thanks for clearing that up; with my patch applied I had the feeling that startup was slower but I didn't have time to measure it yet.

> One idea to avoid this hazard is to avoid blocking on compression at the end
> of parsing/evaluation.  The only reason (I know of) we currently block is
> because compression operates on the char array passed to the parse/evaluate
> APIs which is only guaranteed to be valid for the duration of the call. 
> However, bug 987556 added the ability for the browser to hand the JS engine
> ownership of the chars which avoids this limitation: we can keep the
> uncompressed chars alive until compression completes.  If we did this, we
> could compress even on single-core devices w/o hurting load time.  We might
> even want to consider delaying the compression until after some quiescent
> period to avoid load-time resource contention.

Just being able not to block on the compression would already give us a lot more freedom. As for scheduling the task I think that's better left to the OS. If the thread is properly prioritized we shouldn't need to worry about it.

> The other question is what decompression performance looks like.  The
> important thing to consider here is that, with the current design, we have
> "lazy" functions which aren't fully parsed and thus we must parse when
> executed.  To parse these functions, we have to decompress their whole
> script (if it is compressed), and thus whole-script decompression time can
> cause main-thread janks (bug 938385 is an extreme example).

This is a very important piece of information I was missing. I should definitely measure this besides getting some more accurate data about the speed/size tradeoff here. My test in comment 2 is way too simple.

> To get an idea of the impact of these two hazards, you can add logging
> statements to log the time spent in the functions
> SourceCompressionTask::complete (time spent blocking on the compression
> task), and ScriptSource::chars (time spent blocking on decompression of
> scripts).  The JS engine would greatly benefit from all these fixes and
> better source compression ratios, so I hope you pursue this.

Thanks for the tip.
(Reporter)

Updated

4 years ago
Blocks: 1115631
Whiteboard: [MemShrink] → [MemShrink:P2]

Comment 9

3 years ago
Currently on FF Nightly on desktop, I am seeing the script sources being stored uncompressed(?), and the size in memory looks very precisely like 2x the size on disk, which suggests that the script source gets expanded from UTF-8 to UTF-16(?), and a 24MB .js file on disk takes 48MB in memory. Emscripten/asm.js modules can have very large compiled codebases, so this accounts for a large portion of consumed memory, in particular when attempting to run native ported applications as packaged apps on Firefox OS.

Would it be possible to avoid storing the sources in memory if the source was loaded from file:// or if the source is a Firefox OS packaged app? Instead, the function could store a file and a begin/end range from where the contents could be synchronously loaded in the rare case that it's needed?

In general, would it be possible to retain the sources as UTF-8/ASCII/original bytes to save half of the space and avoid the 2x blowup?
We do have a mechanism for not saving the source (it ends up making func.toSource() fail and disabling lazy parsing):
  http://mxr.mozilla.org/mozilla-central/source/js/src/jsapi.h#2266
I know we do it for "system" apps on FFOS, but I don't know if it's exposed to normal user apps as a manifest option.  Do you know Bobby?

Also, you're right that, past a certain size threshold, we don't compress sources.  The reason is that, with lazy parsing (and also relazification, which discards bytecode), we end up needing the source all the time (to parse) and so we end up decompressing the huge blob multiple times, costing a few hundred ms each.  A solution would be to chunk compressed sources (past a certain threshold) or some other scheme to decompress "just a little bit".
Flags: needinfo?(bobbyholley)
From bug 1197231 comment 30 we do have uncompressed sources in a Firefox OS certified app (the Messages app). Is it expected ?
Flags: needinfo?(bobbyholley) → needinfo?(fabrice)
(In reply to Luke Wagner [:luke] from comment #10)
> We do have a mechanism for not saving the source (it ends up making
> func.toSource() fail and disabling lazy parsing):
>   http://mxr.mozilla.org/mozilla-central/source/js/src/jsapi.h#2266
> I know we do it for "system" apps on FFOS, but I don't know if it's exposed
> to normal user apps as a manifest option.  Do you know Bobby?

It's not exposed as a manifest flag. We discard source for all certified (gaia) and privileged (served from the marketplace apps) http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/src/nsXPConnect.cpp#406

(In reply to Julien Wajsberg [:julienw] from comment #11)
> From bug 1197231 comment 30 we do have uncompressed sources in a Firefox OS
> certified app (the Messages app). Is it expected ?

I don't know... Luke?
Flags: needinfo?(fabrice) → needinfo?(luke)
If I'm reading bug 1197231 comment 30 correctly, uncompressed-source-cache means the memory spent on compressed sources that were temporarily decompressed (for the purposes of lazy compilation mentioned above) and will be freed at the next GC.  I don't know why it has sources at all, though; with discardSource() set it seems like it shouldn't, but it's hard to follow this code to see all the cases.

Could we perhaps add a manifest option?  Technically it breaks JS semantics...
Flags: needinfo?(luke)

Comment 14

3 years ago
A manifest option would be something at least for Firefox OS packaged apps, however, I'd love to see a solution that could be used even on desktop pages, since we do see this being a major issue on desktop as well. If I do a development build (nonminified for debugging purposes) of a medium-sized Unity3D game, I get a 364 MB .js file, and about:memory of that page says

769.27 MB (30.15%) -- script-sources

which is quite close to a 2x blowup as well (the UTF8->UTF16?). If I tap the Free memory (GC/CC/Minimize memory usage) buttons, the allocation persists, so it is not a temporary allocation. This is the single biggest memory allocation on the whole page.

I wonder if it would make sense to spec to have an attribute like <script src='page.js' dontpersistscriptsourcesinmemory></script> or something similar (with a good name)?
(In reply to Jukka Jylänki from comment #14)
> A manifest option would be something at least for Firefox OS packaged apps,
> however, I'd love to see a solution that could be used even on desktop
> pages, since we do see this being a major issue on desktop as well. If I do
> a development build (nonminified for debugging purposes) of a medium-sized
> Unity3D game, I get a 364 MB .js file, and about:memory of that page says
> 
> 769.27 MB (30.15%) -- script-sources
> 
> which is quite close to a 2x blowup as well (the UTF8->UTF16?). If I tap the
> Free memory (GC/CC/Minimize memory usage) buttons, the allocation persists,
> so it is not a temporary allocation. This is the single biggest memory
> allocation on the whole page.
> 
> I wonder if it would make sense to spec to have an attribute like <script
> src='page.js' dontpersistscriptsourcesinmemory></script> or something
> similar (with a good name)?

We had a similar request in bug 1134933 where we didn't want to keep the source around for generated code (or really any of the code).
Jukka: Correct, in memory source is two-byte code units.  Also yes, we need to compress large sources.  I filed bug 1219098 for that since it is different than what this bug is about.
(In reply to Julien Wajsberg [:julienw] from comment #11)
> From bug 1197231 comment 30 we do have uncompressed sources in a Firefox OS
> certified app (the Messages app). Is it expected ?

Hmm, interesting, is "javascript.options.discardSystemSource" pref responsible for that? I see that it's enabled by default on b2g PVT builds (both user and eng), but in bug 1197231 I was running Raptor and then was taking memory reports on the same device. I see that "make raptor" (that is run before measurements) is essentially "RAPTOR=1 PERF_LOGGING=1 DEVICE_DEBUG=1 GAIA_OPTIMIZE=1 NOFTU=1 SCREEN_TIMEOUT=0 make reset-gaia" that disables this pref ([1]-->[2]).

Hey Eli, is this what we want? I suspect it may skew both visuallyLoaded and memory results.

[1] https://github.com/mozilla-b2g/gaia/blob/1fe6efac59451204b9132b4c460f7b96dafd64da/build/preferences.js#L174

[2] https://github.com/mozilla-b2g/gaia/blob/1fe6efac59451204b9132b4c460f7b96dafd64da/build/preferences.js#L265
Flags: needinfo?(eperelman)

Comment 18

3 years ago
@azasypkin in Raptor we typically used DEVICE_DEBUG which seemed to counteract some issues we were having with no longer being able to connect to a device over adb after a reboot. Not sure if the logic was sound, and am happy to review changing our necessity of this flag.

In regards to the javascript.options.discardSystemSource preference, I am not familiar with it. If you believe it has a detrimental effect in our performance measurements and should probably be excluded, I'm open to the arguments.
Flags: needinfo?(eperelman)
(In reply to :Eli Perelman from comment #18)
> @azasypkin in Raptor we typically used DEVICE_DEBUG which seemed to
> counteract some issues we were having with no longer being able to connect
> to a device over adb after a reboot. Not sure if the logic was sound, and am
> happy to review changing our necessity of this flag.
> 
> In regards to the javascript.options.discardSystemSource preference, I am
> not familiar with it. If you believe it has a detrimental effect in our
> performance measurements and should probably be excluded, I'm open to the
> arguments.

Sure, I've filed bug 1219301, so let's discuss there.

Thanks!
You need to log in before you can comment on or make changes to this bug.