1219098 - re-enable compression on large sources

Assignee

Description

•

9 years ago

Currently, large ScriptSources aren't compressed since decompression can take hundreds of ms and, due to lazy parsing and relazification, the decompression happens periodically. However, this results in a ton of wasted memory (esp for large asm.js apps): in-memory source is often 2x bigger than the .js file (ascii -> jschar) and compression can yield 4x size reductions of ascii, so this is an ~8x waste. One simplish mitigation would be to compress in reasonable-sized chunks (enough for compression to be effective) so the ScriptSource::chars can only decompress the chunks it needs.

Luke Wagner [:luke]

Assignee

Comment 1

•

9 years ago

Actually, scanning http://zlib.net/manual.html quickly, it looks like zlib supports saving intermediate states while encoding to allow random access while decoding...

Luke Wagner [:luke]

Assignee

Comment 2

•

9 years ago

Also: http://lh3.github.io/2014/07/05/random-access-to-zlib-compressed-files/

Till Schneidereit [:till]

Comment 3

•

9 years ago

Needinfo-ing Yury because he's doing some work that might be related, and could potentially help here.

Flags: needinfo?(ydelendik)

Yury Delendik (:yury)

Comment 4

•

9 years ago

(In reply to Luke Wagner [:luke] from comment #1) > Actually, scanning http://zlib.net/manual.html quickly, it looks like zlib > supports saving intermediate states while encoding to allow random access > while decoding... That's correct Z_FULL_FLUSH would do the trick, but you have to save index map of flush positions from the original source code to compressed one (to properly will right range of compressed to decode). The DEFLATE uses previously uncompressed data during decompression of the following bits -- that's probably more important part of the algorithm than Huffman coding. That means Z_FULL_FLUSH will not rely on the data compressed in previous blocks anymore -- and that will cause to loose compression ratio. As extension of the original proposal, for zlib, I would recommend to start using so-called "dictionary" that will contain e.g. JS keywords, space sequences with commonly used DOM API identifiers. That shall compensate for Z_FULL_FLUSH inefficiency. (But for that you need to start different deflate session -- assuming starting a new deflate session somewhat as bad memory wise as using Z_FULL_FLUSH) (In reply to Till Schneidereit [:till] from comment #3) > Needinfo-ing Yury because he's doing some work that might be related, and > could potentially help here. If we want to be eager on compression, we can start compression of the source code as soon is it's coming from the channel and hand compressed source along with uncompressed (preferably with optimizations mentioned above -- in chunks).

Flags: needinfo?(ydelendik)

Till Schneidereit [:till]

Comment 5

•

9 years ago

(In reply to Yury Delendik (:yury) from comment #4) > If we want to be eager on compression, we can start compression of the > source code as soon is it's coming from the channel and hand compressed > source along with uncompressed (preferably with optimizations mentioned > above -- in chunks). We talked about something like this in the past. Ideally, we would store decoded and compressed data in the http cache so we wouldn't have to do any conversions at all. Perhaps adding a JS::Compile overload that takes data in that format would work?

Luke Wagner [:luke]

Assignee

Comment 6

•

9 years ago

(In reply to Yury Delendik (:yury) from comment #4) > That's correct Z_FULL_FLUSH would do the trick, but you have to save index > map of flush positions from the original source code to compressed one (to > properly will right range of compressed to decode). That's the starting point, but the link above (http://lh3.github.io/2014/07/05/random-access-to-zlib-compressed-files/) points to an even better solution that seems to capitalize on blocking that already occurs within the stream.

Jukka Jylänki

Comment 7

•

9 years ago

This is popping up in quite a lot of places for Emscripten. Even though we would not compress yet (since compression sounds a bit more tricky due to chunks etc.), could we store the input file in memory in original ASCII/UTF-8 form (and expanded to UTF-16) at least, whenever we don't compress, which would be an immediate help and save 50% of the memory here without any drawbacks(?). If that sounds doable, should we open a specific bug just for that change?

Jukka Jylänki

Comment 8

•

9 years ago

err, in the above comment, s/(and expanded to UTF-16)/(and not expanded to UTF-16).

Lars T Hansen [:lth]

Comment 9

•

9 years ago

A related discussion for Blink, with some additional links: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/aEDcSvGo5Qs.

Jan de Mooij [:jandem]

Comment 10

•

9 years ago

I have some big patches waiting for review, so I may have time to look into this in the meantime. There's a lot of interesting data we can collect: compression ratios on different workloads, with different chunk sizes, etc.

Flags: needinfo?(jdemooij)

Jan de Mooij [:jandem]

Comment 11

•

9 years ago

Clearing needinfo; I'm still interested in fixing this at some point, but I have some other things on my plate atm and keeping the needinfo is just distracting.

Flags: needinfo?(jdemooij)

Jukka Jylänki

Comment 12

•

9 years ago

Unreal Engine 4 exported content generates about 128MB of .js code (10MB compressed), which expands to UTF-16 in memory. Here is a test case https://s3.amazonaws.com/mozilla-games/tmp/2016-04-23-PlatformerGame/PlatformerGame.html?cpuprofiler&playback which gives ├────273.11 MB (18.37%) -- js-non-window │ ├──261.76 MB (17.61%) -- runtime │ │ ├──256.04 MB (17.22%) -- script-sources │ │ │ ├──255.73 MB (17.20%) -- source(scripts=1, blob:https://s3.amazonaws.com/b5d508c2-2f8b-4bb5-9ac8-abe624eb7cbf) │ │ │ │ ├──255.73 MB (17.20%) ── uncompressed │ │ │ │ └────0.00 MB (00.00%) ── misc │ │ │ └────0.31 MB (00.02%) ++ (6 tiny) │ │ └────5.72 MB (00.38%) ++ (11 tiny) │ └───11.36 MB (00.76%) ++ (2 tiny) We have talked about this a lot and know that wasm will naturally nuke this away (can't wait for that!), but I wonder if there might be a band-aid that we could do to reduce this for current asm.js code? This might help with 32-bit OOMing due to address space fragmentation (bug 1266389, bug 1266393). Some developers might not migrate to wasm until major browser vendors all support it in stable browsers, so if we had something we could do here quicker, that would be super helpful. Recently we have had a developer report us that about 8% of their game launches are failures due to address space fragmentation OOMs, and they are running their game with a measly 112MB asm.js heap! (with 40MB of script-sources) Could we detect if code is asm.js (or if code is too large), and if so, always store the script-sources compressed in memory? And/or even evict that compressed data to disk? In comment 0, Luke mentions compression being disabled "due to lazy parsing and relazification, the decompression happens periodically". Does this occur even in asm.js code that is beyond our (==Emscripten's codegen) control? If I understood correctly, the lazy parsing is something that wouldn't happen for aot compilation? If this was to support someFunction.toString(), we don't mind in Emscripten if that would take up dozens of seconds, since in Emscripten apps that is never in normal runtime execution paths and we can discourage any developers from doing that. In recent Emscripten code, we have a feature called --separate-asm, which separates out the asm.js module to its own file. If we wanted to do some kind of heuristic to choose when to compress/evict script-sources, we could do it based on if the .js file only contains an asm.js module and nothing else. Does that sound feasible? There was the concern that since other browsers would not be doing this kind of optimizations that this would not improve things overall, but to clarify here, the grand goal is to mitigate OOMing due to 32-bit address space fragmentation, so to bring 32-bit browsers more up to 64-bit level. We could wait for wasm, but I think it is probably still 6 months+ away before anyone could consider shipping a product with it. Any thoughts?

Luke Wagner [:luke]

Assignee

Comment 13

•

9 years ago

Lazy/recompilation isn't for asm.js code, it's the other JS code in the same file. We could attempt to observe that the asm.js function is the only function in the file, as you suggested, and reenable compression in that case. That's probably not too hard, but it means we need to get everyone building with this new flag and it's one more pattern. It seems better just to do the more general (and probably not too hard, though I don't have time to do this atm) solution of chunked source compression since that helps all JS.

Jukka Jylänki

Comment 14

•

9 years ago

In general we'd be ok advocating users to rebuild their pages with --separate-asm flag to take advantage, if this would simplify things on the implementation side here. (We did that kind of developer advocation for adding <script async> as well, kind of the same situation here). The chunked compression sounds fine as well, if it's not overly complex to implement to end up becoming a bigger project(?)

Luke Wagner [:luke]

Assignee

Comment 15

•

9 years ago

Actually, it might be pretty easy to detect that a script contains only one function which is asm.js at the point where we make the compression decision, so I'll see if I can do that quickly. Perhaps then you could have Emscripten issue a warning/suggestion when generating really big scripts when --separate-asm isn't specified.

Jukka Jylänki

Comment 16

•

9 years ago

That would sound perfect to me. Alon, what are your thoughts here? Btw, any chance it might be possible to just reuse the .gz compressed data from the network, if the asm.js content came in as gzip-compressed in the first place? (with the "Content-Encoding: gzip" header). Those will likely be compressed with very aggressive compression factor already.

Yury Delendik (:yury)

Comment 17

•

9 years ago

(In reply to Jukka Jylänki from comment #16) > Btw, any chance it might be possible to just reuse the .gz compressed data > from the network, if the asm.js content came in as gzip-compressed in the > first place? (with the "Content-Encoding: gzip" header). Those will likely > be compressed with very aggressive compression factor already. The source code data is coming to the DOM in uncompressed and also UTF-16 encoded state incrementally (http://mxr.mozilla.org/mozilla-central/source/dom/base/nsScriptLoader.cpp#1731). I think the source can be re-compressed it as soon as you detect it's a asm.js one. You may certainly rely on compressed data from the network, however you will need to properly re-encode to UTF16 (which can be originally encoded in some non-UTF16 encoding) after gzip/deflate decompression once you decide you need the source again.

Alon Zakai (:azakai)

Comment 18

•

9 years ago

(In reply to Jukka Jylänki from comment #16) > That would sound perfect to me. Alon, what are your thoughts here? Sure, warning sounds good. We can even consider switching --separate-asm on by default.

Luke Wagner [:luke]

Assignee

Comment 19

•

9 years ago

Attached patch share-function — Details — Splinter Review

Unrelated, but I happened to notice this simple optimization opportunity which shaves off ~50ms from compile time (~2%).

Assignee: nobody → luke

Status: NEW → ASSIGNED

Attachment #8745826 - Flags: review?(bbouvier)

Luke Wagner [:luke]

Assignee

Comment 20

•

9 years ago

Attached patch unique-ptr — Details — Splinter Review

While I'm in the area, UniquePtrs makes things much nicer.

Attachment #8745827 - Flags: review?(jdemooij)

Luke Wagner [:luke]

Assignee

Comment 21

•

9 years ago

Attached patch compress-huge — Details — Splinter Review

Initially I tried to make the change in ScriptSource::setSourceCopy (as I said above), but that can't work b/c we haven't even begun parsing the chars yet. I poked around with cancelling compression during parsing (like we do for the huge-string-literals hack), but that's complicated. I realized a pretty simple scheme is to observe when we do an expensive decompression, in ScriptSource::chars, and then just revert the ScriptSource back to the uncompressed state. This way we make a purely dynamic judgement which will be a little bit more forgiving. Jukka, perhaps you could try out this patch (applied on top of the UniqueChars one) to confirm?

Attachment #8745833 - Flags: review?(jdemooij)

Benjamin Bouvier [:bbouvier] (inactive)

Comment 22

•

9 years ago

Comment on attachment 8745826 [details] [diff] [review] share-function Review of attachment 8745826 [details] [diff] [review]: ----------------------------------------------------------------- Ha, nice! ::: js/src/asmjs/AsmJS.cpp @@ +6675,5 @@ > ParseNode* fn = m.parser().handler.newFunctionDefinition(); > if (!fn) > return false; > > + RootedFunction fun(m.cx(), m.dummyFunction()); Could dummyFunction() just return a RootedFunction&? This could avoid re-rooting the function (with the same context).

Attachment #8745826 - Flags: review?(bbouvier) → review+

Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8]

Comment 23

•

9 years ago

See also bug 1211723, which landed and then got backed but and should re-land soon-ish.

Comment 24

•

9 years ago

Comment on attachment 8745833 [details] [diff] [review] compress-huge Review of attachment 8745833 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jsscript.cpp @@ +1969,5 @@ > + // time decompressing. > + const size_t HUGE_SCRIPT = 5 * 1024 * 1024; > + if (lengthWithNull > HUGE_SCRIPT) { > + js_free(ss.compressedData()); > + ss.data = SourceType(Uncompressed(decompressed.release(), true)); So once we decompress once, it stays decompressed? Nice balance, since presumably it rarely ever happens (and _never_ happens for asm.js?)

Luke Wagner [:luke]

Assignee

Comment 25

•

9 years ago

(In reply to Benjamin Bouvier [:bbouvier] from comment #22) > Could dummyFunction() just return a RootedFunction&? This could avoid > re-rooting the function (with the same context). Yes, good idea. (In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #24) > So once we decompress once, it stays decompressed? Nice balance, since > presumably it rarely ever happens (and _never_ happens for asm.js?) So this is only for huge scripts (which previously just never got compressed in the first place). In the case of asm.js, while the asm.js module function never needs source/recompilation, the surrounding glue code often does. So to take advantage of this, HUGE_SCRIPTs need to put asm.js by itself in the file (which Emscripten already has a flag for and might enable by default).

Jan de Mooij [:jandem]

Comment 26

•

9 years ago

Comment on attachment 8745827 [details] [diff] [review] unique-ptr Review of attachment 8745827 [details] [diff] [review]: ----------------------------------------------------------------- Nice cleanup!

Attachment #8745827 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Comment 27

•

9 years ago

Comment on attachment 8745833 [details] [diff] [review] compress-huge Review of attachment 8745833 [details] [diff] [review]: ----------------------------------------------------------------- Makes sense.

Attachment #8745833 - Flags: review?(jdemooij) → review+

Pulsebot

Comment 28

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/e477d5ac98ac https://hg.mozilla.org/integration/mozilla-inbound/rev/af09142d1264 https://hg.mozilla.org/integration/mozilla-inbound/rev/ab8fc91183a5

Luke Wagner [:luke]

Assignee

Comment 29

•

9 years ago

I tested on synthetic test cases to confirm but without a --separate-asm build on hand I wasn't able to test on a big workload. Jukka, could you confirm?

Flags: needinfo?(jujjyl)

Nicholas Nethercote [inactive]

Comment 30

•

9 years ago

> So once we decompress once, it stays decompressed? Nice balance, since > presumably it rarely ever happens (and _never_ happens for asm.js?) Mmm, yes, nice compromise :)

Jukka Jylänki

Comment 31

•

9 years ago

Working on creating some Unity and UE4 test cases with --separate-asm. Here is a UE4 one: https://dl.dropboxusercontent.com/u/40949268/emcc/2016-04-29-StrategyGame_separate_asm_shipping/StrategyGame-HTML5-Shipping.html Interestingly on that page the script-sources memory is not correcly reported. Marked that down as https://bugzilla.mozilla.org/show_bug.cgi?id=1268863. Keeping needinfo, I'll do an Unity test case as well.

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 32

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/e477d5ac98ac https://hg.mozilla.org/mozilla-central/rev/af09142d1264 https://hg.mozilla.org/mozilla-central/rev/ab8fc91183a5

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox49: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla49

Jukka Jylänki

Comment 33

•

9 years ago

Actually, decided not to do an Unity sample due to running into problems with latest Unity update. Instead, here is another test case from UE4: https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html?playback&cpuprofiler Nightly from 4-28 has 601.49 MB (100.0%) -- explicit ├──390.32 MB (64.89%) -- window-objects/top(https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html, id=4294967301) │ ├──389.82 MB (64.81%) -- active/window(https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html) │ │ ├──388.04 MB (64.51%) -- js-compartment(https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html) │ │ │ ├──387.66 MB (64.45%) -- classes │ │ │ │ ├──294.01 MB (48.88%) -- class(ArrayBuffer)/objects │ │ │ │ │ ├──288.00 MB (47.88%) ── non-heap/elements/asm.js │ │ │ │ │ └────6.01 MB (01.00%) ++ (2 tiny) │ │ │ │ ├───91.59 MB (15.23%) -- class(WasmModuleObject)/objects │ │ │ │ │ ├──54.88 MB (09.12%) ── non-heap/code/asm.js │ │ │ │ │ ├──36.71 MB (06.10%) ── malloc-heap/misc │ │ │ │ │ └───0.00 MB (00.00%) ── gc-heap │ │ │ │ └────2.06 MB (00.34%) ++ (10 tiny) │ │ │ └────0.39 MB (00.06%) ++ (6 tiny) │ │ └────1.78 MB (00.30%) ++ (4 tiny) │ └────0.50 MB (00.08%) ++ js-zone(0x24fb110f800) ├──175.28 MB (29.14%) ── heap-unclassified ├───16.73 MB (02.78%) -- js-non-window │ ├──10.13 MB (01.68%) -- zones │ │ ├───8.67 MB (01.44%) ++ zone(0x24fa2e8c000) │ │ └───1.46 MB (00.24%) ++ (3 tiny) │ └───6.60 MB (01.10%) ++ (2 tiny) ├───11.83 MB (01.97%) ++ (18 tiny) └────7.34 MB (01.22%) ++ webgl and with today's Nightly, it reads 524.67 MB (100.0%) -- explicit ├──390.08 MB (74.35%) -- window-objects/top(https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html, id=2147483655) │ ├──389.58 MB (74.25%) -- active/window(https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html) │ │ ├──387.78 MB (73.91%) -- js-compartment(https://s3.amazonaws.com/mozilla-games/tmp/2016-05-02-SunTemple/SunTemple-HTML5-Shipping.html) │ │ │ ├──387.16 MB (73.79%) -- classes │ │ │ │ ├──294.01 MB (56.04%) -- class(ArrayBuffer)/objects │ │ │ │ │ ├──288.00 MB (54.89%) ── non-heap/elements/asm.js │ │ │ │ │ ├────6.01 MB (01.15%) ── malloc-heap/elements/normal │ │ │ │ │ └────0.00 MB (00.00%) ── gc-heap │ │ │ │ ├───91.11 MB (17.36%) -- class(WasmModuleObject)/objects │ │ │ │ │ ├──54.40 MB (10.37%) ── non-heap/code/asm.js │ │ │ │ │ ├──36.71 MB (07.00%) ── malloc-heap/misc │ │ │ │ │ └───0.00 MB (00.00%) ── gc-heap │ │ │ │ └────2.04 MB (00.39%) ++ (10 tiny) │ │ │ └────0.62 MB (00.12%) ++ (6 tiny) │ │ └────1.80 MB (00.34%) ++ (4 tiny) │ └────0.50 MB (00.10%) ++ js-zone(0x2218b16d000) ├──100.08 MB (19.08%) ── heap-unclassified ├───15.77 MB (03.01%) -- js-non-window │ ├──10.02 MB (01.91%) -- zones │ │ ├───8.57 MB (01.63%) ++ zone(0x22184fc5000) │ │ └───1.45 MB (00.28%) ++ (3 tiny) │ └───5.75 MB (01.10%) ++ (2 tiny) ├────7.35 MB (01.40%) ++ webgl ├────6.10 MB (01.16%) ++ (17 tiny) └────5.29 MB (01.01%) ++ heap-overhead Unfortunately due to bug 1268863 these reports don't directly say the exact numbers for script-sources, but it's part of heap-unclassified. There is a 75.2MB reduction in heap-unclassified. The uncompressed JS file for Sun Temple is 45.7MB (91.4MB doubled up to UTF-16), so assuming that all of the change in heap-unclassified is due to the improvement in this patch, gzip compression took the original file down from 91.4MB to 16.2MB. This is just fantastic! When gzipping the asm.js file locally, it goes from 45.7MB down to 8.92MB. It looks like we do decompress the gzipped content we receive from network, and then gzip compress it again ourselves. I wonder if that might pose a burden on mobile. If the machinery here was improved to reuse the compressed bits we get directly from the network, it would further save -45% of the size (-7.28MB). Not sure about the feasibility of that though?

Flags: needinfo?(jujjyl)

Luke Wagner [:luke]

Assignee

Comment 34

•

9 years ago

(In reply to Jukka Jylänki from comment #33) > When gzipping the asm.js file locally, it goes from 45.7MB down to 8.92MB. > It looks like we do decompress the gzipped content we receive from network, > and then gzip compress it again ourselves. I wonder if that might pose a > burden on mobile. If the machinery here was improved to reuse the compressed > bits we get directly from the network, it would further save -45% of the > size (-7.28MB). Not sure about the feasibility of that though? Yes, the difference here is that SM is doing a fast (Z_BEST_SPEED) compression of the utf8-decoded (char16_t) chars, so double size loss. It's a good idea to try to use what came over the network and one we've talked about before but unfortunately the points we'd need to connect are very far apart in terms of code architecture: the network does a streaming decoding of what comes over the network before it hands the decompressed utf8 bytes to DOM-land which uses a number of heuristics to decide how to decode into char16_t before handing over to JS. Maybe one day.

Jukka Jylänki

Comment 35

•

9 years ago

Thanks for landing bug 1268863, for good measure, ran the page again with that included, which now shows │ ├──35.10 MB (05.14%) -- runtime │ │ ├──18.25 MB (02.68%) -- script-sources │ │ │ ├──16.32 MB (02.39%) -- source(scripts=1, blob:https://s3.amazonaws.com/8ce8d2e9-efe2-4706-88bd-f9ef930ab8c4) │ │ │ │ ├──16.32 MB (02.39%) ── compressed so the above 16.2MB estimation was accurate. I'm gonna go and start rallying Emscripten users to migrate to --separate-asm now!

Jukka Jylänki

Comment 36

•

9 years ago

https://groups.google.com/d/msg/emscripten-discuss/ia6Q4Jd7G7k/DRKCCr_rBQAJ

Luke Wagner [:luke]

Assignee

Comment 37

•

9 years ago

Thanks for the confirmation and thanks again for pushing on this issue. Any chance you can get Emscripten to do this by default (or, if not, warn if files are large)?

share-function 9 years ago Luke Wagner [:luke] 4.93 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
unique-ptr 9 years ago Luke Wagner [:luke] 8.18 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
compress-huge 9 years ago Luke Wagner [:luke] 3.67 KB, patch	jandem : review+	Details \| Diff \| Splinter Review