Closed Bug 938385 Opened 11 years ago Closed 10 years ago

lazy-parsing and very-large scripts still a problem

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29

People

(Reporter: luke, Assigned: luke)

Details

Attachments

(1 file)

Bug 885908 cut the link in a cycle where decompressing a large script causes a GC which flushed the decompressed source so we had to decompress again etc etc.  Even with that fixed though, we can still run into problems with large scripts:
 - the decompression time is significant, even if it only happens once (1s for 270MB script which is about the size of an Emscripten an un-optimized un-minified build for the Unreal engine)
 - the decompressed chars are large and can contribute to OOM on 32-bit

This tends to show up even with asm.js AOT compilation since there are still a bunch of random glue functions in non-asm.js.
This seems to account for 600ms of main-thread startup time in the Citadel demo, with another 250ms happening within the first few seconds of interaction (presumably some input handler code running for the first time after GC).  On another large compiled mobile game, it's also accounts for 600ms (measured on a PC, so probably >2x this on a device) with several later scattered 150ms pauses.

One targeted fix is, since "use asm" is the "I don't want any pauses after I start running" pragma, we could just reparse the entire script with FullParseHandler when we see "use asm" (that's almost what happens now: everything after the "use asm" is full-parsed).  Even simpler, we could just have every abortIfSyntaxParser() call cause this behavior.  (IIRC, this shouldn't happen much outside of asm.js and, if it did, we'd want to improve the syntax-only-parser to handle that case.)

Another thought is that lazy parsing is kinda dangerous for mega-huge scripts in general and perhaps we should disable it when the script length is >10MB (when the decompression times are >100ms).  asm.js or not, if such a script runs for any duration of time, it's *bound* to repeatedly trigger lazy compilation after GC.  If it doesn't run long, then compression was a waste anyway.  It seems like re-lazification will exacerbate this.  Furthermore, for huge scripts, the compression time can make loading slower (esp. if there isn't a second core).  While this seems counter-intuitive (huge scripts stand to save the most memory): (1) huge scripts are very rare, so this shouldn't impact ordinary memory use, (2) running a huge script probably is a resource-hungry operation anyway, 10MB one way or the other is probably not much compared to all the other resources being used.

A third option, discussed before, and a lot more work, is that we implement some form of chunked compression/decompression.  This seems like the "right" fix, but it's a non-trivial amount of complexity to handle a corner case.

Overall, my inclination is to do the second option and wait until we have a use case for the third.  cc'ing people for other opinions.
Attached patch fix-compressionSplinter Review
While writing this patch I realized that we already disable compression in other bad cases (huge string literal, 1 core) which supports the view that compression is a heuristic memory optimization that we attempt to apply when it doesn't hurt.  So I think this is the right path.
Assignee: nobody → luke
Status: NEW → ASSIGNED
Attachment #8357359 - Flags: review?(wmccloskey)
Oh, also: this seems to improve warm Citadel startup time by a full second.
Comment on attachment 8357359 [details] [diff] [review]
fix-compression

Review of attachment 8357359 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsscript.cpp
@@ +1217,5 @@
>      argumentsNotIncluded_ = argumentsNotIncluded;
>  
> +    // There are several cases where source compression is not a good idea:
> +    //  - If the script is enormous, then decompression can take seconds which,
> +    //    when combined with lazy parsing, which frequently requires

Not sure which which to keep, but it's confusing to have two of them.

@@ -1226,5 @@
>      //    threads since at most one worker thread can be blocking on the main
>      //    thread (see WorkerThreadState::canStartParseTask) which would cause a
>      //    deadlock if there wasn't a second worker thread that could make
>      //    progress on our compression task.
> -    if (task && cx->cpuCount() > 1 && cx->workerThreadCount() >= 2) {

I think you still need to nullcheck task here. See this call site:
http://mxr.mozilla.org/mozilla-central/source/js/src/frontend/BytecodeCompiler.cpp#211
The default value for extraSct is null.
Attachment #8357359 - Flags: review?(wmccloskey) → review+
(In reply to Bill McCloskey (:billm) from comment #4)
> I think you still need to nullcheck task here. See this call site:
> http://mxr.mozilla.org/mozilla-central/source/js/src/frontend/
> BytecodeCompiler.cpp#211
> The default value for extraSct is null.

But the
  SourceCompressionTask *sct = extraSct ? extraSct : &mysct;
line should ensure 'sct' is non-null even if extraSct is null.
Oops, I guess you're right.
I think this makes a lot of sense. Luke, you're completely right that relazification will exacerbate the problem here, so it's good to have it out of the way for large scripts already.

> A third option, discussed before, and a lot more work, is that we implement
> some form of chunked compression/decompression.  This seems like the "right"
> fix, but it's a non-trivial amount of complexity to handle a corner case.
> 
> Overall, my inclination is to do the second option and wait until we have a
> use case for the third.  cc'ing people for other opinions.

I agree with this, too.

Medium- to long-term, we should still switch to lz4 and chunked compression and then disable these exceptions again. I'm watching lz4's development and will update our import as soon as I think it makes sense for us to switch to it in SpiderMonkey. (We have it in MFBT and Yoric is experimenting with it for storing session data and other things, IIUC.) Right now, it still misses the ability to be interrupted. It has, OTOH, recently gained support for using custom allocators for its internal (de-)compression state, so that's interesting.

We could also switch to a slightly more complicated format where we just manually chunk the data into 64kb blocks and check for interruptions between these. That would have the added advantage of making chunked decompression fairly trivial to implement.

Under this scheme, I could imagine the ScriptSource having an array of bools indicating which source chunks are available decompressed. If the requested function is fully available, it can just get returned. If not, we have to request decompression of the missing chunk(s). A function that crosses chunk boundaries would require the two (or more) chunks to be decompressed at the same time and in a continuous area of memory, I guess, but I'm hopeful that that case doesn't happen too frequently and that this downside wouldn't outweigh the substantial benefits.

Oh, and lz4's compression ratio doesn't meaningfully suffer from having chunks this small. See http://fastcompression.blogspot.de/2013/08/inter-block-compression.html for a table.
(In reply to Till Schneidereit [:till] from comment #8)
Thanks for following up.  Yes, chunked decompression is probably what we want long term.

On the "functions spanning chunks" problem: independently of this problem, I've been wondering if we could change SM's parser so that it could parse in chunks, calling hook when it reaches the end of one chunk to get the next one.  If you combine this with off-main-thread parsing (so that it is ok to block on the network), this would allow us to parse a script as it arrive from the network.  Assuming our parsing throughput is greater than network throughput, this would make parsing almost free.  It would also avoid what I assume must be a bunch of wasted effort that the HTML5 parser has to do first scanning the chars for the </script> tag and then copying them into a big contiguous buffer.  Anyhow, if we had the above functionality, then we could use it to easily solve the function-spanning-chunks problem.
https://hg.mozilla.org/mozilla-central/rev/aeef43ce8bdb
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: