Open Bug 988353 Opened 10 years ago Updated 2 years ago

Octane Code Load spends most time in full parsing

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

People

(Reporter: till, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf)

Attachments

(2 files)

Looking at https://tl-beta.paas.allizom.org/, we're not doing an awful lot of syntax parsing in code load. Full parsing takes 56% of the entire run time, with lazy parsing at 19%. From a very cursory glance at how the test works, it seems to me like we should be able to spend much more time in syntax parsing, in theory.

Script execution is totalling about 23% of the entire run time, with 12% in the interpreter, 7.3% in baseline, 2% in ion and 1.4% in Yarr JIT. That seems fair to me, given what the test does.
As h4writer pointed out, here's the full profile for code load: https://tl-beta.paas.allizom.org/tracelogger.html?data=data-Octane-x86-code-load.js
related to bug 912561 ?
(In reply to mayankleoboy1 from comment #2)
> related to bug 912561 ?

No, this is about an issue that affects both GGC and non-GGC builds and isn't related to GC at all: it's about (potentially) doing too much full parsing instead of just syntax parsing most code in this particular benchmark.
I looked into this some more, with somewhat interesting results.

The TL;DR version is that we seem to hit our lazyScriptCache far less than we ought to.


Coad Load consists of two parts: one loading and executing jQuery, and one doing the same for a part of Google's Closure library.

Both are embedded in the source file as strings which are salted before evaluating them.

For the Closure part, the salting is thorough enough that there's pretty much nothing short of outright cheating one could do to improve the amount of lazy parsing (or cloning of parsed results).

For jQuery, however, we should get pretty good hit rates form the lazyCodeCache: the salting only affects a very few functions in the source. The thing is, we don't.

Attached is a reduced version of Code Load, with Closure removed, and only two runs of evaluating and running jQuery. Running it with `-e 'unsalted=true'` will prevent the salting. In theory, this should give us perfect cache hit rates, as the exact same source code is parsed and executed twice.

Running the salted and unsalted versions with the tracelogger enabled[1] gives these results:
http://people.mozilla.org/~tschneidereit/tracelogging/tracelogger.html?log=logs/octane-code-load-salted
http://people.mozilla.org/~tschneidereit/tracelogging/tracelogger.html?log=logs/octane-code-load-unsalted

As can be seen, the amount of parsing that's done is virtually identical, so we don't seem to get any improvements from the lazyScriptCache at all. I wonder if this is true for all (eval-using, perhaps) code, or if there's something specific about this script.

I haven't yet looked too much into *why* we're not hitting the cache.
Brian, maybe you know a reason off-hand that I just don't see?


[1]: After compiling with --enable-trace-logging, I ran the shell with this command line:
TLOPTIONS=DisableMainThread TLLOG=Default js -e 'unsalted={true,false}' code-load.js
Flags: needinfo?(bhackett1024)
Hmm, it looks like the lazy script cache has never actually done anything, due to a local-variable-shadowing-argument bug.
Attachment #8417539 - Flags: review?(till)
Flags: needinfo?(bhackett1024)
Comment on attachment 8417539 [details] [diff] [review]
fix lazy script cache

Review of attachment 8417539 [details] [diff] [review]:
-----------------------------------------------------------------

Oh wow :/
Attachment #8417539 - Flags: review?(till) → review+
I just tested this, and it does help some: Octane Code Load goes up about 800-1000 points on my machine, or roughly 4.5% to 5%.

Looking at the new tracelogging results, there is still a lot of time spent in parsing, though:
http://people.mozilla.org/~tschneidereit/tracelogging/tracelogger.html?data=../tl-logs/octane-code-load-unsalted-lazy/
http://people.mozilla.org/~tschneidereit/tracelogging/tracelogger.html?data=../tl-logs/octane-code-load-salted-lazy/

I guess there's not too much that can be done about the syntax parsing taking a while. One idea might be to see if a script has been loaded from the same URI previously and, if so, compare the source chars and take the parsing results wholesale instead of doing even the syntax parsing. Presumably, that'd help with multiple sites loading the same scripts, like jQuery, from CDNs. That seems to be quite involved, though.

I don't, however, understand why we're still spending so much time in delazification. At least for the unsalted version, shouldn't we straight out copy all scripts from the cache if, as in this case, no GC has happened between two iterations of parsing and running a set of functions?
> Oh wow :/

I've learned to view "stop doing something stupid" as just as good as "start doing something clever". It helps with cases like this :)
(In reply to Till Schneidereit [:till] from comment #8)
> I guess there's not too much that can be done about the syntax parsing
> taking a while. One idea might be to see if a script has been loaded from
> the same URI previously and, if so, compare the source chars and take the
> parsing results wholesale instead of doing even the syntax parsing.
> Presumably, that'd help with multiple sites loading the same scripts, like
> jQuery, from CDNs. That seems to be quite involved, though.

Well, while testing the above patch I've found several different ways the lazy script cache is broken, some of them due to engine changes since the cache was (well, wasn't) added.  The most serious of these is that the memcmp we do to test correspondence between the cached and new code isn't valid, as the code we emit for a lazy script depends on the environment it exists in (e.g. for ALIASEDVAR and GNAME ops).  To fix this I think the cache needs to be rewritten so that we find identical script sources instead of individual scripts.  That would permit caching and reusing both the initial results of lazy parsing and later delazification, though, and in non-contrived cases (i.e. not octane-codeload) should still hit the same amount of time as the existing approach.

> I don't, however, understand why we're still spending so much time in
> delazification. At least for the unsalted version, shouldn't we straight out
> copy all scripts from the cache if, as in this case, no GC has happened
> between two iterations of parsing and running a set of functions?

The lazy script cache as is doesn't deal with functions with inner functions, maybe that is hitting a lot in this case.
Depends on: 1008032
Depends on: 1020012
With the blockers fixed, can we do more to improve our script cache, will code load just be as slow as it is because it's not a good test?
Flags: needinfo?(bhackett1024)
Blocks: 1101386
(In reply to Till Schneidereit [:till] from comment #11)
> With the blockers fixed, can we do more to improve our script cache, will
> code load just be as slow as it is because it's not a good test?

octane-code-load is designed to defeat caching attempts, by modifying the executed code each time it runs.  It's just trying to isolate the behavior of the parser, which I think makes it a good test, actually.  I'm still not sure what to do about the lazy script cache; it leaves a lot on the table wrt potential gains from caching compilation results, but on the other hand it doesn't use much memory and doesn't keep things alive longer than they otherwise would be.  In any case, I think bug 1101386 would be a better place to pursue improvements to the cache, since that is a case that can actually be helped by parse caching.
Flags: needinfo?(bhackett1024)
Since it's unclear what the blockers are anymore, here's a try catch to see if exercising the lazy script cache fails:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5edb536095b9
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: