Closed Bug 1779940 Opened 5 months ago Closed 3 months ago

Large time spent in JS:frontend on https://lambdahack.github.io/ with "concurrent delazification

Categories

(Core :: JavaScript Engine, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
106 Branch
Tracking Status
firefox106 --- fixed

People

(Reporter: mayankleoboy1, Assigned: nbp)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

Attachments

(6 files)

Set dom.script_loader.delazification.strategy = 2
Restart nightly
Go to https://lambdahack.github.io/
Wait for the game to load..

ER: The load time increases to 20+ seconds, with all the time spent in JS:frontend.
AR: Instant loading of the page (which happens with "255" strategy)

Profile with "2" strategy : https://share.firefox.dev/3PgBWGE
Profile with "255" strategy: https://share.firefox.dev/3yMb4au

Attached file about:support
Summary: Large time spent in JS:frontend on https://lambdahack.github.io/ → Large time spent in JS:frontend on https://lambdahack.github.io/ with "2" delazification strategy
See Also: → 1779292
Summary: Large time spent in JS:frontend on https://lambdahack.github.io/ with "2" delazification strategy → Large time spent in JS:frontend on https://lambdahack.github.io/ with "concurrent delazification

From the profile I wonder if there are just many aliased bindings in an outer function and we get quadratic behavior in searchInEnclosingScope?

Component: JavaScript Engine: JIT → JavaScript Engine

Thanks a lot for this bug report! Definitely useful.

Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED
Severity: -- → N/A
Priority: -- → P1

Ok, the problem is not that the cache does not work.

So … this is likely the code I wrote which is generic over the input which is costing us.
I will revisit and try to split the searchInEnclosingScope function based on the kind of scope we are expecting.

(In reply to Nicolas B. Pierron [:nbp] from comment #4)

I will revisit and try to split the searchInEnclosingScope function based on the kind of scope we are expecting.

I've done so, unfortunately without success. Checking Firefox 94, which predates Bug 1730881 changes, shows a profile which also takes a huge amount of time under script delazify.

Thus this is a pre-existing issue, and not something introduced with the addition of InputScope variants.
I will investigate whether we can provide a better approach than searchInEnclosingScope.

Ok, I just tested a patch which is caching each scope as we iterate over the chain of enclosing scopes, and this caused everything to be even slower than before.

The problem might be the number of bindings which are listed. Without the patch the profiler highlight a huge cost under the InputName::isEqualTo function, and when using a cache, the InputName::internInto function, to use the same namespace as the compiled function is taking a huge hit as well.

When using on-demand, the cost is apparent when comparing against the JSAtom*, and when using off-thread delazification, the cost appear as we have TaggedParserAtomIndex which are not comparable as coming from different ParserAtomTable.

Arai already implemented something similar where the cache was pre-populated ahead, in Bug 1690277.

It sounds to me that the concurrent delazification cost could be avoided if we were to pre-compute a ParserAtomTable of all names which are in the source, and reuse the same ParserAtomTable for all inner functions. Unfortunately, this would not resolve the case where we attempt to do on-demand delazification, where JSAtom* would have to be interned.

Depends on: 1782493

This patch moves the hops offseting out of the searchInEnclosingScope
computation such that searchInEnclosingScope computation can provide a result
which is independent from where the named is looked for. Making the NameLocation
independent enables us to later cache the NameLocation.

This patch adds 2 short-circuits to improve the lookup of bindings in the
enclosing scope chain, during on-demand delazification.

The first short-circuit is in Function scope, by comparing bindings against each
others, we can compare either JSAtom directly against JSAtom or
TaggedParserAtomIndex directly with TaggedParserAtomIndex from the
CompilationStencil context.

The second short-circuit is in InputName::isEqualTo, in the case where we are
comparing a JSAtom against the TaggedParserAtomIndex from the compiled
script, we can query the hash of both and compare them before converting the
TaggedParserAtomIndex into a JSAtom, which should make the comparison
faster.

The next patch is creating a cache which is capable of looking up different kind
of string types. However, each string type need some contextual information to
be able to compare them against each others, which adds complexity to the lookup
type. In addition, the keys are of only one string type, and therefore we try to
avoid storing this context as part of each key, but instead provide it with the
contextual information coming with the Lookup type.

Therefore, when we want to insert a key, which might already be present, using
put. We have to provide a aLookup argument which knows how to compare keys.

This also make the interface similar to putNew which already has the
distinctions between the Lookup argument and the KeyInput argument.

This patch isolate the boiler plate code to create a dummy ScopeBindingCache
with no logic to split the noise of this not-so mechanical patch driven by
compiler error messages.

This patch modify the interface of every function, adding the option to provide
a custom ScopeBindingCache when parsing a Stencil, except for publicly facing
functions such as from js/public/experimental/JSStencil.h.

The intent being that Helper thread functions, which are relying on Stencil
should have their own ScopeBindingCache which is either unused, or dedicated to
caching the bindings of multiple delazifications. The reasons are that in the
future we are interested in stripping the JSContext* out of HelperThreads, and
that there is no value in sharing the content across threads.

When generating bytecode, we are looking for names locations, using
ScopeContext::searchInEnclosingScope, to replace them by integers, which
improve the speed of the generated code.

However, looking for these names can it-self be extremlly costly if there is a
scope with a very large number of bindings. Multiple attempts have been made in
the past to improve this situation by caching bindings within the scope of each
compilation. Unfortunately, this was either a regression or a non-improvement,
as these lookups might be done by thousands of inner functions.

This patch adds content to the ScopeBindingCache, introduced in the previous
patch as placeholder. The ScopeBindingCache is a cache which persists across
multiple compilations. Therefore, if a scope chain is shared by multiple
delazified inner function, the caching cost would be mutualized.

The ScopeBindingCache, as the searchInEnclosingScope function have to deal
with 2 different kind of scopes. Either we are manipulating stencils, in case of
concurrent delazification, or we are manipulating GC objects, in case of
on-demand delazification. Thus the interface provide overloaded functions to be
able to write generic code, and contains 2 ScopeBindingMap, one for each type
of atoms TaggedParserAtomIndex and JSAtom* respectively coming from scope
*::ParserData and *::RuntimeData of each scope.

The ScopeBindingMap is a HashMap indexed on the AbstactBaseScopeData for
each Atom type. It maps the scope data, i-e the pointer to the array of
bindings, to the BindingMap.

The BindingMap provides a HashMap and a catchAll field. The HashMap is
used to map each binding to its NameLocation, while the catchAll is used to
handle the case where the scope is used to collect all new names, when
dynamically created.

The BindingMap's hashMap makes use of the lookup capability of HashTable
in order to avoid storing the context of each atom next to each key. As all
bindings are coming from the same scope data. Thus the Lookup provides the
context to interpret the keys which are stored in the hashMap. This HashMap
also feature the ability to compare all 3 different atoms possible. The
comparisons between these 3 different atoms is handled by the GenericAtom
class.

The GenericAtom class provides a uniform interface for comparing the 3
different type of atoms against each others, such that atoms can be registered
in the BindingMap with one type, while being look up with the same or
different type. The GenericAtom exposes a hash field, which is consistent
across all atoms types, as well as an equality operator. The 3 different atoms
types are: TaggedParserAtomIndex provided by the ScopeStencilRef, the
TaggedParserAtomIndex provided by the ParserAtomTable and the JSAtom* from
GC objects.

Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/bd1c92aacfbc
part 0 - Move the hops offsetting out of searchInEnclosingScope. r=arai
https://hg.mozilla.org/integration/autoland/rev/fa61db088844
part 1 - Short circuit TaggedParserAtomIndex and JSAtom comparisons. r=arai
https://hg.mozilla.org/integration/autoland/rev/be795fd450ab
part 2 - Add HashMap::put which relies on lookup. r=arai
https://hg.mozilla.org/integration/autoland/rev/9498e01e91e2
part 3 - Add dummy ScopeBindingCache, and get it across. r=arai
https://hg.mozilla.org/integration/autoland/rev/62177878bdfb
part 4 - Use ScopeBindingCache to speed-up binding lookup. r=arai

Profile with latest nightly: https://share.firefox.dev/3Kk8SfP

This is fixed.

You need to log in before you can comment on or make changes to this bug.