Tune the megamorphic cache and associated code
Categories
(Core :: JavaScript Engine: JIT, task, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox107 | --- | fixed |
People
(Reporter: alexical, Assigned: alexical)
References
(Blocks 1 open bug)
Details
(Whiteboard: [sp3])
Attachments
(3 files)
This is a bit of grab-bag of proposed changes to the megamorphic cache. In summary:
- Slightly change the hashing of the shape to help avoid hitting pathological cases.
- Remove the
branchIfNonNativeObject
bailout path in the places which callGetNativeDataPropertyPure
and friends until after the cache lookup, since we'll never get a cache hit on a non-native object anyway. - Inline the cache lookup.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
|
||
The first three bits of the Shape pointer will always be 0, so we lose out on
some entropy here. I initially tried a "real" hashing algorithm, but noticed as
you likely already did that this can be incredibly hot and even adding a single
multiply in meaningfully degrades the performance of this. However, adding this
shift in just adds one mov and one shr, and I couldn't get it to regress perf
for a microbenchmark that tried to make it irrelevant. However, it did speed up
a microbenchmark that used the same key over and over again against 100
different shapes by a factor of about 25%, so I think it's worth it to avoid
potential pathological cases.
Further observations: we do hit this issue reasonably frequently in the matrix
react bench as well as in speedometer - this patch reduces cache misses where
only the shape was different quite substantially. However, those misses are
a small enough portion of overall cache misses that this patch doesn't seem to
observably affect those benchmarks' overall numbers.
Lastly, I wish I could have computed these shifts instead of just leaving a
comment explaining how they were derived. However, our tools for doing this in
C++ are not constexpr-friendly it seems. Also, I made these public because
they will be used from the MacroAssembler in later patches.
Assignee | ||
Comment 2•2 years ago
|
||
Initially we explored caching this check on the shape, which still seems like
a decent idea. However, we can also just sidestep the problem for the most part
be moving the check after the cache lookup, since we can't get a cache hit if
it's not a NativeObject anyway, and we're hoping that a cache hit should be the
dominant path in most cases.
Depends on D156613
Assignee | ||
Comment 3•2 years ago
|
||
This inlines the megamorphic cache lookup, netting us about a 30-40%
improvement for the microbenchmark on top of the prior optimizations. There is
no measured performance losses for speedometer or matrix-react, and maybe a
1% win overall, though I need to do more runs with the cleaned up patch to be
confident.
We could do similar inlining for the CacheIRCompiler version and for
MegamorphicLoadSlotByValue - however, we wouldn't be able to include the id and
id hash as immediates, which would mean we'd need another register and several
more instructions to make it work, and it would be awkward to share the masm
code because of the different register requirements.
Depends on D156614
Updated•2 years ago
|
Comment 5•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/5ff12c46bc20
https://hg.mozilla.org/mozilla-central/rev/080d321a6baf
https://hg.mozilla.org/mozilla-central/rev/da8bb74f3daa
Updated•2 years ago
|
Updated•2 years ago
|
Description
•