Fix some slow calls into C++
Categories
(Core :: JavaScript Engine: JIT, task, P2)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox132 | --- | fixed |
People
(Reporter: jandem, Assigned: jandem)
References
(Blocks 1 open bug)
Details
(Keywords: perf-alert, Whiteboard: [sp3])
Attachments
(2 files)
While looking into bug 1917628, I added some instrumentation for calls that push/pop all volatile registers. On Speedometer 3, the top two cases are:
-
The C++ call in
MacroAssembler::emitMegamorphicCachedSetSlot. Here we don't need to save all volatile registers because the LIR instruction is a call instruction, so almost no registers are live. -
Calls to
js::EmulatesUndefined. Most of these are for non-wrapper proxy objects. We can check for this case inMacroAssembler::branchIfObjectEmulatesUndefinedand avoid the call.
| Assignee | ||
Comment 1•1 year ago
|
||
The LIR instructions are marked as call-instruction so we only need to save the registers
we still need to use after the call.
Especially on ARM64 there are many volatile (float) registers and saving/restoring them
is pretty expensive.
| Assignee | ||
Comment 2•1 year ago
|
||
This eliminates a few hundred thousand calls to js::EmulatesUndefined on Speedometer 3.
Updated•1 year ago
|
Comment 4•1 year ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/8df7ffcad4d4
https://hg.mozilla.org/mozilla-central/rev/9e65655a96c1
Comment 5•1 year ago
|
||
"Perf" key word?
| Assignee | ||
Comment 6•1 year ago
|
||
Looks like this indeed fixed the ARM64 TodoMVC-Vue/DeletingAllItems/Sync regression in bug 1917628.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 7•1 year ago
|
||
(In reply to Pulsebot from comment #3)
Pushed by jdemooij@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8df7ffcad4d4
part 1 - Avoid saving all volatile registers in LMegamorphicStoreSlot and
LMegamorphicSetElement code. r=dthayer
https://hg.mozilla.org/integration/autoland/rev/9e65655a96c1
part 2 - Support non-wrapper proxies in
MacroAssembler::branchIfObjectEmulatesUndefined. r=anba
Perfherder has detected a browsertime performance change from push 9e65655a96c1cd90fbd3c98efc9bea04addcbb0f.
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
|---|---|---|---|---|---|
| 5% | speedometer3 TodoMVC-Vue/DeletingAllItems/Sync | macosx1400-64-shippable-qr | fission webrender | 2.62 -> 2.49 | Before/After |
| 5% | speedometer3 TodoMVC-Vue/DeletingAllItems/Sync | linux1804-64-shippable-qr | fission webrender | 9.72 -> 9.24 | Before/After |
| 4% | speedometer3 TodoMVC-Vue/DeletingAllItems/Sync | linux1804-64-nightlyasrelease-qr | fission webrender | 9.66 -> 9.28 | Before/After |
| 4% | speedometer3 TodoMVC-Vue/DeletingAllItems/total | macosx1400-64-shippable-qr | fission webrender | 4.22 -> 4.07 | Before/After |
| 3% | speedometer3 TodoMVC-Vue/DeletingAllItems/total | linux1804-64-shippable-qr | fission webrender | 14.98 -> 14.47 | Before/After |
| 3% | speedometer3 TodoMVC-Vue/DeletingAllItems/total | macosx1400-64-shippable-qr | fission webrender | 4.21 -> 4.08 | Before/After |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 2195
For more information on performance sheriffing please see our FAQ.
Description
•