Closed Bug 1919217 Opened 1 year ago Closed 1 year ago

Fix some slow calls into C++

Categories

(Core :: JavaScript Engine: JIT, task, P2)

task

Tracking

()

RESOLVED FIXED
132 Branch
Tracking Status
firefox132 --- fixed

People

(Reporter: jandem, Assigned: jandem)

References

(Blocks 1 open bug)

Details

(Keywords: perf-alert, Whiteboard: [sp3])

Attachments

(2 files)

While looking into bug 1917628, I added some instrumentation for calls that push/pop all volatile registers. On Speedometer 3, the top two cases are:

  • The C++ call in MacroAssembler::emitMegamorphicCachedSetSlot. Here we don't need to save all volatile registers because the LIR instruction is a call instruction, so almost no registers are live.

  • Calls to js::EmulatesUndefined. Most of these are for non-wrapper proxy objects. We can check for this case in MacroAssembler::branchIfObjectEmulatesUndefined and avoid the call.

The LIR instructions are marked as call-instruction so we only need to save the registers
we still need to use after the call.

Especially on ARM64 there are many volatile (float) registers and saving/restoring them
is pretty expensive.

This eliminates a few hundred thousand calls to js::EmulatesUndefined on Speedometer 3.

Severity: -- → N/A
Priority: -- → P2
Pushed by jdemooij@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8df7ffcad4d4 part 1 - Avoid saving all volatile registers in LMegamorphicStoreSlot and LMegamorphicSetElement code. r=dthayer https://hg.mozilla.org/integration/autoland/rev/9e65655a96c1 part 2 - Support non-wrapper proxies in MacroAssembler::branchIfObjectEmulatesUndefined. r=anba
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 132 Branch

"Perf" key word?

Whiteboard: [sp3]

(In reply to Pulsebot from comment #3)

Pushed by jdemooij@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8df7ffcad4d4
part 1 - Avoid saving all volatile registers in LMegamorphicStoreSlot and
LMegamorphicSetElement code. r=dthayer
https://hg.mozilla.org/integration/autoland/rev/9e65655a96c1
part 2 - Support non-wrapper proxies in
MacroAssembler::branchIfObjectEmulatesUndefined. r=anba

Perfherder has detected a browsertime performance change from push 9e65655a96c1cd90fbd3c98efc9bea04addcbb0f.

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
5% speedometer3 TodoMVC-Vue/DeletingAllItems/Sync macosx1400-64-shippable-qr fission webrender 2.62 -> 2.49 Before/After
5% speedometer3 TodoMVC-Vue/DeletingAllItems/Sync linux1804-64-shippable-qr fission webrender 9.72 -> 9.24 Before/After
4% speedometer3 TodoMVC-Vue/DeletingAllItems/Sync linux1804-64-nightlyasrelease-qr fission webrender 9.66 -> 9.28 Before/After
4% speedometer3 TodoMVC-Vue/DeletingAllItems/total macosx1400-64-shippable-qr fission webrender 4.22 -> 4.07 Before/After
3% speedometer3 TodoMVC-Vue/DeletingAllItems/total linux1804-64-shippable-qr fission webrender 14.98 -> 14.47 Before/After
3% speedometer3 TodoMVC-Vue/DeletingAllItems/total macosx1400-64-shippable-qr fission webrender 4.21 -> 4.08 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 2195

For more information on performance sheriffing please see our FAQ.

Keywords: perf-alert
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: