Created attachment 717265 [details] [diff] [review] WIP Passes jit-tests x86 and x64. This turned out to be pretty straight-forward: we know the native code address of the LOOPENTRY op (pc -> native map), so EnterJit reserves space for the frame and calls a C++ stub to copy everything from the StackFrame to the BaselineFrame. The debugger slightly complicates things, in debug mode we have to update frame objects for the StackFrame to point to the BaselineFrame. The patch also adds --baseline-eager and --baseline-uses-before-compile flags, the default is 10 now but we can tweak it later. TODO: ARM support, some cleanup/OOM handling, add --baseline-eager flag to TBPL jit-test config.
Created attachment 717834 [details] [diff] [review] Patch Passes jit-tests on x86, x64 and ARM. With this patch we still do eager compilation if JSD is enabled. This is done so that we don't have to update frame pointers in JSD's frame list when we OSR. Given the state JSD is in (unmaintained, undocumented, going away soon) I think this is the best option. I also changed --ion-eager to imply --baseline-eager, and when the baseline compiler is enabled JM is disabled. The patch also adds some interesting flags to the jit_test.py --tbpl config.
Created attachment 717837 [details] [diff] [review] Patch
Comment on attachment 717837 [details] [diff] [review] Patch Review of attachment 717837 [details] [diff] [review]: ----------------------------------------------------------------- Got to this a bit earlier than anticipated, and review was faster than anticipated. Nice work! ::: js/src/ion/arm/Trampoline-arm.cpp @@ +196,5 @@ > + masm.addPtr(Imm32(2 * sizeof(uint32_t)), r9); > + masm.storePtr(r9, Address(sp, 0)); > + masm.jump(&skipJump); > + masm.jump(&returnLabel); > + masm.bind(&skipJump); Nice :) @@ +208,5 @@ > + masm.mov(sp, r11); > + > + // Reserve space for locals and stack values. > + masm.ma_lsl(Imm32(3), r8, r9); > + masm.ma_sub(sp, r9, sp); I think the enterJIT code is getting long/complex enough that we should consider using GeneralRegisterSet + named register local vars for this code. Following the data flow through this had me a backtracking several times to check what was copied where and when. Same comment applies to x64. Not to x86, though, because I think the constant adding/removing of register from the set would make things less readable. @@ +232,5 @@ > + > + Label error; > + masm.addPtr(Imm32(IonExitFrameLayout::SizeWithFooter()), sp); > + masm.addPtr(Imm32(BaselineFrame::Size()), r11); > + masm.branchTest32(Assembler::Zero, r0, r0, &error); Nit: r0 should be ReturnReg? (Just for clarity) ::: js/src/shell/js.cpp @@ +5348,5 @@ > || !op.addBoolOption('\0', "no-baseline", "Disable baseline compiler") > + || !op.addBoolOption('\0', "baseline-eager", "Always baseline-compile methods") > + || !op.addIntOption('\0', "baseline-uses-before-compile", "COUNT", > + "Wait for COUNT calls or iterations before baseline-compiling " > + "(default: 10240)", -1) Nit: default: 10.
Attachment #717837 - Flags: review?(kvijayan) → review+
Pushed with comments addressed. The RegisterSet is a good idea, I tried it on x86 too and it seemed cleaner there too. https://hg.mozilla.org/projects/ionmonkey/rev/be125cabea26 https://hg.mozilla.org/projects/ionmonkey/rev/c856bc366e53
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
This looks to have greatly affected Octane benchmarks. The overall doesn't change much, but individual tests show some large swings.
rev be125cabea26 seems to have regressed ss-cube 14.3% and it has not fully recovered yet. We have moved from first to second place on that benchmark (only a few seconds, but still). Is this an expected regression?
(In reply to Cyko_01 from comment #7) > Is this an expected regression? Yes it is. The interpreter is very slow, especially on scripts that make lots of calls. The main benefit of this bug is reducing memory usage - on typical web workloads we now compile only ~20% of all scripts we execute.
We maybe be able to tune this a bit by using a smaller OSR-to-baseline threshold for "large" scripts (e.g. UseCount of 3 instead of 10 for big scripts).
You need to log in before you can comment on or make changes to this bug.