Closed
Bug 898963
Opened 11 years ago
Closed 10 years ago
Odinmonkey (ARM): check that d14, the NANReg, is being restored if necessary in FFI calls.
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla31
People
(Reporter: dougc, Assigned: dougc)
References
Details
Attachments
(2 files, 12 obsolete files)
4.56 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
27.76 KB,
patch
|
dougc
:
review+
|
Details | Diff | Splinter Review |
With the fast path from Odin to Ion code added in bug 860838 it might be the case that floating point register d15 is not being restored after returning to Odin. This might interact with bug 893553 if it changes the calling conventions for Ion and Odin code. It should be possible to test for such an issue, and tests need to be developed.
Assignee | ||
Updated•11 years ago
|
Summary: Odinmonkey (ARM): check that d15, the NANReg, is being restored is necessary in FFI calls. → Odinmonkey (ARM): check that d15, the NANReg, is being restored if necessary in FFI calls.
Assignee | ||
Comment 1•11 years ago
|
||
Assignee | ||
Comment 2•11 years ago
|
||
Sorry, mistyped a bug number above. This might interact with bug 893552.
Comment 3•10 years ago
|
||
Note that bug 937944 changed the NANReg from d15 to d14.
Summary: Odinmonkey (ARM): check that d15, the NANReg, is being restored if necessary in FFI calls. → Odinmonkey (ARM): check that d14, the NANReg, is being restored if necessary in FFI calls.
Assignee | ||
Comment 4•10 years ago
|
||
Assignee: general → dtc-moz
Attachment #782412 -
Attachment is obsolete: true
Assignee | ||
Comment 5•10 years ago
|
||
Attachment #8391972 -
Attachment is obsolete: true
Assignee | ||
Comment 7•10 years ago
|
||
Optimize the stack usage further: * Allocate only one stack frame for GenerateFFIIonExit and GenerateOOLConvert, saving a little stack manipulation. * Keep the stack aligned, allowing the ABI calls to enable and disable the activation to avoid dynamic stack alignment. Extended setupAlignedABICall to handle this usage. This also avoided another push/pop pair. Also inline setActive() which is called from EnableActivation and DisableActivation. These are hot functions and are down to a half dozen instructions now, ripe to be inlined. Is it worth taking time to try and inline these now or are there other plans for the activations? Tested on jit-tests and some of the big asm.js demos. Shall do some more testing.
Attachment #8391973 -
Attachment is obsolete: true
Attachment #8395359 -
Flags: feedback?(luke)
Assignee | ||
Comment 8•10 years ago
|
||
* Fix non-unified builds. * Avoid a couple more instructions on the x64 and ARM. * Timing results for a micro-benchmark (run time with patch / run time without patch): x64: 70% x86: 65% ARM: 84% * Here's a perf profiling result (x64) for the microbenchmark. The result looks a little odd but suggests that Enable/DisableActivationFromAsmJS are still hot. 44.04% js js [.] DisableActivationFromAsmJS(js::AsmJSActivation*) 36.99% js perf-5107.map [.] 0x00000000f77170f7 17.73% js js [.] EnableActivationFromAsmJS(js::AsmJSActivation*) * Here's Enable/DisableActivationFromAsmJS (ARM) that are not inlined, but it looks possible. I suggest landing the current improvement first. EnableActivationFromAsmJS: 0x00073b6c <+0>: ldr r3, [r0, #0] 0x00073b6e <+2>: movs r1, #1 0x00073b70 <+4>: ldr r2, [r3, #0] 0x00073b72 <+6>: ldr r2, [r2, #104] ; 0x68 0x00073b74 <+8>: strb.w r1, [r2, #33] ; 0x21 0x00073b78 <+12>: ldr r1, [r3, #0] 0x00073b7a <+14>: ldr r1, [r1, #92] ; 0x5c 0x00073b7c <+16>: str r1, [r2, #24] 0x00073b7e <+18>: ldr r1, [r3, #0] 0x00073b80 <+20>: ldr r1, [r1, #96] ; 0x60 0x00073b82 <+22>: str r1, [r2, #28] 0x00073b84 <+24>: ldr r2, [r3, #0] 0x00073b86 <+26>: str r3, [r2, #96] ; 0x60 0x00073b88 <+28>: bx lr DisableActivationFromAsmJS: 0x00073b50 <+0>: ldr r3, [r0, #0] 0x00073b52 <+2>: movs r1, #0 0x00073b54 <+4>: ldr r2, [r3, #0] 0x00073b56 <+6>: ldr r2, [r2, #104] ; 0x68 0x00073b58 <+8>: strb.w r1, [r2, #33] ; 0x21 0x00073b5c <+12>: ldr r0, [r2, #24] 0x00073b5e <+14>: ldr r1, [r3, #0] 0x00073b60 <+16>: ldr r2, [r2, #28] 0x00073b62 <+18>: str r0, [r1, #92] ; 0x5c 0x00073b64 <+20>: ldr r3, [r3, #0] 0x00073b66 <+22>: str r2, [r3, #96] ; 0x60 0x00073b68 <+24>: bx lr
Attachment #8395359 -
Attachment is obsolete: true
Attachment #8395359 -
Flags: feedback?(luke)
Comment 9•10 years ago
|
||
Nice! Is the patch ready for me to review? On a side note: there is a bunch of duplication in GenerateFFIInterpreterExit that it looks like could be factored out to be more like GenerateFFIIonExit ... if by any chance you were interested :)
Assignee | ||
Comment 10•10 years ago
|
||
This patch just cleans up duplication in GenerateFFIInterpreterExit, as requested in comment 9, and should otherwise be neutral.
Attachment #8396358 -
Flags: review?(luke)
Comment 11•10 years ago
|
||
Comment on attachment 8396358 [details] [diff] [review] 2. Refactor GenerateFFIInterpreterExit. \m/
Attachment #8396358 -
Flags: review?(luke) → review+
Assignee | ||
Comment 12•10 years ago
|
||
Comment on attachment 8395561 [details] [diff] [review] 1. Optimize asm.js FFI calls to Ion functions trimming the registers saved on the stack and trim stack maniplation. Testing look good. Note GenerateOOLConvert no longer allocates its own stack frame, but rather re-uses the stack frame already set up. There is a frame check in GenerateOOLConvert but it's a bit rough.
Attachment #8395561 -
Attachment description: Optimize asm.js FFI calls to Ion functions trimming the registers saved on the stack and trim stack maniplation. → 1. Optimize asm.js FFI calls to Ion functions trimming the registers saved on the stack and trim stack maniplation.
Attachment #8395561 -
Flags: review?(luke)
Comment 13•10 years ago
|
||
Comment on attachment 8395561 [details] [diff] [review] 1. Optimize asm.js FFI calls to Ion functions trimming the registers saved on the stack and trim stack maniplation. Review of attachment 8395561 [details] [diff] [review]: ----------------------------------------------------------------- This is looking great; very excited to have these improvements. A few opening comments/questions: ::: js/src/jit/AsmJS.cpp @@ +6526,4 @@ > unsigned argBytes = 3 * sizeof(size_t) + (1 + exit.sig().args().length()) * sizeof(Value); > unsigned extraBytes = 0; > #if defined(JS_CODEGEN_ARM) > extraBytes += sizeof(size_t); I was looking at this code earlier and I was curious about extraBytes. It'd be nice to comment what this is and why we can't just masm.Push something so that sizeof(size_t) is already in masm.framePushed(). @@ +6529,5 @@ > extraBytes += sizeof(size_t); > #endif > + > + unsigned alreadyPushed = AlignmentAtPrologue + masm.framePushed(); > + unsigned stackDec = AlignBytes(alreadyPushed + argBytes + extraBytes, StackAlignment) - alreadyPushed; I'd like to use StackDecrementForCall as much as possible. Is the reason you unfolded the definition here to avoid the emptyVector? I guess StackArgBytes(argTypes) also adds in the ShadowStackSpace; is that the issue? @@ +6540,4 @@ > > // 2. Callee > Register callee = ABIArgGenerator::NonArgReturnVolatileReg0; > + Register scratch = ABIArgGenerator::NonVolatileReg; If a non-volatile register is used as scratch, won't that clobber a register of the caller? @@ +6543,5 @@ > + Register scratch = ABIArgGenerator::NonVolatileReg; > +#if defined(JS_CODEGEN_X64) || defined(JS_CODEGEN_ARM) > + Register arg0 = IntArgReg0; > +#else > + Register arg0 = callee; This is rather confusing at this point in the code. (It's not clear what arg0 means or why it's not IntArgReg0 on ARM.) It'd be nice to push this down closer to the use so it made more sense or have some comments to explain why the ARM variance. @@ +6598,5 @@ > + AssertStackAlignment(masm); > + LoadAsmJSActivationIntoRegister(masm, arg0); > + masm.setupAlignedABICall(1, AlignmentAtPrologue); > + masm.passABIArg(arg0); > + masm.callWithABI(AsmJSImm_EnableActivationFromAsmJS); In addition to making this path way faster, it seems like it'd be a nice simplification to remove these calls by inlining EnableActivation/DisableActivation (see bug 886411). ::: js/src/jit/arm/Assembler-arm.h @@ +79,5 @@ > ABIArg ¤t() { return current_; } > uint32_t stackBytesConsumedSoFar() const { return stackOffset_; } > static const Register NonArgReturnVolatileReg0; > static const Register NonArgReturnVolatileReg1; > + static const Register NonVolatileReg; I don't see this defined on non-ARM.
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #13) > Comment on attachment 8395561 [details] [diff] [review] > 1. Optimize asm.js FFI calls to Ion functions trimming the registers saved > on the stack and trim stack maniplation. > > Review of attachment 8395561 [details] [diff] [review]: > ----------------------------------------------------------------- > > This is looking great; very excited to have these improvements. A few > opening comments/questions: Thank you for the quick feedback. I'll make another pass over this to try and address the feedback, but here's a few quick answers. > > ::: js/src/jit/AsmJS.cpp > @@ +6526,4 @@ > > unsigned argBytes = 3 * sizeof(size_t) + (1 + exit.sig().args().length()) * sizeof(Value); > > unsigned extraBytes = 0; > > #if defined(JS_CODEGEN_ARM) > > extraBytes += sizeof(size_t); > > I was looking at this code earlier and I was curious about extraBytes. It'd > be nice to comment what this is and why we can't just masm.Push something so > that sizeof(size_t) is already in masm.framePushed(). Shall do. The extraBytes here for the ARM is a word on the stack into which the return PC will be stored when making the call into Ion. The ARM Ion frames reserve a slot for the return PC and the caller stores it there when making the call. The x86 and x64 have a different convention as the return PC is pushed as part of the call instruction. ARM Odin has a different convention again. > @@ +6529,5 @@ > > extraBytes += sizeof(size_t); > > #endif > > + > > + unsigned alreadyPushed = AlignmentAtPrologue + masm.framePushed(); > > + unsigned stackDec = AlignBytes(alreadyPushed + argBytes + extraBytes, StackAlignment) - alreadyPushed; > > I'd like to use StackDecrementForCall as much as possible. Is the reason > you unfolded the definition here to avoid the emptyVector? I guess > StackArgBytes(argTypes) also adds in the ShadowStackSpace; is that the issue? While it is possible to thread this through StackDecrementForCall, it makes the code less readable for me because this is a call into Ion, not into Odin, and the conventions are different - I find myself mentally unrolling this just to understand what the net effect is. The ShadowStackSpace appears to be for Windows x64. Forgot about this and will need to check. It's not relevant for the ARM paths. > @@ +6540,4 @@ > > > > // 2. Callee > > Register callee = ABIArgGenerator::NonArgReturnVolatileReg0; > > + Register scratch = ABIArgGenerator::NonVolatileReg; > > If a non-volatile register is used as scratch, won't that clobber a register > of the caller? The caller is Odin code and it does not yet take advantage of non-volatile registers across calls and does not expect them to be preserved. This code will be rewritten when the calls are inlined anyway. > @@ +6543,5 @@ > > + Register scratch = ABIArgGenerator::NonVolatileReg; > > +#if defined(JS_CODEGEN_X64) || defined(JS_CODEGEN_ARM) > > + Register arg0 = IntArgReg0; > > +#else > > + Register arg0 = callee; > > This is rather confusing at this point in the code. (It's not clear what > arg0 means or why it's not IntArgReg0 on ARM.) It'd be nice to push this > down closer to the use so it made more sense or have some comments to > explain why the ARM variance. Shall look into moving this and documenting it. The issue is that the x64 and ARM can pass the argument in a register so it saves a move to load the value directly into this register. > @@ +6598,5 @@ > > + AssertStackAlignment(masm); > > + LoadAsmJSActivationIntoRegister(masm, arg0); > > + masm.setupAlignedABICall(1, AlignmentAtPrologue); > > + masm.passABIArg(arg0); > > + masm.callWithABI(AsmJSImm_EnableActivationFromAsmJS); > > In addition to making this path way faster, it seems like it'd be a nice > simplification to remove these calls by inlining > EnableActivation/DisableActivation (see bug 886411). This does look possible, and I'll look into it. > ::: js/src/jit/arm/Assembler-arm.h > @@ +79,5 @@ > > ABIArg ¤t() { return current_; } > > uint32_t stackBytesConsumedSoFar() const { return stackOffset_; } > > static const Register NonArgReturnVolatileReg0; > > static const Register NonArgReturnVolatileReg1; > > + static const Register NonVolatileReg; > > I don't see this defined on non-ARM. See Assembler-x64.cpp, and Assembler-x86.cpp
Comment 15•10 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #14) Thanks for all the explanations! > While it is possible to thread this through StackDecrementForCall, it makes > the code less readable for me because this is a call into Ion, not into > Odin, and the conventions are different - I find myself mentally unrolling > this just to understand what the net effect is. Yeah, I guess you're right. > > I don't see this defined on non-ARM. > > See Assembler-x64.cpp, and Assembler-x86.cpp Ah, those changes seem not to have made it into the posted patch (or at least into bugzilla's diff viewer :).
Assignee | ||
Comment 16•10 years ago
|
||
Just rebasing after bug 988475.
Attachment #8395561 -
Attachment is obsolete: true
Attachment #8395561 -
Flags: review?(luke)
Assignee | ||
Comment 17•10 years ago
|
||
Inline EnableActivationFromAsmJS() and DisableActivationFromAsmJS(). * Timing results for a micro-benchmark show further improvement (run time with patch / run time without patch): x64: 61% x86: 31% ARM: 67% The patch passes the asm.js jit-tests on the x86, x64 and ARM. Needs some more testing, and cleaning up, and source comments.
Attachment #8398460 -
Attachment is obsolete: true
Assignee | ||
Comment 18•10 years ago
|
||
Clean up. Retested ARM and Linux x86 and x64. TODO check Windows 64 shadow stack usage and test on the Windows builds.
Attachment #8401833 -
Attachment is obsolete: true
Comment 20•10 years ago
|
||
Comment on attachment 8402702 [details] [diff] [review] 1. Optimize asm.js FFI calls to Ion functions, trim regs saved, inline activation and deactivation. Review of attachment 8402702 [details] [diff] [review]: ----------------------------------------------------------------- That is awesome! Can you add a comment to the JitActivation::active() function that the logic is inlined in odinmonkey. So that any change to that function should get reflected in AsmJS.cpp?
Assignee | ||
Comment 21•10 years ago
|
||
Thank you pointing out the shadow stack, it turned out to be quite an omission in prior patches. Jit-tests succeed on all platforms now, except win64 builds which are timing out but win64 works locally, so I'll request review. https://tbpl.mozilla.org/?tree=Try&rev=a935b239286c https://tbpl.mozilla.org/?tree=Try&rev=01a551423d18 The set of registers allocated is rather complex, and it might be hard to read. There are source comments, but feedback on this would be particularly welcomed.
Attachment #8402702 -
Attachment is obsolete: true
Attachment #8405329 -
Flags: review?(luke)
Comment 22•10 years ago
|
||
Comment on attachment 8405329 [details] [diff] [review] 1. Optimize asm.js FFI calls to Ion functions, trim regs saved, inline activation and deactivation. Review of attachment 8405329 [details] [diff] [review]: ----------------------------------------------------------------- This is looking great! Thanks for all the work. I'd like one more quick pass over this patch before r+. ::: js/src/jit/AsmJS.cpp @@ +6545,2 @@ > unsigned argBytes = 3 * sizeof(size_t) + (1 + exit.sig().args().length()) * sizeof(Value); > + // The extraBytes accounts for the stack difference when the call into Ion is made. \n before this comment please @@ +6550,1 @@ > unsigned extraBytes = 0; Could we name this 'bytesBeforeArgs'? Also, I didn't quite understand this comment until reading ma_callIonNoPush and seeing what was going on. Perhaps the comment could read more like: "On ARM, we call with ma_callIonNoPush which, following the Ion calling convention, stores the return address into *sp. This means we need to include an extra word of space before the arguments in the stack allocation. (On x86/x64, the call instruction does the push itself and the ABI just requires us to be aligned before the call instruction.)" @@ +6550,5 @@ > unsigned extraBytes = 0; > #if defined(JS_CODEGEN_ARM) > extraBytes += sizeof(size_t); > #endif > + // Can you convert this // into an empty \n line @@ +6556,5 @@ > + // The Ion frame does not use the system ABI for passing arguments and does not need > + // to allocate the shadow stack used on win64. > + unsigned alreadyPushed = AlignmentAtPrologue + masm.framePushed(); > + unsigned stackDec = AlignBytes(alreadyPushed + argBytes + extraBytes, StackAlignment) > + - alreadyPushed; Could you split StackDecrementForCall into: static unsigned StackDecrementForCall(MacroAssembler &masm, unsigned argBytes, unsigned extraBytes) { // the code you have above } and then change the existing StackDecrementForCall to just return StackDecrementForCall(masm, StackArgBytes(argTypes), extraBytes); ? @@ +6567,5 @@ > + callArgTypes.infallibleAppend(typeArray, ArrayLength(typeArray)); > + unsigned OOLStackDec = StackDecrementForCall(masm, callArgTypes, sizeof(Value)); > + > + // Allocate a frame large enough for both of the above calls. > + stackDec = Max(stackDec, OOLStackDec); Rather than mutating stackDec, could you name the first definition 'stackDecForIonCall' (and perhaps rename OOLStackDec to 'stackDecForOOLCall' to keep with the locals-have-lowercase-names convention)? @@ +6596,5 @@ > // 2.2. Get callee > masm.loadPtr(Address(callee, offsetof(AsmJSModule::ExitDatum, fun)), callee); > > // 2.3. Save callee > + masm.storePtr(callee, Address(StackPointer, extraBytes + sizeof(size_t))); Pre-existing, but since you are tweaking already: these manual stack offset computations seem a bit error prone. What about instead having one variable declared up above: size_t argOffset = byteBeforeArgs; and then each storePtr to Address(StackPointer, argOffset) is followed by a argOffset += sizeof(size_t) (or sizeof(Value)) then we can remove the 'offsetToArgs' computation below, and pass argOffset to FillArgumentsArray directly. Finally, we can have: argOffset += exit.sig().args().length() * sizeof(Value); JS_ASSERT(argOffset == extraBytes + argBytes); @@ +6624,5 @@ > masm.branchIfFunctionHasNoScript(callee, &ionFailed); > #endif > > masm.loadPtr(Address(callee, JSFunction::offsetOfNativeOrScript()), scratch); > masm.loadBaselineOrIonNoArgCheck(scratch, scratch, SequentialExecution, maybeDebugBreakpoint); IIUC, callee is dead after this loadPtr, so you could use 'callee' here and below instead of 'scratch'. Name-wise, that'd be nice since 'callee' would be the actual callee. @@ +6633,5 @@ > + // > + // This sequence requires four registers, and needs to preserve the 'scratch' > + // register, so there are five live registers. The 'scratch' register is currently > + // allocated to: ARM r6; x64 r13; x86 ebx. The four registers chosen for this > + // sequence are: ARM r0, r1, r2, r3; x64 rax, rdi, rbx, rsi; x86 edi, eax, ecx, edx. One idea for how to make this constraint more explicit is to add an AsmJSFFIIonExitRegCallee so that way, in all the Assembler-* files it is easy to see that all the named registers are different without needing any of these comments. Instead, we'd just have: JS_ASSERT(callee == AsmJSFFIIonExitRegCallee); @@ +6642,5 @@ > + > + LoadAsmJSActivationIntoRegister(masm, reg0); > + > + // Load the JSContext from the AsmJSActivation > + // JSContext *cx = activation->cx(); For blocks of assembly like this, mostly I just want to scan over the code and see what how registers are used and the interleaved comments makes it a bit harder to follow. Also, the masm function names mostly are clear enough w/o comments. Could you instead have one comment up front that perhaps is just a copy of the C++ code that is being effectively inlined below? @@ +6650,5 @@ > + masm.loadPtr(Address(reg3, JSContext::offsetOfRuntime()), reg0); > + // Load the Activation from the JSContext mainThread(). > + // Activation *act = cx->mainThread().activation(); > + size_t offsetOfActivation = offsetof(JSRuntime, mainThread) + > + PerThreadData::offsetOfActivation(); Could you indent 'PerThreadData' with the 'offsetof' above (here and below x2)? @@ +6678,1 @@ > masm.push(scratch); Instead of doing this, what if we just left the stack unaligned (and moved the AssertStackAlignment below into the #else branch)? Then instead we could have: // ma_callIonNoPush will store the return address into *sp and, according to the Ion ABI, // the Ion callee will pop this value when returning. Thus, on return from Ion, the stack // is unaligned and incremented by sizeof(void*). masm.ma_callIonNoPush(scratch); masm.adjustFrame(-sizeof(void*)); stackDec -= sizeof(void*); which would shave off one insn from Ion calls on ARM. @@ +6688,5 @@ > + // This sequence needs three registers, and must preserve the JSReturnReg_Data (ARM > + // r2, x64 rcx, x86 ecx), and JSReturnReg_Type (ARM r3, x64 rcx, x86 edx). The three > + // registers chosen are: ARM r0, r1, r4; x64 rax, rdi, rbx; x86 edi, eax, esi; The > + // definitions of reg0 and reg1 defined above are reused. > + Register reg4 = AsmJSFFIIonExitReg4; To capture this constraints in the code, how about we have a separate set of registers (one for enable activation, one for disable) named AsmJSActivateIonReg* and AsmJSDeactiveIonReg*, resp. Then we can have: JS_ASSERT(AsmJSDeactivateIonRegReturnData == JSReturnReg_Data); JS_ASSERT(AsmJSDeactivateIonRegReturnType == JSReturnReg_Type); Register reg0 = AsmJSDeactivateIonReg0; Also, could you wrap both the activate/deactivate blocks in { } which will avoid the declaration conflict for 'reg0' but also visually set off these blocks. @@ +6750,2 @@ > masm.ret(); > +#endif Can you JS_ASSERT(masm.framePushed() == 0); ? ::: js/src/jit/arm/Assembler-arm.h @@ +97,5 @@ > static MOZ_CONSTEXPR_VAR FloatRegister ScratchFloatReg = { FloatRegisters::d15 }; > > static MOZ_CONSTEXPR_VAR FloatRegister NANReg = { FloatRegisters::d14 }; > > +// Registers used in GenerateFFIIonExit. Can you add "None of these may be secondScratchRegister (lr)."
Attachment #8405329 -
Flags: review?(luke)
Assignee | ||
Comment 23•10 years ago
|
||
Thank you for the great feedback. This has improved readability a lot. All suggestions have been applied, except the removal of the stack push on return from the Ion call on the ARM (to keep the stack aligned) although it has been replaced by a subtraction instruction. It does appear possible to remove this stack adjustment instruction, but it would need to be added back to the OOL paths because the stack needs to be aligned when making the OOL path ABI calls. This could add a good deal of complexity for little gain. Do you want this done? A try run looks ok. Win64 debug failed to build, but it looks like an infrastructure issue: https://tbpl.mozilla.org/?tree=Try&rev=cfa316f52428
Attachment #8405329 -
Attachment is obsolete: true
Attachment #8405825 -
Flags: review?(luke)
Comment 24•10 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #23) > Created attachment 8405825 [details] [diff] [review] > 1. Optimize asm.js FFI calls to Ion functions, trim regs saved, inline > activation and deactivation. > > Thank you for the great feedback. This has improved readability a lot. I agree. Awesome work. Could it be you overlooked comment 20? Would it still be possible to add a comment at http://dxr.mozilla.org/mozilla-central/source/js/src/vm/Stack.cpp#1520 That this is inlined in AsmJS.cpp?
Assignee | ||
Comment 25•10 years ago
|
||
Address comment 20 which was overlook, sorry. Adds a comment to setActive() to note that changes need to be reflected in GenerateFFIIonExit().
Attachment #8405825 -
Attachment is obsolete: true
Attachment #8405825 -
Flags: review?(luke)
Attachment #8405850 -
Flags: review?(luke)
Comment 26•10 years ago
|
||
Comment on attachment 8405850 [details] [diff] [review] 1. Optimize asm.js FFI calls to Ion functions, trim regs saved, inline activation and deactivation. Review of attachment 8405850 [details] [diff] [review]: ----------------------------------------------------------------- Beautiful, thanks! ::: js/src/jit/AsmJS.cpp @@ +6034,5 @@ > return iter.stackBytesConsumedSoFar(); > } > > +static unsigned > +StackDecrementForCall(MacroAssembler &masm, unsigned argBytes, unsigned extraBytes = 0) Nit on my own suggestion: since there is no real distinction between argBytes vs. extraBytes in this function, having them separate raises the unnecessary question of what bytes should go where. Can this function instead take a single "bytesToPush" argument (such that callers handle the addition)? @@ +6545,5 @@ > + // > + // Arguments to the Ion function are in the following order on the stack: > + // descriptor | callee | argc | this | arg1 | arg2 | ... > + // > + // Reserve and align space for the arguments. This last line of comment doesn't seem to belong anymore, can you remove it and the preceding \n? @@ +6651,5 @@ > + LoadAsmJSActivationIntoRegister(masm, reg0); > + > + // The following is inlined: > + // JSContext *cx = activation->cx(); > + // Activation *act = cx->mainThread().activation(); Can you add: act->active_ = true; ::: js/src/jit/x86/Assembler-x86.h @@ +91,5 @@ > +static MOZ_CONSTEXPR_VAR Register AsmJSFFIIonExitRegCallee = ecx; > +static MOZ_CONSTEXPR_VAR Register AsmJSFFIIonExitRegE0 = edi; > +static MOZ_CONSTEXPR_VAR Register AsmJSFFIIonExitRegE1 = eax; > +static MOZ_CONSTEXPR_VAR Register AsmJSFFIIonExitRegE2 = ebx; > +static MOZ_CONSTEXPR_VAR Register AsmJSFFIIonExitRegE3 = edx; Nit on my own previously-suggested name: JSFFII boy that's a lot of capital letters together. I think AsmJSIonExitReg* would be just as clear and less ugly. ::: js/src/vm/Stack.h @@ +1340,5 @@ > + } > + static size_t offsetOfPrevJitJSContext() { > + return offsetof(JitActivation, prevJitJSContext_); > + } > + static size_t offsetOfActive() { Could you rename this to offsetOfActiveUint8() and JS_STATIC_ASSERT(sizeof(active_) == 8) in the body?
Attachment #8405850 -
Flags: review?(luke) → review+
Comment 27•10 years ago
|
||
Hannes: could setActive be changed to only contain "active_ = active" and none of the rest? That is, it seems like the rest could be done in the JitActivation ctor/dtor since they aren't going to change while inside asm.js code. Is there some use of JitActivation that depends on these being null? It seems like such uses should instead be guarded by isActive().
Assignee | ||
Comment 28•10 years ago
|
||
Address reviewer feedback. Carrying forward r+.
Attachment #8405850 -
Attachment is obsolete: true
Attachment #8406888 -
Flags: review+
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 29•10 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #27) > Hannes: could setActive be changed to only contain "active_ = active" and > none of the rest? That is, it seems like the rest could be done in the > JitActivation ctor/dtor since they aren't going to change while inside > asm.js code. Is there some use of JitActivation that depends on these being > null? It seems like such uses should instead be guarded by isActive(). 1) Theoretical: yes. My initial version did this. (Onlye active_ = active and cx()->mainThread().ionTop = asJit()->prevIonTop(); when active = false;) 2) Practical: no. See bug 886255. So doing this created crashes which are really very deep and hard to debug. Eventually I adjusted setActive to be more like the ctor/dtor of JitActivation, which fixed the issue. One brave man might try to remove them again. But be aware it is an error prone and should be done carefully to not introduce crashes. (I assume that some of the fields get overwritten or something, or get changed during execution)
Comment 30•10 years ago
|
||
(In reply to Hannes Verschore [:h4writer] from comment #29) Hmm, creepy. I'm a bit confused how bug 886255 hit this, though since the URL doesn't appear to run any asm.js. Perhaps there was a different/related bug that got fixed as part of the overall work?
Comment 31•10 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #30) > (In reply to Hannes Verschore [:h4writer] from comment #29) > Hmm, creepy. I'm a bit confused how bug 886255 hit this, though since the > URL doesn't appear to run any asm.js. Perhaps there was a different/related > bug that got fixed as part of the overall work? I think I fixed one additional thing. If I am still able to reproduce. I can possible see if the additional thing fixed it. If that is the case. I'm willing to try to remove the extra code again...
Flags: needinfo?(hv1989)
Comment 32•10 years ago
|
||
Ah, that seems likely. In particular, setActive() is only called for asm.js FFI Ion calls so it seems like any change in behavior in setActive() couldn't possibly affect the site in bug 886255.
Comment 33•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ccbb670a699 https://hg.mozilla.org/integration/mozilla-inbound/rev/37e7cae3d8c8
Keywords: checkin-needed
Comment 34•10 years ago
|
||
Backed out for Windows debug bustage. https://hg.mozilla.org/integration/mozilla-inbound/rev/6e5d5104b3ea https://tbpl.mozilla.org/php/getParsedLog.php?id=37854779&tree=Mozilla-Inbound
Assignee | ||
Comment 35•10 years ago
|
||
Fix build failure on Windows, sorry. The compiler does not accept sizeof(slot), so use sizeof(bool). Also the failing test should have been using a static assert. Carrying forward r+. Successful try run on Windows: https://tbpl.mozilla.org/?tree=Try&rev=99a5029f34a8
Attachment #8406888 -
Attachment is obsolete: true
Attachment #8407532 -
Flags: review+
Updated•10 years ago
|
Whiteboard: checkin-needed
Comment 36•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/28d75105f7c8 https://hg.mozilla.org/integration/mozilla-inbound/rev/d1b66744e023
Whiteboard: checkin-needed
Comment 37•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/28d75105f7c8 https://hg.mozilla.org/mozilla-central/rev/d1b66744e023
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Comment 38•10 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #32) > Ah, that seems likely. In particular, setActive() is only called for asm.js > FFI Ion calls so it seems like any change in behavior in setActive() > couldn't possibly affect the site in bug 886255. Okay, I was able to reproduce the crash. (Thanks github for history of the website). And I can officially say the bug in bug 886255 was not fixed by the setActive changes. I tried to bisect the real issue, but I only have a range, since firefox didn't want to build in those revisions. So the good news is that there is no reason anymore against making setActive leaner. As in only retain the active=true/false part.
Flags: needinfo?(hv1989)
Comment 39•10 years ago
|
||
Thanks for the detective work Hannes!
You need to log in
before you can comment on or make changes to this bug.
Description
•