Closed
Bug 1277973
Opened 7 years ago
Closed 7 years ago
Baldr: set Activation::fp in GenerateTrapStub so faults have backtraces
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla52
Tracking | Status | |
---|---|---|
firefox52 | --- | fixed |
People
(Reporter: luke, Assigned: luke)
References
Details
Attachments
(8 files)
22.43 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
5.28 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
768 bytes,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
16.92 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
16.78 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
101.91 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
4.78 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
370.02 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
Currently we just auto-align the stack and call out to C++, but this leaves Activation.fp null which means, when we try to build a backtrace for the error, we lose all the wasm frames in the activation. The trick is we need a static frame depth so we can simply store (fp + frameDepth). It'd be good to assert this when emitting the masm.jump(JumpTarget). I think we might also be able to convert JumpTarget::StackOverflow to a Trap, since it also has static frame depth.
![]() |
Assignee | |
Comment 1•7 years ago
|
||
D'oh, of course there isn't one static frame-depth: these JumpTargets are shared by all functions. So I think what we need here is a per-(JumpTarget,function) thunk that bumps sp to point to the frame before jumping to the real target. This is what CodeGenerator::generateWasm is already doing for stack overflow; so the fix here is just to generalize that.
Summary: Baldr: set Activation::fp in GenerateTrapStub → Baldr: set Activation::fp in GenerateTrapStub so faults have backtraces
![]() |
Assignee | |
Comment 2•7 years ago
|
||
What this means is that the jumps to JumpTargets can be plain jumps (no patching); its just the jumps in the thunks that need to go "far".
![]() |
Assignee | |
Comment 3•7 years ago
|
||
Note, there are two additional related cases where we'll fail to have exitFP set: - calls to JumpTarget::BadIndirectCall via entries in a Table - calls to HandleTrap() in the C++ impl of wasm ops
![]() |
Assignee | |
Comment 4•7 years ago
|
||
This prep patch removes 'throw' from the list of traps and makes it a separate kind of stub that is linked using plain labels (since it is only used by stubs created during finishCodegen).
![]() |
Assignee | |
Comment 5•7 years ago
|
||
This patch switches profiling prologues to use a symbolic address, rather than the WasmTlsReg, to load the WasmActivation*. This is necessary since trap stubs will (in the last patch) use profiling prologues/epilogues and trap stubs do (want to) have WasmTlsReg available. The downside is prologues get a bit bigger (which is why I used to WasmTlsReg initially). Another annoyance is that loading a SymoblicAddress on ARM has different codegen when movw/movt are not available (it uses an immediate in a pool) and this breaks the static code offsets necessary for stack walking. For now, the patch just disables wasm if movw/movt aren't available since I expect this is a diminishing number of devices; if it ever mattered we could do something more complicated but I'd like to see if anyone actually notices first.
Attachment #8799196 -
Flags: review?(bbouvier)
![]() |
Assignee | |
Comment 6•7 years ago
|
||
Unrelated: I happened to notice that the SignalUsage-removal patch forgot to add in a dynamic check for gc::SystemPageSize > wasm::PageSize (which we assume).
Attachment #8799197 -
Flags: review?(bbouvier)
![]() |
Assignee | |
Comment 7•7 years ago
|
||
This simple patch generalizes CallSiteAndTarget in preparation for the next patch, which needs to add a new case, to use a proper enum in stead of the current NOT_DEFINITION.
Attachment #8799198 -
Flags: review?(bbouvier)
![]() |
Assignee | |
Comment 8•7 years ago
|
||
This patch simply renames the "thunks" that were added earlier to "far jumps". "Thunk" is pretty a overloaded term and these methods just emit jumps so the new name is more accurate and makes the last patch read better.
Attachment #8799199 -
Flags: review?(bbouvier)
![]() |
Assignee | |
Comment 9•7 years ago
|
||
This simple patch moves MWasmMemoryAccess out of MIR.h and into Assembler-shared.h, where it is named wasm::MemoryAccessDesc, and makes the load/store MIR nodes *contain*, not *derive* wasm::MemoryAccessDesc makes things a bit nicer in a few ways.
Attachment #8799200 -
Flags: review?(bbouvier)
![]() |
Assignee | |
Comment 10•7 years ago
|
||
This patch fixes the ARM simulator so that f64 loads/stores issue a single 8-byte load/store rather than two 4-byte loads/stores. Without this patch, the fault-handling emulation gets confused in the case where an f64 memory access partially overlaps the guard page because one access will be "in" and one will be "out".
Attachment #8799201 -
Flags: review?(bbouvier)
![]() |
Assignee | |
Comment 11•7 years ago
|
||
Here's the final patch which actually makes the fix. In addition to setting fp, the patch also reports the trapping instruction's offset which is what required most of the work; most of the patch is just plumbing the trap information from wasm decoding into MIR and then into masm which is fortunately pretty mechanical. The basic approach taken is to have a 3-step path for a trap: - the "trap site": the inline hot path which contains the branch - the "out of line trap path": jumped to by the trap site and goes at the very end of each function; 1 per trap-site so it can record trapping bytecode - the "trap exit": the shared stubs that go at the end of the module and are jumped to by all same-kind out of line trap paths Initially I was going to have the out-of-line trap path just set some registers before jumping to the trap exit. But in practice I ended up duplicating all this CallSite logic so I found it much simpler just have the out-of-line trap path make a proper call (with CallSite) to the trap exit which then uses a plain profiling prologue/epilogue so it is picked up without any extra work by the stack iterators. The only tricky thing was memory accesses due to signal handling. Fundamentally, precise trap info requires storing a MemoryAccess for each load/store pc which conveys the bytecode (whereas before wasm loads/stores had no metadata). Considering different options, the simplest thing I found was to give MemoryAccess a 'trapOutOfLineCode' field to which the signal handler redirects the pc on fault. This keeps the signal handler pretty simple and the rest of the trap path orthogonal to whether the trap was reach via branch vs. signal handler.
Attachment #8799547 -
Flags: review?(bbouvier)
Updated•7 years ago
|
Attachment #8799195 -
Flags: review?(bbouvier) → review+
Updated•7 years ago
|
Attachment #8799197 -
Flags: review?(bbouvier) → review+
Updated•7 years ago
|
Attachment #8799196 -
Flags: review?(bbouvier) → review+
Updated•7 years ago
|
Attachment #8799198 -
Flags: review?(bbouvier) → review+
Comment 12•7 years ago
|
||
Comment on attachment 8799199 [details] [diff] [review] rename-thunk-to-far-jump Review of attachment 8799199 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for making a separate patch for this. ::: js/src/asmjs/WasmGenerator.cpp @@ +304,5 @@ > offsets.end = masm_.currentOffset(); > > if (!metadata_->codeRanges.emplaceBack(CodeRange::CallThunk, offsets)) > return false; > + if (!metadata_->callThunks.emplaceBack(jumpOffset, cs.funcDefIndex())) I was wondering whether it'd make sense to rename callThunks to callFarJumps, but it seems that if we did so, we'd have to rename jumpThunks to jumpFarJumps, which is silly; so keeping callThunks / jumpThunks sounds good. ::: js/src/jit/MacroAssembler.h @@ +509,5 @@ > + // other uint32_t offset without using a constant pool (thus returning a > + // simple CodeOffset instead of a CodeOffsetJump). > + CodeOffset farJumpWithPatch() PER_SHARED_ARCH; > + void patchFarJump(CodeOffset farJump, uint32_t targetOffset) PER_SHARED_ARCH; > + static void repatchFarJump(uint8_t* code, uint32_t farJumpOffset, uint32_t targetOffset) PER_SHARED_ARCH; For consistency with the previous one, we *could* have farJumpOffset be a CodeOffset too (and be renamed to farJump), but I see how verbose it would be for the single caller. ::: js/src/jit/arm/MacroAssembler-arm.cpp @@ +5034,5 @@ > return u32Offset; > } > > void > +MacroAssembler::patchFarJump(CodeOffset u32Offset, uint32_t targetOffset) Maybe rename u32Offset to farJump? @@ +5048,5 @@ > *u32 = (targetOffset - addOffset) - 8; > } > > void > +MacroAssembler::repatchFarJump(uint8_t* code, uint32_t u32Offset, uint32_t targetOffset) (and maybe u32Offset to farJumpOffset?) ::: js/src/jit/mips-shared/MacroAssembler-mips-shared.cpp @@ +1513,5 @@ > return u32Offset; > } > > void > +MacroAssembler::patchFarJump(CodeOffset u32Offset, uint32_t targetOffset) ditto @@ +1522,4 @@ > } > > void > +MacroAssembler::repatchFarJump(uint8_t* code, uint32_t u32Offset, uint32_t targetOffset) and ditto
Attachment #8799199 -
Flags: review?(bbouvier) → review+
Comment 13•7 years ago
|
||
Comment on attachment 8799200 [details] [diff] [review] hoist-memory-access-desc Review of attachment 8799200 [details] [diff] [review]: ----------------------------------------------------------------- Much cleaner, thank you. ::: js/src/jit/shared/Assembler-shared.h @@ +740,5 @@ > + > + void clearOffset() { offset_ = 0; } > +}; > + > + nit: two blank lines
Attachment #8799200 -
Flags: review?(bbouvier) → review+
Comment 14•7 years ago
|
||
Comment on attachment 8799201 [details] [diff] [review] fix-simulator Review of attachment 8799201 [details] [diff] [review]: ----------------------------------------------------------------- Curious that ARM wants 8 bytes loads/stores to be aligned on 4 bytes boundary.
Attachment #8799201 -
Flags: review?(bbouvier) → review+
Comment 15•7 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #14) > Comment on attachment 8799201 [details] [diff] [review] > fix-simulator > > Review of attachment 8799201 [details] [diff] [review]: > ----------------------------------------------------------------- > > Curious that ARM wants 8 bytes loads/stores to be aligned on 4 bytes > boundary. Doubleword accesses are defined on early ARMv7 to be two single-word accesses, ie, doubleword loads and stores are not access-atomic always. (LDREXD/STREXD are different, as are 8-byte loads and stores on AArch64.) On ARMv7 that supports the Large Physical Address Extension, and presumably on ARMv8, LDRD/STRD are access-atomic *provided* the address is 8-byte aligned.
Comment 16•7 years ago
|
||
Comment on attachment 8799547 [details] [diff] [review] fix-callstack Review of attachment 8799547 [details] [diff] [review]: ----------------------------------------------------------------- Looking very clean in general, great comments and good testing! Thanks for doing this. I think there's going to be a strange behavior for ARMv7 unaligned accesses, see below. ::: js/src/asmjs/WasmFrameIterator.cpp @@ +632,2 @@ > // When the pc is inside the prologue/epilogue, the innermost > // call's AsmJSFrame is not complete and thus fp points to the the pre-existing: the the ::: js/src/asmjs/WasmGenerator.cpp @@ +277,2 @@ > > + OffsetMap callFarJumps; I liked the previous name a bit better because of the "already" that made it more descriptive imo. Totally ok to remove thunk, though. ::: js/src/asmjs/WasmSignalHandlers.cpp @@ -1154,5 @@ > - return HugeMemoryAccess(context, pc, faultingAddress, *instance, ppc); > -#elif defined(JS_CODEGEN_ARM) > - MOZ_RELEASE_ASSERT(signal == Signal::BusError || signal == Signal::SegFault); > - if (signal == Signal::BusError) > - *ppc = instance->codeSegment().unalignedAccessCode(); Well, this is weird: on ARMv7 even where we allow dynamic unaligned accesses, this will report out-of-bounds instead of unaligned access, which we should handle correctly at some point. Shouldn't we revert this change? ::: js/src/jit-test/tests/asm.js/testAtomics.js @@ +8,5 @@ > load(libdir + "asserts.js"); > > setJitCompilerOption('asmjs.atomics.enable', 1); > > +const RuntimeError = WebAssembly.RuntimeError; For the record, noting down my thoughts here: this is safe because asm.js has the same compiler support preconditions as wasm. Also, atomics/simd.js error reporting becomes conditional on wasm being enabled; that is safe, because all of these are nightly only, at the moment. In the future, wasm will likely be enabled *before* atomics/simd.js (if they ever get into JS), so it will be safe too. ::: js/src/jit-test/tests/wasm/errors.js @@ +150,5 @@ > +for (var i = 1; i < N; i++) { > + assertEq(isWasmFunction(stack[i].name), true); > + assertEq(stack[i].line > lastLine, true); > + lastLine = stack[i].line; > + assertEq(binary[stack[i].line], i == 3 ? CallIndirectCode : CallCode); Sweet! ::: js/src/jit-test/tests/wasm/profiling.js @@ +38,5 @@ > + for (let j = 1; j < array.length; j++) { > + if (array[i] !== array[j]) > + array[++i] = array[j]; > + } > + array.length = i + 1; or using the shiny ES2015 features, function removeDuplicates(array) { return Array.from(new Set(array)); } ::: js/src/jit/CodeGenerator.cpp @@ +11657,5 @@ > void > CodeGenerator::visitWasmTrap(LWasmTrap* lir) > { > MOZ_ASSERT(gen->compilingAsmJS()); > + masm.jump(wasm::TrapDesc(lir->mir()->trapOffset(), lir->mir()->trap(), masm.framePushed())); maybe put lir->mir() in a local? ::: js/src/jit/CodeGenerator.h @@ +65,5 @@ > ~CodeGenerator(); > > public: > MOZ_MUST_USE bool generate(); > + MOZ_MUST_USE bool generateWasm(wasm::SigIdDesc sigId, wasm::TrapOffset trapOffset, wasm::FuncOffsets *offsets); nit: more than 100 chars, please split in two lines ::: js/src/jit/MCallOptimize.cpp @@ +3869,5 @@ > if (sign == SimdSign::NotApplicable) > sign = GetSimdSign(toType); > > // Possibly expand into multiple instructions. > + ins = MSimdConvert::AddLegalized(alloc(), current, arg, mirType, sign, wasm::TrapOffset()); trapOffset = wasm::TrapOffset() could be the default value in the static method's signature ::: js/src/jit/MacroAssembler.cpp @@ +2869,5 @@ > + // To solve both problems at once, the out-of-line path (far) jumps > + // directly to the trap exit stub. This takes advantage of the fact > + // that there is already a CallSite for call_indirect and the > + // current stack/register state already match the trap exit's > + // assumptions. Nice comments in this entire function, thanks! @@ +2891,5 @@ > + // exit's frame so that unwinding begins at the frame and offset of > + // the trapping instruction. > + wasm::CallSiteDesc desc(site.bytecodeOffset, wasm::CallSiteDesc::TrapExit); > + call(desc, site.trap); > + } I like the idea below of having the breakpoint. In DEBUG mode, could we add a breakpoint() right here too? ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +552,5 @@ > masm.as_cmp(rhs, Imm8(0)); > if (mir->canTruncateInfinities()) { > if (mir->trapOnError()) { > + wasm::TrapDesc trap(mir->trapOffset(), wasm::Trap::IntegerDivideByZero, > + masm.framePushed()); I liked what you've done for the WasmBaselineCompiler; random idea: could we also have a CodeGenerator::trap() method, maybe looking like this, to make call sitess less verbose? template <class T> wasm::TrapDesc trap(T* mir, wasm::Trap trap) { return wasm::TrapDesc(mir->trapOffset(), trap, masm.framePushed()); } In some cases, when the MIR only has one associated wasm::Trap, it could have a wasm::Trap trap() getter (which would have the nice side-effect of propagating less information in all the CodeGenerator* files), and then we could have a variant like this: template<class T> wasm::TrapDesc trap(T* mir) { return trap(mir, mir->trap()); } And a call site would look like this in the worst case: masm.ma_b(trap(mir, wasm::Trap::IntegerDivideByZero), Assembler::Equal); Or like this in the best case: masm.ma_b(trap(mir), Assembler::Equal); (suppressing many braces as a bonus) @@ +2398,5 @@ > if (type == Scalar::Int64) { > MOZ_ASSERT(INT64LOW_OFFSET == 0); > + > + load = masm.ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, output.low); > + masm.append(mir->access(), load.getOffset(), masm.framePushed()); Not sure why we need those here (and on x86 too)? is it because of the guard page optimization for small offsets? (if so, we might need them too for the unaligned loads/stores too). If this is because of the guard page optimization used to catch accesses with small offsets out of bounds, could we append them conditionally iff the access' offset (not trap offset) is not 0, to reduce the amount of MemoryAccesses? (probably in the append method itself). On !HUGE_WASM_MEMORY platforms, the bounds check have their own TrapSite and will throw before the access is actually run. ::: js/src/jit/shared/Assembler-shared.h @@ +771,5 @@ > +// the trap out-of-line path. > + > +struct TrapDesc : TrapOffset > +{ > + enum Kind { Jump, MemoryAccess }; enum class? ::: js/src/jit/x86-shared/CodeGenerator-x86-shared.cpp @@ +2471,5 @@ > Register temp = ToRegister(ins->temp()); > > masm.convertFloat32x4ToInt32x4(in, out); > > + auto *ool = new(alloc()) OutOfLineSimdFloatToIntCheck(temp, in, ins, ins->mir()->trapOffset()); pre-existing: auto* ool
Attachment #8799547 -
Flags: review?(bbouvier) → review+
Comment 17•7 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #15) > (In reply to Benjamin Bouvier [:bbouvier] from comment #14) > > Comment on attachment 8799201 [details] [diff] [review] > > fix-simulator > > > > Review of attachment 8799201 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > Curious that ARM wants 8 bytes loads/stores to be aligned on 4 bytes > > boundary. > > Doubleword accesses are defined on early ARMv7 to be two single-word > accesses, ie, doubleword loads and stores are not access-atomic always. > (LDREXD/STREXD are different, as are 8-byte loads and stores on AArch64.) Luke, this seems to suggest that the issue you were seeing in the simulator could actually happen in practice with a real device. Also, what happens with unaligned loads/stores, which do several accesses to simulate an atomic one?
Flags: needinfo?(luke)
![]() |
Assignee | |
Comment 18•7 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #17) > > Doubleword accesses are defined on early ARMv7 to be two single-word > > accesses, ie, doubleword loads and stores are not access-atomic always. > > (LDREXD/STREXD are different, as are 8-byte loads and stores on AArch64.) > > Luke, this seems to suggest that the issue you were seeing in the simulator > could actually happen in practice with a real device. So the failure here was more to do with the simulator inner workings: we perform the first loadW, it faults, redirects PC, then perform the second, it faults, tries to look up the (now redirected) pc, finds no MemoryAccess, and then uses the generic out-of-bounds throw path (which provides no callstack, and thus failed jit-tests/tests/wasm/errors.js). So the attempted fix in fix-simulator was really just preventing that; I suppose instead I could have a "did I fault?" test in-between the first and second loadW. > Also, what happens > with unaligned loads/stores, which do several accesses to simulate an atomic > one? Right, I forgot to mention that: since those don't create MemoryAccesses for each loading/storing instruction, they will get the generic out-of-bounds path that won't provide a stack trace. I had meant to say that should be a follow-up fix to add MemoryAccesses and add special tests for them. Sounds like it depends on bug 1308996.
Flags: needinfo?(luke)
![]() |
Assignee | |
Comment 19•7 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #16) Great points! > ::: js/src/jit-test/tests/asm.js/testAtomics.js > For the record, noting down my thoughts here: [...] > In the future, wasm will likely be enabled *before* atomics/simd.js (if > they ever get into JS), so it will be safe too. Agreed on overall reasoning, but I'd slightly refine that to say that Atomics will most definitely get into *JS*, just not *asmjs*. Atomics will continue to throw whatever JS says; but wasm will mostly have its own atomic operations (either by new barrier bits in existing load/store's flags immediate or new ops for cmpxchg et al) which will have trapping (and thus RuntimeError) semantics. > or using the shiny ES2015 features, > > function removeDuplicates(array) { > return Array.from(new Set(array)); > } Ah, cool technique. Unfortunately, we only want to remove *adjacent* duplicates, not all duplicates. I'll rename to removeAdjacentDuplicates, though. > I liked what you've done for the WasmBaselineCompiler; random idea: could we > also have a CodeGenerator::trap() method, maybe looking like this, to make > call sitess less verbose? This suggestion in particular really made things cleaner, thanks! > Not sure why we need those here (and on x86 too)? is it because of the guard > page optimization for small offsets? (if so, we might need them too for the > unaligned loads/stores too). Right and right > If this is because of the guard page optimization used to catch accesses > with small offsets out of bounds, could we append them conditionally iff the > access' offset (not trap offset) is not 0, to reduce the amount of > MemoryAccesses? (probably in the append method itself). On !HUGE_WASM_MEMORY > platforms, the bounds check have their own TrapSite and will throw before > the access is actually run. Unfortunately, there's the unaligned+partially-oob case too. > > +struct TrapDesc : TrapOffset > > +{ > > + enum Kind { Jump, MemoryAccess }; > > enum class? For nested enums, where the containing struct already nicely scopes the names, I tend to not think the extra type-checking is worth the extra syntactic burden. I wish one could 'using Kind' within TrapDesc to inject the enumerators and have both...
Comment 20•7 years ago
|
||
Pushed by lwagner@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/e3f5d150e1ba Baldr: make the throw stub not be a JumpTarget (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/9a51acfd0094 Baldr: replace WasmTlsReg load with SymbolicAddress load in prologue/epilogue (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/e6d2b67b35e5 Baldr: dynamically check system page size (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/e29daffaaca4 Baldr: prepare CallSiteAndTarget for another Kind (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/33b295d58244 Baldr: rename masm 'thunk' to 'farJump' (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/959f1e7b26fa Baldr: hoist wasm::MemoryAccessDesc (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/6b8136cb9a4d Baldr: avoid clobbering pc in ARM simulator (r=bbouvier) https://hg.mozilla.org/integration/mozilla-inbound/rev/860ba5468626 Baldr: provide precise trap locations and stacks (r=bbouvier)
Comment 21•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e3f5d150e1ba https://hg.mozilla.org/mozilla-central/rev/9a51acfd0094 https://hg.mozilla.org/mozilla-central/rev/e6d2b67b35e5 https://hg.mozilla.org/mozilla-central/rev/e29daffaaca4 https://hg.mozilla.org/mozilla-central/rev/33b295d58244 https://hg.mozilla.org/mozilla-central/rev/959f1e7b26fa https://hg.mozilla.org/mozilla-central/rev/6b8136cb9a4d https://hg.mozilla.org/mozilla-central/rev/860ba5468626
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox52:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Updated•7 years ago
|
status-firefox49:
affected → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•