wasm: implement signal handlers for unaligned floating-point accesses on ARM
Categories
(Core :: JavaScript: WebAssembly, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox50 | --- | affected |
People
(Reporter: bbouvier, Assigned: lth)
References
Details
Attachments
(2 files, 4 obsolete files)
2.00 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
34.84 KB,
patch
|
lth
:
review+
|
Details | Diff | Splinter Review |
See https://bugzilla.mozilla.org/show_bug.cgi?id=1268024#c32 for more context. Emulating an integer memory access in the signal handler for unaligned access is not too hard; for float memory access, it's harder as we have no way to retrieve the content of the FPU from the signal handler without using ptrace, it seems. As a temporary workaround, we forbid compilation to everything below ARMv7 and assume ARMv7 does the right thing. There are modes for which v7 won't do the right thing too, so this needs to be fixed later.
Reporter | ||
Comment 1•8 years ago
|
||
Some documentation I've found and that might show useful: - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388e/CIHEHDHJ.html (look at bit 1) - http://www.heyrick.co.uk/armwiki/Unaligned_data_access - the entire comment there: https://android.googlesource.com/platform/bionic/+/c124baa/libc/arch-arm/include/machine/ucontext.h#86
Reporter | ||
Comment 2•8 years ago
|
||
Some notes from discussion on IRC with sunfish: 15:08 <sunfish> bbouvier: my notes from last time I looked into this indicate that armv6 also has a mode where it traps on alignment 15:09 <sunfish> bbouvier: so it'd only be armv5 and armv6/7 in legacy mode that we wouldn't support 15:10 <sunfish> which i think we're totally ok with 15:16 <sunfish> oh, never mind. there are three levels here. 15:17 <sunfish> armv5 silently masks off the low bits, so it's totally unusable 15:17 <sunfish> armv6 introduced trapping on misaligned accesses, which is theoretically supportable, but is the thing we're talking about 15:18 <sunfish> and then there's the-hardware-takes-care-of-it 15:18 <sunfish> or, the-os-takes-care-of-it I've checked on the rpi: by default, the kernel takes care of it for integer memory accesses (/proc/cpu/alignment set to 2).
Updated•8 years ago
|
Assignee | ||
Comment 3•7 years ago
|
||
Test case: This causes wasm/spec/memory_redundancy.wast to fail (the program has unaligned f32 loads and stores) on my dev board [1]. We'll need this for shipping on ARM, I expect. [1] Nvidia Jetson TK1 4xTegra @ 2.3GHz, Ubuntu 14.
Assignee | ||
Comment 4•7 years ago
|
||
Test case: This causes wasm/spec/float_memory.wast to iloop - the program has unaligned floating point accesses and the signal handler does not advance the PC when it handles the SIGBUS resulting from the unaligned access. Same system as comment 3.
Reporter | ||
Comment 6•7 years ago
|
||
This is something I've seen in the past: according to the mode you're using to handle unaligned accesses (at the kernel level), or whether this is the simulator or a real device, the program might segfault at the expected PC or at PC+/-4. I haven't clearly identified the actual reason for this behavior...
Assignee | ||
Comment 7•7 years ago
|
||
Test case (sort of): wasm/errors.js. This is similar to comment 4 and comment 5 but the FP accesses are all out of bounds, they don't need to be emulated but we need to invoke the OOB handler instead.
Assignee | ||
Comment 9•7 years ago
|
||
There's the worrisome possibility that /proc/cpu/alignment is set to 0, which means the kernel will quietly swallow the exception and return garbage on an unaligned access. https://www.kernel.org/doc/Documentation/arm/mem_alignment https://gcc.gnu.org/ml/gcc/2010-05/msg00142.html https://wiki.debian.org/ArmEabiFixes#word_accesses_must_be_aligned_to_a_multiple_of_their_size One might hope that, with newer hardware supporting unaligned accesses so that the kernel does not have to, this is not an issue, but we probably want to check this on startup -- not that that's really sufficient, I expect the value can change at any time.
Assignee | ||
Comment 10•7 years ago
|
||
Not a fix, but a stopgap solution that allows testing: throw on unaligned accesses that trap, do not hang.
Reporter | ||
Comment 11•7 years ago
|
||
Comment on attachment 8829347 [details] [diff] [review] bug1283121-throw-on-unaligned.patch Review of attachment 8829347 [details] [diff] [review]: ----------------------------------------------------------------- wow, very good catch. To make sure this never happens again, and since this exit and the interrupt/OOB ones are always generated, and none of these is the first one to be generated, can you please try to add an assertion in CodeSegment::create that the linkData.{interrupt,outOfBounds,unalignedAccess}Offset are not 0? (we have plenty of tests with empty module, so if at least one offset is 0, that will show up during testing)
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #11) > Comment on attachment 8829347 [details] [diff] [review] > bug1283121-throw-on-unaligned.patch > > Review of attachment 8829347 [details] [diff] [review]: > ----------------------------------------------------------------- > > wow, very good catch. > > To make sure this never happens again, and since this exit and the > interrupt/OOB ones are always generated, and none of these is the first one > to be generated, can you please try to add an assertion in > CodeSegment::create that the > linkData.{interrupt,outOfBounds,unalignedAccess}Offset are not 0? (we have > plenty of tests with empty module, so if at least one offset is 0, that will > show up during testing) Sure thing.
Assignee | ||
Comment 13•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/3a9a465bd55ed2b5f39cd75be7e1e33cd07cc632 Bug 1283121 - Wasm ARM: Throw on trapping unaligned access, do not hang. r=bbouvier
Assignee | ||
Updated•7 years ago
|
Comment 14•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3a9a465bd55e
Assignee | ||
Comment 15•7 years ago
|
||
I think we're not targeting Wasm on ARM in earnest until the next iteration? Also, I understand there's some discussion going on around how much effort we want to spend here. So P3 but useful not to lose sight of this.
Comment 16•6 years ago
|
||
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Reporter | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 17•5 years ago
|
||
The Google Pixel 2 traps on unaligned FP accesses in ARMv7 mode, see bug 1517351 comment 15 et seq. THat hardware is mainstream enough that we probably have to deal with this problem. Increasing priority.
Assignee | ||
Comment 18•5 years ago
•
|
||
ARM-32 failures on Pixel 2 and Moto G5:
wasm/spec/address.wast.js fails because of an unaligned f32 load
wasm/spec/call_indirect.wast.js fails because of an unaligned f64 store
wasm/spec/memory_redundancy.wast.js fails because of an unaligned f32 load
wasm/spec/float_memory.wast.js fails because of an unaligned f32 load
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4188c079bd4a2c8f10236333f717c4752d654930
These tests are all currently disabled on ARM for this reason.
Assignee | ||
Comment 19•5 years ago
|
||
Fairly grubby code to emulate trapping FP loads. Completely untested, not even test compiled, but everything used here is part of the Linux ABI in principle, so this is not entirely insane.
Assignee | ||
Comment 20•5 years ago
|
||
This compiles but does not run (it segfaults w/o much information when it should have emulated the instruction). Hard to debug locally unless you have a device that traps on unaligned FP accesses...
Assignee | ||
Comment 21•5 years ago
|
||
Wohoo, I have a device that has the problem.
(This is a very mainstream system, BTW, a Samsung Galaxy Tab S2, so the problem with trapping unaligned FP accesses appears to be widespread in the wild.)
Assignee | ||
Comment 22•5 years ago
|
||
Also my LG/Google Nexus 5x with Android 8.1. I think at this point the inescapable conclusion is that this problem is common or possibly even the majority case.
Simple test case (temporary home): https://axis-of-eval.org/sandbox/oob-arm.html, runs quickly and the result is reported in the window.
Assignee | ||
Comment 23•5 years ago
|
||
Works locally on my Nexus 5X. Needs:
- cleanup & maybe generalization
- test cases
- testing on various android versions (the VFP_MAGIC may have changed at
various points in the past) - scrutiny
- maybe additional ifdeffery so that we don't expose ARM Linux to it, just
Android
Assignee | ||
Comment 24•5 years ago
|
||
This is green on the Pixel 2 on try with my new test cases and after reinstating the test cases that were disabled. (https://treeherder.mozilla.org/#/jobs?repo=try&revision=9b21e231d93f13c7abde89a02f1542cf6f082ec0)
Primary worries for reviews:
- Not enough guards / risk of double-faulting because I've forgotten to check some precondition. I'm pretty sure I got "this is a known wasm instance with a PC that is for sure valid and a memory address that will for sure be in-bound when I read the data" right, but is there more?
- Unnecessary kernel dependencies
- Overbroad kernel assumptions
- "Unknown unknowns"
Bob, can you put a couple of Moto G5s back in the pool for me to use? I'll want to run jittest-10 in both opt and debug mode, just to be sure.
Comment 25•5 years ago
|
||
Done, I am on pto today. If you can kick these off soon before I have to leave, we can make sure they are working properly. Otherwise I'll check in around 9AM PT.
Comment 26•5 years ago
|
||
Comment on attachment 9036554 [details] [diff] [review] bug1283121-emulate-fp-accesses-v4.patch Review of attachment 9036554 [details] [diff] [review]: ----------------------------------------------------------------- r+ if at least |pc| is asserted to be 0 % 4. ::: js/src/wasm/WasmInstance.cpp @@ +1143,5 @@ > lastByteOffset < memoryMappedSize(); > } > > +bool Instance::memoryAccessInBounds(uint8_t* addr, > + unsigned numBytes) const { Does this correctly handle (reject) the case where [addr, +numBytes) wraps around the end of the address space? I asumme that a wasm memory itself can't wrap around the end of the address space. ::: js/src/wasm/WasmSignalHandlers.cpp @@ +573,5 @@ > +#if defined(__linux__) && defined(__arm__) > + // wasmLoadImpl() and wasmStoreImpl() in MacroAssembler-arm.cpp emit plain > + // unconditional VLDR and VSTR instructions that do not use the PC as the > + // base register. > + uint32_t instr = *(uint32_t*)pc; This appears to assume that |pc| refers to an ARM-encoded instruction. Please assert this; possibly release-assert so that we can't end up doing memory accesses on behalf of who-knows-what Thumb instruction by accident. @@ +580,5 @@ > + bool isVSTR = masked == 0x0D000A00; > + DebugOnly<bool> oob = true; > + if (isVLDR || isVSTR) { > + bool isUnconditional = (instr >> 28) == 0xE; > + MOZ_RELEASE_ASSERT(isUnconditional); To release-assert this (and below) is to imply that we can only get here as a result of a fault with a wasm-jit generated instruction. If we take a fault elsewhere in Gecko and end up in here, the release assert could cause the system to fail unnecessarily. So do we have that guarantee? If not, it would seem safer to simply ignore unexpected instructions.
Assignee | ||
Comment 27•5 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #26)
Comment on attachment 9036554 [details] [diff] [review]
bug1283121-emulate-fp-accesses-v4.patchReview of attachment 9036554 [details] [diff] [review]:
r+ if at least |pc| is asserted to be 0 % 4.
::: js/src/wasm/WasmInstance.cpp
@@ +1143,5 @@lastByteOffset < memoryMappedSize();
}
+bool Instance::memoryAccessInBounds(uint8_t* addr,
unsigned numBytes) const {
Does this correctly handle (reject) the case where [addr, +numBytes) wraps
around the end of the address space? I asumme that a wasm memory itself
can't wrap around the end of the address space.
The wasm memory can't wrap but it's possible this method should be more paranoid. Will investigate.
::: js/src/wasm/WasmSignalHandlers.cpp
@@ +573,5 @@+#if defined(linux) && defined(arm)
- // wasmLoadImpl() and wasmStoreImpl() in MacroAssembler-arm.cpp emit plain
- // unconditional VLDR and VSTR instructions that do not use the PC as the
- // base register.
- uint32_t instr = (uint32_t)pc;
This appears to assume that |pc| refers to an ARM-encoded instruction.
Yes.
Please assert this; possibly release-assert so that we can't end up doing
memory accesses on behalf of who-knows-what Thumb instruction by accident.
I can assert pc % 4 == 0 at most and will do that, just because. For the instructions of interest, Arm and Thumb have the same encoding (the only difference is that Arm allows the instruction to be conditional), so further tests on arm-vs-thumb are not interesting.
@@ +580,5 @@
- bool isVSTR = masked == 0x0D000A00;
- DebugOnly<bool> oob = true;
- if (isVLDR || isVSTR) {
bool isUnconditional = (instr >> 28) == 0xE;
MOZ_RELEASE_ASSERT(isUnconditional);
To release-assert this (and below) is to imply that we can only get here as
a result of a fault with a wasm-jit generated instruction.
That is what the earlier guards are supposed to guarantee - that the faulting instruction is in a wasm code segment.
If we take a
fault elsewhere in Gecko and end up in here, the release assert could cause
the system to fail unnecessarily. So do we have that guarantee? If not, it
would seem safer to simply ignore unexpected instructions.
If we fail the earlier guards then we return false and the superstructure in this file will invoke the next handler.
Assignee | ||
Comment 28•5 years ago
|
||
Memo to self: in JSContext.cpp there is code that checks for certain Android versions (to disable the JIT in some cases). It might be possible to reduce the risk of going wrong in the instruction emulation by checking the OS version and only trying to emulate if the OS is new "enough".
Comment 29•5 years ago
|
||
Comment on attachment 9036554 [details] [diff] [review] bug1283121-emulate-fp-accesses-v4.patch Review of attachment 9036554 [details] [diff] [review]: ----------------------------------------------------------------- Really great work here; awesome to have this old hole finally fixed and verified on hardware. ::: js/src/jit/arm/MacroAssembler-arm.cpp @@ +5907,5 @@ > ScratchRegisterScope scratch(asMasm()); > ma_add(memoryBase, ptr, scratch); > > + // WasmSignalHandler.cpp depends on this being a single, unconditional > + // VLDR with a base pointer other than PC. How about "See HandleUnalignedTrap() in WasmSignalHandler.cpp" where we depend on ..."? (ditto below) ::: js/src/wasm/WasmSignalHandlers.cpp @@ +543,2 @@ > static MOZ_MUST_USE bool HandleTrap(CONTEXT* context, > + bool isSigbus = false, How about naming this isUnalignedSignal, making it an OS-specific question of whether a given signal means "unaligned". I'm thinking of ARM-on-Windows here (although maybe a non-issue b/c ARM64). @@ +569,5 @@ > trap == Trap::IndirectCallBadSig); > + > + // Handle trapping unaligned accesses. > + if (isSigbus) { > +#if defined(__linux__) && defined(__arm__) Can you hoist this whole #ifdef into a new HandleUnalignedTrap() function above in the above #ifdef (the one with ReadFPR32() et al). Can you also have an #else that defines HandleUnalignedTrap() to `return false` so that we can call HandleUnalignedTrap() from HandleTrap() without any #ifdefs? Also, I think it would be useful to also require `trap == Trap::OutOfBounds`? Actually, if `isUnalignedSignal && trap != Trap::OutOfBounds` I think we shouldn't even handle the signal at all. So that suggests the control structure: ``` if (isUnalignedSignal) { if (trap != Trap::OutOfBounds) return false; if (HandleUnalignedTrap()) return true; } ``` @@ +578,5 @@ > + uint32_t masked = instr & 0x0F300E00; > + bool isVLDR = masked == 0x0D100A00; > + bool isVSTR = masked == 0x0D000A00; > + DebugOnly<bool> oob = true; > + if (isVLDR || isVSTR) { It seems like we should be able to release assert it's one of these two since we know we are at a wasm-generated trap for an out-of-bounds. If it's false, that means we're generating a new form of OOB-trapping load/store and need to add a new pattern here. @@ +585,5 @@ > + bool isDouble = (instr & 0x00000100) != 0; > + bool isAdd = (instr & 0x00800000) != 0; > + uint32_t dBit = (instr >> 22) & 1; > + uint32_t offs = (instr & 0xFF) << 2; > + uint32_t rn = (instr >> 16) & 0xF; This is some nice Scandanavian Metal here ;) @@ +627,5 @@ > + } > + } > + } > +# ifdef DEBUG > + if (!oob) { With this hoisted into a function, I think you can kill `oob` and have the `!memoryAccessInBounds()` case `return false`. @@ +631,5 @@ > + if (!oob) { > + // If we get to this point, we did not understand the instruction or we > + // could not access the FP register. This is probably a bug / missing > + // feature or an incompatibility with the kernel. It is best to MOZ_CRASH > + // only in debug builds and throw an OOB error in release builds. I like this logic. I wonder if the __android_log_print() is necessary though: the MOZ_CRASH() will __android_log_print() its string before crashing and the instruction is asserted to be fine; the only way to reach here is that we're not able to find the f32/f64 register in WriteFPR*.
Comment 30•5 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #24)
- Not enough guards / risk of double-faulting because I've forgotten to check some precondition. I'm pretty sure I got "this is a known wasm instance with a PC that is for sure valid and a memory address that will for sure be in-bound when I read the data" right, but is there more?
That seems sufficient to me. Note that double-faulting is uniformly and conservatively handled by sAlreadyHandlingTrap; if you have a bug here it'll just end in a proper crash report.
Assignee | ||
Comment 31•5 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #29)
Comment on attachment 9036554 [details] [diff] [review]
bug1283121-emulate-fp-accesses-v4.patch
Great feedback, thanks. Some followup questions:
@@ +578,5 @@
- uint32_t masked = instr & 0x0F300E00;
- bool isVLDR = masked == 0x0D100A00;
- bool isVSTR = masked == 0x0D000A00;
- DebugOnly<bool> oob = true;
- if (isVLDR || isVSTR) {
It seems like we should be able to release assert it's one of these two
since we know we are at a wasm-generated trap for an out-of-bounds. If it's
false, that means we're generating a new form of OOB-trapping load/store and
need to add a new pattern here.
That is a little risky. We know there may be device configurations that also trap on unaligned integer loads and stores, even if we haven't seen any, and by my later comment I think it would be better to just throw a range error if we encounter one of those.
@@ +631,5 @@
- if (!oob) {
// If we get to this point, we did not understand the instruction or we
// could not access the FP register. This is probably a bug / missing
// feature or an incompatibility with the kernel. It is best to MOZ_CRASH
// only in debug builds and throw an OOB error in release builds.
I like this logic. I wonder if the __android_log_print() is necessary
though: the MOZ_CRASH() will __android_log_print() its string before
crashing and the instruction is asserted to be fine; the only way to reach
here is that we're not able to find the f32/f64 register in WriteFPR*.
As above: it's here in case we encounter a fault from unaligned integer access.
But as you point out, having moved the code into a function makes it easier to create better logic around these things, and I think I see a good compromise here.
Comment 32•5 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #31)
It seems like we should be able to release assert it's one of these two
since we know we are at a wasm-generated trap for an out-of-bounds. If it's
false, that means we're generating a new form of OOB-trapping load/store and
need to add a new pattern here.That is a little risky. We know there may be device configurations that also trap on unaligned integer loads and stores, even if we haven't seen any, and by my later comment I think it would be better to just throw a range error if we encounter one of those.
Ah hah; I hadn't thought about trapping integer loads. You're right; it would be best if we trapped in those cases.
Assignee | ||
Comment 33•5 years ago
|
||
Updated after reviews. Green in opt and debug on pixel 2 and moto g5. Will land it after a more general try run.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f7e0aca6faf936cf877722d0421acfbbdcbf5369
Bob, you should be able to reclaim the G5s now. Thanks!
Comment 35•5 years ago
|
||
Pushed by lhansen@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/1230184adda1 Handle SIGBUS for unaligned FP load/store on ARM Linux. r=luke, r=jseward
Comment 36•5 years ago
|
||
Do you know how this affects JS with typed arrays?
Assignee | ||
Comment 37•5 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #36)
Do you know how this affects JS with typed arrays?
Should not affect JS at all. Float32Array and Float64Array are always aligned, and for DataView I'm pretty sure we're always careful to use memcpy for access. This is basically a consequence of wasm allowing unaligned accesses even when the instruction says the access is aligned, combined with the desire always to emit the best possible code for accesses that say they are aligned. (See commit msg for more, or lmk if you need more information.)
Comment 38•5 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #37)
Float32Array and Float64Array are always aligned, and for DataView I'm pretty sure we're always careful to use memcpy for access.
Ah right. I thought TypedArray views could be unaligned but you're right. Great!
Comment 39•5 years ago
|
||
bugherder |
Assignee | ||
Updated•5 years ago
|
Description
•