Open Bug 1666747 Opened 4 years ago Updated 2 years ago

Partially OOB unaligned stores on ARM, ARM64, x86(?) may write partial data at the end of the heap

Categories

(Core :: JavaScript: WebAssembly, defect, P3)

defect

Tracking

()

People

(Reporter: lth, Unassigned)

References

(Depends on 1 open bug, Regression)

Details

(Keywords: leave-open, regression)

Attachments

(1 file)

The background is bug 1609381 comment 26 et seq but also related are bug 1343981 and the long-neglected bug 1587757.

A partially OOB write must not write any data. On ARM64 the semantics are such that if unaligned accesses are enabled in the system control word then the data are written bytewise from low to high and can then perform a partial write; this violates wasm semantics.

We appear to have no good test cases for this. In jit-test/tests/wasm/memory.js we have tests for OOB stores that are in fact partially OOB, but we do not test the memory after the store to check that the bytes at the end of the memory are unaffected.

I ran into this problem with a SIMD test on an arm64 android device, where unaligned stores are apparently allowed without a trap and will write partial data, suggesting deeper problems in our test suite, jit, and simulators.

Our ARM64 and ARM emulators emulate all of this imperfectly by checking for OOB before writing anything (thus guaranteeing no partial OOB writes). After that, the behavior for unaligned writes diverges. On the ARM simulator there is an attempt to control for unaligned accesses but it looks half-implemented, for example, readQ and writeQ take an argument that controls alignment policy, but it is always ForbidUnaligned. On the ARM64 simulator, unaligned accesses appear to be allowed always. Neither is necessarily correct or comprehensive.

Blocks: 1609381

As expected, the new test cases that check against partial writes fail on the android device: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e41ec46c354f79e901ce9df54cf4350abe9153a0&selectedTaskRun=Ke7hg_WeRSaKeUvKYl2FVA.0

Looking at the aarch64 semantics for the integer store instructions, they appear to be similarly affected. This is fairly bad, as the engine must assume every store is unaligned even if there's no alignment hint indicating that. Strategies to mitigate the problem would be: explicit bounds check; individual byte stores from high to low; or synthetic load of the highest byte to artificially generate a trap (we can use LDRB with an immediate and target the zero register or scratch). None of these are great, though the load is probably best, even if it brings the entire cache line into the cache in a way that is completely unnecessary. We would definitely want to rely on some kind of bounds checking machinery to avoid the check, but that machinery can't depend on the guard page trick working. We would not want to do it if the alignment checking is enabled in the hardware.

One thing I don't know is how widespread this problem is on devices, or whether it could be mitigated by flipping the control bit so that we could get alignment checking. Last I looked, the bit was not controllable or even readable from userspace.

Amusingly, the test also fails on 32-bit ARM. This is a little more surprising to me, since unaligned FP accesses must trap (and bug 1587757 is fallout from that). But unaligned integer accesses are still a problem, and may execute byte-at-a-time if the manual is to be believed (E2.4.4. implies this strongly, G2.3 subsection "aarch32/functions/memory", case MemU[] assignment form, which delegates to MemU_with_type[] assignment form, makes it pretty explicit). Indeed bug 1343981 was probably a result of observing that in a limited case but not seeing the whole picture.

What I will do here is factor the test cases as separate patches and also factor the SIMD test cases, and that will allow us a little more flexibility in when we run them, and allows us to track them and fixes properly.

Summary: Add tests that partial-OOB writes do not write any data → Partially OOB unaligned stores on ARM, ARM64 may write partial data at the end of the heap

Filed https://github.com/WebAssembly/spec/issues/1250, the wasm spec test suite appear to have no test cases for this either.

Factor partial-OOB-store checks out of the SIMD tests into a separate
file, and add similar non-SIMD tests in a separate file.

(The purpose of the separate file is to allow the tests to be disabled
on select architectures until the semantics are supported by the
engine.)

Thinking ahead to how this might be fixed:

  • it only affects plain multi-byte stores whose alignment is unknown; single-byte stores, known-aligned stores, and all loads are fine
  • if the store has an explicit bounds check that does not depend on the guard page trick then it is also fine
  • it only(?) is an issue on platforms with SCTLR.A = 0, and while we can't read that flag directly we may be able to get it out of /proc or registry or other data store, or we can run a test on startup that tests what happens
  • if SCTLR.A = 1 then vector stores can be implemented as two 8-byte integer stores, if we like (via integer registers)
  • if SCTLR.A = 0 then we can either use an explicit no-guard-page-trick bounds check or we can perform a read of the high byte of the datum or a read of the entire datum before the store to cause the trap if we can't guarantee alignment
  • if SCTLR.A = 1 then we will either depend on the kernel to do the fixup for us for unaligned accesses, or we must have trap handling of some kind to do the fixup, and this might itself be unpleasant (bug 1587757 et seq), so it might be in our best interest to simply pay for the bounds check in some form, see previous point
Attachment #9177864 - Attachment description: Bug 1666747 - Check for partial writes with partial-OOB stores → Bug 1666747 - Check for partial writes with partial-OOB stores. r?rhunt

When we fix this bug, the guards in the test case have to be removed.

Keywords: leave-open
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c028d53c18c7
Check for partial writes with partial-OOB stores. r=rhunt
Regressed by: 1669003
Has Regression Range: --- → yes
Keywords: regression

Turned out this was also a problem on x86-32 (!) and ARMv7 for 64-bit data, because the low-addressed word would be stored before the high-addressed word (bug 1669003). The MIPS32 port got it right.

Depends on: 1669179

Test cases were refined further in bug 1669003, bug 1669196.

A good place to start here would be to fix the ARMv7 and ARM64 simulators so that they actually behave in the same way as actual hardware. At the moment, tier-2 device test failures are increasingly a fact of life because the simulators are not accurate. Fixing the simulators might also entail running test cases on simulators both with trapping semantics and with untrapping semantics; not sure what's needed yet.

Most fun bug ever...

A number of Intel family CPUs do not implement copy-atomic unaligned writes (up to word size) that cross cache lines and page boundaries. The manual lists these models: "Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, P6 family, Pentium, and Intel486". Most of those are obsolete though "Xeon" is pretty generic and a current trademark. Not sure how much to worry about this.

However, "An x87 instruction or an SSE instruction that accesses data larger than a quadword may be implemented using multiple memory accesses. If such an instruction stores to memory, some of the accesses may complete (writing to memory) while another causes the operation to fault for architectural reasons (e.g. due an page-table entry that is marked “not present”). In this case, the effects of the completed accesses may be visible to software even though the overall instruction caused a fault." A quadword is the size of a double, so this might affect normal SIMD loads and could perhaps be demonstrated with a test case. There's no qualification about specific microarchitectures on the statement.

Summary: Partially OOB unaligned stores on ARM, ARM64 may write partial data at the end of the heap → Partially OOB unaligned stores on ARM, ARM64, x86 may write partial data at the end of the heap

I think any x86 issues (probably restricted to SIMD stores) are mostly theoretical and I'm unable to demonstrate any problems so far. That said, there's no definitive / authoritative prose that I've found that states that partial stores is not an issue on more recent systems. Links welcome.

Summary: Partially OOB unaligned stores on ARM, ARM64, x86 may write partial data at the end of the heap → Partially OOB unaligned stores on ARM, ARM64, x86(?) may write partial data at the end of the heap
Depends on: 1671065
Assignee: lhansen → nobody
Status: ASSIGNED → NEW

The leave-open keyword is there and there is no activity for 6 months.
:lth, maybe it's time to close this bug?

Flags: needinfo?(lhansen)
Flags: needinfo?(lhansen)
Priority: P3 → P5
See Also: → 1737225

Also see bug 1737225 comment 3 about ARMv8.4 and the Apple M1.

We'll run into this on pre-ARMv8.4 hardware the next time we pull the spec tests, see https://github.com/WebAssembly/spec/pull/1384.

Priority: P5 → P2
Priority: P2 → P3
Type: enhancement → defect

The leave-open keyword is there and there is no activity for 6 months.
:rhunt, maybe it's time to close this bug?
For more information, please visit auto_nag documentation.

Flags: needinfo?(rhunt)

We don't have a solution for this yet.

Flags: needinfo?(rhunt)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: