1432345 - Add index masking for 32-bit wasm loads and stores

Assignee

Description

•

8 years ago

On 64-bit, wasm Memory reserves a 6gb guard page and there are no bounds checks so Spectre is not a problem. On 32-bit ARM, conditional instructions are used instead of branching, and it doesn't appear these are speculated yet. However, x32 bounds checks require mitigations.

Luke Wagner [:luke]

Assignee

Comment 1

•

8 years ago

Attached patch spectre-wasm32 (obsolete) — Details — Splinter Review

This patch uses the masm and Ion primitives added by bug 1430602. I tried to also add index masking to rabldr, but masm.spectreMaskIndex() requires a separate register it can write into and this is rather difficult to do from within prepareMemoryAccess() since, at least for some of the new atomic rmws, if I simply needI32() to get a temp, there are no free GPRs on x32. Maybe Lars you could help do the baseline patch?

Assignee: nobody → luke

Attachment #8944593 - Flags: review?(lhansen)

Attachment #8944593 - Flags: feedback?(jdemooij)

Lars T Hansen [:lth]

Comment 2

•

8 years ago

I'll take care of the baseline patch. (prepareMemoryAccess should not be allocating registers anyway.)

Lars T Hansen [:lth]

Comment 3

•

8 years ago

Comment on attachment 8944593 [details] [diff] [review] spectre-wasm32 Review of attachment 8944593 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/wasm/WasmIonCompile.cpp @@ +969,5 @@ > { > if (inDeadCode()) > return nullptr; > > + checkOffsetAndAlignmentAndBounds(access, SpectreMask::DontNeed, &base); DontNeed is almost certainly wrong here. On x86 it is /maybe/ good enough (and it might not be...) /if/ the back-end uses XCHG (and it might not...) but on architectures where exchange is implemented as a LL/SC loop we should worry about speculation through LL. Effectively, except where you can optimize with XCHG the exchange case is equivalent to the binop case. If we have to use a loop with cmpxchg the initial load will even be a plain load on most systems and we will speculate through that for sure.

Attachment #8944593 - Flags: review?(lhansen) → review+

Luke Wagner [:luke]

Assignee

Comment 4

•

8 years ago

(In reply to Lars T Hansen [:lth] from comment #3) > > + checkOffsetAndAlignmentAndBounds(access, SpectreMask::DontNeed, &base); > > DontNeed is almost certainly wrong here. Oops, for some silly reason I forgot Atomics.exchange() returned a value.

Jan de Mooij [:jandem]

Comment 5

•

8 years ago

Comment on attachment 8944593 [details] [diff] [review] spectre-wasm32 Review of attachment 8944593 [details] [diff] [review]: ----------------------------------------------------------------- Just skimmed the patch, but LGTM.

Attachment #8944593 - Flags: feedback?(jdemooij) → feedback+

Lars T Hansen [:lth]

Comment 6

•

8 years ago

Attached patch bug1432345-rabaldr-spectre-boundscheck.patch (obsolete) — Details — Splinter Review

Less terrible than I had feared. See commit msg for technical details. Jan, the interesting bit here for you is the change to the bounds check logic in prepareMemoryAccess, diff chunk marked "-3945,40".

Attachment #8947479 - Flags: review?(luke)

Attachment #8947479 - Flags: feedback?(jdemooij)

Luke Wagner [:luke]

Assignee

Comment 7

•

8 years ago

Comment on attachment 8947479 [details] [diff] [review] bug1432345-rabaldr-spectre-boundscheck.patch Review of attachment 8947479 [details] [diff] [review]: ----------------------------------------------------------------- Thanks! ::: js/src/wasm/WasmBaselineCompile.cpp @@ +3918,5 @@ > + bool free; > + RegI32 tmp; > + > + public: > + SpectreTemp() : free(false) {} So the callsite reads a bit more explicitly and for a superficial symmetry with the WasmIonCompile.cpp patch, how about instead having a 'static SpectreTemp DontNeed()' and the default ctor is private? @@ +3938,5 @@ > + bc->freeI32(tmp); > + } > + > + bool isValid() const { return tmp.isValid(); } > + RegI32 operator*() const { return tmp; } Could this assert !spectreIndexMasking() && tmp.isValid()? @@ +8245,5 @@ > + RegI32 r = regs.expectLow(); > + fr.pushPtr(r); > + SpectreTemp stemp(this, r); > + address = prepareAtomicMemoryAccess(&access, &check, tls, rp, stemp); > + fr.popPtr(r); Ah hah, this is the case that tripped me up when I naively tried this patch.

Attachment #8947479 - Flags: review?(luke) → review+

Lars T Hansen [:lth]

Comment 8

•

8 years ago

(In reply to Luke Wagner [:luke] from comment #7) > Comment on attachment 8947479 [details] [diff] [review] > bug1432345-rabaldr-spectre-boundscheck.patch > > Review of attachment 8947479 [details] [diff] [review]: > ----------------------------------------------------------------- > > @@ +3938,5 @@ > > + bc->freeI32(tmp); > > + } > > + > > + bool isValid() const { return tmp.isValid(); } > > + RegI32 operator*() const { return tmp; } > > Could this assert !spectreIndexMasking() && tmp.isValid()? Not as such, because spectreIndexMasking() is not definitive, we also need to know what the operation is. But we can simply assert isValid(), because we'll only deref the spectreTemp if we actually need it for codegen.

Jan de Mooij [:jandem]

Comment 9

•

8 years ago

Comment on attachment 8947479 [details] [diff] [review] bug1432345-rabaldr-spectre-boundscheck.patch Review of attachment 8947479 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/wasm/WasmBaselineCompile.cpp @@ +4012,5 @@ > + masm.wasmBoundsCheck(Assembler::AboveOrEqual, ptr, limit, oldTrap(Trap::OutOfBounds)); > + > + if (spectreTemp.isValid()) { > + masm.spectreMaskIndex(ptr, limit, *spectreTemp); > + masm.and32(*spectreTemp, ptr); As mentioned on IRC, spectreMaskIndex will do the and32 so you can use *spectreTemp or masm.move32(*spectreTemp, ptr).

Attachment #8947479 - Flags: feedback?(jdemooij) → feedback+

Luke Wagner [:luke]

Assignee

Comment 10

•

8 years ago

Probably also mentioned in irc but; we should switch this patch to using the cmov masm ops added by bug 1435209. jan mentioned for wasm we don't need to zero a reg and just cmov length to index because we have the guard page.

Nicolas B. Pierron [:nbp]

Updated

•

8 years ago

status-firefox60: --- → affected

Priority: -- → P1

Luke Wagner [:luke]

Assignee

Comment 11

•

8 years ago

Comment on attachment 8947479 [details] [diff] [review] bug1432345-rabaldr-spectre-boundscheck.patch (Bug 1435209 landed.) Jan pointed out that, for wasm, we could rely on the guard page being present and thus we won't need a temp register; we just cmov length to index if index > length. That would super-simplify this patch and remove the need for SpectreTemp also probably be a bit faster so could we do that instead?

Attachment #8947479 - Flags: review+

Luke Wagner [:luke]

Assignee

Comment 12

•

8 years ago

Actually, w/ the temp business out of the way, I'll write the rabaldr patch.

Lars T Hansen [:lth]

Comment 13

•

8 years ago

Yes, we should absolutely not introduce the SpectreTemp business if we go the CMOV route.

Luke Wagner [:luke]

Assignee

Comment 14

•

7 years ago

Attached patch spectre-wasm32 — Details — Splinter Review

Wow, the new approach is fantastic: it adds a single cmov and fits in ever-so-neatly with the existing masm.wasmBoundsCheck(), so this tiny patch covers both Ion and Rabaldr with zero changes to Rabaldr. A few notes: - I considered forking off Spectre variants of (M|L)WasmBoundsCheck to reflect the difference that the Spectre variants returned the saturated index, but there seems to be no problem at either the MIR or LIR level with having these single nodes do both. Jan: in particular, can you see any theoretical problem with LInstructionHelper<1,...> even though there is no output defined? - the WasmBCE changes maintain the asserted invariant that MWasmBoundsCheck's for which isRedundant()=true aren't codegen'd.

Attachment #8944593 - Attachment is obsolete: true

Attachment #8947479 - Attachment is obsolete: true

Attachment #8953645 - Flags: review?(jdemooij)

Luke Wagner [:luke]

Assignee

Comment 15

•

7 years ago

Oof: so I measured the perf impact on Bullet, Fasta, Box2D and Lua-BinaryTrees and all 4 showed a roughly 15% slowdown. I confirmed: - before patch had the same perf as after patch with --spectre-mitigations=off - the exact same number of MWasmBoundsChecks are being marked as redundant by WasmBCE I forgot to mention above: this patch mitigates both loads and stores.

Summary: Add index masking for 32-bit wasm loads → Add index masking for 32-bit wasm loads and stores

Lars T Hansen [:lth]

Comment 16

•

7 years ago

(In reply to Luke Wagner [:luke] from comment #14) > Created attachment 8953645 [details] [diff] [review] > spectre-wasm32 > > Wow, the new approach is fantastic: it adds a single cmov and fits in > ever-so-neatly with the existing masm.wasmBoundsCheck(), so this tiny patch > covers both Ion and Rabaldr with zero changes to Rabaldr. /me rejoices.

Jan de Mooij [:jandem]

Comment 17

•

7 years ago

Comment on attachment 8953645 [details] [diff] [review] spectre-wasm32 Review of attachment 8953645 [details] [diff] [review]: ----------------------------------------------------------------- Looks good!

Attachment #8953645 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Comment 18

•

7 years ago

(In reply to Luke Wagner [:luke] from comment #15) > Oof: so I measured the perf impact on Bullet, Fasta, Box2D and > Lua-BinaryTrees and all 4 showed a roughly 15% slowdown. Do you know if there's also a slowdown from the bounds check having its own def now? (In reply to Luke Wagner [:luke] from comment #14) > Jan: in particular, can you see any > theoretical problem with LInstructionHelper<1,...> even though there is no > output defined? That should be fine. It will default to LDefinition::BogusTemp() and that's ignored by regalloc.

Jan de Mooij [:jandem]

Comment 19

•

7 years ago

Oh do we need a follow-up for ARM64?

Jan de Mooij [:jandem]

Comment 20

•

7 years ago

(In reply to Jan de Mooij [:jandem] from comment #19) > Oh do we need a follow-up for ARM64? Nevermind, we only care about 32-bit. I thought I saw x86-shared in your patch but it's x86.

Pulsebot

Comment 21

•

7 years ago

Pushed by lwagner@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/bf401fe9c95c Baldr: add index masking for 32-bit wasm loads and stores (r=jandem)

Luke Wagner [:luke]

Assignee

Comment 22

•

7 years ago

Oops, I missed this: (In reply to Jan de Mooij [:jandem] from comment #18) > (In reply to Luke Wagner [:luke] from comment #15) > > Oof: so I measured the perf impact on Bullet, Fasta, Box2D and > > Lua-BinaryTrees and all 4 showed a roughly 15% slowdown. > > Do you know if there's also a slowdown from the bounds check having its own > def now? Great question! Comparing a pre-patch shell to a post-patch build that has the "cmovCCl()" commented out (so the same regalloc, just no cmov being emitted), they are almost the same (the only slowdown I could find was fasta, with 1%). Thus, it appears that the overhead is entirely from cmov.

Arthur Iakab [arthur_iakab]

Comment 23

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/bf401fe9c95c

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox60: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla60

Pulsebot

Comment 24

•

7 years ago

Pushed by lwagner@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/19c4eedf4d8e Baldr: remove accidental testing printf (r=me)

Cristina Coroiu [:ccoroiu]

Comment 25

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/19c4eedf4d8e

Luke Wagner [:luke]

Assignee

Comment 26

•

7 years ago

awfy wasm 32-bit confirms an overall 15% slowdown. Looking at v8's 32-bit graphs, they show almost the exact same magnitude drop when they landed their spectre 32-bit cset (much earlier). To wit: - their cset uses a mask loaded from their tls array instead of cmov - SM perf *after* this patch matches v8 perf *before* their spectre patch - v8 apparently doesn't have 64-bit guard pages enabled and thus they took the same hit on 64-bit

spectre-wasm32 8 years ago Luke Wagner [:luke] 12.12 KB, patch	lth : review+ jandem : feedback+	Details \| Diff \| Splinter Review
bug1432345-rabaldr-spectre-boundscheck.patch 8 years ago Lars T Hansen [:lth] 17.07 KB, patch	jandem : feedback+	Details \| Diff \| Splinter Review
spectre-wasm32 7 years ago Luke Wagner [:luke] 10.31 KB, patch	jandem : review+	Details \| Diff \| Splinter Review