Closed Bug 1846474 Opened 1 years ago Closed 1 year ago

Add a mechanism to detect incorrectly placed TrapSites

Categories

(Core :: JavaScript: WebAssembly, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
120 Branch
Tracking Status
firefox120 --- fixed

People

(Reporter: jseward, Assigned: jseward)

References

Details

Attachments

(9 files, 6 obsolete files)

3.77 KB, patch
Details | Diff | Splinter Review
276.39 KB, patch
Details | Diff | Splinter Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Bug 1843631 and bug 1843991 are the result of having a TrapSite pointing at
the "wrong" instruction in a sequence. Hence when the intended target
instruction does take a fault (SEGFAULT, SIGILL, etc) it falls through our
signal catcher and takes out the whole process. Currently there's no way to
detect such anomalies.

This bug is to add such a mechanism. When creating a TrapSite, it is now
necessary to specify approximately what kind of instruction (UD, load, store,
load/store size, etc) it is expected to apply to. Later, when the module
compilation is completed, the instruction pointed to by each TrapSite is
checked, to ensure that it matches the description provided when the TrapSite
was created.

As a ridealong, upgrade the stackmap-placement-checker -- which uses
an analogous scheme to that proposed in comment 0 -- from MOZ_ASSERT
to MOZ_DIAGNOSTIC_ASSERT.

WIP patch. This is the first working implementation of the scheme.
Currently works for x86_64 only. Passes the wasm jit-tests.

Now with support for x86_32 and x86_64. This is able to detect the error
(incorrectly located trapsite) described in bug 1843631 comment 5.

Attachment #9346634 - Attachment is obsolete: true
Severity: -- → N/A
Priority: -- → P1

Now with support for x86_32, x86_64 and arm64. The arm64 code
detects two errors in jit_tests, plus errors when compiling four
large .wasm files (for autocad, clang, earth, photoshop).

Attachment #9346691 - Attachment is obsolete: true

Update of "as a ridealong, upgrade the stackmap-placement-checker".

Attachment #9346630 - Attachment is obsolete: true

This patch contains:

  • a complete implementation of TrapSite-placement checking

  • the necessary assembler plumbing fixes needed get correct "offset of first
    faulting instruction" values to the places that need them

  • use of all of the above to fix known invalid-TrapSite bug 1843991 and
    bug 1843631

However, only for x86_32, x86_64 and arm64. arm32 remains to be done.

Attachment #9348533 - Attachment is obsolete: true

Update to comment 6 patch, with full support for x86_32, x86_64, arm64 and arm32.
arm32 does not fully work with this applied, since the validation machinery detects
a TrapSite that refers to the branch at the start of a constant pool, and so asserts.
This needs to be debugged/fixed. The other 3 targets are believed to be correct.

Attachment #9349985 - Attachment is obsolete: true

Update to comment 7 patch; now runs clean on mach try auto.

Attachment #9350791 - Attachment is obsolete: true

(In reply to Julian Seward [:jseward] from comment #8)

Created attachment 9350996 [details] [diff] [review]
bug1846474-2-trapsite-checks-diag.diff-25

In general this sounds good. However, I am troubled by 2 things:

  • OoFFI collects the location of the instruction, but its result is fed to the append function which accept TrapInsn sizes.
  • Instructions might be implemented with more than one instruction, which is reflected by the need for OoFFIPair need.

Thus what the caller of the lower-level assembly are doing with the append calls is guess what the instruction was implemented with, and the fact that OoFFIPair kind of leak this info is unexpected.

I think the following alternative might be of interest:

Instead of returning the OoFFI and OoFFIPair, to later guess what is used internally, I suggest setting up the lower-level assembler, such that it can call the append function on its own. If the lower-level assembler is not derived from AssemblerShared, providing a little wrapper which delegate the call to the append function would be alternative.

If we are dealing with non-wasm context, we might as well provide a no-op implementation which does not append anything.

Here's a summary of what's in the comment 8 patch. I will try to split it up
for review, but there's going to be a large central piece that can't be split
up further (without a lot of extra work).

======== (1) Validation machinery ========

This is relatively straightforward.

  • [WasmConstants.h] enum class TrapInsn. This categorises the behaviour of
    a trapping instruction enough that we have a good chance of verifying it
    correct later. Possible values are: Load{8,16,32,64,128} (it's a load insn
    of that size), Store{8,16,32,64,128}, OfficialUD and Atomic.

  • Every time we call masm.append (to register a trap site), we must supply a
    TrapInsn. That is stored in the TrapSite for later checking.

  • [WasmGenerator.cpp] ModuleGenerator::finishCodeTier(): at the end of
    compilation, visits all registered trap sites. Calls SummarizeInsn to
    inspect the actual insn in memory, returning a TrapInsn, which must be the
    same as the one that was registered. Currently is MOZ_DIAGNOSTIC_ASSERTed
    for extra coverage.

  • [WasmSummarizeInsn.cpp] SummarizeInsn(): this does the partial disassembly
    of in-memory insns. Tedious (and target dependent) but uncontroversial.

======== (2) Fixing the problem ========

The problem exists because the current code creates and registers trapping
insns thusly:

  uint32_t offset = masm.currentOffset();
  masm.trappingInsnOfSomeKind(..);
  masm.append(... offset .. bytecode offset ..)

This assumes that masm.trappingInsnOfSomeKind will create the instruction
that might trap (a.k.a "fault") immediately at offset. But that's not
always true, and the assembler offers no such guarantee:

  • On at least arm and arm64, almost any instruction can be preceded by a
    constant pool, since pools can go "out of range" at any point during
    assembly.

  • On at least arm, arm64 and x86_32, the assembler may insert instructions
    before the one that traps:

    • MacroAssembler::LoadStoreMacro in MacroAssembler-vixl.cpp
    • similarly for 32-bit arm
    • MacroAssemblerX86Shared::store8(Register, ..) MacroAssembler-x86-shared.h

Dealing with constant pools means any trapping insn we create needs to be
wrapped in AutoForbidPoolsAndNops. But that doesn't solve the second
problem. For those, we need the assembler to tell us the buffer offset of the
(first) instruction in the sequence it created, that might trap (fault): the
"Offset of the First Faulting Insn" (OoFFI).

Note that any change(s) need to have minimal performance impact, and need to
not affect the JS-side uses of the macroassembler stack. Hence it seems
reasonable to change some macroassembler methods to return an OoFFI rather
than void, since (1) returning it (a 32-bit int) is cheap, and (2) the JS
world can ignore the return value and will need no adjustment.

The fix has the following components:

  • [Assembler-shared.h] class OoFFI. A 32-bit int wrapped up in a class so
    that we don't confuse it with other 32-bit ints. It is the "Offset of the
    First Faulting Instruction" (an assembler buffer offset). Maybe we need a
    different name, but this is at least concise and IMO correctly descriptive.

  • [WasmBaselineCompile.cpp] *::emitTrapSite: (and equivalents in
    CodeGenerator.cpp): this no longer queries the assembler offset itself.
    Instead you must pass it an OoFFI (as well as a TrapInsn).

  • [WasmBaselineCompile.cpp] ::emitGcGet/Set (and equivalents in
    CodeGenerator.cpp): these now get a suitable OoFFI from the masm. methods
    they call. No more querying of currentOffset().

  • [WasmBaselineCompile.cpp] I adopted the consistent policy that all
    macroassembler methods reachable from ::emitGcGet/Set (and Ion
    equivalents) must return an OoFFI. This in turn drives OoFFI-fication down
    through the assembler stack. I tried hard to minimise the number of methods
    changed.

  • [MacroAssembler-vixl.cpp] MacroAssembler::LoadStoreMacro: now wraps
    faulting insns with AutoForbidPoolsAndNops and returns an OoFFI. This
    fixes the arm64 problems AFAIK. arm32 is not yet fixed.

  • As a result of the above, AutoForbidPoolsAndNops has been made nestable,
    since MacroAssembler::LoadStoreMacro may itself be called from within an
    AutoForbidPoolsAndNops scope.

  • I looked carefully at 64-bit integer transactions on 32 bit targets. It is
    hard (probably impossible) to argue that one of the two 32-bit transactions
    will fault before the other. Eg, MacroAssemblerX86::load64/store64 -- their
    sequencing depends on register-availability constraints. It seems safest to
    have a TrapSite for both transactions. I therefore created OoFFIPair to
    carry two offsets, and fixed up all uses accordingly.

======== (3) Rollout ========

The patch runs green on try. It provides validation for x86_64, x86_32, arm64
and arm32. It provides bug fixes for x86_64, x86_32, arm64 but not arm32.
Testing on large wasm inputs shows that the arm32 validation detects as-yet
unfixed bugs. Hence I propose this:

  • review, fix up, land, etc, the patch, including arm32 validation, but with
    arm32 validation disabled.

  • in a followup bug, fix arm32 bugs as discovered, then re-enable arm32
    validation.

  • for riscv64, loongson64, etc, these will need OoFFI plumbing fixes at a
    minimum, but they can have dummy validation that doesn't do anything,
    until/unless extern contribs write SummarizeInsn routines for them.

======== (4) Splitting for review ========

The following patch split up seems to me plausible:

  • definition of TrapInsn and supporting routines. No actual uses.

  • definition of SummarizeInsn (the partial disassembler) plus the check code
    in WasmGenerator.cpp.

  • make AutoForbidPoolsAndNops nestable.

  • the whole rest of it, which is mostly the assembler OoFFI return-type fixes,
    the wasm-end changes to pass OoFFIs and TrapInsns to masm.append(..), and
    the use of AutoForbidPoolsAndNops as necessary. Unfortunately this will
    still be a big patch.

(In reply to Nicolas B. Pierron [:nbp] from comment #9)

It is indeed true that there is nothing to guarantee that what the assembler
emits has any relationship to the TrapInsn that we register. For example, it
would seem reasonable to write

  OoFFI ooffi = masm.load32(..);
  emitTrapSite(.., ooffi, TrapInsn::Load32);

but it could be the case that masm. decides to implement the load32 using
two 16-bit loads for some reason.

However, this would be most unusual, and I don't think it actually ever
happens in our code base. It could happen if we are trying to generate code
for misaligned multi-byte loads on a target that doesn't support misaligned
loads, but that's not a case we have to handle.

If we ask for a masm.loadXbits, we really expect to get an X-bit load, if
only so that the performance of the generated code is what we expect. The
only exception is for 64-bit load/store on a 32-bit target, but that's already
handled with OoFFIPair.

Also, if this ever did happen, the generated instruction(s) would cause the
verification stage to fail, so we'd at least know it had happened.

(some more In reply to Nicolas B. Pierron [:nbp] from comment #9)

I suggest setting up the lower-level assembler, such that it can call the
append function on its own.

This could probably also be made to work. I must say though, to me this feels
like it works a bit against the separation-of-concerns principle, by
incorporating wasm-specific knowledge in the assembler. As you say, we could
no doubt hide that by using some inheritance magic in the *Assembler
hierarchy. But still, we'd end up with an assembler variant which does
"special other stuff on the side" when you call into it.

Also there's the question of how to tell it when to call append. For
example, masm.loadPtr (etc) calls are made from many places in SM, but only
a very few call sites will want it to call append; even inside wasm-world
there are many places where we want to generate a memory instruction that
doesn't have a TrapSite record. So we'd need some way to tell the assembler
when to call append.

The "return an offset" scheme is a bit primitive, but it is conceptually
simple, and callers that don't care about the returned value (which is, most
callers) can simply ignore it.

Post-compilation stack-map checking conceptually works the same way as the
post-compilation trap-site checking implemented in this bug: by inspecting bits
of the generated code. Here are a couple of ridealongs that make it more
stylistically consistent with the trap-site checking:

  • rename IsValidStackMapKey to IsPlausibleStackMapKey, and remove unused
    debugEnabled parameter

  • update a comment

This adds enum class TrapInsn and a couple of utility routines for it. A
TrapInsn indicates roughly what kind of instruction we expect to see at a
trap site (read/write+size, or UD, or "Atomic") as a basis for later
(post-compilation) comparison. The new enum is not used for anything in this
patch.

Depends on D187456

This adds a function SummarizeInsn, which inspects the machine instruction at
a specified location and tries to determine what kind of instruction it is,
returning a value in enum class TrapInfo (or none). This is necessarily
architecture-dependent. Currently this is implemented only for x86_{32,64} and
arm{32,64}; for other targets we will for now skip the checking enabled by this
function.

This function is only used by ModuleGenerator::finishCodeTier to audit wasm
trap sites. It doesn't need to handle the whole complexity of the target
machine's instruction set. It only needs to handle the subset used by the
trappable instructions we actually generate.

Depends on D187457

The trapsite-placement bug fixes later in this patch series lead to a situation
where, on arm64, a wasm-code-generation routine that creates an
AutoForbidPoolsAndNops scope calls a lower-level routine that also creates an
AutoForbidPoolsAndNops scope. This lower-level routine is also called from
other places, so it isn't possible to remove either scope; instead
AutoForbidPoolsAndNops must be changed so as to tolerate these nested scopes.

Because AutoForbidPoolsAndNops inherits from AutoForbidNops, the latter must
also be changed to handle nested scopes.

In both cases the change is basically a no-op: non-top-level entries/exits are
tracked (so we know when we exit the top level) but otherwise have no effect.

The changes are really to the methods
AssemblerBufferWithConstantPools::{enter,leave}No{Nops,Pool}:

  • AssemblerBufferWithConstantPools::inhibitNops_ is changed from a bool to an
    unsigned int, so as to track entry/exit depths.

  • Similarly, ::canNotPlacePool_ is changed to an unsigned int, and also renamed
    to ::inhibitPools_, for consistency of naming with ::inhibitNops_.

  • For ::{enter,leave}NoPool, nested entries must fall within the no-pool
    instruction area implied by the top level entry. This means, apart from
    tracking entries/exits, we don't need to do anything else.

Depends on D187459

This adds type OoFFI to hold an assembler offset. It's just a uint32_t
wrapped up so it can't be confused with any other uint32_t. If the
macroassembler should create a multiple instruction sequence, it can use this
type to return the buffer offset of the first instruction in the sequence that
might fault; this may well not be the first instruction in the sequence.

For 64-bit transactions on 32-bit targets, an OoFFIPair is also provided.

Depends on D187460

This is a big, although conceptually simple, patch. It makes some (macro)
assembler routines return an OoFFI rather than void, and uses these to
improve the accuracy of TrapSite placement. In doing so it fixes both known
bugs relating to incorrect TrapSite placement, that is, 1843631 and 1843991.

As a result of the following, the construction

  OoFFI ooffi = OoFFI(currentOffset());
  masm.emitSomethingThatMightTrap(..);
  // assume that `ooffi` is the offset of the trapping insn generated

is potentially wrong, and pretty much asking for trouble, unless you have some
way to know for sure that masm.emitSomethingThatMightTrap(..) won't add any
constant pools or other instructions before the instruction of interest. The
"correct" way to do this is:

  OoFFI ooffi = masm.emitSomethingThatMightTrap(..);
  // use `ooffi` as obtained from lower levels in the assembler stack

The patch has the following components:

  • [WasmBaselineCompile.cpp] *::emitTrapSite: (and Ion equivalents): this no
    longer queries the assembler offset itself. Instead you must pass it an
    OoFFI (as well as a TrapInsn). This pretty much drives all of the changes
    listed below.

  • [WasmBaselineCompile.cpp] ::emitGcGet/Set (and Ion equivalents): these now
    get a suitable OoFFI from the masm. methods they call. No more querying of
    currentOffset().

  • [WasmBaselineCompile.cpp] The policy has been adopted that all
    macroassembler methods reachable from ::emitGcGet/Set (and Ion equivalents)
    must return an OoFFI. This in turn drives OoFFI-fication down through the
    assembler stack. The number of methods changed has been kept to a minimum.

  • [MacroAssembler-vixl.cpp] MacroAssembler::LoadStoreMacro: now wraps faulting
    insns with AutoForbidPoolsAndNops and returns an OoFFI. This fixes the
    arm64 TrapSite bug, 1843991. arm32 is not yet fixed.

  • [MacroAssembler-x86-shared.h] MacroAssemblerX86Shared::store8: this is the
    specific fix for bug 1843631.

  • For 64-bit integer transactions on 32 bit targets, it is hard (probably
    impossible) to argue that one of the two 32-bit transactions will fault
    before the other. Eg, MacroAssemblerX86::load64/store64 -- their sequencing
    depends on register-availability constraints. It seems safest to have a
    TrapSite for both transactions, so that has been done.

  • [WasmBaselineCompile.cpp] BaseCompiler::callIndirect: ridealong: fix
    incorrect OOM check !oob; should be !nullref.

  • [WasmFrameIter.cpp] GenerateJitEntryPrologue: correctly delimit some
    AutoForbidPoolsAndNops scopes.

  • [As yet unresolved] There are many places where existing code queries
    masm.size() rather than masm.currentOffset(). These need to be
    investigated and made consistent.

Depends on D187461

Finally, this small patch enables checking of TrapSites at the end of wasm
compilation, for x86, x64 and arm64. Function SummarizeInsn also supports
arm32, but checking on arm32 is disabled until such time as the bug(s) it
detects have been fixed.

Depends on D187462

See Also: → 1855963
See Also: → 1855959
Attachment #9351593 - Attachment description: Bug 1846474 - Part 2: add `enum class TrapInsn`. r=rhunt. → Bug 1846474 - Part 2: add `enum class TrapMachineInsn`. r=rhunt.
Attachment #9351594 - Attachment description: Bug 1846474 - Part 3: add function `SummarizeInsn`. r=nbp. → Bug 1846474 - Part 3: add function `SummarizeTrapInstruction`. r=nbp.
Attachment #9351596 - Attachment description: Bug 1846474 - Part 5: add type `OoFFI` (Offset of the First Faulting Instruction). r=rhunt. → Bug 1846474 - Part 5: add type `FaultingCodeOffset`. r=rhunt.
Attachment #9351597 - Attachment description: Bug 1846474 - Part 6: make some assembler routines return OoFFIs, and use them to improve TrapSite placement accuracy. r=rhunt,nbp. → Bug 1846474 - Part 6: make some assembler routines return `FaultingCodeOffset`, and use them to improve TrapSite placement accuracy. r=rhunt,nbp.
Pushed by jseward@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b8ad8503d4bf Part 1: (ridealong) cleanup of post-compilation stack-map checks. r=rhunt. https://hg.mozilla.org/integration/autoland/rev/8b4b42a31cd2 Part 2: add `enum class TrapMachineInsn`. r=rhunt. https://hg.mozilla.org/integration/autoland/rev/84ff0b59368d Part 3: add function `SummarizeTrapInstruction`. r=nbp. https://hg.mozilla.org/integration/autoland/rev/44a8e6272cad Part 4: arm64: make `AutoForbidPoolsAndNops` nestable. r=nbp. https://hg.mozilla.org/integration/autoland/rev/8179fce04692 Part 5: add type `FaultingCodeOffset`. r=rhunt. https://hg.mozilla.org/integration/autoland/rev/b3ab651f5ef8 Part 6: make some assembler routines return `FaultingCodeOffset`, and use them to improve TrapSite placement accuracy. r=rhunt,nbp. https://hg.mozilla.org/integration/autoland/rev/656ad1cea330 Part 7: enable TrapSite debug checking for x86, x64, arm64. r=rhunt. https://hg.mozilla.org/integration/autoland/rev/666d87a27481 apply code formatting via Lando
Blocks: 1857001
Duplicate of this bug: 1843631
Duplicate of this bug: 1843991
See Also: → 1855960
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: