Closed Bug 1675844 Opened 4 years ago Closed 4 years ago

Assertion failure: mLength + aIncr <= reserved(), at dist/include/mozilla/Vector.h:1143

Categories

(Core :: JavaScript: WebAssembly, defect, P1)

defect

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox-esr78 85+ fixed
firefox83 --- wontfix
firefox84 --- wontfix
firefox85 + fixed
firefox86 + fixed

People

(Reporter: decoder, Assigned: lth)

References

Details

(5 keywords, Whiteboard: [adv-main85+r][adv-esr78.7+r][sec-survey])

Attachments

(6 files)

The attached testcase crashes on mozilla-central revision 41effaf024a5 (debug build, run with --no-threads --fuzzing-safe --wasm-compiler=baseline --disable-oom-functions test.js).

Backtrace:

==5798==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0xaaaaca964290 bp 0xffffcf2f0fe0 sp 0xffffcf2f0fe0 T5798)
==5798==The signal is caused by a WRITE memory access.
==5798==Hint: address points to the zero page.
    #0 0xaaaaca964290 in void mozilla::Vector<js::wasm::Stk, 0ul, js::SystemAllocPolicy>::infallibleEmplaceBack<js::wasm::Stk>(js::wasm::Stk&&) dist/include/mozilla/Vector.h
    #1 0xaaaaca8f6214 in js::wasm::BaseCompiler::pushReturnValueOfCall(js::wasm::BaseCompiler::FunctionCall const&, js::jit::MIRType) js/src/wasm/WasmBaselineCompile.cpp
    #2 0xaaaaca9052b0 in js::wasm::BaseCompiler::emitInstanceCall(unsigned int, js::wasm::SymbolicAddressSignature const&, bool) js/src/wasm/WasmBaselineCompile.cpp:11505:5
    #3 0xaaaaca91f644 in js::wasm::BaseCompiler::emitMemorySize() js/src/wasm/WasmBaselineCompile.cpp:11536:10
    #4 0xaaaaca91f644 in js::wasm::BaseCompiler::emitBody() js/src/wasm/WasmBaselineCompile.cpp:14650:9
    #5 0xaaaaca931258 in js::wasm::BaseCompiler::emitFunction() js/src/wasm/WasmBaselineCompile.cpp:15455:8
    #6 0xaaaaca931258 in js::wasm::BaselineCompileFunctions(js::wasm::ModuleEnvironment const&, js::wasm::CompilerEnvironment const&, js::LifoAlloc&, mozilla::Vector<js::wasm::FuncCompileInput, 8ul, js::SystemAllocPolicy> const&, js::wasm::CompiledCode*, mozilla::UniquePtr<char [], JS::FreePolicy>*) js/src/wasm/WasmBaselineCompile.cpp:15624:12
    #7 0xaaaacaa118cc in ExecuteCompileTask(js::wasm::CompileTask*, mozilla::UniquePtr<char [], JS::FreePolicy>*) js/src/wasm/WasmGenerator.cpp:769:12
    #8 0xaaaacaa1379c in js::wasm::ModuleGenerator::locallyCompileCurrentTask() js/src/wasm/WasmGenerator.cpp:819:8
    #9 0xaaaacaa1379c in js::wasm::ModuleGenerator::finishFuncDefs() js/src/wasm/WasmGenerator.cpp:957:24
    #10 0xaaaaca94d2f0 in bool DecodeCodeSection<js::wasm::Decoder>(js::wasm::ModuleEnvironment const&, js::wasm::Decoder&, js::wasm::ModuleGenerator&) js/src/wasm/WasmCompile.cpp:569:13
    #11 0xaaaaca94cc08 in js::wasm::CompileBuffer(js::wasm::CompileArgs const&, js::wasm::ShareableBytes const&, mozilla::UniquePtr<char [], JS::FreePolicy>*, mozilla::Vector<mozilla::UniquePtr<char [], JS::FreePolicy>, 0ul, js::SystemAllocPolicy>*, JS::OptimizedEncodingListener*, JSTelemetrySender) js/src/wasm/WasmCompile.cpp:592:8
    #12 0xaaaacaa6bc18 in js::WasmModuleObject::construct(JSContext*, unsigned int, JS::Value*) js/src/wasm/WasmJS.cpp:1631:25
    #13 0xaaaac9a675d4 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) js/src/vm/Interpreter.cpp:507:13
    [...]

Marking s-s because this is a range assertion.

Attached file Testcase
Assignee: nobody → lhansen
Severity: -- → S2
Status: NEW → ASSIGNED
Priority: -- → P1

Unable to repro locally with either m-c or the given rev. Can I have a concrete option string for configure? The one I'm using is: ../configure --enable-simulator=arm64 --disable-jemalloc --enable-address-sanitizer --enable-undefined-sanitizer --enable-gczeal --enable-debug --enable-optimize="-O2 -g" --without-intl-api

This is on real hardware, as simulator builds are still unsupported in fuzzfetch.

It's possible this is related to what Julian's doing in another bug. Reassigning to Julian since he has hardware, but lowering to P2 since this will have to wait until a few other things are done.

Assignee: lhansen → jseward
Flags: needinfo?(jseward)
Priority: P1 → P2
Group: core-security → javascript-core-security

I can reproduce on real arm64 h/w. Interestingly, CL rejects this thusly:

test.js:17:14 CompileError: Invalid input WebAssembly code at offset 155: type mismatch: expected f32, found i32

whereas baseline fails as claimed:

Assertion failure: mLength + aIncr <= reserved(), at /home/sewardj/MOZ/MC/js/BUILD64DEB/dist/include/mozilla/Vector.h:1143
Flags: needinfo?(jseward)

The baseline failure is really quite strange (to me). It fails in an assertion
whilst infallibly trying to grow a vector, and that vector is the baseline
compiler's operand stack -- pushI32 as called from pushReturnValueOfCall.

If and how that is related to CL claiming the input is invalid, I struggle to guess.

The disassembly, wasm2wat rejects it without --no-check, so this is likely an error in our validation:

[lhansen@yojimbo tmp]$ wasm2wat --no-check test.wasm 
(module
  (type (;0;) (func (result f32 f32 i32)))
  (func $main (type 0) (result f32 f32 i32)
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    memory.size
    call $main
    call $main
    call $main
    call_indirect (type 0)
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    memory.size
    call $main
    call $main
    call $main
    call_indirect (type 0)
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    call $main
    memory.size)
  (table (;0;) 62 255 funcref)
  (memory (;0;) 15 18)
  (export "t1" (table 0))
  (export "memory" (memory 0)))
Assignee: jseward → lhansen
Priority: P2 → P1
OS: Linux → All
Hardware: ARM64 → All

Ion gets the validation failure right but baseline hits the assert (on x64).

If I remove two of the call $main it no longer fails. If I remove one, it fails. In that case, there are 43*3 + 3 results on the stack at the end of the function. 43*3 = 129, and until proven otherwise I refuse to believe that it is a coincidence that once I drop below 128 the error goes away.

I'll attach a reduced test case that shows this in more detail.

At the moment I tend to think this is not a terrible bug. We fail in a debug build by hitting the assert, but in a release build the baseline compiler correctly rejects the program. There may of course be the risk of some memory corruption during compilation, and that can be bad, but there is no machine code generated from this program that is ever run.

Attached file bad.js

The problem seems to be that the baseline compiler assumes that opcodes can only push a limited number of values onto the value stack, and so (for speed) it can pre-reserve space on the stack in the decoding loop, doing this only so often. But with multiple values this is not so: functions can in principle push as many values as they like. This is a simpler TC:

var xs = new Array(300).fill(1);
var txt = `
(module
  (func $f (result ${xs.map(_ => 'i32 ').join(' ')})
    ${xs.map((_) => '(i32.const 1)').join(' ')})
  (func $g
    (call $f)))`;
new WebAssembly.Module(wasmTextToBinary(txt));

Here, f returns 300 results, which are pushed onto the stack in g. This triggers the assert immediately, because the pre-reserved stack is not that large.

This could be sort of gnarly to fix because there's no path along which to propagate an OOM from the reservation that has to happen fairly deep in the bowels of the compiler.

(This commit message, and some comments in the code, should be removed
before landing. There are test cases, but they are not in this
patch.)

Several interlocking parts to this patch:

First, the pre-reservation of the value stack in the decode loop does
not work for opcodes with more than a fixed number of results: a
function or a block can return a larger number of values than any
reasonable pre-reserved size. The pushResults method must reserve
space for the values it will push, and must propagate OOM up the call
tree. This requires a fair amount of plumbing but is straightforward.

Second, the way the reservation should be made both in pushResults and
in the decode loop is to take into account the max number of pushes
per opcode, which can be defined as a constant (here very
conservatively as 10), but which is not "2" as asserted by a comment
in the existing code, and even if it were "2" it would not be that for
the reason given in the comment.

Third, the reservation code in the decode loop is cleaned up by
removing the guard that runs reservation code only once every 50
decode iterations, since this guard was always of questionable utility
and seems only to complicate the reasoning around the reservation
logic.

Finally, the guarded reservation code also contained a check for
available space in the temp allocator the baseline compiler uses for
allocating delayed code generation objects. This check was necessary
because that allocator is infallible. However, the compiler uses the
allocator rarely, and all existing uses of the allocator check for
allocation failure already and propagate the OOM, so the reservation
check can be removed if the allocator is simply marked as fallible.

I bet we'll want to do something about ESR, minimally. It won't be this patch, but could be a release assert in pushResults + removing the guard in the decode loop so that the chance of hitting the release assert is very small.

Given where we are in the cycle, it's probably best for this patch to target 85/78.7esr.

(In reply to Ryan VanderMeulen [:RyanVM] from comment #15)

Given where we are in the cycle, it's probably best for this patch to target 85/78.7esr.

Noted.

The test case in comment 12, when cranked up a bit, actually causes a related but different failure in ion too. Funfunfun.

Attachment #9190614 - Attachment description: Bug 1675844 - Reserve space in pushResults. r?jseward → Bug 1675844 - tidy up. r?jseward

Commit message, comments, and test cases for s-s patch that landed
separately.

Several interlocking parts to the baseline part of the patch:

First, the pre-reservation of the value stack in the decode loop does
not work for opcodes with more than a fixed number of results: a
function or a block can return a larger number of values than any
reasonable pre-reserved size. The pushResults method must reserve
space for the values it will push, and must propagate OOM up the call
tree. This requires a fair amount of plumbing but is straightforward.

Second, the way the reservation should be made both in pushResults and
in the decode loop is to take into account the max number of pushes
per opcode, which can be defined as a constant (here very
conservatively as 10), but which is not "2" as asserted by a comment
in the existing code, and even if it were "2" it would not be that for
the reason given in the comment.

Third, the reservation code in the decode loop is cleaned up by
removing the guard that runs reservation code only once every 50
decode iterations, since this guard was always of questionable utility
and seems only to complicate the reasoning around the reservation
logic.

Finally, the guarded reservation code also contained a check for
available space in the temp allocator the baseline compiler uses for
allocating delayed code generation objects. This check was necessary
because that allocator is infallible. However, the compiler uses the
allocator rarely, and all existing uses of the allocator check for
allocation failure already and propagate the OOM, so the reservation
check can be removed if the allocator is simply marked as fallible.

On the ion side of things:

Every time we process an open-ended list of values for which we may
allocate nodes, we must call ensureBallast on the allocator, since it
is infallible. This was done in one place in WasmIonCompile, but not
both places where it was needed.

As far as the test case is concerned:

This test was enough to find the errors in both baseline and ion. It
can probably be pared down from the 100..300 loop it has now to
something smaller, but there you have it.

Depends on D98361

Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: I'm going to go with "hard". There are two separate issues, both are memory clobbers (writes past the end of an arena). It's possible some kind of spray (into data memory) would work to clobber data structures higher up in memory. In one of the issues the contents of the spray are slightly controllable but within very tight parameters; in the other issue, not really at all.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: ESR78 and later
  • If not all supported branches, which bug introduced the flaw?: Bug 1628321
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?: I have a backport for ESR78 and this should apply to 84 beta if we want.
  • How likely is this patch to cause regressions; how much testing does it need?: The patch is a conservative fix and not likely to cause functional regressions. There might be slight regressions in webassembly compilation speed but I doubt it's measurable.
Attachment #9190614 - Flags: sec-approval?

The second patch contains test cases and comments and a comprehensive commit message and should land once we've opened this bug up.

Flags: in-testsuite?

Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward

sec-approved to land on 1/10 or 1/11 - I'd like to avoid this sitting in the repo for an entire month before release. Date chosen to give a week of testing before RC week; if relman thinks that's cutting it too close, perhaps half a week earlier?

Attachment #9190614 - Flags: sec-approval? → sec-approval+

AIUI, that targets FF86 / ESR 78.7 then? Or do you also want to apply the smaller patch to FF85 beta?

As to the landing date, if I have to do it it would have to be on 1/11. But I'll be working the entire week before that, so we can hash something out then if we haven't before.

Tom, should I do approval-esr? and/or approval-beta? for the other patch or do I wait until we've landed the big patch?

Flags: needinfo?(tom)

No, when it lands in m-c on 1/11 we'll also uplift it for 85. 1/11 is fine, of course. Please set the approval flags after we release 84 on 12/15 otherwise it will confuse relman.

Flags: needinfo?(tom)
Whiteboard: [land after 2021-01-10]
Whiteboard: [land after 2021-01-10] → [land after 2021-01-10][no-nag]

Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward

Beta/Release Uplift Approval Request

  • User impact if declined: Crash, possible security problem, see comment 18, comment 20.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Mostly this is plumbing, and the core of the patch is a simplification. We will land this patch on Nightly concurrently and test it there too. The main risk is a risk of secondary performance problems.
  • String changes made/needed:
Attachment #9190614 - Flags: approval-mozilla-beta?
Attachment #9190875 - Flags: approval-mozilla-beta?
Attachment #9190983 - Flags: approval-mozilla-beta?

Comment on attachment 9190983 [details]
Bug 1675844 - tidy up ESR. r?jseward

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: Crashes and possible security problems.
  • Fix Landed on Version: will land in 86 + 85 beta
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Very local fix with simple logic.
  • String or UUID changes made by this patch:
Attachment #9190983 - Flags: approval-mozilla-esr78?
Attachment #9190614 - Flags: approval-mozilla-esr78?
Attachment #9190875 - Flags: approval-mozilla-esr78?
Attachment #9190614 - Flags: approval-mozilla-esr78?
Attachment #9190875 - Flags: approval-mozilla-esr78?
Attachment #9190875 - Flags: approval-mozilla-beta?
Attachment #9190983 - Flags: approval-mozilla-beta?

Removed some flags that were unhelpfully auto-set by bugzilla. Flags should be correct now: one patch for nightly + beta; the other for esr; the third for tests + docn to land later. Will rebase all patches first week of January to make sure they're current and continue to work.

Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward

approved for 85.0b8

Attachment #9190614 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9190983 [details]
Bug 1675844 - tidy up ESR. r?jseward

approved for 78.7esr

Attachment #9190983 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
Group: javascript-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch
Whiteboard: [land after 2021-01-10][no-nag]
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Whiteboard: [adv-main85+r]
Whiteboard: [adv-main85+r] → [adv-main85+r][adv-esr78.7+r]

As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.

Please visit this google form to reply.

Flags: needinfo?(lhansen)
Whiteboard: [adv-main85+r][adv-esr78.7+r] → [adv-main85+r][adv-esr78.7+r][sec-survey]
Flags: needinfo?(lhansen)
Group: core-security-release
Flags: in-testsuite? → in-testsuite+
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e23c3e39ac59
comments, test cases, commit msg.  r=jseward
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: