Assertion failure: mLength + aIncr <= reserved(), at dist/include/mozilla/Vector.h:1143
Categories
(Core :: JavaScript: WebAssembly, defect, P1)
Tracking
()
People
(Reporter: decoder, Assigned: lth)
References
Details
(5 keywords, Whiteboard: [adv-main85+r][adv-esr78.7+r][sec-survey])
Attachments
(6 files)
9.73 KB,
application/octet-stream
|
Details | |
172 bytes,
application/wasm
|
Details | |
806 bytes,
application/x-javascript
|
Details | |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
tjr
:
sec-approval+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-esr78+
|
Details | Review |
The attached testcase crashes on mozilla-central revision 41effaf024a5 (debug build, run with --no-threads --fuzzing-safe --wasm-compiler=baseline --disable-oom-functions test.js).
Backtrace:
==5798==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0xaaaaca964290 bp 0xffffcf2f0fe0 sp 0xffffcf2f0fe0 T5798)
==5798==The signal is caused by a WRITE memory access.
==5798==Hint: address points to the zero page.
#0 0xaaaaca964290 in void mozilla::Vector<js::wasm::Stk, 0ul, js::SystemAllocPolicy>::infallibleEmplaceBack<js::wasm::Stk>(js::wasm::Stk&&) dist/include/mozilla/Vector.h
#1 0xaaaaca8f6214 in js::wasm::BaseCompiler::pushReturnValueOfCall(js::wasm::BaseCompiler::FunctionCall const&, js::jit::MIRType) js/src/wasm/WasmBaselineCompile.cpp
#2 0xaaaaca9052b0 in js::wasm::BaseCompiler::emitInstanceCall(unsigned int, js::wasm::SymbolicAddressSignature const&, bool) js/src/wasm/WasmBaselineCompile.cpp:11505:5
#3 0xaaaaca91f644 in js::wasm::BaseCompiler::emitMemorySize() js/src/wasm/WasmBaselineCompile.cpp:11536:10
#4 0xaaaaca91f644 in js::wasm::BaseCompiler::emitBody() js/src/wasm/WasmBaselineCompile.cpp:14650:9
#5 0xaaaaca931258 in js::wasm::BaseCompiler::emitFunction() js/src/wasm/WasmBaselineCompile.cpp:15455:8
#6 0xaaaaca931258 in js::wasm::BaselineCompileFunctions(js::wasm::ModuleEnvironment const&, js::wasm::CompilerEnvironment const&, js::LifoAlloc&, mozilla::Vector<js::wasm::FuncCompileInput, 8ul, js::SystemAllocPolicy> const&, js::wasm::CompiledCode*, mozilla::UniquePtr<char [], JS::FreePolicy>*) js/src/wasm/WasmBaselineCompile.cpp:15624:12
#7 0xaaaacaa118cc in ExecuteCompileTask(js::wasm::CompileTask*, mozilla::UniquePtr<char [], JS::FreePolicy>*) js/src/wasm/WasmGenerator.cpp:769:12
#8 0xaaaacaa1379c in js::wasm::ModuleGenerator::locallyCompileCurrentTask() js/src/wasm/WasmGenerator.cpp:819:8
#9 0xaaaacaa1379c in js::wasm::ModuleGenerator::finishFuncDefs() js/src/wasm/WasmGenerator.cpp:957:24
#10 0xaaaaca94d2f0 in bool DecodeCodeSection<js::wasm::Decoder>(js::wasm::ModuleEnvironment const&, js::wasm::Decoder&, js::wasm::ModuleGenerator&) js/src/wasm/WasmCompile.cpp:569:13
#11 0xaaaaca94cc08 in js::wasm::CompileBuffer(js::wasm::CompileArgs const&, js::wasm::ShareableBytes const&, mozilla::UniquePtr<char [], JS::FreePolicy>*, mozilla::Vector<mozilla::UniquePtr<char [], JS::FreePolicy>, 0ul, js::SystemAllocPolicy>*, JS::OptimizedEncodingListener*, JSTelemetrySender) js/src/wasm/WasmCompile.cpp:592:8
#12 0xaaaacaa6bc18 in js::WasmModuleObject::construct(JSContext*, unsigned int, JS::Value*) js/src/wasm/WasmJS.cpp:1631:25
#13 0xaaaac9a675d4 in CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) js/src/vm/Interpreter.cpp:507:13
[...]
Marking s-s because this is a range assertion.
Reporter | ||
Comment 1•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
Unable to repro locally with either m-c or the given rev. Can I have a concrete option string for configure? The one I'm using is: ../configure --enable-simulator=arm64 --disable-jemalloc --enable-address-sanitizer --enable-undefined-sanitizer --enable-gczeal --enable-debug --enable-optimize="-O2 -g" --without-intl-api
Reporter | ||
Comment 3•5 years ago
|
||
This is on real hardware, as simulator builds are still unsupported in fuzzfetch.
Assignee | ||
Comment 4•5 years ago
|
||
It's possible this is related to what Julian's doing in another bug. Reassigning to Julian since he has hardware, but lowering to P2 since this will have to wait until a few other things are done.
Updated•5 years ago
|
Updated•5 years ago
|
Comment 5•5 years ago
|
||
I can reproduce on real arm64 h/w. Interestingly, CL rejects this thusly:
test.js:17:14 CompileError: Invalid input WebAssembly code at offset 155: type mismatch: expected f32, found i32
whereas baseline fails as claimed:
Assertion failure: mLength + aIncr <= reserved(), at /home/sewardj/MOZ/MC/js/BUILD64DEB/dist/include/mozilla/Vector.h:1143
Comment 6•5 years ago
|
||
The baseline failure is really quite strange (to me). It fails in an assertion
whilst infallibly trying to grow a vector, and that vector is the baseline
compiler's operand stack -- pushI32
as called from pushReturnValueOfCall
.
If and how that is related to CL claiming the input is invalid, I struggle to guess.
Assignee | ||
Comment 7•5 years ago
|
||
Assignee | ||
Comment 8•5 years ago
|
||
The disassembly, wasm2wat rejects it without --no-check, so this is likely an error in our validation:
[lhansen@yojimbo tmp]$ wasm2wat --no-check test.wasm
(module
(type (;0;) (func (result f32 f32 i32)))
(func $main (type 0) (result f32 f32 i32)
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
memory.size
call $main
call $main
call $main
call_indirect (type 0)
call $main
call $main
call $main
call $main
call $main
call $main
memory.size
call $main
call $main
call $main
call_indirect (type 0)
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
call $main
memory.size)
(table (;0;) 62 255 funcref)
(memory (;0;) 15 18)
(export "t1" (table 0))
(export "memory" (memory 0)))
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 9•5 years ago
|
||
Ion gets the validation failure right but baseline hits the assert (on x64).
Assignee | ||
Comment 10•5 years ago
•
|
||
If I remove two of the call $main
it no longer fails. If I remove one, it fails. In that case, there are 43*3 + 3 results on the stack at the end of the function. 43*3 = 129, and until proven otherwise I refuse to believe that it is a coincidence that once I drop below 128 the error goes away.
I'll attach a reduced test case that shows this in more detail.
At the moment I tend to think this is not a terrible bug. We fail in a debug build by hitting the assert, but in a release build the baseline compiler correctly rejects the program. There may of course be the risk of some memory corruption during compilation, and that can be bad, but there is no machine code generated from this program that is ever run.
Assignee | ||
Comment 11•5 years ago
|
||
Assignee | ||
Comment 12•5 years ago
|
||
The problem seems to be that the baseline compiler assumes that opcodes can only push a limited number of values onto the value stack, and so (for speed) it can pre-reserve space on the stack in the decoding loop, doing this only so often. But with multiple values this is not so: functions can in principle push as many values as they like. This is a simpler TC:
var xs = new Array(300).fill(1);
var txt = `
(module
(func $f (result ${xs.map(_ => 'i32 ').join(' ')})
${xs.map((_) => '(i32.const 1)').join(' ')})
(func $g
(call $f)))`;
new WebAssembly.Module(wasmTextToBinary(txt));
Here, f returns 300 results, which are pushed onto the stack in g. This triggers the assert immediately, because the pre-reserved stack is not that large.
This could be sort of gnarly to fix because there's no path along which to propagate an OOM from the reservation that has to happen fairly deep in the bowels of the compiler.
Assignee | ||
Comment 13•5 years ago
|
||
(This commit message, and some comments in the code, should be removed
before landing. There are test cases, but they are not in this
patch.)
Several interlocking parts to this patch:
First, the pre-reservation of the value stack in the decode loop does
not work for opcodes with more than a fixed number of results: a
function or a block can return a larger number of values than any
reasonable pre-reserved size. The pushResults method must reserve
space for the values it will push, and must propagate OOM up the call
tree. This requires a fair amount of plumbing but is straightforward.
Second, the way the reservation should be made both in pushResults and
in the decode loop is to take into account the max number of pushes
per opcode, which can be defined as a constant (here very
conservatively as 10), but which is not "2" as asserted by a comment
in the existing code, and even if it were "2" it would not be that for
the reason given in the comment.
Third, the reservation code in the decode loop is cleaned up by
removing the guard that runs reservation code only once every 50
decode iterations, since this guard was always of questionable utility
and seems only to complicate the reasoning around the reservation
logic.
Finally, the guarded reservation code also contained a check for
available space in the temp allocator the baseline compiler uses for
allocating delayed code generation objects. This check was necessary
because that allocator is infallible. However, the compiler uses the
allocator rarely, and all existing uses of the allocator check for
allocation failure already and propagate the OOM, so the reservation
check can be removed if the allocator is simply marked as fallible.
Assignee | ||
Comment 14•5 years ago
|
||
I bet we'll want to do something about ESR, minimally. It won't be this patch, but could be a release assert in pushResults + removing the guard in the decode loop so that the chance of hitting the release assert is very small.
Comment 15•5 years ago
|
||
Given where we are in the cycle, it's probably best for this patch to target 85/78.7esr.
Assignee | ||
Comment 16•5 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #15)
Given where we are in the cycle, it's probably best for this patch to target 85/78.7esr.
Noted.
Assignee | ||
Comment 17•5 years ago
|
||
The test case in comment 12, when cranked up a bit, actually causes a related but different failure in ion too. Funfunfun.
Updated•5 years ago
|
Assignee | ||
Comment 18•5 years ago
|
||
Commit message, comments, and test cases for s-s patch that landed
separately.
Several interlocking parts to the baseline part of the patch:
First, the pre-reservation of the value stack in the decode loop does
not work for opcodes with more than a fixed number of results: a
function or a block can return a larger number of values than any
reasonable pre-reserved size. The pushResults method must reserve
space for the values it will push, and must propagate OOM up the call
tree. This requires a fair amount of plumbing but is straightforward.
Second, the way the reservation should be made both in pushResults and
in the decode loop is to take into account the max number of pushes
per opcode, which can be defined as a constant (here very
conservatively as 10), but which is not "2" as asserted by a comment
in the existing code, and even if it were "2" it would not be that for
the reason given in the comment.
Third, the reservation code in the decode loop is cleaned up by
removing the guard that runs reservation code only once every 50
decode iterations, since this guard was always of questionable utility
and seems only to complicate the reasoning around the reservation
logic.
Finally, the guarded reservation code also contained a check for
available space in the temp allocator the baseline compiler uses for
allocating delayed code generation objects. This check was necessary
because that allocator is infallible. However, the compiler uses the
allocator rarely, and all existing uses of the allocator check for
allocation failure already and propagate the OOM, so the reservation
check can be removed if the allocator is simply marked as fallible.
On the ion side of things:
Every time we process an open-ended list of values for which we may
allocate nodes, we must call ensureBallast on the allocator, since it
is infallible. This was done in one place in WasmIonCompile, but not
both places where it was needed.
As far as the test case is concerned:
This test was enough to find the errors in both baseline and ion. It
can probably be pared down from the 100..300 loop it has now to
something smaller, but there you have it.
Depends on D98361
Assignee | ||
Comment 19•5 years ago
|
||
Assignee | ||
Comment 20•5 years ago
|
||
Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward
Security Approval Request
- How easily could an exploit be constructed based on the patch?: I'm going to go with "hard". There are two separate issues, both are memory clobbers (writes past the end of an arena). It's possible some kind of spray (into data memory) would work to clobber data structures higher up in memory. In one of the issues the contents of the spray are slightly controllable but within very tight parameters; in the other issue, not really at all.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
- Which older supported branches are affected by this flaw?: ESR78 and later
- If not all supported branches, which bug introduced the flaw?: Bug 1628321
- Do you have backports for the affected branches?: Yes
- If not, how different, hard to create, and risky will they be?: I have a backport for ESR78 and this should apply to 84 beta if we want.
- How likely is this patch to cause regressions; how much testing does it need?: The patch is a conservative fix and not likely to cause functional regressions. There might be slight regressions in webassembly compilation speed but I doubt it's measurable.
Assignee | ||
Comment 21•5 years ago
|
||
The second patch contains test cases and comments and a comprehensive commit message and should land once we've opened this bug up.
Comment 22•5 years ago
|
||
Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward
sec-approved to land on 1/10 or 1/11 - I'd like to avoid this sitting in the repo for an entire month before release. Date chosen to give a week of testing before RC week; if relman thinks that's cutting it too close, perhaps half a week earlier?
Assignee | ||
Comment 23•5 years ago
|
||
AIUI, that targets FF86 / ESR 78.7 then? Or do you also want to apply the smaller patch to FF85 beta?
As to the landing date, if I have to do it it would have to be on 1/11. But I'll be working the entire week before that, so we can hash something out then if we haven't before.
Tom, should I do approval-esr? and/or approval-beta? for the other patch or do I wait until we've landed the big patch?
Comment 24•5 years ago
|
||
No, when it lands in m-c on 1/11 we'll also uplift it for 85. 1/11 is fine, of course. Please set the approval flags after we release 84 on 12/15 otherwise it will confuse relman.
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 25•5 years ago
|
||
Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward
Beta/Release Uplift Approval Request
- User impact if declined: Crash, possible security problem, see comment 18, comment 20.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Mostly this is plumbing, and the core of the patch is a simplification. We will land this patch on Nightly concurrently and test it there too. The main risk is a risk of secondary performance problems.
- String changes made/needed:
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 26•5 years ago
|
||
Comment on attachment 9190983 [details]
Bug 1675844 - tidy up ESR. r?jseward
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration:
- User impact if declined: Crashes and possible security problems.
- Fix Landed on Version: will land in 86 + 85 beta
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Very local fix with simple logic.
- String or UUID changes made by this patch:
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 27•5 years ago
|
||
Removed some flags that were unhelpfully auto-set by bugzilla. Flags should be correct now: one patch for nightly + beta; the other for esr; the third for tests + docn to land later. Will rebase all patches first week of January to make sure they're current and continue to work.
Updated•5 years ago
|
Comment 29•5 years ago
|
||
Comment 30•5 years ago
|
||
Comment on attachment 9190614 [details]
Bug 1675844 - tidy up. r?jseward
approved for 85.0b8
Comment 31•5 years ago
|
||
Comment on attachment 9190983 [details]
Bug 1675844 - tidy up ESR. r?jseward
approved for 78.7esr
Comment 32•5 years ago
|
||
uplift |
![]() |
||
Comment 33•5 years ago
|
||
https://hg.mozilla.org/integration/autoland/rev/410530def0b511f090003b6a9e5f7a0d75678007
https://hg.mozilla.org/mozilla-central/rev/410530def0b5
Comment 34•5 years ago
|
||
uplift |
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 35•5 years ago
|
||
As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.
Please visit this google form to reply.
Assignee | ||
Updated•5 years ago
|
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Comment 36•4 years ago
|
||
Comment 37•4 years ago
|
||
bugherder |
Description
•