Closed Bug 1777604 (CVE-2022-40957) Opened 2 years ago Closed 2 years ago

Assertion failure: instCache_[offset] == instValue, at jit/arm64/vixl/MozCachingDecoder.h:77

Categories

(Core :: JavaScript: WebAssembly, defect, P2)

ARM64
Linux
defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox-esr91 --- wontfix
firefox-esr102 105+ fixed
firefox103 --- wontfix
firefox104 --- wontfix
firefox105 + fixed

People

(Reporter: gkw, Assigned: rhunt)

References

(Blocks 2 open bugs)

Details

(Keywords: sec-low, testcase, Whiteboard: [adv-main105+][adv-esr102.3+])

Attachments

(5 files, 2 obsolete files)

for (let i = 0; i < 99; i++) {
    relazifyFunctions();
    const { Instance, compileStreaming } = WebAssembly;
    function f(x, g) {
        var cache = streamCacheEntry(wasmTextToBinary(x));
        compileStreaming(cache).then(function (m) {
            g(new Instance(m, undefined));
            return compileStreaming(cache);
        }).then(function (m) {
            g(new Instance(m, undefined));
        });
        drainJobQueue();
    }
    f('(module \
        (func $test (param i64) (result f64) local.get 0 f64.convert_u/i64 )\
        (func (export "run") (result i32) i64.const 1 call $test f64.const 1 f64.eq )\
    )', function (i) {});
    f('(module \
        (func (export "run") (result i64) (i64.const 1))\
    )', function (i) {
        i.exports.run();
    });
}
Thread 1 "js-dbg-64-armsi" received signal SIGSEGV, Segmentation fault.
0x00005555578cbe0d in vixl::CachingDecoder::Decode (this=0x7ffff6a03df0, instr=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozCachingDecoder.h:76
76	    MOZ_ASSERT_IF(val != InstDecodedKind::NotDecodedYet,
(gdb) bt
#0  0x00005555578cbe0d in vixl::CachingDecoder::Decode (this=0x7ffff6a03df0, instr=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozCachingDecoder.h:76
#1  0x00005555578cbbfc in vixl::Simulator::ExecuteInstruction (this=this@entry=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:239
#2  0x00005555578cf5c3 in vixl::Simulator::Run (this=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
#3  0x00005555578ccde0 in vixl::Simulator::RunFrom (this=0x7ffff6a35100, first=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:78
#4  vixl::Simulator::call (this=0x7ffff6a35100, entry=<optimized out>, argument_count=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:327
#5  0x0000555557d4b6bf in js::wasm::Instance::callExport (this=0x7ffff5583300, cx=0x7ffff6a2d000, funcIndex=0, args=..., level=js::wasm::CoercionLevel::Spec) at /home/skygentoo/trees/mozilla-central/js/src/wasm/WasmInstance.cpp:2197
#6  0x0000555557d8c945 in WasmCall (cx=cx@entry=0x7ffff6a2d000, argc=<optimized out>, vp=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/wasm/WasmJS.cpp:2176
#7  0x0000555556c4edc0 in CallJSNative (cx=cx@entry=0x7ffff6a2d000, native=native@entry=0x555557d8c7e0 <WasmCall(JSContext*, unsigned int, JS::Value*)>, reason=<optimized out>, reason@entry=js::CallReason::Call, args=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:421
#8  0x0000555556c40bd8 in js::InternalCallOrConstruct (cx=0x7ffff6a2d000, cx@entry=0x7ffff6af8e98, args=..., construct=construct@entry=js::NO_CONSTRUCT, reason=(js::CallReason::Getter | js::CallReason::Setter | unknown: 0xf7c86720), reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:508
#9  0x0000555556c4181e in InternalCall (cx=<optimized out>, args=..., reason=reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:575
#10 0x0000555556c41789 in js::CallFromStack (cx=0x7ffff7c87a60 <_IO_stdfile_2_lock>, args=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:579
#11 0x0000555557668199 in js::jit::DoCallFallback (cx=0x7ffff7c87a60 <_IO_stdfile_2_lock>, frame=<optimized out>, stub=0x7ffff6af8e98, argc=<optimized out>, vp=0x7ffff687fa08, res=...) at /home/skygentoo/trees/mozilla-central/js/src/jit/BaselineIC.cpp:1580
#12 0x00005555578ce4d4 in vixl::Simulator::VisitCallRedirection (this=this@entry=0x7ffff6a35100, instr=<optimized out>, instr@entry=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:739
#13 0x00005555578cd090 in vixl::Simulator::VisitException (this=0x7ffff6a35100, instr=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:447
#14 0x000055555787ae2d in vixl::Decoder::VisitException (this=<optimized out>, instr=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Decoder-vixl.cpp:876
#15 0x00005555578cc192 in vixl::CachingDecoder::Decode (this=0x7ffff6a03df0, instr=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:1127
#16 0x00005555578cbbfc in vixl::Simulator::ExecuteInstruction (this=this@entry=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:239
#17 0x00005555578cf5c3 in vixl::Simulator::Run (this=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
#18 0x00005555578ccde0 in vixl::Simulator::RunFrom (this=0x7ffff6a35100, first=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:78
#19 vixl::Simulator::call (this=0x7ffff6a35100, entry=<optimized out>, argument_count=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:327
#20 0x0000555557b0e009 in EnterJit (cx=0x7ffff6a2d000, state=..., code=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/Jit.cpp:107
#21 js::jit::MaybeEnterJit (cx=cx@entry=0x7ffff6a2d000, state=...) at /home/skygentoo/trees/mozilla-central/js/src/jit/Jit.cpp:205
#22 0x0000555556c2dc81 in js::RunScript (cx=cx@entry=0x7ffff6a2d000, state=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:380
#23 0x0000555556c40b14 in js::InternalCallOrConstruct (cx=0x7ffff6a2d000, cx@entry=0x7ffff6af8d48, args=..., construct=construct@entry=js::NO_CONSTRUCT, reason=<optimized out>, reason@entry=js::CallReason::Getter) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:540
#24 0x0000555556c4181e in InternalCall (cx=<optimized out>, args=..., reason=reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:575
#25 0x0000555556c41789 in js::CallFromStack (cx=0x7ffff7c87a60 <_IO_stdfile_2_lock>, args=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:579
#26 0x0000555557668199 in js::jit::DoCallFallback (cx=0x7ffff7c87a60 <_IO_stdfile_2_lock>, frame=<optimized out>, stub=0x7ffff6af8d48, argc=<optimized out>, vp=0x7ffff687fbe8, res=...) at /home/skygentoo/trees/mozilla-central/js/src/jit/BaselineIC.cpp:1580
#27 0x00005555578ce4d4 in vixl::Simulator::VisitCallRedirection (this=this@entry=0x7ffff6a35100, instr=<optimized out>, instr@entry=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:739
#28 0x00005555578cd090 in vixl::Simulator::VisitException (this=0x7ffff6a35100, instr=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:447
#29 0x000055555787ae2d in vixl::Decoder::VisitException (this=<optimized out>, instr=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Decoder-vixl.cpp:876
#30 0x00005555578cc192 in vixl::CachingDecoder::Decode (this=0x7ffff6a03df0, instr=0x7ffff6a42c68) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:1127
#31 0x00005555578cbbfc in vixl::Simulator::ExecuteInstruction (this=this@entry=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:239
#32 0x00005555578cf5c3 in vixl::Simulator::Run (this=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
#33 0x00005555578ccde0 in vixl::Simulator::RunFrom (this=0x7ffff6a35100, first=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:78
#34 vixl::Simulator::call (this=0x7ffff6a35100, entry=<optimized out>, argument_count=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:327
#35 0x0000555557b0e009 in EnterJit (cx=0x7ffff6a2d000, state=..., code=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/Jit.cpp:107
#36 js::jit::MaybeEnterJit (cx=cx@entry=0x7ffff6a2d000, state=...) at /home/skygentoo/trees/mozilla-central/js/src/jit/Jit.cpp:205
#37 0x0000555556c2dc81 in js::RunScript (cx=cx@entry=0x7ffff6a2d000, state=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:380
#38 0x0000555556c40b14 in js::InternalCallOrConstruct (cx=0x7ffff6a2d000, cx@entry=0x0, args=..., construct=construct@entry=js::NO_CONSTRUCT, reason=<optimized out>, reason@entry=(unknown: 0xffffb058)) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:540
#39 0x0000555556c4181e in InternalCall (cx=cx@entry=0x7ffff6a2d000, args=..., reason=reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:575
#40 0x0000555556c41a13 in js::Call (cx=cx@entry=0x7ffff6a2d000, fval=..., fval@entry=..., thisv=..., thisv@entry=..., args=..., rval=rval@entry=..., reason=reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:606
#41 0x0000555556d071dd in js::Call (cx=0x7ffff6a2d000, fval=..., thisv=..., arg0=..., rval=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.h:105
#42 0x0000555556f0a491 in PromiseReactionJob (cx=0x7ffff7c87a60 <_IO_stdfile_2_lock>, cx@entry=0x7ffff6a2d000, argc=<optimized out>, vp=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/builtin/Promise.cpp:2241
#43 0x0000555556c4edc0 in CallJSNative (cx=cx@entry=0x7ffff6a2d000, native=native@entry=0x555556f09330 <PromiseReactionJob(JSContext*, unsigned int, JS::Value*)>, reason=<optimized out>, reason@entry=js::CallReason::Call, args=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:421
#44 0x0000555556c40bd8 in js::InternalCallOrConstruct (cx=0x7ffff6a2d000, cx@entry=0x0, args=..., construct=construct@entry=js::NO_CONSTRUCT, reason=(js::CallReason::Getter | js::CallReason::Setter | unknown: 0xf7c86720), reason@entry=(unknown: 0xffffb438)) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:508
#45 0x0000555556c4181e in InternalCall (cx=cx@entry=0x7ffff6a2d000, args=..., reason=reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:575
#46 0x0000555556c41a13 in js::Call (cx=cx@entry=0x7ffff6a2d000, fval=..., thisv=..., args=..., rval=rval@entry=..., reason=reason@entry=js::CallReason::Call) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:606
#47 0x0000555556d9bdb2 in JS::Call (cx=0x7ffff7c87a60 <_IO_stdfile_2_lock>, cx@entry=0x7ffff6a2d000, thisv=..., fval=..., args=..., rval=..., rval@entry=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/CallAndConstruct.cpp:117
#48 0x0000555556e62c0e in JS::Call (cx=0x7ffff6a2d000, thisv=..., funObj=..., args=..., rval=...) at /home/skygentoo/shell-cache/js-dbg-64-armsim64-linux-x86_64-65e579f52525/objdir-js/dist/include/js/CallAndConstruct.h:110
#49 js::InternalJobQueue::runJobs (this=0x7ffff6a40d00, cx=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/vm/JSContext.cpp:849
#50 0x0000555556e62666 in js::RunJobs (cx=cx@entry=0x7ffff6a2d000) at /home/skygentoo/trees/mozilla-central/js/src/vm/JSContext.cpp:786
#51 0x0000555556b44ff8 in RunShellJobs (cx=cx@entry=0x7ffff6a2d000) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:1166
#52 0x0000555556b5ebc2 in DrainJobQueue (cx=0x7ffff6a2d000, argc=<optimized out>, vp=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:1188
#53 0x00005555578ce2f5 in vixl::Simulator::VisitCallRedirection (this=this@entry=0x7ffff6a35100, instr=<optimized out>, instr@entry=0x7ffff553b588) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:724
#54 0x00005555578cd090 in vixl::Simulator::VisitException (this=0x7ffff6a35100, instr=0x7ffff553b588) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:447
#55 0x000055555787ae2d in vixl::Decoder::VisitException (this=<optimized out>, instr=0x7ffff553b588) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Decoder-vixl.cpp:876
#56 0x00005555578cc192 in vixl::CachingDecoder::Decode (this=0x7ffff6a03df0, instr=0x7ffff553b588) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:1127
#57 0x00005555578cbbfc in vixl::Simulator::ExecuteInstruction (this=this@entry=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:239
#58 0x00005555578cf5c3 in vixl::Simulator::Run (this=0x7ffff6a35100) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:70
#59 0x00005555578ccde0 in vixl::Simulator::RunFrom (this=0x7ffff6a35100, first=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/Simulator-vixl.cpp:78
#60 vixl::Simulator::call (this=0x7ffff6a35100, entry=<optimized out>, argument_count=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/arm64/vixl/MozSimulator-vixl.cpp:327
#61 0x0000555557b0e009 in EnterJit (cx=0x7ffff6a2d000, state=..., code=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/jit/Jit.cpp:107
#62 js::jit::MaybeEnterJit (cx=cx@entry=0x7ffff6a2d000, state=...) at /home/skygentoo/trees/mozilla-central/js/src/jit/Jit.cpp:205
#63 0x0000555556c2dc81 in js::RunScript (cx=cx@entry=0x7ffff6a2d000, state=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:380
#64 0x0000555556c43324 in js::ExecuteKernel (cx=cx@entry=0x7ffff6a2d000, script=script@entry=..., envChainArg=envChainArg@entry=..., evalInFrame=evalInFrame@entry=..., result=result@entry=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:782
#65 0x0000555556c4370c in js::Execute (cx=cx@entry=0x7ffff6a2d000, script=..., envChain=..., rval=..., rval@entry=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/Interpreter.cpp:814
#66 0x0000555556dbfe6f in ExecuteScript (cx=cx@entry=0x7ffff6a2d000, envChain=..., script=..., rval=rval@entry=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/CompilationAndEvaluation.cpp:509
#67 0x0000555556dc0098 in JS_ExecuteScript (cx=cx@entry=0x7ffff6a2d000, scriptArg=scriptArg@entry=...) at /home/skygentoo/trees/mozilla-central/js/src/vm/CompilationAndEvaluation.cpp:533
#68 0x0000555556b7348a in RunFile (cx=cx@entry=0x7ffff6a2d000, filename=0x7fffffffde57 "114.js", filename@entry=0x7ffff7864020 "\230$\255\373\344\344\344", <incomplete sequence \344>, file=file@entry=0x7ffff7864020, compileMethod=<optimized out>, compileMethod@entry=CompileUtf8::DontInflate, compileOnly=false) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:1069
#69 0x0000555556b72b4f in Process (cx=cx@entry=0x7ffff6a2d000, filename=<optimized out>, forceTTY=false, kind=kind@entry=FileScript) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:1659
#70 0x0000555556b38e97 in ProcessArgs (cx=0x7ffff6a2d000, op=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:10964
#71 Shell (cx=0x7ffff6a2d000, op=op@entry=0x7fffffffd6d8) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:11703
#72 0x0000555556b31d6a in main (argc=<optimized out>, argv=<optimized out>) at /home/skygentoo/trees/mozilla-central/js/src/shell/js.cpp:12825
(gdb)

Run with --fuzzing-safe --ion-offthread-compile=off --ion-eager, compile with AR=ar sh ./configure --enable-simulator=arm64 --enable-debug --with-ccache --enable-gczeal --enable-debug-symbols --disable-bootstrap --disable-tests, tested on m-c rev 65e579f52525.

Bisection result coming up...

Note that this may be slightly intermittent. Not sure if this is s-s.

Flags: sec-bounty?

Might any of the following regressions (due to build failures), or possibly bug 1661016, be the potential cause?

Unsure who to needinfo specifically, so setting Jan as a start.

Flags: needinfo?(jdemooij)
Group: core-security → javascript-core-security

This fails in the instruction cache coherency verifier that's part of the ARM64 simulator (added in bug 1441436). It could be an actual bug or an issue with the verification code. Looking at the test, I wonder if this is related to Wasm machine code serialization.

Flags: needinfo?(jdemooij) → needinfo?(rhunt)
Severity: -- → S4
Flags: needinfo?(nicolas.b.pierron)
Priority: -- → P2

We're not flushing the icache correctly when deserializing code on a different thread from the one we'll be executing on.

Assignee: nobody → rhunt
Flags: needinfo?(rhunt)
Flags: needinfo?(nicolas.b.pierron)

(In reply to Ryan Hunt [:rhunt] from comment #4)

We're not flushing the icache correctly when deserializing code on a different thread from the one we'll be executing on.

Yes, this is a requirement that this should be done on the core which is executing the code, as ARM64 can have big/small cores which are not in-sync in terms of icache invalidation. The executing core should refresh the memory which is mapped as executable, such that the icache does not pick the previously mapped executable content.

The Simulator is extra pessimistic in this case, due to inability to fully emulate the icache, but this only make this issue unlikely instead of impossible.

Fixing this is simple for the serialized module case, but I'm not really sure why this works for the case where we compile a module on one thread and then postMessage the module to another thread.

As far as I can see, we do FlushIcache::LocalThreadOnly for the first tier of a module. If that's the only tier (or the race is won) and the module is posted to another thread, I don't see any synchronization that would prevent this issue. Unless there's some other mechanism to guarantee correctness here.

There are some WPT's for this situation that theoretically should be hit, I wonder if we just don't run them or run them with the ARM64 simulator.

When compiling, we do the following:
https://searchfox.org/mozilla-central/source/js/src/wasm/WasmCode.cpp#342-346

  // Optimized compilation finishes on a background thread, so we must make sure
  // to flush the icaches of all the executing threads.
  FlushICacheSpec flushIcacheSpec = isTier2 == IsTier2::Tier2
                                        ? FlushICacheSpec::AllThreads
                                        : FlushICacheSpec::LocalThreadOnly;

Which set the codeIsThreadLocal argument to true in CPU::EnsureIAndDCacheCoherency, which calls a membarrier with the intent of flushing all instruction caches.

(In reply to Ryan Hunt [:rhunt] from comment #6)

Fixing this is simple for the serialized module case, but I'm not really sure why this works for the case where we compile a module on one thread and then postMessage the module to another thread.

The case where we share modules across workers might not exists in the JS shell, and thus we might not have caught it before.

There are some WPT's for this situation that theoretically should be hit, I wonder if we just don't run them or run them with the ARM64 simulator.

Probably not, the simulator has a huge cost and we limit our usage of it to JS shell test cases.

Okay, I was able to modify the test case to confirm that this is not related to just code caching. I can also trigger it using our normal background thread compiling code. I'm experimenting with using the evalInWorker builtin to trigger this, but it's tricky.

Nicolas, do you know the relative difference in cost for FlushICacheSpec::AllThreads, versus LocalThreadOnly?

I can write a fix for this that flushes the local thread's icache when it receives a module from another thread (either from a background compile, or via postMessage), but I want to make sure it's worth it versus just always performing an all threads flush when a module is finished compiling.

Flags: needinfo?(nicolas.b.pierron)

(In reply to Ryan Hunt [:rhunt] from comment #10)

Nicolas, do you know the relative difference in cost for FlushICacheSpec::AllThreads, versus LocalThreadOnly?

I can write a fix for this that flushes the local thread's icache when it receives a module from another thread (either from a background compile, or via postMessage), but I want to make sure it's worth it versus just always performing an all threads flush when a module is finished compiling.

So far, I've never saw any bug report mentioning the FlushICacheSpec::AllThreads as being an issue.
On the other hand detecting it might be hard without system-wide profiling tools to detect these kind of effects.

I would not worry much about flushing all threads i-cache today, knowing that spidermonkey alone does not fit in the instruction cache.

Flags: needinfo?(nicolas.b.pierron)

Wasm code may be used from multiple threads in several ways:

  1. Tier2 code is committed from a background thread
  2. Modules may be compiled/deserialized on a background thread
  3. Modules may be postMessage'ed to different threads

The ARM64 I&D-cache is not coherent and requires us to perform
synchronization when creating new code.

There has been previous discussion about how to handle
this in wasm:

Lars/Luke originally discussed this with someone at ARM
and decided that our existing cache synchronization for
compiling, our allocating of fresh code pages, and use
of mprotect was sufficient.

Later, Benjamin ran into a cache simulator failure and
researched this again and came to the conclusion that
we need to force synchronization on all threads when
tiering.

We now have a test case that can cause the cache simulator
to fail again, and it's not clear if this is a real bug
or the simulator being pessimistic compared to real
hardware.

This commit changes all module compilations to trigger
the membarrier on all threads, instead of just the local
thread. This is very conservative and to be extra safe
in the presence of poor documentation and possible
hardware errata.

I am conservatively rating this sec-low. The commit message above explains the issue in some detail. I'm removing that from the commit and just letting it live in this bug.

The gist is that on ARM64 we need to keep the instruction cache coherent with the data cache when creating wasm code. If we don't do this correctly, previously compiled wasm code that exists in a stale I-cache may execute. When we are simulating ARM64 code we have a cache coherence simulation that attempts to catch issues here, but it's not a perfect simulator of 'real hardware' as there are potential synchronization points that exist in real hardware, but not the simulator. So a failure in the simulator doesn't necessarily mean we're failing on real hardware. At the same time, real hardware has been known to have errata and so it's good to be conservative if the performance penalty is not extreme.

The above linked two bugs show the reasons we have for thinking that our existing synchronization is sufficient for real hardware, and AFAICS are still valid. But just in the case that we are missing something, this commit adds an extra sync point to work around this debug assertion and to conservatively cover misbehaving hardware.

Keywords: sec-low

Landed: https://hg.mozilla.org/integration/autoland/rev/9d028d37a9dbc1e24f5c4f314519b2df84dab342

Backed out for failing own test:

https://hg.mozilla.org/integration/autoland/rev/9d028d37a9dbc1e24f5c4f314519b2df84dab342

Push with failures
Failure log

TEST-UNEXPECTED-FAIL | js/src/jit-test/tests/wasm/regress/bug1777604/test.js | /builds/worker/checkouts/gecko/js/src/jit-test/tests/wasm/regress/bug1777604/test.js:11:25 Error: WebAssembly.compileStreaming not supported with --no-threads (code 3, args "--ion-offthread-compile=off --ion-eager --ion-eager --ion-offthread-compile=off --ion-check-range-analysis --ion-extra-checks --no-sse3 --no-threads") [0.1 s]

Flags: needinfo?(rhunt)

Needs an extra guard for when streaming compilation is disabled.

Flags: needinfo?(rhunt)

(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #14)

Landed: https://hg.mozilla.org/integration/autoland/rev/9d028d37a9dbc1e24f5c4f314519b2df84dab342

Backed out for failing own test:

Whoa …!!!

Daniel, what is the procedure in this case? I do not think the test case would help much in creating an exploit, as it is relying on the simulator instrumentation. However this is definitely a spot light on the issue.

The issue is that the instruction cache might be stale, and that one attacker might use the stale content of the instruction cache as a mean to do random code execution, as the new code provide potentially a new entry point into a random position of code which was properly freed. However, being able to make use of this attack implies that the instruction cache was not trashed by all the code present in the rest of SpiderMonkey/Firefox which sounds very unliekly.

Flags: needinfo?(dveditz)

(In reply to Nicolas B. Pierron [:nbp] from comment #16)

The issue is that the instruction cache might be stale, and that one attacker might use the stale content of the instruction cache as a mean to do random code execution, as the new code provide potentially a new entry point into a random position of code which was properly freed. However, being able to make use of this attack implies that the instruction cache was not trashed by all the code present in the rest of SpiderMonkey/Firefox which sounds very unliekly.

In addition to the i-cache needing to not be trashed, as stated above we have other reasons to think that our synchronization is sufficient for real hardware and this commit is just trying to be conservative for if there is something we have missed and to workaround the simulator. So the sec-low rating is to just be extra careful. The test was included so that we didn't have to circle back. It only triggers an assertion failure in the simulator, no failure happens on real hardware.

The above linked two bugs show the reasons we have for thinking that our existing synchronization is sufficient for real hardware, and AFAICS are still valid. But just in the case that we are missing something, this commit adds an extra sync point to work around this debug assertion and to conservatively cover misbehaving hardware.

Split the test out, will land it separately.

Regressions: 1779844
Regressions: 1779840
Regressions: 1779839
Group: javascript-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → 104 Branch
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Regressions: 1779968
Status: RESOLVED → REOPENED
Flags: needinfo?(rhunt)
Resolution: FIXED → ---
Target Milestone: 104 Branch → ---

Hmm. Looking at the crash notes for the linked bug, it looks like the linux version is below the minimum required for membarrier syscall support. What's odd, is that the patch that landed here just caused us to try to call membarrier more frequently. Before we would only do it when tiering wasm code. So I'm surprised we haven't run into this before. Looking into this more.

Flags: needinfo?(rhunt)

It looks like we need to support android devices that don't have membarrier (MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE?) support. This is likely a pre-existing bug made more frequent. Although I haven't found any crash reports to confirm this.

(In reply to Kevin Brosnan [:kbrosnan] from comment #24)

There are a three crashes on Android 102 possibly all from the same install which would not have this fix. https://crash-stats.mozilla.org/signature/?major_version=%3C%3D103&signature=vixl%3A%3ACPU%3A%3AEnsureIAndDCacheCoherency&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-date&page=1

There are 8k crashes on Nightly Android.

Taking a look at those three, it looks like they are all unrelated.

Looking into this deeper it turns out this was not a pre-existing issue as the previous code relied on a IsICacheSafe check to gate this. This would turn off tiering on linux kernels below 4.16, avoiding this issue.

Now if we start trying to use membarrier on all wasm compilations (not just tiering), we either have to find an alternative to membarrier or disable wasm on these kernel versions (likely not this option).

Looking at other VM's I can get the source code for (V8, dart, dotnetcore/mono) I don't see any of them using the membarrier syscall for this purpose. They all 'flush the icache', but I can't find any synchronization beyond that.

(In reply to Nicolas B. Pierron [:nbp] from comment #16)

Backed out for failing own test:

Daniel, what is the procedure in this case? I do not think the test case would help much in creating an exploit, as it is relying on the simulator instrumentation. However this is definitely a spot light on the issue.

That's a good reason to keep pushing to fix this somewhat quickly, but if the sec-low rating is accurate it's not too troubling. There's no well-defined "procedure", but we do need to ask ourselves whether we just zero-day'd ourselves and if so how badly. In this case it doesn't seem terrible.

Flags: needinfo?(dveditz)

The ic ivau instruction performs a broadcasted invalidation of
the instruction cache for the address range provided. The simulator
should perform this flush for all active simulators to emulate this
behavior. Currently this is only done when a membarrier is called,
which is too conservative.

Attachment #9285227 - Attachment is obsolete: true

ARM and ARM64 require a 'context synchronization operation' to occur before
running JIT'ed code. This operation forces any stale data in the execution
pipeline to be reloaded. As we globally flush the icache after compiling code,
the reload will get the newly JIT'ed code.

A 'context synchronization operation' can be performed several ways. Interrupts
and processor exceptions are the most common events. Executing an isb
instruction can also accomplish this. There is no instruction that will broadcast
context synchronization to all cores.

Newly compiled wasm code can be used on a different thread in several ways:

  1. Compiled on helper thread and received on JS thread
  2. Passed from one JS thread to another JS thread
  3. Patched into the jump table while finishing tier2 code

We currently don't do anything special for (1)/(2) and rely on non-specified
behavior (or luck) to avoid this issue. As far as I can tell, no other VM
(V8, Dart, Mono) does anything for this situation, so we're likely not
vulnerable. We should still fix this though.

For (3), we rely on the inter-processor-interrupt performed by the membarrier
syscall to achieve this when tiering.

This commit adds a manually placed isb instruction to conservatively
handle (1) and (2). The most convenient place for this is when we are
constructing a WasmModuleObject, as that is used on all paths where a
module may be received from another thread.

JS does not require this, as while it may be compiled on another thread,
it will be linked on the thread it will run on, which will trigger an
icache flush which will flush the execution context.

Depends on D152230

The membarrier is really used for a 'FlushExecutionContextForAllThreads`
operation, it doesn't flush the icache for all threads. Moving it to a
separate function makes this clear and allows wasm to call it just once
before we commit our tier2 code.

Depends on D152304

Okay, I believe we've figured this out now. Here's the important points:

  1. We have a mismatch between our ARM64 cache simulator and real hardware that causes the original testcase to fail in an assertion
    - The issue is that the simulator will only flush the icache of all threads when flags = AllThreads. But real hardware will in-fact clear the icache for all threads even if flags = AllThreads.
    - Fixing this mismatch, fixes the original test case.
  2. flags = AllThreads triggered a membarrier call which was described as flushing the icache for all threads. It's actually more subtle in triggering a 'context synchronization' on all threads, which will clear out any stale data in in-progress instruction pipelines. The ic ivau instruction we execute before the membarrier will flush the icache of all threads. We should make this clear in the code.
  3. We currently just use the membarrier to do the global 'context sync' when we're patching in tier2 code. This makes sense as we don't have any other good way to interrupt code to perform this otherwise.
  4. From discussion with ARM engineer, Anton Kirilov, we should probably execute a 'context sync' using an isb instruction when we receive a module from another thread. It's possible this is not required due to some other operation consistently triggering a 'context sync', but it's fragile if we don't explicitly do it. I checked V8, Dart, Mono and couldn't find this being done in other VM's so we're not far afield in this regard.

In summary, I think we should still keep this sec-low due to (4), although I think we're very likely to not be vulnerable to a code UAF through this due to the very small window of the instruction pipeline, and the likelyhood that something else in the architecture keeps this safe enough that other VM's don't worry about it.

I'm linking to past discussions of interest for future reference, because digging this all up was challenging.

  1. https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Code.20generation.20and.20CPU.20cache.20maintenance
  2. https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Using.20membarrier.20when.20finalizing.20code
  3. https://github.com/bytecodealliance/wasmtime/pull/3426
  4. https://bugzilla.mozilla.org/show_bug.cgi?id=1529933
  5. https://bugzilla.mozilla.org/show_bug.cgi?id=1661016
Attachment #9286144 - Attachment is obsolete: true

The delay in landing was caused by difficulty getting the platform ifdef's and includes exactly right. The two commits are queued up now. I've removed their commit descriptions, so I'm posting them here for posterity.

Bug 1777604 - wasm: Perform a pipeline flush while creating a module object. r?nbp

ARM and ARM64 require a 'context synchronization operation' to occur before
running JIT'ed code. This operation forces any stale data in the execution
pipeline to be reloaded. As we globally flush the icache after compiling code,
the reload will get the newly JIT'ed code.

A 'context synchronization operation' can be performed several ways. Interrupts
and processor exceptions are the most common events. Executing an isb
instruction can also accomplish this. There is no instruction that will broadcast
context synchronization to all cores.

Newly compiled wasm code can be used on a different thread in several ways:

  1. Compiled on helper thread and received on JS thread
  2. Passed from one JS thread to another JS thread
  3. Patched into the jump table while finishing tier2 code

We currently don't do anything special for (1)/(2) and rely on non-specified
behavior (or luck) to avoid this issue. As far as I can tell, no other VM
(V8, Dart, Mono) does anything for this situation, so we're likely not
vulnerable. We should still fix this though.

For (3), we rely on the inter-processor-interrupt performed by the membarrier
syscall to achieve this when tiering.

This commit adds a manually placed isb instruction to conservatively
handle (1) and (2). The most convenient place for this is when we are
constructing a WasmModuleObject, as that is used on all paths where a
module may be received from another thread.

JS does not require this, as while it may be compiled on another thread,
it will be linked on the thread it will run on, which will trigger an
icache flush which will flush the execution context.

Differential Revision: https://phabricator.services.mozilla.com/D152304

Bug 1777604 - wasm: Move membarrier call to separate functions. r?nbp

The membarrier is really used for a 'FlushExecutionContextForAllThreads`
operation, it doesn't flush the icache for all threads. Moving it to a
separate function makes this clear and allows wasm to call it just once
before we commit our tier2 code.

Differential Revision: https://phabricator.services.mozilla.com/D152305

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 105 Branch
Flags: sec-bounty? → sec-bounty-

Hi Ryan, are we feeling good about an ESR102 uplift on this now? It's baked for about a month with no known regressions.

Flags: needinfo?(rhunt)

I think this should be okay to uplift as long as it rebases and builds cleanly. Let me know if you need help resolving conflicts.

Flags: needinfo?(rhunt)

It grafts fine, I just need the approval request :)

Flags: needinfo?(rhunt)

Comment on attachment 9286299 [details]
Bug 1777604 - wasm: Move membarrier call to separate functions. r?nbp

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: These patches fix a potential corner case that could result in crashes on ARM64 hardware when running WebAssembly code.
  • User impact if declined: Slightly increased chance of crashes when using WebAssembly code.
  • Fix Landed on Version: 105
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): These patches increase the frequency we call a clearing instruction on ARM64, but doesn't introduce a new operation. They have been in nightly for a month with no regressions identified.
Flags: needinfo?(rhunt)
Attachment #9286299 - Flags: approval-mozilla-esr102?
Attachment #9286298 - Flags: approval-mozilla-esr102?

Comment on attachment 9286298 [details]
Bug 1777604 - wasm: Perform a pipeline flush while creating a module object. r?nbp

Approved for 102.3esr.

Attachment #9286298 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
Attachment #9286299 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
Whiteboard: [adv-main105+][adv-esr102.3+]
Attached file advisory.txt
Alias: CVE-2022-40957
Flags: sec-bounty-hof+
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: