Closed Bug 2042279 Opened 10 days ago Closed 6 days ago

WebAssembly.compileStreaming hangs a wasm helper thread forever when the code section is truncated mid-stream

Categories

(Core :: JavaScript: WebAssembly, defect, P2)

defect

Tracking

()

RESOLVED FIXED
153 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox-esr140 --- wontfix
firefox151 --- wontfix
firefox152 --- fixed
firefox153 --- fixed

People

(Reporter: jandem, Assigned: jpages)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: ai-involved, regression)

Attachments

(2 files)

Summary

WebAssembly.compileStreaming / WebAssembly.instantiateStreaming permanently deadlocks a wasm compilation helper thread when the streamed module's code-section header declares a size larger than the number of bytes actually delivered before the stream ends cleanly (i.e. a malformed but complete HTTP response). The returned promise never settles, and the helper thread cannot be joined at shutdown.

Root cause

CompileStreamTask in js/src/wasm/WasmJS.cpp drives a state machine Env -> Code -> Tail -> Closed. The helper thread runs CompileStreaming() (js/src/wasm/WasmCompile.cpp:1207) whose StreamingDecoder::waitForBytes() (WasmCompile.cpp:1181-1192) blocks on the exclusiveCodeBytesEnd_ condvar and only exits when either more bytes arrive or cancelled_ is set:

bool waitForBytes(size_t numBytes) {
  numBytes = std::min(numBytes, d_.bytesRemain());
  const uint8_t* requiredEnd = d_.currentPosition() + numBytes;
  auto codeBytesEnd = codeBytesEnd_.lock();
  while (codeBytesEnd < requiredEnd) {
    if (cancelled_) { return false; }
    codeBytesEnd.wait();
  }
  return true;
}

CompileStreamTask::streamEnd() (WasmJS.cpp:5459-5486) handles Code and Tail identically — it only marks exclusiveStreamEnd_.reached = true and notifies exclusiveStreamEnd_. When the stream ends while still in Code, the helper thread is blocked on exclusiveCodeBytesEnd_ (not exclusiveStreamEnd_); neither codeBytesEnd_ advancing nor cancelled_ (i.e. streamFailed_) being set ever happens, so it waits indefinitely.

Contrast with the error path rejectAndDestroyAfterHelperThreadStarted() (WasmJS.cpp:5369-5377), which correctly sets streamFailed_ = true and notifies both condvars. A clean end-of-stream while still in Code is effectively a truncation and should be treated the same way.

The DOM consumer behaves the same as the shell harness: a normal stream close (NS_BASE_STREAM_CLOSED) calls streamEnd() (dom/fetch/FetchUtil.cpp:569), not streamError() — so a complete HTTP response carrying a malformed wasm module reaches this path from web content.

Test case

//   timeout 15 obj-shell-dbgopt/dist/bin/js --fuzzing-safe repro.js
//
// Expected (after fix): "bad: rejected CompileError: ..." printed, exit 0.
// Observed (buggy): "script end reached" prints, then the process hangs
// because the wasm helper thread is stuck in StreamingDecoder::waitForBytes
// and cannot be joined at shutdown.

function streamCompile(bytes, tag) {
  WebAssembly.compileStreaming(bytes).then(
      m => print(tag + ": compiled OK"),
      e => print(tag + ": rejected " + e));
}

// Code-section header declares size=0x0a (10) but only 3 content bytes
// follow before the stream ends (a complete but malformed module).
var bad = new Uint8Array([
  0x00,0x61,0x73,0x6d, 0x01,0x00,0x00,0x00,   // preamble
  0x01,0x04,0x01,0x60,0x00,0x00,              // type section: () -> ()
  0x03,0x02,0x01,0x00,                        // function section: func0 : type0
  0x0a,0x0a,                                  // code section, declared size = 10
  0x01,0x08,0x00                              // count=1, body0 size=8, locals=0 ... then EOF
]);

streamCompile(bad, "bad");
print("script end reached");

Locally reproduced on current main:

$ timeout 15 obj-shell-dbgopt/dist/bin/js --fuzzing-safe test.js
script end reached
EXIT=124   # killed by timeout: hung

$ # control: same module with code section size 4 matching its 4 content bytes
$ timeout 15 obj-shell-dbgopt/dist/bin/js --fuzzing-safe test.js
script end reached
ok: compiled OK
EXIT=0

The hang reproduces even though drainJobQueue() is not called — "script end reached" is printed before the process hangs at shutdown, demonstrating the leaked/blocked helper thread is independent of any job-queue draining.

Suggested fix

When streamEnd() arrives in the Code state, the declared code section is necessarily shorter than what was delivered: treat it as a truncation and reject, mirroring rejectAndDestroyAfterHelperThreadStarted(). This also notifies exclusiveCodeBytesEnd_, which unblocks waitForBytes() via the cancelled_ check.

Sketch (js/src/wasm/WasmJS.cpp, in CompileStreamTask::streamEnd):

   void streamEnd(
       JS::OptimizedEncodingListener* completeTier2Listener) override {
     switch (streamState_.lock().get()) {
       case Env: {
         BytecodeBuffer bytecode(envBytes_, nullptr, nullptr);
         module_ = CompileModule(*compileArgs_,
                                 BytecodeBufferOrSource(std::move(bytecode)),
                                 &compileError_, &warnings_, nullptr);
         setClosedAndDestroyBeforeHelperThreadStarted();
         return;
       }
-      case Code:
+      case Code:
+        // The stream ended mid-code-section: the declared code-section
+        // size was larger than the bytes actually delivered. The helper
+        // thread is blocked in StreamingDecoder::waitForBytes() on
+        // exclusiveCodeBytesEnd_, and only the streamFailed_/cancelled_
+        // path will unblock it. Treat this as a truncation error.
+        rejectAndDestroyAfterHelperThreadStarted(
+            JSMSG_WASM_COMPILE_ERROR /* or a dedicated "truncated" code */);
+        return;
       case Tail:
         // Unlock exclusiveStreamEnd_ before locking streamState_.
         {
           auto streamEnd = exclusiveStreamEnd_.lock();
           MOZ_ASSERT(!streamEnd->reached);
           streamEnd->reached = true;
           streamEnd->tailBytes = tailBytes_;
           streamEnd->completeTier2Listener = completeTier2Listener;
           streamEnd.notify_one();
         }
         setClosedAndDestroyAfterHelperThreadStarted();
         return;
       case Closed:
         MOZ_CRASH("streamEnd() in Closed state");
     }
   }

Equivalent alternatives:

  • Always set streamFailed_ and notify exclusiveCodeBytesEnd_ from the Code branch (so the streaming decoder fails the same way it does on stream error), and rely on the existing Tail/Code end-handling for the rest.
  • Give StreamingDecoder::waitForBytes() visibility into exclusiveStreamEnd_.reached and fail with a "code section truncated" CompileError when reached is set but codeBytesEnd_ < requiredEnd. This produces a clearer error message than reusing the stream-error path, at the cost of a wider API change.

Claude thinks this goes back all the way to bug 1406421 (2017), when streaming compilation was implemented.

Keywords: ai-involved
Keywords: regression
Regressed by: 1406421

Set release status flags based on info from the regressing bug 1406421

Severity: -- → S3
Priority: -- → P2

This could lead to some DoS, but this does not sounds like a security issue.

Blocks: wasm-jit-bugs
No longer blocks: wasm-lang
Assignee: nobody → jpages

I agree, this sounds like DoS but not a security issue. Jan is there any reason we can't open this? Just triple-checking before doing it.

Flags: needinfo?(jdemooij)

(In reply to Ryan Hunt [:rhunt] from comment #4)

I agree, this sounds like DoS but not a security issue. Jan is there any reason we can't open this? Just triple-checking before doing it.

I'll open it up. I sometimes file these as security bugs just to be safe.

Group: javascript-core-security
Flags: needinfo?(jdemooij)
Status: NEW → RESOLVED
Closed: 6 days ago
Resolution: --- → FIXED
Target Milestone: --- → 153 Branch

The patch landed in nightly and beta is affected.
:jpages, is this bug important enough to require an uplift?

For more information, please visit BugBot documentation.

Flags: needinfo?(jpages)
Attachment #9593588 - Flags: approval-mozilla-beta?

firefox-beta Uplift Approval Request

  • User impact if declined/Reason for urgency: This bug can cause a deadlock if an incomplete wasm module is being streamed. In this case the helper thread can become stuck waiting for the rest of the module.
  • Code covered by automated testing?: yes
  • Fix verified in Nightly?: yes
  • Needs manual QE testing?: no
  • Steps to reproduce for manual QE testing:
  • Risk associated with taking this patch: low
  • Explanation of risk level: Small patch with changes in the wasm streaming code to wake up the stuck thread as well as propagating the error.
  • String changes made/needed?: N/A
  • Is Android affected?: yes
Flags: needinfo?(jpages)
Flags: in-testsuite+
Attachment #9593588 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: