Closed Bug 1739683 (CVE-2021-43539) Opened 3 years ago Closed 3 years ago

Crash in Wasm Ion code when gczeal is used with reference types and builtin instance method calls

Categories

(Core :: JavaScript: WebAssembly, defect, P1)

defect

Tracking

()

RESOLVED FIXED
96 Branch
Tracking Status
firefox-esr91 95+ fixed
firefox94 --- wontfix
firefox95 + fixed
firefox96 + fixed

People

(Reporter: asumu, Assigned: asumu)

References

(Blocks 1 open bug)

Details

(Keywords: csectype-uaf, sec-high, Whiteboard: [sec-survey][adv-main95+][adv-ESR91.4.0+])

Attachments

(3 files, 2 obsolete files)

The following Wasm jit-test test case triggers a crash in Ion:

// crash.js
gczeal(2, 1); // Collect on every allocation.

let exports = wasmEvalText(
  `(module
     (table $tab (export "tab") 5 externref)
     (elem declare funcref (ref.func $g))
     (func $g)
     (func $f (export "f") (param externref) (result)
       (ref.func $g) ;; force a collection via allocation in instance call
                     ;; stack map is not at the right offset as failure mode is FailOnInvalidRef
       (ref.is_null)
       (if
         (then)
         ;; crashes here due to GC happening above
         ;; at this point the externref on stack is a poisoned value
         (else (i32.const 0) (local.get 0) (table.set $tab)))
       )
     )`,
  {}
).exports;
exports.f("foo");

Running this produces a crash like the following

$ mach jit-test --ion crash.js
Assertion failure: ThreadId::ThisThreadId() == owningThread_, at /home/asumu/mozilla-unified/js/src/threading/Mutex.cpp:63                                                                   
UndefinedBehaviorSanitizer:DEADLYSIGNAL                                                                                                                                                      
==28940==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x557b685d175c bp 0x7ffec0503ba0 sp 0x7ffec0503b70 T28940)                                            
==28940==The signal is caused by a WRITE memory access.                                                                                                                                      
==28940==Hint: address points to the zero page.     

This is reduced from another test case (in exception handling Ion code) where it was more obvious that the externref parameter to the function $f is getting collected and overwritten with a poisoned value, but you can see in gdb that the externref is overwritten with 0x2F (JS_FRESH_NURSERY_PATTERN) if you break in the tableSet method.

What appears to be happening is that stack maps for Ion calls to Wasm instance method calls are not being installed correctly (at the right offset), so they are not actually looked up during GC. The GC then misses roots in the stack frame. This happens on methods where the failure mode is not Infallible, as there are some error handling instructions inserted between the call return point and the location of the LSafepoint.

I'll attach a patch that should fix the offsets later.

I think it is difficult to trigger this bug without gczeal turned on, but I've marked this as a security bug just in case.

Note: this particular test case seems to only crash on debug builds, and observing the memory poisoning also requires having debug on so it's difficult to tell if something bad is happening on non-debug builds.

Stack map problem with reftypes.

Severity: -- → S3
Priority: -- → P1
Assignee: nobody → asumu
Status: NEW → ASSIGNED

It sounds like a GC rooting issue, so I'm going to mark it sec-high.

Group: core-security → javascript-core-security

I can't tell yet if this is some corner case in not-yet-shipped code or a real emergency, will discuss with Ryan later today.

Analysis:

  • this is about wasm builtin calls only (other calls are ok)
  • this is about ion only (baseline is ok)
  • this is specific to builtin calls that can return an error and can trigger gc, probably only ref.func
  • a pointer on the stack is not traced because the stackmap is not found by the gc because its location marker is off, if the correct phase of gc is triggered by the allocation in question
  • an attacker could perhaps place many pointers on the stack in an attempt to get one that points to something interesting, but this will be hard to count on and will easily crash
  • fission helps us (attack does not escape to other domains) but also hurts us (gc behavior may be less turbulent than it would be in the single-process browser)
  • the issue is serious, we do uplifts all the way
Flags: sec-bounty?

Obviously not asking for a bounty for myself, but for Asumu, if eligible. This is an important bug.

Comment on attachment 9249520 [details]
Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: Unsure, but my guess is not easily. The patch changes how we record offsets for stack maps for wasm calls so it could lead someone to suspect that certain calls could lead to the wrong stack map. From there they would have to figure out that it's only certain kinds, specifically just ref.func to our knowledge (see Lars' post). From there they would have to manage the heap non-determinism, as Lars again points out, to trigger a GC in the ref.func call that will collect an object they control on the stack and then allocate an object into the same pointer and then use it somehow.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
  • Which older supported branches are affected by this flaw?: All
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: Yes
  • If not, how different, hard to create, and risky will they be?:
  • How likely is this patch to cause regressions; how much testing does it need?: Unlikely, we have good coverage of many stack map possible issues.
Attachment #9249520 - Flags: sec-approval?

As marked in the security approval request, I think we should change the commit message to not mention safepoints nor builtin instance methods.

Attachment #9249520 - Attachment description: Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls → Bug 1739683 - Fix code offset use in Wasm calls

I retitled the patch given the above comments to not mention the details. Happy to adjust it again if needed.

Comment on attachment 9249520 [details]
Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls

Approved to land and request uplift

Attachment #9249520 - Attachment description: Bug 1739683 - Fix code offset use in Wasm calls → Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls
Attachment #9249520 - Flags: sec-approval? → sec-approval+
Attachment #9249520 - Attachment description: Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls → Bug 1739683 - Fix code offset use in Wasm calls
Group: javascript-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 96 Branch

Please nominate this patch for Beta and ESR91 approval when you get a chance.

Comment on attachment 9249520 [details]
Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls

Beta/Release Uplift Approval Request

  • User impact if declined: Potentially exploitable GC rooting hazard due to stack maps issue.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Not risky, we have good coverage of many stack map possible issues.
  • String changes made/needed:

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: Potentially exploitable GC rooting hazard due to stack maps issue.
  • Fix Landed on Version: 96
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Not risky, we have good coverage of many stack map possible issues.
  • String or UUID changes made by this patch:
Attachment #9249520 - Attachment description: Bug 1739683 - Fix code offset use in Wasm calls → Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls
Attachment #9249520 - Flags: approval-mozilla-esr91?
Attachment #9249520 - Flags: approval-mozilla-beta?

Comment on attachment 9249520 [details]
Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls

Approved for 95.0b9.

Attachment #9249520 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.

Please visit this google form to reply.

Flags: needinfo?(asumu)
Whiteboard: [sec-survey]
Flags: needinfo?(asumu)

Comment on attachment 9249520 [details]
Bug 1739683 - Fix code offset for safepoints on Wasm instance method calls

Approved for 91.4esr.

Attachment #9249520 - Flags: approval-mozilla-esr91? → approval-mozilla-esr91+
Flags: sec-bounty? → sec-bounty+
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Whiteboard: [sec-survey] → [sec-survey][adv-main95+][adv-ESR91.4.0+]
Attached file advisory.txt (obsolete) —

My understanding of wasm and wasm instance methods is... limited. The advisory draft is addressing this by being very vague. It's generally OK if it's "vague but not wrong", but, can you review the advisory to make sure I'm not saying something incorrect or stupid? :)

Flags: needinfo?(lhansen)

Hi Frederik, I think the advisory is probably misleading as it stands. A title like "GC rooting failure" with explanation "Failure to correctly record the location of live pointers across wasm instance calls resulted in a GC occuring within the call not tracing those live pointers. This could cause some live objects to be collected and the caller could later reference unused or reused memory. " would be comprehensive.

Flags: needinfo?(lhansen) → needinfo?(fbraun)
Attached file advisory.txt (obsolete) —

Thanks, Lars!

I've added the relevant feature to the title and replaced your last sentence with our template sentence of "This can lead to a user-after-free causing a potentially exploitable crash."

Attachment #9253450 - Attachment is obsolete: true
Flags: needinfo?(fbraun)

That sounds good, though (nitpicking) I think the preferred spelling is "use-after-free"?

Attached file advisory.txt

Oops. Thanks!

Attachment #9253572 - Attachment is obsolete: true
Alias: CVE-2021-43539
Group: core-security-release
Attachment #9249521 - Attachment description: Bug 1739683 - Add gczeal test for Wasm instance call safepoint issue. → WIP: Bug 1739683 - Add gczeal test for Wasm instance call safepoint issue.
Attachment #9249521 - Attachment description: WIP: Bug 1739683 - Add gczeal test for Wasm instance call safepoint issue. → Bug 1739683 - Add gczeal test for Wasm instance call safepoint issue. r=rhunt
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/autoland/rev/3637a0367fb6
Add gczeal test for Wasm instance call safepoint issue. r=lth
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: