Closed Bug 1687391 Opened 4 years ago Closed 4 years ago

Race Condition in ThreadSafeWeakPtr

Categories

(Core :: MFBT, defect)

defect

Tracking

()

RESOLVED FIXED
87 Branch
Tracking Status
firefox-esr78 86+ fixed
firefox85 --- wontfix
firefox86 + fixed
firefox87 + fixed

People

(Reporter: nika, Assigned: nika)

References

(Regression)

Details

(Keywords: csectype-race, regression, sec-high, Whiteboard: [sec-survey][post-critsmash-triage][adv-main86+r][adv-esr78.8+r])

Attachments

(1 file)

While looking at the ThreadSafeWeakPtr code trying to make some unrelated changes, I noticed a potential race condition in the locking code.

Example Racy Flow:

  1. Thread A progresses until https://searchfox.org/mozilla-central/rev/85f20360d898501f0fac12dd35fe8b7475e01848/mfbt/ThreadSafeWeakPtr.h#211, having detected a strong mRefCnt == 1 condition, but does not enter the critical section.
  2. Thread B executes getRefPtr(), acquiring and releasing a read lock on the ThreadSafeWeakReference, incrementing mRefCnt to 2.
  3. Thread A progresses through ThreadSafeWeakReference::tryDetach, acquiring and releasing a write lock. mPtr is not cleared as aOwner->hasOneRef no longer evaluates to true, due to Thread B incrementing the count. Thread A stops before the call to AtomicRefCounted::Release.
  4. Thread B releases the reference it acquired in step 2. It does not enter ThreadSafeWeakReference::tryDetach as mRefCnt is still 2, and reduces the reference count back to 1.
  5. Thread A enters AtomicRefCounted::Release, which decreases the refcount to 0, and invokes delete this. The ThreadSafeWeakReference type now holds a dangling raw pointer to the SupportsThreadSafeWeakPointer type which will never be detached, and can be upgraded through the safe API.

This issue can be avoided by using a different architecture for tracking strong and weak references, such as one where the strong reference count is kept within the weakref type, and incremented from a nonzero value using a compare_exchange_weak loop when attempting to upgrade, rather than using a mutex and raw pointers.

It would also be beneficial to eliminate the existing spinlock system due to bug 1646468.

This new approach to weak references is roughly modeled after the approach used
by Rust's Arc<T>, and uses an atomic compare-and-swap loop to perform weak to
strong reference upgrades. This approach ends up moving the strong reference
count out of the tracked object and into the weak reference object, as the
strong reference count atomic needs to outlife the object itself.

Rust's Arc Weak::upgrade implementation:
https://github.com/rust-lang/rust/blob/d98d2f57d9b98325ff075c343d2c7695b66dfa7d/library/alloc/src/sync.rs#L1806-L1837

Group: core-security → dom-core-security
Keywords: csectype-race
Keywords: sec-high

Comment on attachment 9197841 [details]
Bug 1687391 - Avoid using a spinlock in ThreadSafeWeakReference,

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: If a reader guesses that the reason for this being a sec bug is due to an unsoundness in the previous ThreadSafeWeakPtr design, they may be able to find and determine the particular issue I mention in comment 1, however it may be challenging to create the specific race conditions required to trigger the specific use-after-free, especially due to the relatively infrequent use of ThreadSafeWeakPtr in our codebase.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: All branches
  • If not all supported branches, which bug introduced the flaw?: bug 1404742
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: The patch may not apply cleanly to all branches, but should be possible to adapt if needed.
  • How likely is this patch to cause regressions; how much testing does it need?: I do not believe this patch will cause any regressions, though it's possible that some errors may be discovered due to build errors when the patch lands on autoland, as I have been unable to test on all platforms without try.
Attachment #9197841 - Flags: sec-approval?
Has Regression Range: --- → yes
Keywords: regression

Comment on attachment 9197841 [details]
Bug 1687391 - Avoid using a spinlock in ThreadSafeWeakReference,

sec-approval=dveditz

Feel free to test other branches on try at this point since we're landing this on trunk

Attachment #9197841 - Flags: sec-approval? → sec-approval+

Landed and backed out for bustages complaining about ThreadSafeWeakPtr:

https://hg.mozilla.org/integration/autoland/rev/63684ddac654284f269640e514b632963ef795f9

Push with failues: https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=cf95a79e64f6f06278e4a0a11185284b2be49ad2
Failure log: https://treeherder.mozilla.org/logviewer?job_id=328715314&repo=autoland

[task 2021-02-03T16:34:33.667Z] 16:34:33 ERROR - /builds/worker/workspace/obj-build/dist/include/mozilla/ThreadSafeWeakPtr.h:261:3: error: bad implicit conversion operator for 'ThreadSafeWeakPtr'
[task 2021-02-03T16:34:33.667Z] 16:34:33 INFO - operator bool() const = delete;
[task 2021-02-03T16:34:33.667Z] 16:34:33 INFO - ^

Flags: needinfo?(nika)
Flags: needinfo?(nika)
Group: dom-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch

The patch landed in nightly and beta is affected.
:nika, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(nika)

As part of a security bug pattern analysis, we are requesting your help with a high level analysis of this bug. It is our hope to develop static analysis (or potentially runtime/dynamic analysis) in the future to identify classes of bugs.

Please visit this google form to reply.

Flags: needinfo?(nika)
Whiteboard: [sec-survey]
Flags: qe-verify-
Whiteboard: [sec-survey] → [sec-survey][post-critsmash-triage]

Comment on attachment 9197841 [details]
Bug 1687391 - Avoid using a spinlock in ThreadSafeWeakReference,

Beta/Release Uplift Approval Request

  • User impact if declined: potential race condition which could lead to an exploit in an infrequently-used MFBT primitve
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change has already landed in central and appears to be behaving well.
  • String changes made/needed: None

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration:
  • User impact if declined: potential race condition which could lead to an exploit in an infrequently-used MFBT primitve
  • Fix Landed on Version: 87
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change has already landed in central and appears to be behaving well.
  • String or UUID changes made by this patch: None
Flags: needinfo?(nika)
Attachment #9197841 - Flags: approval-mozilla-esr78?
Attachment #9197841 - Flags: approval-mozilla-beta?

Comment on attachment 9197841 [details]
Bug 1687391 - Avoid using a spinlock in ThreadSafeWeakReference,

Approved for 86.0rc1 and 78.8esr.

Attachment #9197841 - Flags: approval-mozilla-esr78?
Attachment #9197841 - Flags: approval-mozilla-esr78+
Attachment #9197841 - Flags: approval-mozilla-beta?
Attachment #9197841 - Flags: approval-mozilla-beta+
Whiteboard: [sec-survey][post-critsmash-triage] → [sec-survey][post-critsmash-triage][adv-main86+r]
Whiteboard: [sec-survey][post-critsmash-triage][adv-main86+r] → [sec-survey][post-critsmash-triage][adv-main86+r][adv-esr78.8+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: