Closed Bug 1867982 Opened 2 years ago Closed 2 years ago

Crash in [@ mozilla::ThreadBound<T>::AssertIsCorrectThread] with poison value in register

Categories

(Core :: DOM: Workers, defect, P2)

Other
Windows
defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 123+ fixed
firefox122 --- wontfix
firefox123 + fixed
firefox124 + fixed

People

(Reporter: release-mgmt-account-bot, Assigned: jstutte)

References

(Blocks 1 open bug)

Details

(Keywords: crash, csectype-uaf, sec-high, Whiteboard: [adv-main123+r][adv-esr115.8+r])

Crash Data

Attachments

(3 files, 2 obsolete files)

Crash report: https://crash-stats.mozilla.org/report/index/7d6ca59a-ad61-42f8-955a-38c180231118

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(IsCorrectThread())

Top 10 frames of crashing thread:

0  xul.dll  AnnotateMozCrashReason  mfbt/Assertions.h:46
0  xul.dll  mozilla::ThreadBound<mozilla::dom::WorkerPrivate::WorkerThreadAccessible>::AssertIsCorrectThread const  xpcom/threads/ThreadBound.h:135
0  xul.dll  mozilla::ThreadBound<mozilla::dom::WorkerPrivate::WorkerThreadAccessible>::Accessor<1>::Accessor  xpcom/threads/ThreadBound.h:100
0  xul.dll  mozilla::ThreadBound<mozilla::dom::WorkerPrivate::WorkerThreadAccessible>::Access const  xpcom/threads/ThreadBound.h:125
0  xul.dll  mozilla::dom::WorkerPrivate::GetCurrentEventLoopGlobal const  dom/workers/WorkerPrivate.h:478
0  xul.dll  mozilla::dom::WorkerRunnable::Run  dom/workers/WorkerRunnable.cpp:245
1  xul.dll  nsThread::ProcessNextEvent  xpcom/threads/nsThread.cpp:1192
2  xul.dll  NS_ProcessNextEvent  xpcom/threads/nsThreadUtils.cpp:480
2  xul.dll  mozilla::ipc::MessagePumpForNonMainThreads::Run  ipc/glue/MessagePump.cpp:300
3  xul.dll  MessageLoop::RunInternal  ipc/chromium/src/base/message_loop.cc:370

By querying Nightly crashes reported within the last 2 months, here are some insights about the signature:

  • First crash report: 2023-10-10
  • Process type: Content
  • Is startup crash: No
  • Has user comments: No
  • Is null crash: No

The Bugbug bot thinks this bug should belong to the 'Core::DOM: Workers' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: General → DOM: Workers

Eden, does this ring any bells? This is something quite new.

Severity: -- → S3
Flags: needinfo?(echuang)
Priority: -- → P3
Group: dom-core-security

I cracked up a minidump and I see "rbx": "0xe5e5e5e5e5e5e5e5". Apparently we run a WorkerRunnable after our WorkerThreadPrimaryRunnable already finished and the WorkerPrivate object has been freed. This seems to be a clear UAF, hiding behind that Accessible assertion.

Summary: Crash in [@ mozilla::ThreadBound<T>::AssertIsCorrectThread] → Crash in [@ mozilla::ThreadBound<T>::AssertIsCorrectThread] with poison value in register

The severity field for this bug is set to S3. However, the bug is flagged with the sec-high keyword.
:jstutte, could you consider increasing the severity of this security bug?

For more information, please visit BugBot documentation.

Flags: needinfo?(jstutte)

Yes, that needs more attention.

Severity: S3 → S2
Flags: needinfo?(jstutte)
Priority: P3 → P2

(In reply to Jens Stutte [:jstutte] from comment #3)

Apparently we run a WorkerRunnable after our WorkerThreadPrimaryRunnable already finished and the WorkerPrivate object has been freed. .

I wonder if this is somehow related to bug 1836937. I'd assume any event able to crash like this would have triggered that assertion (though not any event triggering it might cause this crash, of course).

Technically we have all the information to see, if we have an "illegitimate" WorkerRunnable::Run - on a worker thread we could check nsThread::RecursionDepth to see if we are still running on top of the WorkerThreadPrimaryRunnable and we should know, if we are at all running on the worker thread, too.

(In reply to Jens Stutte [:jstutte] from comment #6)

Technically we have all the information to see, if we have an "illegitimate" WorkerRunnable::Run - on a worker thread we could check nsThread::RecursionDepth to see if we are still running on top of the WorkerThreadPrimaryRunnable and we should know, if we are at all running on the worker thread, too.

The WIP patch is basically proposing that, and providing the runnable name in the log based on bug 1875800.

Assignee: nobody → jstutte
Attachment #9375820 - Attachment description: WIP: Bug 1867982 - Check minimum nesting level on worker thread of WorkerRunnable::Run. → Bug 1867982 - Check minimum nesting level on worker thread of WorkerRunnable::Run. r?#dom-worker-reviewers
Status: NEW → ASSIGNED

Comment on attachment 9375820 [details]
Bug 1867982 - Check if WorkerRunnable::Run runs on top of WorkerThreadPrimaryRunnable::Run in a worker thread. r?#dom-worker-reviewers

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: The patch is seemingly just adding diagnostics (except for the early return), although it does so in a suspiciously complicated way (as we need to avoid to touch mWorkerPrivate). Neither do we know how to force a WorkerRunnable to be run such late.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: probably all
  • If not all supported branches, which bug introduced the flaw?: None
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: The larger changeset from bug 1875800 we use here for diagnostics is probably not needed, we could make a variant of this patch that just introduces the early return and the generic diagnostic assertion. Other than that the code should be largely the same in ESR.
  • How likely is this patch to cause regressions; how much testing does it need?: The patch is expected to move the intermittent assertion from bug 1836937 here, which could be seen as a regression but actually is not.
    Other than that, normal baking times should be enough.
  • Is Android affected?: Yes
Attachment #9375820 - Flags: sec-approval?

Comment on attachment 9375820 [details]
Bug 1867982 - Check if WorkerRunnable::Run runs on top of WorkerThreadPrimaryRunnable::Run in a worker thread. r?#dom-worker-reviewers

Approved to land and uplift if desired

Attachment #9375820 - Flags: sec-approval? → sec-approval+
Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/59caf3ade866 Check minimum nesting level on worker thread of WorkerRunnable::Run. r=dom-worker-reviewers,smaug
Backout by abutkovits@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/638b234fcefa Backed out changeset 59caf3ade866 for causing failures at WorkerRunnable.cpp. CLOSED TREE

Backed out for causing failures at WorkerRunnable.cpp:
https://hg.mozilla.org/integration/autoland/rev/638b234fcefab7a54ef2d2f7e5c872cba1bf4bb7

Push with failures
Failure log

task 2024-01-25T08:14:11.988Z] 08:14:11     INFO - TEST-START | dom/workers/test/test_WorkerDebugger.postMessage.xhtml
[task 2024-01-25T08:14:11.988Z] 08:14:11     INFO - GECKO(1941) | [Parent 1941, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /builds/worker/checkouts/gecko/chrome/nsChromeRegistry.cpp:182
[task 2024-01-25T08:14:11.988Z] 08:14:11     INFO - GECKO(1941) | [Parent 1941, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/chrome/nsChromeProtocolHandler.cpp:73
[task 2024-01-25T08:14:12.209Z] 08:14:12     INFO - GECKO(1941) | MEMORY STAT | vsize 11530MB | residentFast 521MB | heapAllocated 227MB
[task 2024-01-25T08:14:12.222Z] 08:14:12     INFO - GECKO(1941) | [Parent 1941, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /builds/worker/checkouts/gecko/chrome/nsChromeRegistry.cpp:182
[task 2024-01-25T08:14:12.223Z] 08:14:12     INFO - GECKO(1941) | [Parent 1941, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/chrome/nsChromeProtocolHandler.cpp:73
[task 2024-01-25T08:14:12.251Z] 08:14:12     INFO - GECKO(1941) | [Parent 1941, Main Thread] WARNING: Failed to retarget HTML data delivery to the parser thread.: file /builds/worker/checkouts/gecko/parser/html/nsHtml5StreamParser.cpp:1227
[task 2024-01-25T08:14:12.348Z] 08:14:12     INFO - GECKO(1941) | Assertion failure: false (On a worker, a worker runnable should only run on top of WorkerThreadPrimaryRunnable.), at /builds/worker/checkouts/gecko/dom/workers/WorkerRunnable.cpp:253
[task 2024-01-25T08:14:12.349Z] 08:14:12     INFO - GECKO(1941) | #01: mozilla::dom::WorkerRunnable::Run() [dom/workers/WorkerRunnable.cpp:251]
[task 2024-01-25T08:14:12.350Z] 08:14:12     INFO - GECKO(1941) | #02: mozilla::dom::WorkerControlRunnable::Cancel() [dom/workers/WorkerRunnable.cpp:524]
[task 2024-01-25T08:14:12.352Z] 08:14:12     INFO - GECKO(1941) | #03: mozilla::dom::WorkerPrivate::DoRunLoop(JSContext*) [dom/workers/WorkerPrivate.cpp:3319]
[task 2024-01-25T08:14:12.353Z] 08:14:12     INFO - GECKO(1941) | #04: mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() [dom/workers/RuntimeService.cpp:2111]
[task 2024-01-25T08:14:12.354Z] 08:14:12     INFO - GECKO(1941) | #05: nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:1194]
[task 2024-01-25T08:14:12.358Z] 08:14:12     INFO - GECKO(1941) | #06: NS_ProcessNextEvent(nsIThread*, bool) [xpcom/threads/nsThreadUtils.cpp:480]
[task 2024-01-25T08:14:12.360Z] 08:14:12     INFO - GECKO(1941) | #07: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:301]
[task 2024-01-25T08:14:12.360Z] 08:14:12     INFO - GECKO(1941) | #08: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:346]
[task 2024-01-25T08:14:12.361Z] 08:14:12     INFO - GECKO(1941) | #09: nsThread::ThreadFunc(void*) [xpcom/threads/nsThread.cpp:372]
[task 2024-01-25T08:14:12.385Z] 08:14:12     INFO - GECKO(1941) | #10: _pt_root [nsprpub/pr/src/pthreads/ptthread.c:204]
[task 2024-01-25T08:14:12.386Z] 08:14:12     INFO - GECKO(1941) | #11: ??? [/builds/worker/workspace/build/application/firefox/firefox + 0x6a95d]
[task 2024-01-25T08:14:12.387Z] 08:14:12     INFO - GECKO(1941) | #12: ??? [/lib/x86_64-linux-gnu/libpthread.so.0 + 0x76db]
[task 2024-01-25T08:14:12.390Z] 08:14:12     INFO - GECKO(1941) | #13: clone [/lib/x86_64-linux-gnu/libc.so.6 + 0x121a3f]
[task 2024-01-25T08:14:12.391Z] 08:14:12     INFO - GECKO(1941) | #14: ??? (???:???)
Flags: needinfo?(jstutte)

Strange that this did not fall out with previous testing.

There are some runnables that call Run on Cancel.

Edit: This seems to happen mostly for runs without e10s / fission enabled, which is definitely a less tested configuration these days.

Flags: needinfo?(jstutte)
Attachment #9375820 - Attachment description: Bug 1867982 - Check minimum nesting level on worker thread of WorkerRunnable::Run. r?#dom-worker-reviewers → Bug 1867982 - Check if WorkerRunnable::Run runs on top of WorkerThreadPrimaryRunnable::Run in a worker thread. r?#dom-worker-reviewers
Attached file DataReviewRequest_1867982.md (obsolete) —
Attachment #9376582 - Flags: data-review?(chutten)

Before I pop into the data review portion: you might not know this, but data reviews must be publicly available to the populations having data collected. Will this bug be made public before the data starts being collected? If not, we'll need to take the data review to a separate, public, bug.

Flags: needinfo?(jstutte)

Oh, yes, I did not know that, thanks. And no, we cannot make this bug public right now. I'll downgrade the thing to a diagnostic assertion without runnable name for now and move the data review part to a new bug (or bug 1836937).

Flags: needinfo?(jstutte)
Attachment #9376582 - Flags: data-review?(chutten)
Attachment #9376582 - Attachment is obsolete: true
Pushed by jstutte@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/233014f304ae Check if WorkerRunnable::Run runs on top of WorkerThreadPrimaryRunnable::Run in a worker thread. r=dom-worker-reviewers,smaug,asuth
Group: dom-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch

Since nightly and release are affected, beta will likely be affected too.
For more information, please visit BugBot documentation.

Attachment #9376900 - Flags: approval-mozilla-beta?
Attachment #9377125 - Flags: approval-mozilla-beta?

Uplift Approval Request

  • Risk associated with taking this patch: Low
  • String changes made/needed: n/a
  • Steps to reproduce for manual QE testing: n/a
  • Fix verified in Nightly: yes
  • Is Android affected?: yes
  • Code covered by automated testing: yes
  • Explanation of risk level: Essentially this adds just a check if the worker runnable will find the expected conditions to run.
  • User impact if declined: Potential UAF and/or crashes.
  • Needs manual QE test: no
Attachment #9376900 - Attachment is obsolete: true
Attachment #9376900 - Flags: approval-mozilla-beta?
Attachment #9377128 - Flags: approval-mozilla-esr115?

Uplift Approval Request

  • Steps to reproduce for manual QE testing: n/a
  • Is Android affected?: yes
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • User impact if declined: Potential UAF and/or crashes.
  • Explanation of risk level: Essentially this adds just a check if the worker runnable will find the expected conditions to run.
  • Needs manual QE test: no
  • Risk associated with taking this patch: Low
  • String changes made/needed: n/a
Attachment #9377125 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Attachment #9377128 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+

Looks like the ESR patch needs rebasing.

Flags: needinfo?(jstutte)
Flags: needinfo?(jstutte)
Flags: needinfo?(echuang)

Thanks for solving the rebase!

Whiteboard: [adv-main123+r]
Whiteboard: [adv-main123+r] → [adv-main123+r][adv-esr115.8+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: