Closed Bug 1633880 Opened 4 years ago Closed 4 years ago

Intermittent PID 4148 | SUMMARY: ThreadSanitizer: data race /builds/worker/workspace/obj-build/dist/include/mozilla/StaticPtr.h:171:13 in AssignAssumingAddRef | After xpcshell return code: -6

Categories

(Core :: DOM: File, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox-esr68 79+ fixed
firefox-esr78 79+ fixed
firefox77 - wontfix
firefox78 - wontfix
firefox79 + fixed
firefox80 + fixed

People

(Reporter: intermittent-bug-filer, Assigned: ssengupta)

References

(Blocks 1 open bug)

Details

(Keywords: csectype-race, intermittent-failure, sec-moderate, Whiteboard: [post-critsmash-triage][adv-main79+r][adv-ESR78.1+r] [adv-esr68.11+r])

Attachments

(6 files)

Filed by: nbeleuzu [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=299839709&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/NXuwz04MTzGpuWJMXmLtOg/runs/0/artifacts/public/logs/live_backing.log


[task 2020-04-28T19:31:18.611Z] 19:31:18 INFO - PID 4148 | Write of size 8 at 0x7f2c1893cd18 by main thread:
[task 2020-04-28T19:31:18.611Z] 19:31:18 INFO - PID 4148 | #0 AssignAssumingAddRef /builds/worker/workspace/obj-build/dist/include/mozilla/StaticPtr.h:171:13 (libxul.so+0x3912542)
[task 2020-04-28T19:31:18.611Z] 19:31:18 INFO - PID 4148 | #1 AssignWithAddref /builds/worker/workspace/obj-build/dist/include/mozilla/StaticPtr.h:166:5 (libxul.so+0x3912542)
[task 2020-04-28T19:31:18.611Z] 19:31:18 INFO - PID 4148 | #2 operator= /builds/worker/workspace/obj-build/dist/include/mozilla/StaticPtr.h:120:5 (libxul.so+0x3912542)
[task 2020-04-28T19:31:18.612Z] 19:31:18 INFO - PID 4148 | #3 mozilla::dom::IPCBlobInputStreamStorage::Observe(nsISupports*, char const*, char16_t const*) /builds/worker/checkouts/gecko/dom/file/ipc/IPCBlobInputStreamStorage.cpp:64:14 (libxul.so+0x3912542)
[task 2020-04-28T19:31:18.612Z] 19:31:18 INFO - PID 4148 | #4 nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) /builds/worker/checkouts/gecko/xpcom/ds/nsObserverList.cpp:65:19 (libxul.so+0xa4dbd3)
[task 2020-04-28T19:31:18.612Z] 19:31:18 INFO - PID 4148 | #5 nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) /builds/worker/checkouts/gecko/xpcom/ds/nsObserverService.cpp:288:19 (libxul.so+0xa50a0c)
[task 2020-04-28T19:31:18.613Z] 19:31:18 INFO - PID 4148 | #6 mozilla::ShutdownXPCOM(nsIServiceManager*) /builds/worker/checkouts/gecko/xpcom/build/XPCOMInit.cpp:618:26 (libxul.so+0xb13c22)
[task 2020-04-28T19:31:18.613Z] 19:31:18 INFO - PID 4148 | #7 NS_ShutdownXPCOM /builds/worker/checkouts/gecko/xpcom/build/XPCOMInit.cpp:565:10 (libxul.so+0xb13ae5)
[task 2020-04-28T19:31:18.614Z] 19:31:18 INFO - PID 4148 | #8 XRE_XPCShellMain(int, char**, char**, XREShellData const*) /builds/worker/checkouts/gecko/js/xpconnect/src/XPCShellImpl.cpp:1383:8 (libxul.so+0x1923f85)
[task 2020-04-28T19:31:18.614Z] 19:31:18 INFO - PID 4148 | #9 mozilla::BootstrapImpl::XRE_XPCShellMain(int, char**, char**, XREShellData const*) /builds/worker/checkouts/gecko/toolkit/xre/Bootstrap.cpp:54:12 (libxul.so+0x63a26ab)
[task 2020-04-28T19:31:18.614Z] 19:31:18 INFO - PID 4148 | #10 main /builds/worker/checkouts/gecko/js/xpconnect/shell/xpcshell.cpp:66:27 (xpcshell+0xc8b86)

[task 2020-04-28T19:31:18.648Z] 19:31:18 INFO - PID 4148 | #65 XRE_XPCShellMain(int, char**, char**, XREShellData const*) /builds/worker/checkouts/gecko/js/xpconnect/src/XPCShellImpl.cpp:1353:14 (libxul.so+0x1924634)
[task 2020-04-28T19:31:18.649Z] 19:31:18 INFO - PID 4148 | #66 mozilla::BootstrapImpl::XRE_XPCShellMain(int, char**, char**, XREShellData const*) /builds/worker/checkouts/gecko/toolkit/xre/Bootstrap.cpp:54:12 (libxul.so+0x63a26ab)
[task 2020-04-28T19:31:18.649Z] 19:31:18 INFO - PID 4148 | #67 main /builds/worker/checkouts/gecko/js/xpconnect/shell/xpcshell.cpp:66:27 (xpcshell+0xc8b86)
[task 2020-04-28T19:31:18.649Z] 19:31:18 INFO - PID 4148 | SUMMARY: ThreadSanitizer: data race /builds/worker/workspace/obj-build/dist/include/mozilla/StaticPtr.h:171:13 in AssignAssumingAddRef

IPCBlobInputStreamStorage is at the top of the stack.

Component: XPCOM → DOM: File

Looks like the reported write is gStorage getting set to null in IPCBlobInputStreamStorage::Observe() and that's racing with IPCBlobInputStreamParent::HasValidStream().

So IPCBlobInputStreamStorage cleans up at "xpcom-shutdown" but PBackground where the storage can be used is alive until "xpcom-shutdown-threads", and StaticRefPtr doesn't do anything to be threadsafe on its own and IPCBlobInputStreamStorage doesn't do any kind of synchronization to protect its gStorage StaticRefPtr.

The initial reporting test is an xpcshell test, which we expect to have less shutdown fidelity, but https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=306094025&repo=try&lineNumber=4671 is an example from comment 9 of a mochitest experiencing the failure. In that case, it's happening in PBackgroundParent::OnChannelError which then tears down the actors. That seems like a very likely thing to happen in real world situations because the content processes should be gone by xpcom-shutdown, but there's no guarantee the IPC fallout of that has been handled.

And the race between IPCBlobInputStreamStorage::Get() and gStorage = nullptr; indicates potential for UAF, even if this is probably/hopefully only possible during shutdown.

So I see two things to do here:

  1. protect the gStorage pointer itself against concurrent access from multiple threads (it is the memory of the pointer on which we race, not the memory of the object)
  2. let IPCBlobInputStreamStorage::Get() return a StaticRefPtr, not a raw* in order to keep the object alife during usage even if gStorage is set to null

Subhamoy, you might want to take a look?

Group: core-security
Flags: needinfo?(ssengupta)
Severity: normal → S3
Priority: P5 → P3

(In reply to Jens Stutte [:jstutte] from comment #11)

  1. let IPCBlobInputStreamStorage::Get() return a StaticRefPtr, not a raw* in order to keep the object alife during usage even if gStorage is set to null

Not a StaticRefPtr, but rather a regular RefPtr (or SafeRefPtr), this one isn't static.

Flags: needinfo?(ssengupta)
Assignee: nobody → ssengupta
Status: NEW → ASSIGNED

Not a StaticRefPtr, but rather a regular RefPtr (or SafeRefPtr), this one isn't static.

The most important thing is to update the callers because, if ::Get() returns a nullptr (or a smart pointer with a nullptr value), the existing code paths must be updated. Almost all the uses of IPCBlobInputStreamStorage::Get() do not handle nullptr or failures.

Is there a reason to not keep IPCBlobInputStreamStorage alive until PBackground is fully shutdown or all of the PIPCBlobInputStream actors are shutdown and no new actors are allowed to be created? Which could also mean explicitly terminating the actors as part of an "xpcom-shutdown" shutdown blocker (or earlier)?

Adding the null-handling seems like adding new edge-cases that don't really need to exist given that there's no reason why any streams should be active post (or during) "xpcom-shutdown".

(see comment 15)

Flags: needinfo?(amarchesini)

My first approach was to move the shutdown of IPCBlobInputStream a bit forward down the shutdown phases. But this still means that ::Get() could return nullptr and we do not support it. As a temporary solution, I'm OK with what you propose, but in general, I would like to have some better checks of the return value of ::Get().

Flags: needinfo?(amarchesini)

Then it probably makes sense to do the get/null handling first as it's more uplift-able, etc.

Comment on attachment 9158336 [details]
Bug 1633880 - P1 - gStorage in RemoteLazyInputStreamStorage is protected by gMutex r=baku

Security Approval Request

  • How easily could an exploit be constructed based on the patch?: An exploit is -- in my opinion -- not easy to construct.
  • Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: No
  • Which older supported branches are affected by this flaw?: Added checks never existed before, so all of them
  • If not all supported branches, which bug introduced the flaw?: N/A
  • Do you have backports for the affected branches?: No
  • If not, how different, hard to create, and risky will they be?: N/A
  • How likely is this patch to cause regressions; how much testing does it need?: This patch is unlikely to cause regression.
Attachment #9158336 - Flags: sec-approval?
Attachment #9158337 - Flags: sec-approval?

Comment on attachment 9158336 [details]
Bug 1633880 - P1 - gStorage in RemoteLazyInputStreamStorage is protected by gMutex r=baku

Approved to land and uplift.

Attachment #9158336 - Flags: sec-approval?
Attachment #9158336 - Flags: sec-approval+
Attachment #9158336 - Flags: approval-mozilla-esr78+
Attachment #9158336 - Flags: approval-mozilla-esr68+
Attachment #9158336 - Flags: approval-mozilla-beta+

Changing the priority to p1 as the bug is tracked by a release manager for the current beta.
See What Do You Triage for more information

Priority: P3 → P1
Attachment #9158337 - Flags: sec-approval? → sec-approval+
Group: core-security → dom-core-security
Attachment #9158336 - Attachment description: Bug 1633880 - P1 - gStorage in IPCBlobInputStreamStorage is protected by gMutex r=baku → Bug 1633880 - P1 - gStorage in RemoteLazyInputStreamStorage is protected by gMutex r=baku
Attachment #9158337 - Attachment description: Bug 1633880 - P2 - IPCBlobInputStreamStorage::Get() now returns mozilla::Result<RefPtr<IPCBlobInputStreamStorage>, nsresult> r=baku → Bug 1633880 - P2 - RemoteLazyInputStreamStorage::Get() now returns mozilla::Result<RefPtr<RemoteLazyInputStreamStorage>, nsresult> r=baku

Looks like these patches will need rebasing for uplift to all of the affected branches. Subhamoy, can you please attach those patches when you get a chance? Thanks!

Flags: needinfo?(ssengupta)
Group: dom-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80

Hi Ryan, Subhamoy is on holiday this week. If it is more urgent, I can ask someone else to take it.

Flags: needinfo?(ssengupta) → needinfo?(ryanvm)

It can wait until next week, thanks.

Flags: needinfo?(ryanvm)

Subhamoy, this waits for you then ;-)

Flags: needinfo?(ssengupta)

Thank you! Working on it right now :)

Flags: needinfo?(ssengupta)
Attachment #9163722 - Attachment description: Bug 1633880 - P1 - gStorage in IPCBlobInputStreamStorage is protected by gMutex r=baku → Bug 1633880 - P1 - gStorage in IPCBlobInputStreamStorage is protected by gMutex [beta] r=baku
Attachment #9163723 - Attachment description: Bug 1633880 - P2 - IPCBlobInputStreamStorage::Get() now returns mozilla::Result<RefPtr<IPCBlobInputStreamStorage>, nsresult> r=baku → Bug 1633880 - P2 - IPCBlobInputStreamStorage::Get() now returns mozilla::Result<RefPtr<IPCBlobInputStreamStorage>, nsresult> [beta] r=baku

Comment on attachment 9163722 [details]
Bug 1633880 - P1 - gStorage in IPCBlobInputStreamStorage is protected by gMutex [beta] r=baku

Beta/Release Uplift Approval Request

  • User impact if declined: Possible crashes due to data race.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It mutex-locks certain variable accesses and introduces more robust error checking for certain subroutines. I cannot determine any downside to introducing these changes.
  • String changes made/needed:
Attachment #9163722 - Flags: approval-mozilla-beta?
Attachment #9163723 - Flags: approval-mozilla-beta?
Attached patch ESR68.patchSplinter Review

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Possible crashes due to data race and/or null pointer return
Fix Landed on Version: 80
Risk to taking this patch (and alternatives if risky): None I can determine

Attachment #9164039 - Flags: approval-mozilla-esr68?
Attached patch ESR78.patchSplinter Review

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Possible crashes due to data race and/or null pointer return
Fix Landed on Version: 80
Risk to taking this patch (and alternatives if risky): None I can determine

Attachment #9164041 - Flags: approval-mozilla-esr78?

Comment on attachment 9163722 [details]
Bug 1633880 - P1 - gStorage in IPCBlobInputStreamStorage is protected by gMutex [beta] r=baku

Approved for 79.0b9, 78.1esr, and 68.11esr.

Attachment #9163722 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9163723 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9164039 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+
Attachment #9164041 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
Flags: qe-verify-
Whiteboard: [post-critsmash-triage]
Blocks: tsan
Whiteboard: [post-critsmash-triage] → [post-critsmash-triage][adv-main79+r]
Whiteboard: [post-critsmash-triage][adv-main79+r] → [post-critsmash-triage][adv-main79+r][adv-ESR78.1+r]
Whiteboard: [post-critsmash-triage][adv-main79+r][adv-ESR78.1+r] → [post-critsmash-triage][adv-main79+r][adv-ESR78.1+r] [adv-esr68.11+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: