Closed Bug 988698 (CVE-2015-2715) Opened 10 years ago Closed 9 years ago

heap-use-after-free in nsThreadManager::RegisterCurrentThread during shutdown

Categories

(Core :: Audio/Video, defect)

31 Branch
x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- wontfix
firefox38 --- fixed
firefox39 --- fixed
firefox-esr31 --- wontfix
b2g-v1.4 --- wontfix
b2g-v2.0 --- wontfix
b2g-v2.0M --- wontfix
b2g-v2.1 --- wontfix
b2g-v2.1S --- wontfix
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: tsmith, Assigned: jesup)

Details

(Keywords: crash, csectype-uaf, sec-moderate, Whiteboard: [adv-main38+][b2g-adv-main2.2+])

Attachments

(2 files, 1 obsolete file)

Attached file stack_trace.txt
Found by the BlackBerry Security Automated Analysis Team's fuzzing framework ALF.

At this time we do not have a test case that will reproduce the issue.
Keywords: crash, csectype-uaf
Severity: normal → critical
Keywords: sec-critical
Thread from MediaTaskQueue is alive after XPCOM shutdown.
Component: General → Video/Audio
Without a testcase, we can't be certain we've fixed the underlying issue here.

Tyson: Can you change your fuzzer to output the files it's loaded when a problem is detected please? Filling these bugs is of limited value without test cases.
Flags: needinfo?(tysmith)
Whiteboard: testcas
Chris: Sorry I don't have a test case for this one. Sometimes devs can fix issues with only a stack trace so even if I don't have a test case for an issue I will log it. Feel free to close the issue if there is insufficient info and if I do get a test case I will post it ASAP.
Flags: needinfo?(tysmith)
ThreadManager has been shut down, and some thread either hadn't Registered yet, or was started after the ThreadManager shut down.  Since ThreadManager appears to be a static singleton vie ::get(), it should only be deleted during static cleanups.  This implies to me (maybe) that the nsThread was created and stalled, then the rest of shutdown happened, then it ran during mainthread cleanup.

A log from a run that fails would also be useful (i.e. NS_Assertions being logged).

I do think there's a small hole between thread creation and RegisterCurrentThread() where the nsThreadManager::Shutdown could miss cleaning this thread up, though that alone wouldn't have this impact.

I'm not sure how being that deep in shutdown would impact the severity rating here.
Flags: needinfo?(benjamin)
There's this nice comment at the top of nsThreadManager::Shutdown:

109   // XXX What happens if shutdown happens before NewThread completes?
110   //     Fortunately, NewThread is only called on the main thread for now.

Since that's clearly not true from the stack traces here, we need to make sure that these thread pools are fully cleaned up and aren't mid-starting new threads by the time we get here.
Flags: needinfo?(benjamin)
(In reply to Randell Jesup [:jesup] from comment #4)

> I do think there's a small hole between thread creation and
> RegisterCurrentThread() where the nsThreadManager::Shutdown could miss
> cleaning this thread up, though that alone wouldn't have this impact.

Should we plug that hole anyway?
Flags: needinfo?(rjesup)
(In reply to David Bolter [:davidb] from comment #6)
> (In reply to Randell Jesup [:jesup] from comment #4)
> 
> > I do think there's a small hole between thread creation and
> > RegisterCurrentThread() where the nsThreadManager::Shutdown could miss
> > cleaning this thread up, though that alone wouldn't have this impact.
> 
> Should we plug that hole anyway?

If analysis bears it up, probably.
Flags: needinfo?(rjesup)
Keywords: testcase
Whiteboard: testcas
bsmedberg, this is marked sec-critical and hasn't moved in 10 days.  Can you confirm that this should be sec-critical or dveditz)?  If so, we need an owner.
Flags: needinfo?(benjamin)
This is a case where one of the media decoder thread pools is not shutting down early enough, and so the proximate fix is in audio/video not XPCOM. Please talk to jesup.
Flags: needinfo?(benjamin) → needinfo?(rjesup)
-> jwwang and cpearce to make sure the media decoder thread pools get shut down correctly
Flags: needinfo?(rjesup) → needinfo?(jwwang)
The shutdown process of MediaDecoderStateMachine is asynchronous. We might need a synchronous way when MediaDecoderStateMachine::Shutdown() is called from MediaShutdownManager which is called by XPCOM-SHUTDOWN.
Flags: needinfo?(jwwang) → needinfo?(cpearce)
MediaShutdownManager::Shutdown(), the media stack's xpcom-shutdown observer blocks until all MediaDecoderStateMachines, media threads, and thread pools have completed their async shutdown.

I suspect what's happening here is that some event is causing the media stack to startup again after its shutdown has completed. We could try setting a flag in MediaShutdownManager::Shutdown() to prevent the media stack starting up again, but without a testcase we can't be sure that's actually the issue.
Flags: needinfo?(cpearce)
Can anything start new threads while we're in MediaShutdownManager::Shutdown()?  Perhaps those threads started after the async shutdown starts are getting missed?

I don't see any creation guards in place before we EnumerateEntries() shutting them down (so new ones could be created and queue for additions, at least in theory).  You could have MediaDecoder::Init() which calls MediaShutdownManager::Instance().Register()
a) return failure if we're in shutdown, forcing Init() to fail (and make sure the failure path works)
b) if sInstance is null and we're shut down, MOZ_CRASH() or change the API to be fallible.

or put in creation guards and make certain we can't have any in mid-creation when we hit the Shutdown observer

This assumes we believe that new instances could be created/in-mid-creation
Group: media-core-security
Still no owner and sounds like we have more questions than answers...

During critsmash we speculated whether this could be downgraded as it happens on shutdown... (Dan?)
Flags: needinfo?(dveditz)
Guessing that an apparently racy shutdown crash would be difficult to exploit with any reliability.
Flags: needinfo?(dveditz)
Summary: heap-use-after-free in nsThreadManager::RegisterCurrentThread → heap-use-after-free in nsThreadManager::RegisterCurrentThread during shutdown
The message printed on exit is:

ERROR: Caught a segmentation fault while loading plugin file:
(null)

Please either:
- remove it and restart.
- run with --gst-disable-segtrap --gst-disable-registry-fork and debug.
Group: media-core-security
Does comment 16 trigger any ideas?
Flags: needinfo?(benjamin)
No.
Flags: needinfo?(benjamin)
So comment 16 implies there's something related to gstreamer going on:
See http://gstreamer-devel.966125.n4.nabble.com/ERROR-Caught-a-segmentation-fault-while-loading-plugin-file-td1458955.html

This might be a red herring, though, given the ASAN hit above
I think this will plug the hole I identified in comment 4.  I don't think this is the cause of this ASAN failure, but this might block the actual bug from causing this sort of problem, perhaps (especially the mLock change).
Assignee: nobody → rjesup
Status: NEW → ASSIGNED
Attachment #8581430 - Flags: review?(nfroyd)
Attachment #8581430 - Flags: feedback?(benjamin)
Comment on attachment 8581430 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

I didn't do a detailed review of the threading details here, but I think the general plan is good.
Attachment #8581430 - Flags: feedback?(benjamin) → feedback+
Comment on attachment 8581430 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

Review of attachment 8581430 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with some comments below.

::: xpcom/threads/nsThreadManager.cpp
@@ +250,5 @@
>    mMainThread = nullptr;
> +  // Don't release *mLock here, in case there are threads in mid-creation state.
> +  // They'll fail to finish creation, but may make calls (RegisterCurrentThread)
> +  // that need *mLock.  Also this would be a TSAN issue perhaps.  Let mLock
> +  // be freed automatically.

The pointer-ness of mLock doesn't make much sense to me nowadays; maybe we should just convert mLock to a plain member instead?

Does mLock living through ::Shutdown and whatnot mean that this change looks like a leak from the perspective of leak-checking builds?  If so, we should convert mLock to an OffTheBooksMutex.

@@ +343,5 @@
>                             uint32_t aStackSize,
>                             nsIThread** aResult)
>  {
>    // No new threads during Shutdown
>    if (NS_WARN_IF(!mInitialized)) {

If I'm reading the discussion correctly, we can call ::NewThread on non-main-thread threads, correct?  If so, mInitialized should become an Atomic<bool> for the purposes of this access and the one that you're adding below.

@@ +352,3 @@
>    if (!thr) {
>      return NS_ERROR_OUT_OF_MEMORY;
>    }

Thank you for cleaning up the raw NS_{ADDREF,RELEASE} business.  Might as well remove this useless null-check while you're here.

@@ +358,4 @@
>      return rv;
>    }
>  
>    // At this point, we expect that the thread has been registered in mThread;

I assume this is supposed to be |mThreadsByPRThread| nowadays...could you fix that while you're in the area?

@@ +369,5 @@
> +    }
> +    return NS_ERROR_NOT_INITIALIZED;
> +  }
> +
> +  *aResult = thr.forget().take();

thr.forget(aResult);
Attachment #8581430 - Flags: review?(nfroyd) → review+
Updated per review; try = https://treeherder.mozilla.org/#/jobs?repo=try&revision=348b18ef9d04 -- Note that this is sec-mod and also it hits fundamental low-level stuff, so I wanted a full Try.  As sec-mod, it doesn't *need* sec-approval to land, but I think it might make sense to ping al/dan before landing, as this is a UAF failure, albeit during shutdown.
Comment on attachment 8581926 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

Carry forward r=froydnj

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Hard.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
The patch does point out the race between shutdown and thread creation; trying to cause thread creation to break that would not be easy though (we still have no idea how to force it ourselves).  It does expose that such a potential exists.

Which older supported branches are affected by this flaw?  
All

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Trivial to backport; no extra risk.


How likely is this patch to cause regressions; how much testing does it need?
Unlikely unless this has broken some offbeat usage (like the reason it avoided a member Mutex before).  However, anything touching nsThread* needs care.
Attachment #8581926 - Flags: sec-approval?
Attachment #8581926 - Flags: review+
If this is truly a sec-moderate, it can just go in and doesn't need approval.
It's marked sec-moderate; however per comment 23 I wanted to dot my i's and give the sec team a chance to re-assess (as it's a UAF).  Likely it's very hard/near-impossible to leverage since deep in shutdown you can't easily make freed memory get reused with anything predictable.

I'll land it, then request aurora and (maybe) ESR31 once we're stable.
Attachment #8581926 - Flags: sec-approval?
https://hg.mozilla.org/mozilla-central/rev/5627301b95de

Please nominate this for uplift to whatever branches you feel it should land on.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-b2g-v2.1: --- → ?
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Attachment #8581430 - Attachment is obsolete: true
Comment on attachment 8581926 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

Approval Request Comment
[Feature/regressing bug #]: N/A

[User impact if declined]: Longer from landing a sec-mod (but UAF) bug to when it's in release

[Describe test coverage new/current, TreeHerder]: Nothing is specifically testing this race; we don't have an STR

[Risks and why]: Pretty low risk given clean landing; fails thread creation in shutdown more reliably, and moves a lock to be always available instead of allocated.

[String/UUID change made/needed]: none
Attachment #8581926 - Flags: approval-mozilla-aurora?
al, dan - should we consider landing on ESR31 to roll into the next ESR push?  Normally as sec-moderate I'd say no, but it's a (very-hard-to-exploit) UAF.  Getting it into Aurora will ensure it's in the next ESR, which is good.
Flags: needinfo?(dveditz)
Flags: needinfo?(abillings)
Comment on attachment 8581926 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

That makes sense to me. Let's get it on Aurora.
Flags: needinfo?(abillings)
Attachment #8581926 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Do we want this in 31.6.0? If we get this landed today it can go.
Flags: needinfo?(abillings)
(In reply to Benjamin Kerensa [:bkerensa] from comment #32)
> Do we want this in 31.6.0? If we get this landed today it can go.

If this isn't shipping until Fx38, shouldn't it wait for 31.7?
This is a sec-moderate. We don't need to take this in ESR31 and it will be fixed in ESR38.
Flags: needinfo?(abillings)
Comment on attachment 8581926 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): N/A

User impact if declined: See aurora approval request

Testing completed: On m-c and aurora.  No specific testing for the failure known to be possible.

Risk to taking this patch (and alternatives if risky): See Aurora request

String or UUID changes made by this patch: none
Attachment #8581926 - Flags: approval-mozilla-b2g37?
Looks like abillings answered already: yes 38-ESR, not necessary 31-ESR
Flags: needinfo?(dveditz)
Comment on attachment 8581926 [details] [diff] [review]
Ensure threads can't be started during nsThreadManager::Shutdown()

Taking this sec-moderate on the b2g37 branch since this is going o be long lived and the next version is a bit far away.
Attachment #8581926 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Whiteboard: [adv-main38+]
Alias: CVE-2015-2715
Whiteboard: [adv-main38+] → [adv-main38+][b2g-adv-main2.2+]
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.