Refcounting on nsTimerImpl is not actually threadsafe

RESOLVED FIXED in Firefox 37, Firefox OS v1.4

Status

()

Core
XPCOM
RESOLVED FIXED
4 years ago
a year ago

People

(Reporter: bwc, Assigned: froydnj)

Tracking

({csectype-race, sec-high})

Trunk
mozilla38
csectype-race, sec-high
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox35 wontfix, firefox36+ wontfix, firefox37+ fixed, firefox38+ fixed, firefox-esr3137+ fixed, b2g-v1.4 fixed, b2g-v2.0 fixed, b2g-v2.0M fixed, b2g-v2.1 fixed, b2g-v2.1S fixed, b2g-v2.2 fixed, b2g-master fixed)

Details

(Whiteboard: [adv-main37+][adv-esr31.6+][b2g-adv-main2.2+])

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
While the refcounting on nsTimerImpl is declared threadsafe, we override Release in a way that completely negates any threadsafety protection:

http://dxr.mozilla.org/mozilla-central/source/xpcom/threads/nsTimerImpl.cpp#218

If two threads call Release at the same time, it is possible that one thread will destroy the nsTimerImpl before the other thread returns from Release, leading to a UAF. (Credit to jwwang for spotting this)
(Reporter)

Updated

4 years ago
Blocks: 1036531
Keywords: csectype-race, sec-high
Any plan?
Nathan, who do you think might be able to look at this?
Flags: needinfo?(nfroyd)
(Assignee)

Comment 3

4 years ago
I'm not even sure the scenario described in comment 0 is plausible.  My impression was that timers have threadsafe refcounting because the arming/creating thread and the timer thread have to refcount a given timer, not so any two arbitrary threads can refcount the timer.  And there's a scary amount of interplay between the timer thread and the arming/creating thread to ensure that we don't UAF.

I could be reading or remembering things wrongly, though.  I'll look at this a little more closely tomorrow.
(Reporter)

Comment 4

4 years ago
There is a window of opportunity for both the timer and target thread to manipulate this refcount at the same time, when the timer is firing (we unlock the monitor while getting ready to dispatch the timer).
(Assignee)

Comment 5

4 years ago
(In reply to Byron Campen [:bwc] from comment #4)
> There is a window of opportunity for both the timer and target thread to
> manipulate this refcount at the same time, when the timer is firing (we
> unlock the monitor while getting ready to dispatch the timer).

Hm, it looks to me like the timer thread is holding a strong ref over the entire firing process:

http://dxr.mozilla.org/mozilla-central/source/xpcom/threads/TimerThread.cpp#260

I guess the concern is that if we fail to post the timer event, we wind up here:

http://dxr.mozilla.org/mozilla-central/source/xpcom/threads/TimerThread.cpp#282

and then we can race with the target thread?  I'm not entirely sure that's right; please outline the sequence of steps necessary for the race if I'm off the mark.
(Reporter)

Comment 6

4 years ago
Yes, that is the race I'm referring to. Note that the release on the target thread is not necessarily due to the target thread letting go of its nsCOMPtr; it could also be caused by an earlier TimerEvent popping (which also holds a ref). So, it doesn't really help if the target thread ensures that it has released its timers before its thread stops accepting messages.
(Assignee)

Comment 7

4 years ago
I think bug 1036550 is the ultimate way to fix this.

The short-term fix that comes to mind involves deleting the grotty code in nsTimerImpl::Release and TimerThread::Run (and maybe threading the locked monitor through the call to PostTimerEvent), but I'm unable to convince myself that actually solves anything.  I think it just moves the problem someplace else...
Flags: needinfo?(nfroyd)
(Reporter)

Comment 8

4 years ago
I think that rewriting so the monitor is not unlocked is the fix here. I've even taken a crack at this, but after working around the obvious deadlocks, I ran into a deadlock here: http://dxr.mozilla.org/mozilla-central/source/dom/workers/WorkerPrivate.cpp#1299

This stuff is just plain gross.
Is there somebody who can work on this in the near term, either the short term or long term fix?  If not I can try to drum up some resources from jst or something.
(Assignee)

Comment 10

4 years ago
Can Byron post his patch here and bent weight in on rewriting that WorkerPrivate.cpp code?
(Reporter)

Comment 11

4 years ago
Let me try to track that patch down.
(Reporter)

Updated

4 years ago
See Also: → bug 1059572
Byron, any luck with your patch?
Flags: needinfo?(docfaraday)
(Reporter)

Comment 13

3 years ago
I haven't messed with it since comment 8
Flags: needinfo?(docfaraday)
(Assignee)

Comment 14

3 years ago
For avoidance of doubt, is the patch in bug 1059572 the patch before or after fixing the "obvious deadlocks" mentioned in comment 8?  (Testing it seems to reveal several hangs, mostly in what looks like IndexedDB-related things.)  If it's not, could you provide an updated patch?
Flags: needinfo?(docfaraday)
(Reporter)

Comment 15

3 years ago
Yes, the patch on bug 1059572 has all of the deadlock fixes I have written.
Flags: needinfo?(docfaraday)
(Assignee)

Comment 16

3 years ago
Created attachment 8561398 [details] [diff] [review]
narrow the scope of unlocking mMonitor in nsTimerImpl::PostTimerEvents

I haven't redone all the wandering through the timer code I did for comment 3
and comment 5.  But based on the potential race identified in comment 5 and
Byron's positive response in comment 6, I believe this patch closes that
particular hole.  We only need to have the monitor unlocked during the timer
posting process; we don't need to have it unlocked for twiddling with the
refcount in the case of an error.

I'm not convinced this solves all the threadsafety problems...the
nsTimerImpl::Release implementation is still pretty scary...
Attachment #8561398 - Flags: review?(benjamin)

Updated

3 years ago
Attachment #8561398 - Flags: review?(benjamin) → review+
(Assignee)

Comment 17

3 years ago
Comment on attachment 8561398 [details] [diff] [review]
narrow the scope of unlocking mMonitor in nsTimerImpl::PostTimerEvents

[Security approval request comment]
How easily could an exploit be constructed based on the patch?

Not easily.  The UAF has survived many rounds of ASAN/Valgrind testing and never manifested there; I suspect constructing an actual exploit would be difficult.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

No.

Which older supported branches are affected by this flaw?

All of them.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

I have not made branch-specific patches, but the creation of them should not be difficult (this is stable code), and their associated risk is no greater than this patch.

How likely is this patch to cause regressions; how much testing does it need?

The patch is handling a specific failure mode that is rarely seen.  Our normal testing regimen should be sufficient.
Attachment #8561398 - Flags: sec-approval?
Assignee: nobody → nfroyd
status-b2g-v1.4: --- → affected
status-b2g-v2.0: --- → affected
status-b2g-v2.0M: --- → affected
status-b2g-v2.1: --- → affected
status-b2g-v2.1S: --- → affected
status-b2g-v2.2: --- → affected
status-b2g-master: --- → affected
status-firefox35: --- → wontfix
status-firefox36: --- → affected
status-firefox37: --- → affected
status-firefox38: --- → fixed
status-firefox-esr31: --- → affected
(Assignee)

Comment 18

3 years ago
We haven't landed the patch yet. :p
status-firefox38: fixed → affected
Let's get this into trunk but also get patches for Aurora, Beta, and ESR31 nominated and in by the end of the week if possible.
tracking-firefox36: --- → +
tracking-firefox37: --- → +
tracking-firefox38: --- → +
tracking-firefox-esr31: --- → 36+
Attachment #8561398 - Flags: sec-approval? → sec-approval+
It turns out the last Beta build goes to build tomorrow so we need this on the Beta branch by then if it is going to make Firefox 36 (and probably ESR31) then too.
Flags: needinfo?(nfroyd)
(Assignee)

Comment 21

3 years ago
(In reply to Al Billings [:abillings] from comment #20)
> It turns out the last Beta build goes to build tomorrow so we need this on
> the Beta branch by then if it is going to make Firefox 36 (and probably
> ESR31) then too.

No special futzing is needed to apply this patch to beta.  But given the tight schedule, how does landing this work?  Landing it on inbound, and then on aurora and beta straightaway, without waiting for a merge?
Flags: needinfo?(nfroyd) → needinfo?(abillings)
If you land on inbound now, you shouldn't have any issue getting it to aurora/beta on time by following the usual practices. Just please get it nominated ASAP.
(Assignee)

Comment 24

3 years ago
Comment on attachment 8561398 [details] [diff] [review]
narrow the scope of unlocking mMonitor in nsTimerImpl::PostTimerEvents

Approval Request Comment
[Feature/regressing bug #]: Something lost in the mists of time
[User impact if declined]: Possible exploitable security hole
[Describe test coverage new/current, TreeHerder]:
[Risks and why]: Low.  It does change locking behavior, but not in unusual ways.
[String/UUID change made/needed]: None.
Attachment #8561398 - Flags: approval-mozilla-beta?
Attachment #8561398 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 25

3 years ago
Comment on attachment 8561398 [details] [diff] [review]
narrow the scope of unlocking mMonitor in nsTimerImpl::PostTimerEvents

[Approval Request Comment]
User impact if declined: Possibly exploitable security hole.
Fix Landed on Version: Gecko 38.
Risk to taking this patch (and alternatives if risky): Low.  It does change locking behavior, but not in unusual ways.
String or UUID changes made by this patch: None.

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #8561398 - Flags: approval-mozilla-esr31?
https://hg.mozilla.org/mozilla-central/rev/faf3c7ba3985
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-b2g-master: affected → fixed
status-firefox38: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Attachment #8561398 - Flags: approval-mozilla-esr31?
Attachment #8561398 - Flags: approval-mozilla-esr31+
Attachment #8561398 - Flags: approval-mozilla-beta?
Attachment #8561398 - Flags: approval-mozilla-beta+
Attachment #8561398 - Flags: approval-mozilla-aurora?
Attachment #8561398 - Flags: approval-mozilla-aurora+
Keywords: checkin-needed
Nathan, Al, this introduced bug 1133381 and it is blocking the 36 release.
I would like to backout this patch (or get a fix for bug 1133381). Can you help?
Flags: needinfo?(nfroyd)
Flags: needinfo?(abillings)
(Assignee)

Comment 29

3 years ago
(In reply to Sylvestre Ledru [:sylvestre] from comment #28)
> Nathan, Al, this introduced bug 1133381 and it is blocking the 36 release.
> I would like to backout this patch (or get a fix for bug 1133381). Can you
> help?

I personally don't understand how this could have caused bug 1133381, but I also have to admit that given the regression range there, this bug is the likeliest candidate.

However, I don't know what the protocol for dealing with sec-sensitive bugs that cause crashes is.  Do we just back the change out on beta, or all affected branches?  (I'd really like some input from the security team, unless the release process has dealt with things like this before...)
Flags: needinfo?(nfroyd) → needinfo?(sledru)
If this bug is the cause, I'd just back it out on the Beta branch so it doesn't affect the release. We have more time on other branches.
Flags: needinfo?(abillings)
Looks like Al answered to Nathan's question.
Ryan, could you backout the patch ? (beta only for now). Thanks
CC Lawrence to make sure he knows for 37.
Flags: needinfo?(sledru)
Flags: needinfo?(ryanvm)
Flags: needinfo?(lmandel)
Backed out from all Gecko <37 per IRC conversation with Sylvestre.

https://hg.mozilla.org/releases/mozilla-beta/rev/ad8bd14634dd
https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/da49bdf1cf41
https://hg.mozilla.org/releases/mozilla-b2g32_v2_0/rev/8fbfc80ec260
https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/4d96752782bf
https://hg.mozilla.org/releases/mozilla-esr31/rev/b5228115a202
status-b2g-v1.4: fixed → affected
status-b2g-v2.0: fixed → affected
status-b2g-v2.1: fixed → affected
status-firefox36: fixed → wontfix
status-firefox-esr31: fixed → affected
tracking-firefox-esr31: 36+ → ?
Flags: needinfo?(ryanvm)
Depends on: 1133381
I have tracked bug 1133381. We'll need to decide whether we can fix that bug or need to back this out of 37 as well.
Flags: needinfo?(lmandel)
No longer depends on: 1133381
tracking-firefox-esr31: ? → 37+
Whiteboard: [adv-main37+][adv-esr31.6+]
Whiteboard: [adv-main37+][adv-esr31.6+] → [adv-main37+][adv-esr31.6+][b2g-adv-main2.2+]

Updated

3 years ago
Group: core-security → core-security-release
Group: core-security-release
Depends on: 1334701
You need to log in before you can comment on or make changes to this bug.