Closed
Bug 1033121
Opened 10 years ago
Closed 10 years ago
Race in nsTimerEvent destructor
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
People
(Reporter: jwwang, Assigned: jwwang)
References
Details
(Keywords: csectype-race, sec-moderate, Whiteboard: [adv-main32+][adv-esr31.1+])
Attachments
(1 file)
1.07 KB,
patch
|
bzbarsky
:
review+
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-beta-
Sylvestre
:
approval-mozilla-esr31+
|
Details | Diff | Splinter Review |
I found this problem when debugging timeouts in test_playback_rate.html. [1] It looks like it is possible for |event| [2] to be executed in the target thread before nsTimerImpl::PostTimerEvent returns. Then nsTimerEvent will be deleted on exit of nsTimerImpl::PostTimerEvent in the timer thread. In the destructor of nsTimerEvent, it will call nsTimerImpl::Release in the timer thread. At the same time, it is also possible for nsTimerImpl::InitWithCallback to be called in the target thread. Therefore we might hit bug 997844 comment 0 again. [1] bug 1020707 comment 42 [2] http://hg.mozilla.org/mozilla-central/file/3a28d3bb4f55/xpcom/threads/nsTimerImpl.cpp#l763
Assignee | ||
Comment 1•10 years ago
|
||
Hi Boris, Can you check if this is a problem of nsTimer?
Flags: needinfo?(bzbarsky)
Updated•10 years ago
|
Group: core-security
Comment 3•10 years ago
|
||
Doesn't seem so. Does Dispatch not take the reference of the passed Runnable when it succeeds? If not, then yeah we could end up back in the scenario described in bug 997844. Ideally, the behavior I would want from Dispatch is an optional reference transfer (ie; if it works, the reference is transferred, otherwise it is not), but this may not be feasible. If it isn't, I think we'd have to refactor so that we did not need to unlock the monitor here: http://dxr.mozilla.org/mozilla-central/source/xpcom/threads/TimerThread.cpp#253 This would prevent the target thread from getting to AddTimerInternal while all this is going on, which would also fix the race in bug 1024765 comment 33 I think.
Assignee | ||
Comment 4•10 years ago
|
||
The signature is Dispatch(nsIRunnable *event, uint32_t flags). The event target can only add reference but not take reference on success.
Comment 5•10 years ago
|
||
> It looks like it is possible for |event| [2] to be executed in the target thread before
> nsTimerImpl::PostTimerEvent returns.
Yes, this is always possible for event-dispatching situations. In particular, |event| can execute before Dispatch() has returned...
You're right that this is a problem.
I think the right thing to do here is to null out mTimer in nsTimerEvent::Run() before any of its return statements. Agreed?
Flags: needinfo?(bzbarsky) → needinfo?(jwwang)
OS: Android → All
Hardware: ARM → All
Assignee | ||
Comment 10•10 years ago
|
||
Release |mTimer| in nsTimerEvent::Run() instead of ~nsTimerEvent() to avoid race.
Comment 11•10 years ago
|
||
Comment on attachment 8451627 [details] [diff] [review] 1033121_fix_race_in_nsTimerEvent_destructor-v1.patch r=me. I can't think of as sane way to add a testcase for this. :(
Attachment #8451627 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 12•10 years ago
|
||
How about adding assertions to check most nsITimer functions should be called in the target thread.
Assignee | ||
Comment 13•10 years ago
|
||
try: https://tbpl.mozilla.org/?tree=Try&rev=10b52bb4f343 most green.
Keywords: checkin-needed
Comment 14•10 years ago
|
||
> How about adding assertions
That might work, yes. Want to give it a shot on try without this patch?
Comment 16•10 years ago
|
||
Is it exploitable and rated sec-high or sec-crit? Does it affect more than trunk? If yes to both, then it does. It seems kind of sec-high-ish to me, because it sounds like it could lead to a use after free or something.
Flags: needinfo?(continuation)
Keywords: csectype-race
(I was kind of hoping you would rate it ...)
Comment 18•10 years ago
|
||
I don't know what is actually going wrong, and I don't understand comment 0 here or in bug 997844.
Comment 19•10 years ago
|
||
So, I don't think this could lead to a UAF. The fundamental problem here is that an nsTimerImpl could fail to be scheduled if the user gets really unlucky. This could cascade and cause other problems, but likely not a UAF (more likely stuff like leaks and hangs, and asserts in debug builds probably). Coupled with the fact that this is an extremely tight race, it is probably somewhat difficult to trigger in an attack. @jwwang Were you seeing dangerous-looking crashes (ie; anything other than an assert) with this?
Flags: needinfo?(jwwang)
Updated•10 years ago
|
Keywords: sec-moderate
Assignee | ||
Comment 20•10 years ago
|
||
It seems theoretically possible to cause use-after-free or memory corruption if [1] is run in the timer thread after [2] is run in the target thread when 2 threads are calling nsTimerImpl::Release() at the same time. [1] http://hg.mozilla.org/mozilla-central/file/7f9db2379b3f/xpcom/threads/nsTimerImpl.cpp#l264 [2] http://hg.mozilla.org/mozilla-central/file/7f9db2379b3f/xpcom/threads/nsTimerImpl.cpp#l230 Btw, what is UAF?
Flags: needinfo?(jwwang) → needinfo?(bzbarsky)
UAF = Use After Free
Comment 22•10 years ago
|
||
(In reply to JW Wang [:jwwang] from comment #20) > It seems theoretically possible to cause use-after-free or memory corruption > if [1] is run in the timer thread after [2] is run in the target thread when > 2 threads are calling nsTimerImpl::Release() at the same time. Ah, that's lovely. We've managed to take a threadsafe refcount, and make it non-threadsafe by overriding Release to do a whole bunch of extra stuff.
Updated•10 years ago
|
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 23•10 years ago
|
||
Btw, there is also a race in nsTimerImpl::SetDelayInternal() which is called by nsTimerImpl::PostTimerEvent which is called in the timer thread. We should really think about making nsTimerImpl thread-safe. Lockless code is just hard to get right.
Comment 24•10 years ago
|
||
(In reply to JW Wang [:jwwang] from comment #23) > Btw, there is also a race in nsTimerImpl::SetDelayInternal() which is called > by nsTimerImpl::PostTimerEvent which is called in the timer thread. We > should really think about making nsTimerImpl thread-safe. Lockless code is > just hard to get right. Yes, I believe we do have another race here, because we have no assurance that |Fire| will not be called on the target thread at the same time as |PostTimerEvent| is called on the timer thread, either because we're using repeating precise timers, or because something racy has happened with |mGeneration| (yes, there is a race here too, because it is written on the target thread and read on the timer thread with no sync barrier). IMO, What we really need to do is greatly expand unit-test for this, and run it through its paces with TSAN.
Assignee | ||
Comment 27•10 years ago
|
||
So the way forward is to request sec-approval, write comments about that, and then put "check-in-needed" again?
Comment 28•10 years ago
|
||
sec-moderates don't need approval. https://wiki.mozilla.org/Security/Bug_Approval_Process
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 29•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/2f00e114a489 Probably worth nominating this for at least Aurora uplift. Not sure about beta/b2g30 given where we are in the release cycle. I can always disable my way to victory on any media tests being bitten by this if need-be.
status-b2g-v1.4:
--- → ?
status-b2g-v2.0:
--- → ?
status-b2g-v2.1:
--- → affected
status-firefox31:
--- → ?
status-firefox32:
--- → ?
status-firefox33:
--- → affected
Keywords: checkin-needed
Comment 30•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/2f00e114a489
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Updated•10 years ago
|
status-firefox-esr24:
--- → wontfix
Assignee | ||
Comment 31•10 years ago
|
||
Comment on attachment 8451627 [details] [diff] [review] 1033121_fix_race_in_nsTimerEvent_destructor-v1.patch Approval Request Comment [Feature/regressing bug #]:unknown [User impact if declined]:timer callback will cease to fire which will impact the overall system functionality. [Describe test coverage new/current, TBPL]: aurora - https://tbpl.mozilla.org/?tree=Try&rev=cb7a6367e147 beta - https://tbpl.mozilla.org/?tree=Try&rev=2e704d9d39d6 It already runs fine on inbound/central for a week. [Risks and why]: low for the change is simple and root cause is well known [String/UUID change made/needed]:none
Attachment #8451627 -
Flags: approval-mozilla-beta?
Attachment #8451627 -
Flags: approval-mozilla-aurora?
Updated•10 years ago
|
Comment 32•10 years ago
|
||
Comment on attachment 8451627 [details] [diff] [review] 1033121_fix_race_in_nsTimerEvent_destructor-v1.patch Too late for 31. Not sure it is critical enough for ESR 31.
Attachment #8451627 -
Flags: approval-mozilla-esr31?
Attachment #8451627 -
Flags: approval-mozilla-beta?
Attachment #8451627 -
Flags: approval-mozilla-beta-
Attachment #8451627 -
Flags: approval-mozilla-aurora?
Attachment #8451627 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 33•10 years ago
|
||
I don't think so. For now it seems only Android will hit this bug.
Updated•10 years ago
|
Comment 34•10 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/86f331357783 Not sure if this is something that could affect real B2G devices in the wild or not.
Assignee | ||
Comment 35•10 years ago
|
||
Comment on attachment 8451627 [details] [diff] [review] 1033121_fix_race_in_nsTimerEvent_destructor-v1.patch [Approval Request Comment] Bug caused by (feature/regressing bug #): unknown User impact if declined: timer callback will cease to fire which will impact the overall system functionality. Testing completed: It has run fine on inbound/central for about 2 weeks. try: https://tbpl.mozilla.org/?tree=Try&rev=95651f209cc9 Risk to taking this patch (and alternatives if risky): low for the change is simple and root cause is well known String or UUID changes made by this patch:none
Attachment #8451627 -
Flags: approval-mozilla-b2g30?
Updated•10 years ago
|
Comment 37•10 years ago
|
||
Let's land this in v1.4. Hi JW, Do we need to rebase the patch for v1.4. Thanks?
blocking-b2g: 1.4? → 1.4+
Flags: needinfo?(jwwang)
Updated•10 years ago
|
Attachment #8451627 -
Flags: approval-mozilla-b2g30?
Comment 40•10 years ago
|
||
Comment on attachment 8451627 [details] [diff] [review] 1033121_fix_race_in_nsTimerEvent_destructor-v1.patch Taking it because ESR31 is also used for Android 2.2 and ARMv6 31 release...
Attachment #8451627 -
Flags: approval-mozilla-esr31? → approval-mozilla-esr31+
Comment 42•10 years ago
|
||
Marking qe-verify- for verification purposes due to lack of test case or easier STR. Please feel free to provide if you'd like me to verify the fix. Thank you.
Flags: qe-verify-
Updated•10 years ago
|
Whiteboard: [adv-main32+][adv-esr31.1+]
Updated•9 years ago
|
Group: core-security → core-security-release
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•