Closed
Bug 1339588
Opened 4 years ago
Closed 3 years ago
Various rare crashes in nsTimer, nsTimerImpl, nsTimerEvent
Categories
(Core :: XPCOM, defect)
Tracking
()
People
(Reporter: bwc, Assigned: bwc)
Details
(4 keywords, Whiteboard: [post-critsmash-triage])
Attachments
(3 files, 3 obsolete files)
1.57 KB,
patch
|
froydnj
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
1.93 KB,
patch
|
froydnj
:
review+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
2.99 KB,
patch
|
froydnj
:
review+
dveditz
:
approval-mozilla-aurora+
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
crash-stats has a smallish background rate of crashes that look like memory management problems (the races in nsTimerImpl itself seem to be fixed, though). https://crash-stats.mozilla.com/search/?signature=~nsTimer&version=52.0b5&date=%3E%3D2017-02-07T21%3A13%3A50.000Z&date=%3C2017-02-14T21%3A13%3A50.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature In particular, the crashes in nsTimerEvent::Run are suspicious, and hint that something racy is being done to nsTimerEvent::mTimer (perhaps nsTimerEvent::Cancel is racing?).
Assignee | ||
Comment 1•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b60722676a21
Assignee | ||
Comment 2•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3241c0b2b6c7
Updated•4 years ago
|
Group: core-security → dom-core-security
Comment 3•4 years ago
|
||
I'm going to mark this sec-high, but maybe lower would be okay, too.
Keywords: sec-high
Assignee | ||
Comment 4•4 years ago
|
||
MozReview-Commit-ID: 6br6DaDqxR0
Attachment #8838611 -
Flags: review?(nfroyd)
Assignee | ||
Updated•4 years ago
|
Assignee: nobody → docfaraday
Status: NEW → ASSIGNED
Assignee | ||
Comment 5•4 years ago
|
||
MozReview-Commit-ID: BXCGYWnFqSj
Attachment #8838613 -
Flags: review?(nfroyd)
Assignee | ||
Comment 6•4 years ago
|
||
These changes might do the trick. Hard to say for sure.
![]() |
||
Comment 7•4 years ago
|
||
Comment on attachment 8838611 [details] [diff] [review] Part 1: Simplify nsTimerEvent::Cancel, since there's no need to release the nsTimerImpl itself Review of attachment 8838611 [details] [diff] [review]: ----------------------------------------------------------------- I don't think this patch is right, but I'm happy to hear counterarguments as to why I'm wrong. ::: xpcom/threads/TimerThread.cpp @@ -284,5 @@ > { > - if (!mTimer) { > - MOZ_ASSERT(false); > - return NS_OK; > - } Why is this block safe to remove? If the timer gets Cancel'd after the event is already in the event queue, we don't want Run() to do anything. ::: xpcom/threads/nsTimerImpl.cpp @@ +128,5 @@ > mImpl->Cancel(); > mImpl = nullptr; > + // This might not be 0, technically speaking, but as far as the caller of > + // Release is concerned this object is dead. > + return 0; I'm skeptical of this change, since we use the return value of this method for general invariant checking.
Attachment #8838611 -
Flags: review?(nfroyd)
![]() |
||
Comment 8•4 years ago
|
||
Comment on attachment 8838613 [details] [diff] [review] Part 2: Help prevent nullptr crashes due to misuse of the timer API Review of attachment 8838613 [details] [diff] [review]: ----------------------------------------------------------------- Canceling review while we get this worked out. ::: xpcom/threads/nsTimerImpl.cpp @@ +452,5 @@ > oldDelay = mDelay; > oldTimeout = mTimeout; > + // Ensure that the nsITimer does not unhook from the nsTimerImpl during > + // Fire; this will cause null pointer crashes if the user of the timer drops > + // its reference, and then uses the nsITimer* passed in the callback. The scenario here is that: T1: Create timer. T1: timer->SetTarget(T2) ....time passes T2: nsTimeImpl::Fire is running. T1: null out nsCOMPtr to timer (nsITimer). T2: Pass nsITimer* from strong reference, which is bogus Is that right? TimerImpl is still holding a strong reference to nsITimer* thing, though...or does that get removed somehow?
Attachment #8838613 -
Flags: review?(nfroyd)
Assignee | ||
Comment 9•4 years ago
|
||
Comment on attachment 8838611 [details] [diff] [review] Part 1: Simplify nsTimerEvent::Cancel, since there's no need to release the nsTimerImpl itself Review of attachment 8838611 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/threads/TimerThread.cpp @@ -284,5 @@ > { > - if (!mTimer) { > - MOZ_ASSERT(false); > - return NS_OK; > - } If nsTimerEvent::Cancel() is called, we call nsTimerImpl::Cancel() which will prevent nsTimerEvent::Run() from doing anything besides logging. Are you saying to want to avoid the logging below if either the Runnable or the timer itself is cancelled? That could be done, I suppose. ::: xpcom/threads/nsTimerImpl.cpp @@ +128,5 @@ > mImpl->Cancel(); > mImpl = nullptr; > + // This might not be 0, technically speaking, but as far as the caller of > + // Release is concerned this object is dead. > + return 0; I worry that something will see a return of 1 from NS_RELEASE and assume the object isn't dead yet (which would be an invalid assumption in the general case). If you think I shouldn't worry about this, I can remove.
Assignee | ||
Updated•4 years ago
|
Flags: needinfo?(nfroyd)
![]() |
||
Comment 10•4 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #9) > ::: xpcom/threads/TimerThread.cpp > @@ -284,5 @@ > > { > > - if (!mTimer) { > > - MOZ_ASSERT(false); > > - return NS_OK; > > - } > > If nsTimerEvent::Cancel() is called, we call nsTimerImpl::Cancel() which > will prevent nsTimerEvent::Run() from doing anything besides logging. Are > you saying to want to avoid the logging below if either the Runnable or the > timer itself is cancelled? That could be done, I suppose. Oh, oh, I see. Sorry I got mixed up here: nsTimerImpl::Fire will take care of all the details of making the timer not actually do things, not nsTimerEvent::Run. OK, that part makes sense. > ::: xpcom/threads/nsTimerImpl.cpp > @@ +128,5 @@ > > mImpl->Cancel(); > > mImpl = nullptr; > > + // This might not be 0, technically speaking, but as far as the caller of > > + // Release is concerned this object is dead. > > + return 0; > > I worry that something will see a return of 1 from NS_RELEASE and assume the > object isn't dead yet (which would be an invalid assumption in the general > case). If you think I shouldn't worry about this, I can remove. I think you shouldn't worry about it; I'd rather make minimal changes to the timer code if possible, and breaking Release's contract doesn't seem like a good idea in any event.
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 11•4 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #8) > Comment on attachment 8838613 [details] [diff] [review] > Part 2: Help prevent nullptr crashes due to misuse of the timer API > > Review of attachment 8838613 [details] [diff] [review]: > ----------------------------------------------------------------- > > Canceling review while we get this worked out. > > ::: xpcom/threads/nsTimerImpl.cpp > @@ +452,5 @@ > > oldDelay = mDelay; > > oldTimeout = mTimeout; > > + // Ensure that the nsITimer does not unhook from the nsTimerImpl during > > + // Fire; this will cause null pointer crashes if the user of the timer drops > > + // its reference, and then uses the nsITimer* passed in the callback. > > The scenario here is that: > > T1: Create timer. > T1: timer->SetTarget(T2) > ....time passes > T2: nsTimeImpl::Fire is running. > T1: null out nsCOMPtr to timer (nsITimer). > T2: Pass nsITimer* from strong reference, which is bogus > > Is that right? TimerImpl is still holding a strong reference to nsITimer* > thing, though...or does that get removed somehow? The scenario is: T2: nsTimerImpl is running, during which the following happens: 1. T2 drops its (only) reference to the nsITimer, causing the refcount to go to 1. 2. nsTimer::Release sees the refcount of 1, then turns around and nulls out nsTimer::mTimerImpl. 3. Then, still in the callback, T2 uses the nsITimer* param in the callback to grab a _new_ (non-functional!) reference to the nsITimer. Now we have T2 holding onto an nsITimer with no implementation, leading to a null pointer crash when it is used. Adding another ref during fire prevents the refcount from going to 1, and this scenario from unfolding. We could do the same thing if we could pass an already_addRefed<nsITimer> or something to the callback.
Assignee | ||
Comment 12•4 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #10) > (In reply to Byron Campen [:bwc] from comment #9) > > ::: xpcom/threads/nsTimerImpl.cpp > > @@ +128,5 @@ > > > mImpl->Cancel(); > > > mImpl = nullptr; > > > + // This might not be 0, technically speaking, but as far as the caller of > > > + // Release is concerned this object is dead. > > > + return 0; > > > > I worry that something will see a return of 1 from NS_RELEASE and assume the > > object isn't dead yet (which would be an invalid assumption in the general > > case). If you think I shouldn't worry about this, I can remove. > > I think you shouldn't worry about it; I'd rather make minimal changes to the > timer code if possible, and breaking Release's contract doesn't seem like a > good idea in any event. Ok, will change.
Assignee | ||
Comment 13•4 years ago
|
||
.
Assignee | ||
Updated•4 years ago
|
Attachment #8838611 -
Attachment is obsolete: true
Assignee | ||
Updated•4 years ago
|
Attachment #8839688 -
Flags: review?(nfroyd)
Assignee | ||
Updated•4 years ago
|
Attachment #8838613 -
Flags: review?(nfroyd)
![]() |
||
Comment 14•4 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #11) > (In reply to Nathan Froyd [:froydnj] from comment #8) > > Comment on attachment 8838613 [details] [diff] [review] > > Part 2: Help prevent nullptr crashes due to misuse of the timer API > > > > Review of attachment 8838613 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > Canceling review while we get this worked out. > > > > ::: xpcom/threads/nsTimerImpl.cpp > > @@ +452,5 @@ > > > oldDelay = mDelay; > > > oldTimeout = mTimeout; > > > + // Ensure that the nsITimer does not unhook from the nsTimerImpl during > > > + // Fire; this will cause null pointer crashes if the user of the timer drops > > > + // its reference, and then uses the nsITimer* passed in the callback. > > > > The scenario here is that: > > > > T1: Create timer. > > T1: timer->SetTarget(T2) > > ....time passes > > T2: nsTimeImpl::Fire is running. > > T1: null out nsCOMPtr to timer (nsITimer). > > T2: Pass nsITimer* from strong reference, which is bogus > > > > Is that right? TimerImpl is still holding a strong reference to nsITimer* > > thing, though...or does that get removed somehow? > > The scenario is: > > T2: nsTimerImpl is running, during which the following happens: > 1. T2 drops its (only) reference to the nsITimer, causing the refcount to > go to 1. > 2. nsTimer::Release sees the refcount of 1, then turns around and nulls > out nsTimer::mTimerImpl. > 3. Then, still in the callback, T2 uses the nsITimer* param in the > callback to grab a _new_ (non-functional!) reference to the nsITimer. > > Now we have T2 holding onto an nsITimer with no implementation, leading to a > null pointer crash when it is used. Adding another ref during fire prevents > the refcount from going to 1, and this scenario from unfolding. We could do > the same thing if we could pass an already_addRefed<nsITimer> or something > to the callback. Ugh, who is doing this? I wish we didn't have to addref/release an extra check just because people are doing weird things.
![]() |
||
Updated•4 years ago
|
Attachment #8839688 -
Flags: review?(nfroyd) → review+
![]() |
||
Updated•4 years ago
|
Attachment #8838613 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 15•4 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #14) > (In reply to Byron Campen [:bwc] from comment #11) > > (In reply to Nathan Froyd [:froydnj] from comment #8) > > > Comment on attachment 8838613 [details] [diff] [review] > > > Part 2: Help prevent nullptr crashes due to misuse of the timer API > > > > > > Review of attachment 8838613 [details] [diff] [review]: > > > ----------------------------------------------------------------- > > > > > > Canceling review while we get this worked out. > > > > > > ::: xpcom/threads/nsTimerImpl.cpp > > > @@ +452,5 @@ > > > > oldDelay = mDelay; > > > > oldTimeout = mTimeout; > > > > + // Ensure that the nsITimer does not unhook from the nsTimerImpl during > > > > + // Fire; this will cause null pointer crashes if the user of the timer drops > > > > + // its reference, and then uses the nsITimer* passed in the callback. > > > > > > The scenario here is that: > > > > > > T1: Create timer. > > > T1: timer->SetTarget(T2) > > > ....time passes > > > T2: nsTimeImpl::Fire is running. > > > T1: null out nsCOMPtr to timer (nsITimer). > > > T2: Pass nsITimer* from strong reference, which is bogus > > > > > > Is that right? TimerImpl is still holding a strong reference to nsITimer* > > > thing, though...or does that get removed somehow? > > > > The scenario is: > > > > T2: nsTimerImpl is running, during which the following happens: > > 1. T2 drops its (only) reference to the nsITimer, causing the refcount to > > go to 1. > > 2. nsTimer::Release sees the refcount of 1, then turns around and nulls > > out nsTimer::mTimerImpl. > > 3. Then, still in the callback, T2 uses the nsITimer* param in the > > callback to grab a _new_ (non-functional!) reference to the nsITimer. > > > > Now we have T2 holding onto an nsITimer with no implementation, leading to a > > null pointer crash when it is used. Adding another ref during fire prevents > > the refcount from going to 1, and this scenario from unfolding. We could do > > the same thing if we could pass an already_addRefed<nsITimer> or something > > to the callback. > > Ugh, who is doing this? I wish we didn't have to addref/release an extra > check just because people are doing weird things. Not sure. This is the only way I can think of that would explain some of the crashes I'm seeing. I could be totally wrong, of course. We'll just have to watch crash-stats and see...
Assignee | ||
Comment 16•4 years ago
|
||
Comment on attachment 8838613 [details] [diff] [review] Part 2: Help prevent nullptr crashes due to misuse of the timer API [Security approval request comment] How easily could an exploit be constructed based on the patch? This would be really, really hard probably. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? The problem is subtle. Someone could figure it out. Then again, it is just a nullptr crash. Which older supported branches are affected by this flaw? 52 I think. If not all supported branches, which bug introduced the flaw? Bug 1157323 Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? I suspect it won't be a totally clean application, but it shouldn't be too hard. How likely is this patch to cause regressions; how much testing does it need? I can't see how this would make anything less safe.
Attachment #8838613 -
Flags: sec-approval?
Assignee | ||
Comment 17•4 years ago
|
||
Comment on attachment 8839688 [details] [diff] [review] Part 1: Simplify nsTimerEvent::Cancel, since there's no need to release the nsTimerImpl itself [Security approval request comment] How easily could an exploit be constructed based on the patch? Pretty hard, honestly. It can only be triggered by misuse in c++ land, which would probably be really hard to trigger from JS. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not really, it just looks like a simplification. Which older supported branches are affected by this flaw? I think this was present in 46 on. If not all supported branches, which bug introduced the flaw? Bug 901097. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Until 52, it should be easy. Past that, much much harder, since the necessary threadsafety guarantees just aren't there. How likely is this patch to cause regressions; how much testing does it need? I don't think it is very likely. If something is already leaking Cancelled nsTimerEvents, now nsTimerImpl and nsITimer will be leaked too.
Attachment #8839688 -
Flags: sec-approval?
Comment 18•4 years ago
|
||
We'll need branch patches hee but this may have missed the last beta. I'll need to loop in release management to see if we can take this.
Flags: needinfo?(rkothari)
Flags: needinfo?(lhenry)
Flags: needinfo?(jcristau)
I'm on the fence here, leaning towards not taking this for 52 since we are just about to go to build with the last beta and it may be hard to exploit. If julien disagrees we could still take it for next monday's RC build.
Flags: needinfo?(lhenry)
Assignee | ||
Comment 20•4 years ago
|
||
I'm kinda tempted to say land in nightly, and see what happens before we decide to uplift, since I'm not 100% sure these crashes are caused by the problems these patches fix.
Comment 21•4 years ago
|
||
I agree with comment 19.
Flags: needinfo?(rkothari)
Flags: needinfo?(jcristau)
Comment 22•4 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #20) > I'm kinda tempted to say land in nightly, and see what happens before we > decide to uplift, since I'm not 100% sure these crashes are caused by the > problems these patches fix. All right. I'll sec-approval+ this but let's only take it on nightly for now.
Updated•4 years ago
|
Attachment #8838613 -
Flags: sec-approval? → sec-approval+
Updated•4 years ago
|
Attachment #8839688 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 23•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f21b4b01a1a9
Assignee | ||
Comment 24•4 years ago
|
||
Ugh, lots of permaorange on that push. Looks like it had other causes, but I'll do another try based on latest m-c to be sure.
Assignee | ||
Comment 25•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0cde3d66e0b7
Assignee | ||
Comment 26•4 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/e133ed915c7685885f67f589fc2f72f0ddfe35eb Bug 1339588 - Part 1: Simplify nsTimerEvent::Cancel, since there's no need to release the nsTimerImpl itself. r=froydnj https://hg.mozilla.org/integration/mozilla-inbound/rev/56fbe9964a0bccccb6b75a2be4190f52f7a0a502 Bug 1339588 - Part 2: Help prevent nullptr crashes due to misuse of the timer API. r=froydnj
https://hg.mozilla.org/mozilla-central/rev/e133ed915c76 https://hg.mozilla.org/mozilla-central/rev/56fbe9964a0b
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Updated•4 years ago
|
Group: dom-core-security → core-security-release
Updated•4 years ago
|
status-firefox52:
--- → wontfix
status-firefox53:
--- → affected
status-firefox54:
--- → affected
status-firefox-esr45:
--- → unaffected
status-firefox-esr52:
--- → affected
tracking-firefox53:
--- → ?
tracking-firefox54:
--- → ?
tracking-firefox55:
--- → ?
tracking-firefox-esr52:
--- → ?
Comment 28•4 years ago
|
||
Track 53+/54+/55+ as sec-high. Hi :bwc, Since this bug also affects 53/54, do you think it's worth uplifting to 54 at least?
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 29•4 years ago
|
||
We want to see whether these patches actually help the crash rate before uplifting anything.
Flags: needinfo?(docfaraday)
Byron, can you tell if the crash rate improved here? Do you need more help analyzing crash-stats results?
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 31•4 years ago
|
||
It seems to have helped some, but there's still a race it looks like. I've made it harder to hit, but not eliminated it. I need to think of a way to get it for good.
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 32•4 years ago
|
||
MozReview-Commit-ID: J6TNJqGsBv4
Assignee | ||
Updated•4 years ago
|
Attachment #8851650 -
Flags: review?(nfroyd)
Assignee | ||
Comment 33•4 years ago
|
||
Part 3 was written to address the crashes you're seeing over in bug 1352047.
Flags: needinfo?(nfroyd)
![]() |
||
Comment 34•4 years ago
|
||
Comment on attachment 8851650 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire Review of attachment 8851650 [details] [diff] [review]: ----------------------------------------------------------------- Apologies for taking so long to review this. ::: xpcom/threads/nsTimerImpl.cpp @@ +326,5 @@ > ++mGeneration; > > + if (mCallbackDuringFire.mType != Callback::Type::Unknown) { > + // Callback is firing; Cancel did not work! > + return NS_ERROR_FAILURE; I am not comfortable with this; there are a decent number of places that check for timer cancellation failure, and we've never tested those failure codepaths. A couple have things like MOZ_ALWAYS_SUCCEEDS attached to them as well. Maybe if we returned something successful, but bogus (NS_SUCCESS_LOSS_OF_INSIGNIFICANT_DATA?) that we could explicitly check for in Release(), that would be better.
Attachment #8851650 -
Flags: review?(nfroyd) → review-
Assignee | ||
Comment 35•4 years ago
|
||
MozReview-Commit-ID: J6TNJqGsBv4
Assignee | ||
Updated•4 years ago
|
Attachment #8851650 -
Attachment is obsolete: true
Assignee | ||
Comment 36•4 years ago
|
||
Comment on attachment 8853053 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire Review of attachment 8853053 [details] [diff] [review]: ----------------------------------------------------------------- https://treeherder.mozilla.org/#/jobs?repo=try&revision=be4781c70e1c16c471055176af7c77e45926aaf5&selectedJob=87064301
Attachment #8853053 -
Flags: review?(nfroyd)
![]() |
||
Comment 37•4 years ago
|
||
Comment on attachment 8853053 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire Review of attachment 8853053 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/threads/nsTimerImpl.cpp @@ +311,5 @@ > return InitCommon(aDelay, aType); > } > > nsresult > +nsTimerImpl::CancelFailIfFiring() This works too, thank you. @@ +335,5 @@ > + > +nsresult > +nsTimerImpl::Cancel() > +{ > + (void)CancelFailIfFiring(); WDYT about adding a brief comment here, like "// Nobody expects timer canceling to actually fail, so suppress errors." ?
Attachment #8853053 -
Flags: review?(nfroyd) → review+
![]() |
||
Updated•4 years ago
|
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 38•4 years ago
|
||
MozReview-Commit-ID: J6TNJqGsBv4
Assignee | ||
Updated•4 years ago
|
Attachment #8853053 -
Attachment is obsolete: true
Assignee | ||
Comment 39•4 years ago
|
||
Comment on attachment 8854137 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire Review of attachment 8854137 [details] [diff] [review]: ----------------------------------------------------------------- Decided to rename and change signature here, since the name was misleading.
Attachment #8854137 -
Flags: review?(nfroyd)
Assignee | ||
Comment 40•4 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d9c8a7211e2b660e09dbb226a073f39c7b83aabc
![]() |
||
Comment 41•4 years ago
|
||
Comment on attachment 8854137 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire Review of attachment 8854137 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/threads/nsTimerImpl.h @@ +49,5 @@ > static nsresult Startup(); > static void Shutdown(); > > void SetDelayInternal(uint32_t aDelay, TimeStamp aBase = TimeStamp::Now()); > + bool CancelCheckIfFiring(); This reads oddly to me. How about "AttemptCancel()", with a documentation string describing what the return value indicates?
Attachment #8854137 -
Flags: review?(nfroyd) → review+
Assignee | ||
Comment 42•4 years ago
|
||
I'm not going to try to get part 3 in 53. I don't want to rush this that much.
Assignee | ||
Comment 43•4 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #41) > Comment on attachment 8854137 [details] [diff] [review] > Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire > > Review of attachment 8854137 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: xpcom/threads/nsTimerImpl.h > @@ +49,5 @@ > > static nsresult Startup(); > > static void Shutdown(); > > > > void SetDelayInternal(uint32_t aDelay, TimeStamp aBase = TimeStamp::Now()); > > + bool CancelCheckIfFiring(); > > This reads oddly to me. How about "AttemptCancel()", with a documentation > string describing what the return value indicates? A return of true doesn't indicate that the Cancel failed, though. This is literally just Cancel, but it happens to return a bool indicating whether the timer was firing or not.
Comment 44•4 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #42) > I'm not going to try to get part 3 in 53. I don't want to rush this that > much. If you want parts 1 & 2 for 53, they need to be nominated now. Otherwise, we need to wontfix the entire bug for 53. Thanks
Flags: needinfo?(docfaraday)
Comment 46•4 years ago
|
||
wontfix for 53 per comment 45.
Comment 47•4 years ago
|
||
clearing tracking for 53 since we've WONTFIXed it
tracking-firefox53:
+ → ---
Assignee | ||
Comment 48•4 years ago
|
||
Comment on attachment 8854137 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire [Security approval request comment] How easily could an exploit be constructed based on the patch? It would be really hard, I think. Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? Not exactly. Which older supported branches are affected by this flaw? This question is kinda muddy; the timer stuff has been messed up for a long time, and when trying to squeeze the bugs out they manifest in different ways. It would be worth landing this on 54. Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? These should be easy. How likely is this patch to cause regressions; how much testing does it need? It could cause leaks I suppose, but I think it should be fine.
Attachment #8854137 -
Flags: sec-approval?
Updated•4 years ago
|
Comment 49•4 years ago
|
||
Comment on attachment 8854137 [details] [diff] [review] Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire sec-approval=dveditz Since we're holding off on 53 (a good idea) we need to wait on landing on ESR-52 until after merge day, but you can land on aurora 54 now. a=dveditz for aurora
Attachment #8854137 -
Flags: approval-mozilla-aurora+
Updated•4 years ago
|
Attachment #8854137 -
Flags: sec-approval? → sec-approval+
Comment 50•4 years ago
|
||
This needs part 3 to land still?
Updated•4 years ago
|
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 51•4 years ago
|
||
Yes. I will be landing this in a few minutes.
Assignee | ||
Comment 52•4 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/721db1be4f27c3d1eec87b40a342f88cfc433716 Bug 1339588 - Part 3: Don't break the nsTimer/nsTimerImpl cycle during Fire. r=froydnj, a=dveditz
Comment 53•4 years ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/48bcbeacbf01 https://hg.mozilla.org/releases/mozilla-aurora/rev/36e47083c3ba https://hg.mozilla.org/releases/mozilla-aurora/rev/4a1d14fea9e0
Assignee | ||
Updated•4 years ago
|
Flags: needinfo?(docfaraday)
Comment 55•4 years ago
|
||
Byron -- Is this ready to be uplifted to ESR52? (I'm getting regular, automatic emails from the relman tracking bot since it's being tracked for ESR52.)
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 56•4 years ago
|
||
Sadly, it looks like we're still seeing crashes in crash-stats for this stuff. Back to the drawing board I guess...
Flags: needinfo?(docfaraday)
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
tracking-firefox-esr52:
54+ → ---
Updated•4 years ago
|
Whiteboard: [post-critsmash-triage]
Comment 57•4 years ago
|
||
Resetting "fixed" flags and re-opening since comment 56 since this isn't fixed.
Comment 58•4 years ago
|
||
Byron, did you have a chance to investigate this bug again?
Flags: needinfo?(docfaraday)
Comment 59•4 years ago
|
||
Byron, did you have a chance to investigate this bug again?
Assignee | ||
Comment 60•4 years ago
|
||
No. I am neck-deep in a major rework of the webrtc signaling code.
Flags: needinfo?(docfaraday)
Comment 61•4 years ago
|
||
I suggest Byron continue work on the signaling code, which is important and a large effort, and we see who else can take another look at this, with Byron as a resource to discuss things with. This is very touchy, cross-thread and performance-important code, so a new set of eyes on this might help. A current search shows a fair number of problems (3-400/week, ignoring the top hit which is all ff33 and before): https://crash-stats.mozilla.com/search/?signature=~nsTimer&date=%3E%3D2017-06-08T16%3A15%3A28.000Z&date=%3C2017-06-15T16%3A15%3A28.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature Many of these may be misuses of data outside of timer itself, such as two threads using obj->mTimer without a lock, and one of them nulling it out. Due to cache synchronization, timing, etc that could cause either a null-deref or a UAF (since the nulling might not be visible in the other, or might already be cached in a register). Finding a way to catch (in debug or otherwise) bad uses of timers would help. Also: billm/etc are changing the MainThread model to a pseudo-coroutine setup (N threads that are "MainThread" and never run at the same time, but all process the virtual event queue for mainthread). See Bug 1365102 for recent changes to nsTimerImpl.cpp, for example. However, I'm concerned if this works perfectly with nsTimerImpl::SetTarget() -- the targets fed in need to be the virtual nsIEventTargets, not the real thread eventtargets, and we need to ensure this doesn't violate any assumptions in the timer code. Bill, can you check this?
Flags: needinfo?(wmccloskey)
(In reply to Randell Jesup [:jesup] from comment #61) > Also: billm/etc are changing the MainThread model to a pseudo-coroutine > setup (N threads that are "MainThread" and never run at the same time, but > all process the virtual event queue for mainthread). See Bug 1365102 for > recent changes to nsTimerImpl.cpp, for example. However, I'm concerned if > this works perfectly with nsTimerImpl::SetTarget() -- the targets fed in > need to be the virtual nsIEventTargets, not the real thread eventtargets, > and we need to ensure this doesn't violate any assumptions in the timer > code. Bill, can you check this? I don't think anything can go wrong here regardless of which kind of target you use. The only thing we do with the target is to call Dispatch and IsOnCurrentThread. Those methods will work regardless of whether you're using a cooperative thread as the event target or the original main thread. I've considered requiring, for consistency, that the cooperative threads never have methods like Dispatch called on them. That way people always have to use the original main thread. However, I think that's a separate matter from this crash.
Flags: needinfo?(wmccloskey)
Comment 63•4 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #62) > I don't think anything can go wrong here regardless of which kind of target > you use. The only thing we do with the target is to call Dispatch and > IsOnCurrentThread. Those methods will work regardless of whether you're > using a cooperative thread as the event target or the original main thread. My concern was that the Timer will Dispatch to a specific cooperative thread, and not the meta-thread nsIEventTarget. If that's ok, then callers to this may be fine. And this was just a note based on this bug causing me to look at the code; I don't think any of the cooperative threading changes have anything to do with this bug.
Updated•4 years ago
|
Status: REOPENED → NEW
status-firefox57:
--- → affected
tracking-firefox56:
--- → ?
tracking-firefox57:
--- → ?
tracking-firefox-esr52:
--- → ?
Target Milestone: mozilla55 → ---
Comment 66•4 years ago
|
||
Byron -- Do you have any new ideas for this? If not, I think it makes sense for you to unassign yourself from this bug and for us to ask froydnj to find a new owner. Thanks.
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 67•4 years ago
|
||
Yeah, I simply don't have time to chase this right now.
Assignee: docfaraday → nobody
Flags: needinfo?(docfaraday) → needinfo?(nfroyd)
Wontfix for 56 at this point. We could still take a patch for 57.
Comment 69•3 years ago
|
||
Nathan, who could be assigned this bug?
![]() |
||
Comment 70•3 years ago
|
||
(In reply to Stephanie Ouillon [:arroway] from comment #69) > Nathan, who could be assigned this bug? I'm not sure if there's as much left to do here; adjusting the query in comment 61 for the last month: https://crash-stats.mozilla.com/search/?signature=~nsTimer&date=%3E%3D2017-09-04T14%3A07%3A35.000Z&date=%3C2017-10-04T14%3A07%3A35.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature and browsing through the top ~10 results indicates that we have very few (~10-20) crashes on 56/57/58. So maybe this is fixed? Byron landed a patch during the...57 cycle, I think? that might have addressed some of these.
Flags: needinfo?(nfroyd)
Comment 71•3 years ago
|
||
I don't see any obvious UAFs in 57, a very small handful in 56, and more in 55 and earlier. It might make more sense to mark this one fixed (since we did reduce the volume -- something got better) and start and new bug.
Status: NEW → RESOLVED
Closed: 4 years ago → 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment 72•3 years ago
|
||
What if anything do we want to do for ESR52 here?
Assignee | ||
Comment 73•3 years ago
|
||
I don't think this work will apply to ESR.
Updated•3 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•