Closed
Bug 330128
Opened 19 years ago
Closed 17 years ago
Calling cancel() on a timer doesn't drop ref to the callback
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9beta1
People
(Reporter: bzbarsky, Assigned: sayrer)
References
Details
(Keywords: memory-leak, top-memory-leak)
Attachments
(2 files, 4 obsolete files)
2.39 KB,
text/html
|
Details | |
3.82 KB,
patch
|
bzbarsky
:
superreview+
beltzner
:
approval1.9+
|
Details | Diff | Splinter Review |
As far as I can see, XPCOM timers don't drop their ref to the callback unless they are reinited or destroyed... I think that they should do it in the following cases:
1) When cancel() is called.
2) After firing, if they're ONE_SHOT.
Otherwise callers have to explicitly drop refs to the timer to avoid leaking...
This sounds like a good plan to me, although it might upset some existing callers that expect the timer to hold on to them.
Comment 2•17 years ago
|
||
See also: bug 387341.
Sayre, would you have the cycles (pun intended) to look at this?
Assignee: nobody → sayrer
Flags: blocking1.9?
Assignee | ||
Comment 4•17 years ago
|
||
This gets through mochitest without crashing, but I think it broke a couple of prototype.js and scriptaculous testcases.
Assignee | ||
Comment 5•17 years ago
|
||
Attachment #285136 -
Attachment is obsolete: true
Assignee | ||
Comment 6•17 years ago
|
||
(In reply to comment #4)
>
> This gets through mochitest without crashing, but I think it broke a couple of
> prototype.js and scriptaculous testcases.
Dropping callbacks from one shot timers in this patch breaks testPeriodicalExecuterStop in prototype.js for sure.
Reporter | ||
Comment 7•17 years ago
|
||
Thats' very interesting. The DOM setInterval code (which is what that test seems to exercise to me) uses nsITimer::InitWithFuncCallback and reinits after every firing. So this patch shouldn't affect it....
Assignee | ||
Comment 8•17 years ago
|
||
Assignee | ||
Comment 9•17 years ago
|
||
(In reply to comment #7)
> Thats' very interesting. The DOM setInterval code (which is what that test
> seems to exercise to me) uses nsITimer::InitWithFuncCallback and reinits after
> every firing. So this patch shouldn't affect it....
>
If the call to ReleaseCallback() is included, the order of events changes. The third call to peEventFired happens, but after the setTimeout call fires.
Reporter | ||
Comment 10•17 years ago
|
||
So in other words the interval timer no longer fires every 100ms?
Assignee | ||
Comment 11•17 years ago
|
||
(In reply to comment #10)
> So in other words the interval timer no longer fires every 100ms?
>
Here's a detailed description. It looks like nsGlobalWindow is depending on some combination of its timer list and the TimerImpl to keep timers alive. If I take out the setTimeout call, the setInterval call only fires once. nsGlobalWindow::SetTimeoutOrInterval uses one shot timers:
rv = timeout->mTimer->InitWithFuncCallback(TimerCallback, timeout,
realInterval,
nsITimer::TYPE_ONE_SHOT);
The firing sequence should be something like
100ms -- fire interval
200ms -- fire interval
300ms -- fire interval
...
1000ms -- fire setTimeout
I tried lengthening the setTimeout, and it made no difference. With a setTimeout(2000) call included, the firing sequence is something like
100ms -- fire
...
2000ms -- fire interval
2000ms -- fire setTimeout
2000ms -- fire interval
without the setTimeout call, I get
100ms -- fire
Reporter | ||
Comment 12•17 years ago
|
||
Oh, I see what's going on here. The nsGlobalWindow code reinits the timer from inside the call into it. In other words, the call sequence is:
nsTimerImpl::Fire
nsGlobalWindow::TimerCallback
nsGlobalWindow::RunTimeout
nsTimerImpl::InitWithFuncCallback
nsTimerImpl::ReleaseCallback
So the ReleaseCallback at the end of Fire() nulls clears out the reiniting (or more importantly clears out the type-setting) that happened during the InitWithFuncCallback call earlier in Fire(). Then when the timer actually fires its type is unknown, so it does nothing. Adding a setTimeout() "helps" because then we run all the timeouts/intervals that should have come before it.
I think it makes sense for one-shot timers to support this reinit pattern. Furthermore, if this were not using InitWithFuncCallback the existing timer code could really screw us over: if using InitWithCallback(), say, a reinit of a one-shot timer from under Fire() will cause it to drop the ref to the object it's making a call on (a violation of XPCOM policy).
The right thing to do her might be to have Fire() make a copy of mCallback and mCallbackType, then ReleaseCallback(), _then_ actually make the callback, then if its type is still UNKNOWN and it's not one-shot restore mCallback and mCallbackType, and then release the local copies. Or something like that. That would allow all timers to be reinited while firing.
Assignee | ||
Comment 13•17 years ago
|
||
OK, the problem was reseting CALLBACK_TYPE_FUNC callbacks in ReleaseCallback() to CALLBACK_TYPE_UNKNOWN. This patch is still pretty shady, but it no longer fails the test. I think we should add some explicit state to track initialization.
Attachment #285137 -
Attachment is obsolete: true
Assignee | ||
Comment 14•17 years ago
|
||
Comment on attachment 285164 [details] [diff] [review]
drop references [WIP]
bz's analysis is correct, marking obsolete.
Attachment #285164 -
Attachment is obsolete: true
Comment 15•17 years ago
|
||
Oh I do hate nsTimerImpl.cpp. Sayrer, you got this one? I have guilt and blame because I patched up thread safety after pavlov landed this XPCOM timer stuff way back when, but I would love to duck. I'll review FWIW.
/be
Assignee | ||
Comment 16•17 years ago
|
||
(In reply to comment #15)
> Oh I do hate nsTimerImpl.cpp. Sayrer, you got this one?
Yeah, I'll deal with it.
Assignee | ||
Comment 17•17 years ago
|
||
Attachment #285421 -
Flags: review?(brendan)
Comment 18•17 years ago
|
||
Comment on attachment 285421 [details] [diff] [review]
allow reinitialization
>+ // Handle callbacks that re-init the timer, but avoid leaking
>+ // see bug 330128
Nit only: prevailing style uses full stop at end of sentences, capitalizes sentences.
>+ CallbackUnion callback = mCallback;
>+ PRUint8 callbackType = mCallbackType;
Old-school C lusers like me would use PRUintn callbackType, to avoid any chop-to-byte or pad-to-word hazards in comparisons, but that might actually make for fatter code. It shouldn't hurt to use the same type as the member, but if it does, PRUintn should fix that hurt.
>- mCallback.o->Observe(static_cast<nsITimer*>(this),
>+ callback.o->Observe(static_cast<nsITimer*>(this),
> NS_TIMER_CALLBACK_TOPIC,
> nsnull);
Bonus points for unrelated, underhanging indentation fix here.
Nice work, thanks for doing this.
/be
Attachment #285421 -
Flags: review?(brendan) → review+
Comment 19•17 years ago
|
||
s/unrelated, //
/be
Assignee | ||
Comment 20•17 years ago
|
||
Attachment #285421 -
Attachment is obsolete: true
Attachment #285433 -
Flags: superreview?(bzbarsky)
Reporter | ||
Comment 21•17 years ago
|
||
Comment on attachment 285433 [details] [diff] [review]
address brendan's comments
sr=bzbarsky
Attachment #285433 -
Flags: superreview?(bzbarsky) → superreview+
Assignee | ||
Updated•17 years ago
|
Attachment #285433 -
Flags: approval1.9?
Updated•17 years ago
|
Attachment #285433 -
Flags: approval1.9? → approval1.9+
Assignee | ||
Updated•17 years ago
|
Attachment #285433 -
Flags: approval1.9+ → approval1.9?
Comment 22•17 years ago
|
||
Comment on attachment 285433 [details] [diff] [review]
address brendan's comments
I don't think there's a prohibition on reviewer being approver. If I missed a memo, please point me to it.
/be
Attachment #285433 -
Flags: approval1.9? → approval1.9+
Comment on attachment 285433 [details] [diff] [review]
address brendan's comments
No, but the tree rules are currently that only endgame drivers can give approval :(
Sayre, do you really feel safe enough about this patch that it can be shipped in the beta with just a few days of testing? I'm slightly worried, though only because there are so many users of timers.
Attachment #285433 -
Flags: approval1.9+ → approval1.9?
Assignee | ||
Comment 24•17 years ago
|
||
(In reply to comment #23)
> (From update of attachment 285433 [details] [diff] [review])
> No, but the tree rules are currently that only endgame drivers can give
> approval :(
>
> Sayre, do you really feel safe enough about this patch that it can be shipped
> in the beta with just a few days of testing?
It is pretty isolated, it will be easy to back out if we find it causes problems, and we also don't want to ship leaks in the beta.
Comment 25•17 years ago
|
||
The patch has good context. The first hunk is the leak fix. The second, big one is doing manual ref-counting but there are no returns or other non-local jumps -- it is all straight line code and you can see the counting balances once you figure out the logic. The third hunk contains a crucial fix to reset the callback type to unknown, and give the union a name so it can be used when declaring the local in the second hunk to save the current callback.
What's not to love? Seriously, this is about as easy to analyze a patch as you are likely to get in nsTimerImpl.cpp.
/be
I'm not worried about bugs in the patch, I'm worried about any of the numerous clients of timers behaving awkwardly due to this change. For example if someone happens to rely on the timer keeping them alive past the timer firing.
Thinking about it more it seems pretty unlikely that that'll happen. I'm slightly worried still that someone will get .callback after the timer has fired. It would be great if you could check on that.
Other than that this patch looks fine to me for beta.
Assignee | ||
Comment 28•17 years ago
|
||
So can I check this in to hg.mozilla.org?
If you checked the .callback users I think we should take this on trunk. Sorry if I wasn't clear on that before.
Assignee | ||
Comment 30•17 years ago
|
||
Yep, I'll check it in to CVS whenever and endgame-driver approves it.
Comment 31•17 years ago
|
||
Oh for crying out loud -- approve, already, or I will (again, and don't clear my approval if I do).
/be
Comment 32•17 years ago
|
||
+'ing this. It's requesting approval as well. End game drivers will go through blockers that are requesting approval and we will approve this for the beta.
Flags: blocking1.9? → blocking1.9+
Comment 33•17 years ago
|
||
Comment on attachment 285433 [details] [diff] [review]
address brendan's comments
a=endgame drivers before brendan could carry out his threat ;)
Attachment #285433 -
Flags: approval1.9? → approval1.9+
Comment 34•17 years ago
|
||
(that's approval for M9 landing, to be clear!)
Target Milestone: --- → mozilla1.9 M9
Assignee | ||
Updated•17 years ago
|
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Flags: wanted1.8.1.x?
Comment 35•17 years ago
|
||
I don't think this should go on branch for the reason noted in comment 1 (especially as applied to extensions).
Given that we haven't had a single caller rely on that in our internal code, i'm not that worried that extensions will.
Comment 37•17 years ago
|
||
I'm very interested in this patch. However I found a couple of bugs with it:
1) This does not work for ONE_SHOT timers
2) This does not work when cancel() is called when the callback is fired.
In order to fix those replace in Fire()
if (mCallbackType == CALLBACK_TYPE_UNKNOWN && callbackType != TYPE_ONE_SHOT) {
with
if (mCallbackType == CALLBACK_TYPE_UNKNOWN && mType != TYPE_ONE_SHOT && !mCanceled) {
Reporter | ||
Comment 38•17 years ago
|
||
> 1) This does not work for ONE_SHOT timers
Uh... how so?
> 2) This does not work when cancel() is called when the callback is fired.
Indeed. We should fix that.
Comment 39•17 years ago
|
||
The reason the callback is not always released for ONE_SHOT timers is because mType should be compared to TYPE_ONE_SHOT instead of comparing callbackType to TYPE_ONE_SHOT (like comparing apples and oranges)
Assignee | ||
Comment 40•17 years ago
|
||
(In reply to comment #39)
> The reason the callback is not always released for ONE_SHOT timers is because
> mType should be compared to TYPE_ONE_SHOT instead of comparing callbackType to
> TYPE_ONE_SHOT (like comparing apples and oranges)
Doh, typo. I'll run these bug fixes through the test suite.
Comment 41•17 years ago
|
||
Jonas: help us out here with reasons for your nomination for 1.8.1.x
We already know 1.8 is leaky, and FF3 fixes tons and tons of other leaks, so it'd have to be a huge leak and a very safe patch to be worth considering. Also given there appear to be a couple regressions linked to this bug, are there others? Are you or Sayre willing to write up a branch patch that includes the regression fixes?
Flags: blocking1.8.1.15?
Don't recall, probably happened by mistake.
Flags: wanted1.8.1.x?
Comment 43•17 years ago
|
||
OK, then I think we don't want this on the branch. A known leak that's not particularly drawing complaints is better than risking a use-after-free crash if we get too aggressive. For people worried about memory use they should upgrade to FF3.
Flags: blocking1.8.1.15? → blocking1.8.1.15-
You need to log in
before you can comment on or make changes to this bug.
Description
•