Closed Bug 330128 Opened 19 years ago Closed 18 years ago

Calling cancel() on a timer doesn't drop ref to the callback

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla1.9beta1

People

(Reporter: bzbarsky, Assigned: sayrer)

References

Details

(Keywords: memory-leak, top-memory-leak)

Attachments

(2 files, 4 obsolete files)

drop references [WIP] 18 years ago Robert Sayre 4.88 KB, patch		Details \| Diff \| Splinter Review
drop references [WIP] 18 years ago Robert Sayre 3.12 KB, patch		Details \| Diff \| Splinter Review
somewhat reduced test 18 years ago Robert Sayre 2.39 KB, text/html		Details
drop references [WIP] 18 years ago Robert Sayre 4.57 KB, patch		Details \| Diff \| Splinter Review
allow reinitialization 18 years ago Robert Sayre 3.73 KB, patch	brendan : review+	Details \| Diff \| Splinter Review
address brendan's comments 18 years ago Robert Sayre 3.82 KB, patch	bzbarsky : superreview+ beltzner : approval1.9+	Details \| Diff \| Splinter Review

Boris Zbarsky [:bzbarsky]

Reporter

Description

•

19 years ago

As far as I can see, XPCOM timers don't drop their ref to the callback unless they are reinited or destroyed... I think that they should do it in the following cases: 1) When cancel() is called. 2) After firing, if they're ONE_SHOT. Otherwise callers have to explicitly drop refs to the timer to avoid leaking...

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

19 years ago

Keywords: mlk, topmlk

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 1

•

18 years ago

This sounds like a good plan to me, although it might upset some existing callers that expect the timer to hold on to them.

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 2

•

18 years ago

Comment 3

•

18 years ago

Sayre, would you have the cycles (pun intended) to look at this?

Assignee: nobody → sayrer

Flags: blocking1.9?

Robert Sayre

Assignee

Comment 4

•

18 years ago

Attached patch drop references [WIP] (obsolete) — Details — Splinter Review

This gets through mochitest without crashing, but I think it broke a couple of prototype.js and scriptaculous testcases.

Robert Sayre

Assignee

Comment 5

•

18 years ago

Attached patch drop references [WIP] (obsolete) — Details — Splinter Review

Attachment #285136 - Attachment is obsolete: true

Robert Sayre

Assignee

Comment 6

•

18 years ago

(In reply to comment #4) > > This gets through mochitest without crashing, but I think it broke a couple of > prototype.js and scriptaculous testcases. Dropping callbacks from one shot timers in this patch breaks testPeriodicalExecuterStop in prototype.js for sure.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 7

•

18 years ago

Thats' very interesting. The DOM setInterval code (which is what that test seems to exercise to me) uses nsITimer::InitWithFuncCallback and reinits after every firing. So this patch shouldn't affect it....

Robert Sayre

Assignee

Comment 8

•

18 years ago

Attached file somewhat reduced test — Details

Robert Sayre

Assignee

Comment 9

•

18 years ago

(In reply to comment #7) > Thats' very interesting. The DOM setInterval code (which is what that test > seems to exercise to me) uses nsITimer::InitWithFuncCallback and reinits after > every firing. So this patch shouldn't affect it.... > If the call to ReleaseCallback() is included, the order of events changes. The third call to peEventFired happens, but after the setTimeout call fires.

Boris Zbarsky [:bzbarsky]

Reporter

Comment 10

•

18 years ago

So in other words the interval timer no longer fires every 100ms?

Robert Sayre

Assignee

Comment 11

•

18 years ago

(In reply to comment #10) > So in other words the interval timer no longer fires every 100ms? > Here's a detailed description. It looks like nsGlobalWindow is depending on some combination of its timer list and the TimerImpl to keep timers alive. If I take out the setTimeout call, the setInterval call only fires once. nsGlobalWindow::SetTimeoutOrInterval uses one shot timers: rv = timeout->mTimer->InitWithFuncCallback(TimerCallback, timeout, realInterval, nsITimer::TYPE_ONE_SHOT); The firing sequence should be something like 100ms -- fire interval 200ms -- fire interval 300ms -- fire interval ... 1000ms -- fire setTimeout I tried lengthening the setTimeout, and it made no difference. With a setTimeout(2000) call included, the firing sequence is something like 100ms -- fire ... 2000ms -- fire interval 2000ms -- fire setTimeout 2000ms -- fire interval without the setTimeout call, I get 100ms -- fire

Boris Zbarsky [:bzbarsky]

Reporter

Comment 12

•

18 years ago

Oh, I see what's going on here. The nsGlobalWindow code reinits the timer from inside the call into it. In other words, the call sequence is: nsTimerImpl::Fire nsGlobalWindow::TimerCallback nsGlobalWindow::RunTimeout nsTimerImpl::InitWithFuncCallback nsTimerImpl::ReleaseCallback So the ReleaseCallback at the end of Fire() nulls clears out the reiniting (or more importantly clears out the type-setting) that happened during the InitWithFuncCallback call earlier in Fire(). Then when the timer actually fires its type is unknown, so it does nothing. Adding a setTimeout() "helps" because then we run all the timeouts/intervals that should have come before it. I think it makes sense for one-shot timers to support this reinit pattern. Furthermore, if this were not using InitWithFuncCallback the existing timer code could really screw us over: if using InitWithCallback(), say, a reinit of a one-shot timer from under Fire() will cause it to drop the ref to the object it's making a call on (a violation of XPCOM policy). The right thing to do her might be to have Fire() make a copy of mCallback and mCallbackType, then ReleaseCallback(), _then_ actually make the callback, then if its type is still UNKNOWN and it's not one-shot restore mCallback and mCallbackType, and then release the local copies. Or something like that. That would allow all timers to be reinited while firing.

Robert Sayre

Assignee

Comment 13

•

18 years ago

Attached patch drop references [WIP] (obsolete) — Details — Splinter Review

OK, the problem was reseting CALLBACK_TYPE_FUNC callbacks in ReleaseCallback() to CALLBACK_TYPE_UNKNOWN. This patch is still pretty shady, but it no longer fails the test. I think we should add some explicit state to track initialization.

Attachment #285137 - Attachment is obsolete: true

Robert Sayre

Assignee

Comment 14

•

18 years ago

Comment on attachment 285164 [details] [diff] [review] drop references [WIP] bz's analysis is correct, marking obsolete.

Attachment #285164 - Attachment is obsolete: true

Brendan Eich [:brendan]

Comment 15

•

18 years ago

Oh I do hate nsTimerImpl.cpp. Sayrer, you got this one? I have guilt and blame because I patched up thread safety after pavlov landed this XPCOM timer stuff way back when, but I would love to duck. I'll review FWIW. /be

Robert Sayre

Assignee

Comment 16

•

18 years ago

(In reply to comment #15) > Oh I do hate nsTimerImpl.cpp. Sayrer, you got this one? Yeah, I'll deal with it.

Robert Sayre

Assignee

Comment 17

•

18 years ago

Attached patch allow reinitialization (obsolete) — Details — Splinter Review

Attachment #285421 - Flags: review?(brendan)

Brendan Eich [:brendan]

Comment 18

•

18 years ago

Comment on attachment 285421 [details] [diff] [review] allow reinitialization >+ // Handle callbacks that re-init the timer, but avoid leaking >+ // see bug 330128 Nit only: prevailing style uses full stop at end of sentences, capitalizes sentences. >+ CallbackUnion callback = mCallback; >+ PRUint8 callbackType = mCallbackType; Old-school C lusers like me would use PRUintn callbackType, to avoid any chop-to-byte or pad-to-word hazards in comparisons, but that might actually make for fatter code. It shouldn't hurt to use the same type as the member, but if it does, PRUintn should fix that hurt. >- mCallback.o->Observe(static_cast<nsITimer*>(this), >+ callback.o->Observe(static_cast<nsITimer*>(this), > NS_TIMER_CALLBACK_TOPIC, > nsnull); Bonus points for unrelated, underhanging indentation fix here. Nice work, thanks for doing this. /be

Attachment #285421 - Flags: review?(brendan) → review+

Brendan Eich [:brendan]

Comment 19

•

18 years ago

s/unrelated, // /be

Robert Sayre

Assignee

Comment 20

•

18 years ago

Attached patch address brendan's comments — Details — Splinter Review

Attachment #285421 - Attachment is obsolete: true

Attachment #285433 - Flags: superreview?(bzbarsky)

Boris Zbarsky [:bzbarsky]

Reporter

Comment 21

•

18 years ago

Comment on attachment 285433 [details] [diff] [review] address brendan's comments sr=bzbarsky

Attachment #285433 - Flags: superreview?(bzbarsky) → superreview+

Robert Sayre

Assignee

Updated

•

18 years ago

Attachment #285433 - Flags: approval1.9?

(not currently active) Ted Mielczarek

Updated

•

18 years ago

Attachment #285433 - Flags: approval1.9? → approval1.9+

Robert Sayre

Assignee

Updated

•

18 years ago

Attachment #285433 - Flags: approval1.9+ → approval1.9?

Brendan Eich [:brendan]

Comment 22

•

18 years ago

Comment on attachment 285433 [details] [diff] [review] address brendan's comments I don't think there's a prohibition on reviewer being approver. If I missed a memo, please point me to it. /be

Attachment #285433 - Flags: approval1.9? → approval1.9+

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 23

•

18 years ago

Comment on attachment 285433 [details] [diff] [review] address brendan's comments No, but the tree rules are currently that only endgame drivers can give approval :( Sayre, do you really feel safe enough about this patch that it can be shipped in the beta with just a few days of testing? I'm slightly worried, though only because there are so many users of timers.

Attachment #285433 - Flags: approval1.9+ → approval1.9?

Robert Sayre

Assignee

Comment 24

•

18 years ago

(In reply to comment #23) > (From update of attachment 285433 [details] [diff] [review]) > No, but the tree rules are currently that only endgame drivers can give > approval :( > > Sayre, do you really feel safe enough about this patch that it can be shipped > in the beta with just a few days of testing? It is pretty isolated, it will be easy to back out if we find it causes problems, and we also don't want to ship leaks in the beta.

Brendan Eich [:brendan]

Comment 25

•

18 years ago

The patch has good context. The first hunk is the leak fix. The second, big one is doing manual ref-counting but there are no returns or other non-local jumps -- it is all straight line code and you can see the counting balances once you figure out the logic. The third hunk contains a crucial fix to reset the callback type to unknown, and give the union a name so it can be used when declaring the local in the second hunk to save the current callback. What's not to love? Seriously, this is about as easy to analyze a patch as you are likely to get in nsTimerImpl.cpp. /be

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 26

•

18 years ago

I'm not worried about bugs in the patch, I'm worried about any of the numerous clients of timers behaving awkwardly due to this change. For example if someone happens to rely on the timer keeping them alive past the timer firing.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 27

•

18 years ago

Thinking about it more it seems pretty unlikely that that'll happen. I'm slightly worried still that someone will get .callback after the timer has fired. It would be great if you could check on that. Other than that this patch looks fine to me for beta.

Robert Sayre

Assignee

Comment 28

•

18 years ago

So can I check this in to hg.mozilla.org?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 29

•

18 years ago

If you checked the .callback users I think we should take this on trunk. Sorry if I wasn't clear on that before.

Robert Sayre

Assignee

Comment 30

•

18 years ago

Yep, I'll check it in to CVS whenever and endgame-driver approves it.

Brendan Eich [:brendan]

Comment 31

•

18 years ago

Oh for crying out loud -- approve, already, or I will (again, and don't clear my approval if I do). /be

Damon Sicore (:damons)

Comment 32

•

18 years ago

+'ing this. It's requesting approval as well. End game drivers will go through blockers that are requesting approval and we will approve this for the beta.

Flags: blocking1.9? → blocking1.9+

Mike Beltzner [:beltzner, not reading bugmail]

Comment 33

•

18 years ago

Comment on attachment 285433 [details] [diff] [review] address brendan's comments a=endgame drivers before brendan could carry out his threat ;)

Attachment #285433 - Flags: approval1.9? → approval1.9+

Mike Beltzner [:beltzner, not reading bugmail]

Comment 34

•

18 years ago

(that's approval for M9 landing, to be clear!)

Target Milestone: --- → mozilla1.9 M9

Robert Sayre

Assignee

Updated

•

18 years ago

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

neil@parkwaycc.co.uk

Updated

•

18 years ago

Depends on: 401137

Jonas Sicking (:sicking) No longer reading bugmail consistently

Updated

•

17 years ago

Flags: wanted1.8.1.x?

Jeff Walden [:Waldo]

Comment 35

•

17 years ago

I don't think this should go on branch for the reason noted in comment 1 (especially as applied to extensions).

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 36

•

17 years ago

Given that we haven't had a single caller rely on that in our internal code, i'm not that worried that extensions will.

Bruno Calvignac

Comment 37

•

17 years ago

I'm very interested in this patch. However I found a couple of bugs with it: 1) This does not work for ONE_SHOT timers 2) This does not work when cancel() is called when the callback is fired. In order to fix those replace in Fire() if (mCallbackType == CALLBACK_TYPE_UNKNOWN && callbackType != TYPE_ONE_SHOT) { with if (mCallbackType == CALLBACK_TYPE_UNKNOWN && mType != TYPE_ONE_SHOT && !mCanceled) {

Boris Zbarsky [:bzbarsky]

Reporter

Comment 38

•

17 years ago

> 1) This does not work for ONE_SHOT timers Uh... how so? > 2) This does not work when cancel() is called when the callback is fired. Indeed. We should fix that.

Bruno Calvignac

Comment 39

•

17 years ago

The reason the callback is not always released for ONE_SHOT timers is because mType should be compared to TYPE_ONE_SHOT instead of comparing callbackType to TYPE_ONE_SHOT (like comparing apples and oranges)

Robert Sayre

Assignee

Comment 40

•

17 years ago

(In reply to comment #39) > The reason the callback is not always released for ONE_SHOT timers is because > mType should be compared to TYPE_ONE_SHOT instead of comparing callbackType to > TYPE_ONE_SHOT (like comparing apples and oranges) Doh, typo. I'll run these bug fixes through the test suite.

Robert Sayre

Assignee

Updated

•

17 years ago

Depends on: 407201

Daniel Veditz [:dveditz]

Comment 41

•

17 years ago

Jonas: help us out here with reasons for your nomination for 1.8.1.x We already know 1.8 is leaky, and FF3 fixes tons and tons of other leaks, so it'd have to be a huge leak and a very safe patch to be worth considering. Also given there appear to be a couple regressions linked to this bug, are there others? Are you or Sayre willing to write up a branch patch that includes the regression fixes?

Flags: blocking1.8.1.15?

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 42

•

17 years ago

Don't recall, probably happened by mistake.

Flags: wanted1.8.1.x?

Daniel Veditz [:dveditz]

Comment 43

•

17 years ago

OK, then I think we don't want this on the branch. A known leak that's not particularly drawing complaints is better than risking a use-after-free crash if we get too aggressive. For people worried about memory use they should upgrade to FF3.

Flags: blocking1.8.1.15? → blocking1.8.1.15-

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

16 years ago

Blocks: 467295

Nickolay_Ponomarev

Updated

•

16 years ago

Blocks: 472258

You need to log in before you can comment on or make changes to this bug.