Last Comment Bug 640629 - (nsITimer-fail) Fix all creations of nsITimer that are locally-scoped and susceptible to GC before firing
: Fix all creations of nsITimer that are locally-scoped and susceptible to GC b...
[good first bug][mentor=jdm][inbound]
Product: Core
Classification: Components
Component: General (show other bugs)
: Trunk
: All All
-- normal (vote)
: mozilla8
Assigned To: Han Chang
Depends on: 611807 619026 619167 641174 641175 656881 661784 661998 662178
Blocks: 758585 508128
  Show dependency treegraph
Reported: 2011-03-10 08:23 PST by Josh Matthews [:jdm]
Modified: 2012-05-25 05:37 PDT (History)
13 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

Patch for bugfix (11.71 KB, patch)
2011-07-03 23:41 PDT, Han Chang
no flags Details | Diff | Splinter Review
Proposed fix v2 (10.69 KB, patch)
2011-07-05 20:57 PDT, Han Chang
josh: review+
Details | Diff | Splinter Review
Proposed fix v3 (10.68 KB, patch)
2011-07-05 21:40 PDT, Han Chang review+
Details | Diff | Splinter Review

Description User image Josh Matthews [:jdm] 2011-03-10 08:23:25 PST
When timers are created, there is no extra reference held until they fire.  That means that if the variable storing the timer object goes out of scope, it's susceptible to a GC which will destroy the timer without it ever firing.
Comment 1 User image Josh Matthews [:jdm] 2011-03-10 08:28:22 PST
Here's a first cut at scripts needing fixing:

Most of these references are wrong.  The ones with comments about GC hazards and global scope are not!  They are delightful and brighten my day a little bit.
Comment 5 User image Josh Matthews [:jdm] 2011-03-11 12:17:26 PST
Having searched for nsITimer in js files and investigated all the results, I believe this is a pretty complete list of all the problem areas.
Comment 6 User image Andrew Sutherland [:asuth] 2011-03-11 13:26:42 PST
Am I right in reading nsTimerImpl that it does not participate in cycle collection and so a JS callback closure that closes over the timer reference will therefore keep the timer alive?
Comment 7 User image Josh Matthews [:jdm] 2011-03-11 13:32:35 PST
I believe that is true.
Comment 8 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-03-11 13:33:54 PST
That doesn't sound like something we should rely on.
Comment 9 User image Jeff Walden [:Waldo] (remove +bmo to email) 2011-03-11 13:37:30 PST
"closes over" how?  Just having the timer variable in scope is not enough to "close over" it, in theory.  There are already certain cases where the JS engine will optimize that in-scopeness out of existence.  And there may well be more in the future.  If the closure actually uses the timer variable, that will keep it alive, and will continue to do so in the future.  If it doesn't, that's playing with fire, and it'll break if the right GC optimizations happen. basically what comment 8 said, except more verbosely.  :-)
Comment 10 User image Andrew Sutherland [:asuth] 2011-03-11 14:03:11 PST
Yeah, closed over in the sense that the variable is used inside the closure and therefore has to be closed over/maintain a reference for correctness.

I agree that it's not a good code pattern as it could be brittle in the face of certain code refactorings of the code using the idiom and should nsITimer begin participating in cycle collection.  I just wanted to make sure I understood the current state of things as it impacts whether I would be able to attribute intermittent oranges to this bug or would need to keep looking for another root cause of failures.  (Hypothetically speaking.)
Comment 11 User image Josh Matthews [:jdm] 2011-05-30 11:50:46 PDT
There's a very easily-recognizable pattern that is also easily fixed. This would be a great bug for someone to get started with.
Comment 12 User image Han Chang 2011-07-03 23:41:28 PDT
Created attachment 543720 [details] [diff] [review]
Patch for bugfix

I should've gotten them all, but not entirely sure if the scope on some of them is 100% correct. I'd put my confidence at 95+% though.

My assumption was that the really short unit test javascript files are just scoped locally such that any variables declared globally in that file only exist for that one unit test and no more.
Comment 13 User image Josh Matthews [:jdm] 2011-07-04 02:14:49 PDT
Comment on attachment 543720 [details] [diff] [review]
Patch for bugfix

Thanks a lot, this is great!  There's a couple details to fix up before this goes into the tree, however.

>diff --git a/browser/base/content/nsContextMenu.js b/browser/base/content/nsContextMenu.js
>--- a/browser/base/content/nsContextMenu.js
>+++ b/browser/base/content/nsContextMenu.js

I read over saveLink again, and it turns out this isn't a problem situation. Please revert the changes to this file.

>diff --git a/js/src/tests/e4x/XML/regress-324688.js b/js/src/tests/e4x/XML/regress-324688.js
>--- a/js/src/tests/e4x/XML/regress-324688.js
>+++ b/js/src/tests/e4x/XML/regress-324688.js

>-            var t = Components.classes[";1"].
>+            timer = Components.classes[";1"].

Can you avoid renaming this?

>diff --git a/testing/mozmill/tests/shared-modules/testModalDialogAPI.js b/testing/mozmill/tests/shared-modules/testModalDialogAPI.js
>--- a/testing/mozmill/tests/shared-modules/testModalDialogAPI.js
>+++ b/testing/mozmill/tests/shared-modules/testModalDialogAPI.js

>+	this.modalDialogTimer = null; // put timer in class def so it doesn't get GC'ed

You've added hard tabs here and later in the file. We prefer to use spaces instead of tabs for indentation purposes.

>diff --git a/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js b/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js
>--- a/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js
>+++ b/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js

>+var timer = null; // Declare timer outside to prevent premature GC

So, the Timer function in this file is actually used in tests to create multiple timer objects. What we should do here is create a timers array, and each time Timer is called just push the new timer variable on to the array.

>diff --git a/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul b/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul
>--- a/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul
>+++ b/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul

>+	this.timer = null; // timer declared here to prevent premature GC 

Another hard tab instead of spaces.

There are also a couple extra additions to the list that I'd like you to fix in the same way:
Comment 14 User image Josh Matthews [:jdm] 2011-07-04 02:36:32 PDT
Also, please ensure that the commit message in is the style "Bug 12345678 - Message goes here. r=jdm".
Comment 15 User image Han Chang 2011-07-05 20:57:56 PDT
Created attachment 544141 [details] [diff] [review]
Proposed fix v2

Trying again, hopefully will be better this time!
Thanks for your guidance and patience Josh.
Comment 16 User image Josh Matthews [:jdm] 2011-07-05 21:16:19 PDT
Comment on attachment 544141 [details] [diff] [review]
Proposed fix v2

Great! I have a couple remaining nits, but they're tiny things I can fix when I push this patch.  For completeness' sake, however:

>+var timerArray = new Array();

This is better written as |var timerArray = [];|

>+  closeTimer: null,

This should be renamed to match the other similar fields, ie. |_closeTimer|
Comment 17 User image Han Chang 2011-07-05 21:40:44 PDT
Created attachment 544144 [details] [diff] [review]
Proposed fix v3

Should contain all the fixes and nitpicks now ;)
Comment 18 User image Josh Matthews [:jdm] 2011-07-05 21:55:11 PDT
Pushed to try:
Comment 19 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2011-07-05 22:02:51 PDT
It seems like the nsITimer ownership model is fundamentally broken. Timers should not need to be manually "kept alive" to keep firing, since "manually keeping something alive" is not a well-defined concept.
Comment 20 User image Josh Matthews [:jdm] 2011-07-06 11:06:11 PDT
Bug 647998 was moving towards some sort of resolution but stalled. I think this is worth landing right now, and pursuing a better ownership model to avoid this problem (which is a fantastic idea) can stay in the other bug.
Comment 21 User image Robert O'Callahan (:roc) (email my personal email if necessary) 2011-07-06 15:56:35 PDT
Comment 22 User image Dão Gottwald [:dao] 2011-07-09 06:47:15 PDT
(In reply to comment #20)
> Bug 647998 was moving towards some sort of resolution but stalled. I think
> this is worth landing right now,

Maybe, but the browser-places.js part of the patch needs peer review. To me it looks like that code should just use setTimeout, but maybe that doesn't work there for some reason...
Comment 23 User image Josh Matthews [:jdm] 2011-07-09 16:39:05 PDT
Comment on attachment 544144 [details] [diff] [review]
Proposed fix v3

Gavin, can you look at browser-places.js, please?
Comment 24 User image :Gavin Sharp [email:] 2011-07-15 15:09:24 PDT
Comment on attachment 544144 [details] [diff] [review]
Proposed fix v3

Seems like you should drop the reference to _closeTimer somewhere (when it's fired, at least). Really all of this code should be using setTimeout, I think.
Comment 26 User image Marco Bonardo [::mak] 2011-07-27 03:44:00 PDT

Note You need to log in before you can comment on or make changes to this bug.