When timers are created, there is no extra reference held until they fire. That means that if the variable storing the timer object goes out of scope, it's susceptible to a GC which will destroy the timer without it ever firing.
Here's a first cut at scripts needing fixing: http://mxr.mozilla.org/mozilla-central/search?string=var+timer Most of these references are wrong. The ones with comments about GC hazards and global scope are not! They are delightful and brighten my day a little bit.
http://mxr.mozilla.org/mozilla-central/source/browser/base/content/nsContextMenu.js#1001 <- kind of scary http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul#91 http://mxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js#291 http://mxr.mozilla.org/mozilla-central/source/modules/plugin/test/mochitest/neverending.sjs#9 http://mxr.mozilla.org/mozilla-central/source/layout/reftests/backgrounds/delay-image-response.sjs#47 http://mxr.mozilla.org/mozilla-central/source/parser/htmlparser/tests/mochitest/file_bug568470-script.sjs#8 http://mxr.mozilla.org/mozilla-central/source/parser/htmlparser/tests/mochitest/file_bug568470.sjs#13 http://mxr.mozilla.org/mozilla-central/source/parser/htmlparser/tests/mochitest/file_bug534293-slow.sjs#7 http://mxr.mozilla.org/mozilla-central/source/netwerk/test/unit/test_auth_proxy.js#161
http://mxr.mozilla.org/mozilla-central/source/testing/mozmill/tests/shared-modules/testModalDialogAPI.js#129 http://mxr.mozilla.org/mozilla-central/source/js/src/tests/e4x/XML/regress-324688.js#82 http://mxr.mozilla.org/mozilla-central/source/browser/base/content/test/tabview/browser_tabview_bug599626.js#127 http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser-places.js#875
Having searched for nsITimer in js files and investigated all the results, I believe this is a pretty complete list of all the problem areas.
Am I right in reading nsTimerImpl that it does not participate in cycle collection and so a JS callback closure that closes over the timer reference will therefore keep the timer alive?
I believe that is true.
That doesn't sound like something we should rely on.
"closes over" how? Just having the timer variable in scope is not enough to "close over" it, in theory. There are already certain cases where the JS engine will optimize that in-scopeness out of existence. And there may well be more in the future. If the closure actually uses the timer variable, that will keep it alive, and will continue to do so in the future. If it doesn't, that's playing with fire, and it'll break if the right GC optimizations happen. ...so basically what comment 8 said, except more verbosely. :-)
Yeah, closed over in the sense that the variable is used inside the closure and therefore has to be closed over/maintain a reference for correctness. I agree that it's not a good code pattern as it could be brittle in the face of certain code refactorings of the code using the idiom and should nsITimer begin participating in cycle collection. I just wanted to make sure I understood the current state of things as it impacts whether I would be able to attribute intermittent oranges to this bug or would need to keep looking for another root cause of failures. (Hypothetically speaking.)
There's a very easily-recognizable pattern that is also easily fixed. This would be a great bug for someone to get started with.
Comment on attachment 543720 [details] [diff] [review] Patch for bugfix https://bugzilla.mozilla.org/show_bug.cgi?id=640629 Thanks a lot, this is great! There's a couple details to fix up before this goes into the tree, however. >diff --git a/browser/base/content/nsContextMenu.js b/browser/base/content/nsContextMenu.js >--- a/browser/base/content/nsContextMenu.js >+++ b/browser/base/content/nsContextMenu.js I read over saveLink again, and it turns out this isn't a problem situation. Please revert the changes to this file. >diff --git a/js/src/tests/e4x/XML/regress-324688.js b/js/src/tests/e4x/XML/regress-324688.js >--- a/js/src/tests/e4x/XML/regress-324688.js >+++ b/js/src/tests/e4x/XML/regress-324688.js >- var t = Components.classes["@mozilla.org/timer;1"]. >+ timer = Components.classes["@mozilla.org/timer;1"]. createInstance(Components.interfaces.nsITimer); Can you avoid renaming this? >diff --git a/testing/mozmill/tests/shared-modules/testModalDialogAPI.js b/testing/mozmill/tests/shared-modules/testModalDialogAPI.js >--- a/testing/mozmill/tests/shared-modules/testModalDialogAPI.js >+++ b/testing/mozmill/tests/shared-modules/testModalDialogAPI.js >+ this.modalDialogTimer = null; // put timer in class def so it doesn't get GC'ed You've added hard tabs here and later in the file. We prefer to use spaces instead of tabs for indentation purposes. >diff --git a/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js b/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js >--- a/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js >+++ b/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js >+var timer = null; // Declare timer outside to prevent premature GC So, the Timer function in this file is actually used in tests to create multiple timer objects. What we should do here is create a timers array, and each time Timer is called just push the new timer variable on to the array. >diff --git a/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul b/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul >--- a/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul >+++ b/toolkit/mozapps/downloads/tests/chrome/test_space_key_pauses_resumes.xul >+ this.timer = null; // timer declared here to prevent premature GC Another hard tab instead of spaces. There are also a couple extra additions to the list that I'd like you to fix in the same way: http://mxr.mozilla.org/mozilla-central/source/dom/plugins/test/mochitest/neverending.sjs http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser-places.js#881
Also, please ensure that the commit message in is the style "Bug 12345678 - Message goes here. r=jdm".
Created attachment 544141 [details] [diff] [review] Proposed fix v2 Trying again, hopefully will be better this time! Thanks for your guidance and patience Josh.
Comment on attachment 544141 [details] [diff] [review] Proposed fix v2 Great! I have a couple remaining nits, but they're tiny things I can fix when I push this patch. For completeness' sake, however: >+var timerArray = new Array(); This is better written as |var timerArray = ;| >+ closeTimer: null, This should be renamed to match the other similar fields, ie. |_closeTimer|
Created attachment 544144 [details] [diff] [review] Proposed fix v3 Should contain all the fixes and nitpicks now ;)
Pushed to try: http://tbpl.mozilla.org/?tree=Try&rev=4f0e99c18637
It seems like the nsITimer ownership model is fundamentally broken. Timers should not need to be manually "kept alive" to keep firing, since "manually keeping something alive" is not a well-defined concept.
Bug 647998 was moving towards some sort of resolution but stalled. I think this is worth landing right now, and pursuing a better ownership model to avoid this problem (which is a fantastic idea) can stay in the other bug.
(In reply to comment #20) > Bug 647998 was moving towards some sort of resolution but stalled. I think > this is worth landing right now, Maybe, but the browser-places.js part of the patch needs peer review. To me it looks like that code should just use setTimeout, but maybe that doesn't work there for some reason...
Comment on attachment 544144 [details] [diff] [review] Proposed fix v3 Gavin, can you look at browser-places.js, please?
Comment on attachment 544144 [details] [diff] [review] Proposed fix v3 Seems like you should drop the reference to _closeTimer somewhere (when it's fired, at least). Really all of this code should be using setTimeout, I think.