Closed Bug 631447 Opened 13 years ago Closed 13 years ago

Intermittent infinite loop of "ASSERTION: XPConnect is being called on a scope without a 'Components' property!" in browser/components/places/tests/browser/browser_0_library_left_pane_migration.js winding up with "command timed out: 5400 seconds elapsed"

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 6

People

(Reporter: philor, Assigned: mak)

References

Details

(Keywords: intermittent-failure, regression, Whiteboard: [test which aborts the suite])

Attachments

(2 files, 5 obsolete files)

This has been going on for around 10 days, but we've just been blowing it off, because digging around in 50MB logs for errors that aren't highlighted isn't much fun.

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296707279.1296713760.12672.gz
Rev3 Fedora 12x64 mozilla-central debug test mochitest-other on 2011/02/02 20:27:59
s: talos-r3-fed64-038

TEST-PASS | chrome://mochitests/content/browser/browser/components/places/tests/browser/browser_0_library_left_pane_migration.js | Left pane version correctly set
++DOCSHELL 0x34b65a0 == 26
++DOMWINDOW == 128 (0x5aff008) [serial = 1810] [outer = (nil)]
WARNING: Subdocument container has no content: file /builds/slave/cen-lnx64-dbg/build/layout/base/nsDocumentViewer.cpp, line 2367
++DOMWINDOW == 129 (0x6853718) [serial = 1811] [outer = 0x5afef90]
WARNING: Subdocument container has no content: file /builds/slave/cen-lnx64-dbg/build/layout/base/nsDocumentViewer.cpp, line 2367
WARNING: NS_ENSURE_TRUE(sf) failed: file /builds/slave/cen-lnx64-dbg/build/docshell/base/nsDocShell.cpp, line 4985
WARNING: NS_ENSURE_TRUE(sf) failed: file /builds/slave/cen-lnx64-dbg/build/docshell/base/nsDocShell.cpp, line 4985
WARNING: Subdocument container has no content: file /builds/slave/cen-lnx64-dbg/build/layout/base/nsDocumentViewer.cpp, line 2367
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!  (stack and details follow): 'Error', file /builds/slave/cen-lnx64-dbg/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 761
DEBUG_CheckForComponentsInScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:762]
XPCWrappedNativeScope::FindInJSObjectScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:840]
XPCConvert::NativeInterface2JSObject [js/src/xpconnect/src/xpcconvert.cpp:1181]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcconvert.cpp:492]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcprivate.h:3211]
nsXPCWrappedJSClass::CallMethod [js/src/xpconnect/src/xpcwrappedjsclass.cpp:1594]
nsXPCWrappedJS::CallMethod [js/src/xpconnect/src/xpcwrappedjs.cpp:588]
PrepareAndDispatch [xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:153]
nsXPTCVariant::IsPtrData [xptcall.h:112]
The current JS stack is:
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!  (stack and details follow): 'Error', file /builds/slave/cen-lnx64-dbg/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 761
DEBUG_CheckForComponentsInScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:762]
XPCWrappedNativeScope::FindInJSObjectScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:840]
XPCConvert::NativeInterface2JSObject [js/src/xpconnect/src/xpcconvert.cpp:1181]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcconvert.cpp:492]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcprivate.h:3211]
nsXPCWrappedJSClass::CallMethod [js/src/xpconnect/src/xpcwrappedjsclass.cpp:1594]
nsXPCWrappedJS::CallMethod [js/src/xpconnect/src/xpcwrappedjs.cpp:588]
PrepareAndDispatch [xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:153]
nsXPTCVariant::IsPtrData [xptcall.h:112]
The current JS stack is:
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!  (stack and details follow): 'Error', file /builds/slave/cen-lnx64-dbg/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 761
DEBUG_CheckForComponentsInScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:762]
XPCWrappedNativeScope::FindInJSObjectScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:840]
XPCConvert::NativeInterface2JSObject [js/src/xpconnect/src/xpcconvert.cpp:1181]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcconvert.cpp:492]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcprivate.h:3211]
nsXPCWrappedJSClass::CallMethod [js/src/xpconnect/src/xpcwrappedjsclass.cpp:1594]
nsXPCWrappedJS::CallMethod [js/src/xpconnect/src/xpcwrappedjs.cpp:588]
PrepareAndDispatch [xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:153]
nsXPTCVariant::IsPtrData [xptcall.h:112]
The current JS stack is:
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!  (stack and details follow): 'Error', file /builds/slave/cen-lnx64-dbg/build/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 761
(repeats several times, until)
The current JS stack is:
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80570021: file /builds/slave/cen-lnx64-dbg/build/toolkit/components/places/src/nsNavBookmarks.cpp, line 2138
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80570021: file /builds/slave/cen-lnx64-dbg/build/toolkit/components/places/src/nsNavBookmarks.cpp, line 2897
(that line repeats dozens of times)
0 CB_create_query(aQueryUrl = "place:type=3&sort=4", aParentId = 92, aQueryName = "History") ["resource:///modules/PlacesUIUtils.jsm":1102]
    itemId = undefined
    this = [object Object]
    this.runBatched = [function]
    this.create_folder = [function]
    this.create_query = [function]
1 CB_runBatched(aUserData = null) ["resource:///modules/PlacesUIUtils.jsm":1153]
    this = [object Object]
    this.runBatched = [function]
    this.create_folder = [function]
    this.create_query = [function]
(etc., down to)
121 onload(event = [object Event @ 0x5f4d8f0 (native @ 0x474d870)]) ["chrome://browser/content/places/places.xul":1]
    this = [object ChromeWindow @ 0x61d1a40 (native @ 0x6853718)]
And the object whose scope lacks a 'Components' property is:
object 0x7f69f10b7958
class 0x7f6a1a132060 Object
flags: branded own_shape inDictionaryMode hasPropertyTable
properties:
    enumerate "_unstarredTooltip": slot 18
    enumerate "_ignoreClicks": slot 17
    enumerate "_itemIds": slot 16
    enumerate "_uri": slot 15
    enumerate "_starIcon": slot 14
    enumerate "onItemMoved": slot 13
    enumerate "onItemVisited": slot 12
    enumerate "onBeforeItemRemoved": slot 11
    enumerate "onEndUpdateBatch": slot 10
    enumerate "onBeginUpdateBatch": slot 9
    enumerate "onItemChanged": slot 8
    enumerate "onItemRemoved": slot 7
    enumerate "onItemAdded": slot 6
    enumerate "onClick": slot 5
    enumerate "_updateStateInternal": slot 4
    enumerate "updateState": slot 3
    enumerate getter shared "_starredTooltip": slot -1
    enumerate "QueryInterface": slot 2
    enumerate "uninit": slot 1
    enumerate "_hasBookmarksObserver": slot 0
proto <Object at 0x7f6a04c8fcf0>
parent <ChromeWindow object at 0x7f6a04c8fc60>
slots:
   0 = true
   1 = <function PSB_uninit (chrome://browser/content/browser.js:3383) at 0x7f69f1176160 (JSFunction at 0x7f69f2811500)>
   2 = <function XPCOMUtils_QueryInterface (resource://gre/modules/XPCOMUtils.jsm:254) at 0x7f69f1176268 (JSFunction at 0x7f6a05612200)>
   3 = <function PSB_updateState (chrome://browser/content/browser.js:3407) at 0x7f69f1176370 (JSFunction at 0x7f69f2811680)>
   4 = <function PSB__updateStateInternal (chrome://browser/content/browser.js:3445) at 0x7f69f11763c8 (JSFunction at 0x7f69f2811780)>
   5 = <function PSB_onClick (chrome://browser/content/browser.js:3461) at 0x7f69f1176420 (JSFunction at 0x7f69f2811800)>
   6 = <function PSB_onItemAdded (chrome://browser/content/browser.js:3472) at 0x7f69f1176478 (JSFunction at 0x7f69f2811880)>
   7 = <function PSB_onItemRemoved (chrome://browser/content/browser.js:3488) at 0x7f69f11764d0 (JSFunction at 0x7f69f2811900)>
   8 = <function PSB_onItemChanged (chrome://browser/content/browser.js:3503) at 0x7f69f1176528 (JSFunction at 0x7f69f2811980)>
   9 = <unnamed function (chrome://browser/content/browser.js:3526) at 0x7f69f1176580 (JSFunction at 0x7f69f2811a00)>
  10 = <unnamed function (chrome://browser/content/browser.js:3527) at 0x7f69f11765d8 (JSFunction at 0x7f69f2811a80)>
  11 = <unnamed function (chrome://browser/content/browser.js:3528) at 0x7f69f1176630 (JSFunction at 0x7f69f2811b00)>
  12 = <unnamed function (chrome://browser/content/browser.js:3529) at 0x7f69f1176688 (JSFunction at 0x7f69f2811b80)>
  13 = <unnamed function (chrome://browser/content/browser.js:3530) at 0x7f69f11766e0 (JSFunction at 0x7f69f2811c00)>
  14 = <XULElement object at 0x7f69f1d67ec8>
  15 = <XPCWrappedNative_NoHelper object at 0x7f69cfe4b478>
  16 = <Array object at 0x7f69f1121aa0>
  17 = false
  18 = "Bookmark this page"

at which point the loop starts over again, down to

Output exceeded 52428800 bytes, remaining output has been truncated

command timed out: 5400 seconds elapsed, killing pid 2128

I'm torn between places and xpconnect, so filing in Core: Dumping Ground and cc'ing both.
Do we have a regression range?
Note that because it's pretty much impossible to load the full log in the browser, and the brief log doesn't actually show much other than a random bit of the loop just before the timeout, it's possible that anything which I call this will actually not be, or will be this but started by a different test, or possibly will be meatcake.

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296768758.1296775425.23948.gz#err2
Rev3 Fedora 12 mozilla-central debug test mochitest-other on 2011/02/03 13:32:38
s: talos-r3-fed-031
I'll be working my way slowly back to the start, but the reason it's gotten away with not being filed for two weeks is because it only hits a couple times a day, so the range is going to be pretty roughly "that weekend I took off back in January, I think the 22nd/23rd."
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296670476.1296677413.31396.gz
Rev3 WINNT 6.1 mozilla-central debug test mochitest-other on 2011/02/02 10:14:36
s: talos-r3-w7-016

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296513481.1296520057.15075.gz
Rev3 Fedora 12 mozilla-central debug test mochitest-other on 2011/01/31 14:38:01
s: talos-r3-fed-051

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296496508.1296503055.21660.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test mochitest-other on 2011/01/31 09:55:08
s: talos-r3-snow-020

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296340606.1296347500.10493.gz
Rev3 WINNT 6.1 mozilla-central debug test mochitest-other on 2011/01/29 14:36:46
s: talos-r3-w7-022

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296209517.1296216319.29168.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test mochitest-other on 2011/01/28 02:11:57
s: talos-r3-snow-050

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296176733.1296183918.9117.gz
Rev3 WINNT 6.1 mozilla-central debug test mochitest-other on 2011/01/27 17:05:33
s: talos-r3-w7-031

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1296170503.1296179076.17439.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test mochitest-other on 2011/01/27 15:21:43
s: talos-r3-snow-053

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1295997584.1296004090.314.gz
Rev3 Fedora 12 mozilla-central debug test mochitest-other on 2011/01/25 15:19:44
s: talos-r3-fed-049

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1295846361.1295852925.5978.gz
Rev3 Fedora 12 mozilla-central debug test mochitest-other on 2011/01/23 21:19:21
s: talos-r3-fed-016

I didn't expect much of the regression window, since it was going to have to be a fuzzy 2-3 days before the first one, but, that first instance was on http://tbpl.mozilla.org/?rev=bc2fda9a8b32 which includes the obvious cause of an infinite loop while dumping the JS stack, since it includes the cset that dumps the JS stack.
Blocks: 510489
Whiteboard: [orange] → [orange][test which aborts the suite]
hm, interesting failures.
This is the batch that creates the left pane of the Library, it is failing due to a call to GetBookmarkURI that I have not yet identified, that fails on a NS_newURI.
The object without a scope is PlacesStarButton, that starts observing bookmarks onLocationChange and stops at BrowserShutdown.
I have to note this is the first Places b-c test that runs in our harness, so it's even possible a bad interaction with the tests that runs just before it, I see base/content/test/tabview/browser_tabview_undo_group.js and components/certerror/test/browser_bug431826.js
If I run the test alone I see some warning but not this kind of scope missing loop.
After "Left pane version correctly set" message, we register a listener to the window watcher and open the Library window, at this point we wait for "load" event, onload the library initializes the left pane that runs a batch that creates some bookmarks. We are at the point where the assertion happens. The star button object is most likely listening, but should be in the browser scope, unless this is a ghost object left around by a missing call to BrowserShutdown, but that is called in the onunload handler of the browser window.
It's possible a previous test causes some part of BrowserShutdown to throw before running to completion. A bunch of calls there should be wrapped by a try/catch.
previous tests are showing a bunch of "reference to undefined property aBrowser.webNavigation"
ah-ha here is the problem:

TEST-INFO | chrome://mochitests/content/browser/browser/base/content/test/tabview/browser_tabview_undo_group.js | Console message: [JavaScript Error: "uncaught exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: chrome://browser/content/browser.js :: BrowserShutdown :: line 6885"  data: no]"]

a Panorama b-c test (looks like browser_tabview_undo_group.js) is causing an uncaught assertion in BrowserShutdown

looks like one of these is failing
http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#1691
1691   Services.obs.removeObserver(gSessionHistoryObserver, "browser:purge-session-history");
1692   Services.obs.removeObserver(gXPInstallObserver, "addon-install-disabled");
1693   Services.obs.removeObserver(gXPInstallObserver, "addon-install-started");
1694   Services.obs.removeObserver(gXPInstallObserver, "addon-install-blocked");
1695   Services.obs.removeObserver(gXPInstallObserver, "addon-install-failed");
1696   Services.obs.removeObserver(gXPInstallObserver, "addon-install-complete");
1697   Services.obs.removeObserver(gPluginHandler.pluginCrashed, "plugin-crashed");
1698   Services.obs.removeObserver(gFormSubmitObserver, "invalidformsubmit");
It's even possible I blamed the wrong test and that browser_tabview_undo_group.js is only bringing the problem to evidence but it's a previous test causing it (by removing a global browser observer).

I'll add some try catch and try to see if I can find the culprit topic. Taking for now.
Assignee: nobody → mak77
Attached patch patch v1.0 (obsolete) — Splinter Review
I think we could want to do something like this in the shutdown function, to ensure in future people won't add stuff that could interrupt the cleanup.
This way instead of getting a looping error we would get a notification, still usable to investigate issues through the log.

The way and the name of the helper are not finalized, feedback welcome.

I'll follow with the log after a run of browser/base/content tests (I think the problem lives here).
Attachment #509756 - Flags: review?(gavin.sharp)
There are three tests printing out shutdown errors in my last run in browser/base/content:

browser_bug491431.js
browser_tab_dragdrop2.js
browser_tabview_undo_group.js
also

browser_bug609700.js
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1297279047.1297285563.8794.gz
Rev3 Fedora 12 tracemonkey debug test mochitest-other on 2011/02/09 11:17:27
s: talos-r3-fed-022
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297468461.1297475029.22474.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test mochitest-other on 2011/02/11 15:54:21
s: talos-r3-snow-053
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297645826.1297652637.13069.gz
Rev3 WINNT 6.1 mozilla-central debug test mochitest-other on 2011/02/13 17:10:26
s: talos-r3-w7-027
Attachment #509756 - Flags: review?(gavin.sharp)
Attachment #509756 - Flags: review+
Attachment #509756 - Flags: approval2.0+
Attached patch patch for check-in (obsolete) — Splinter Review
Ok, here is the patch ready for check-in.

As a plan of action, once this bug is fixed, I'll open a meta bug on BrowserShutdown exceptions, and attach a dependent bug on each test causing them.
Attachment #509756 - Attachment is obsolete: true
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297714913.1297721492.9935.gz
Rev3 MacOSX Leopard 10.5.8 mozilla-central debug test mochitest-other on 2011/02/14 12:21:53
s: talos-r3-leopard-015
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297716275.1297722811.17575.gz
Rev3 MacOSX Snow Leopard 10.6.2 mozilla-central debug test mochitest-other on 2011/02/14 12:44:35
s: talos-r3-snow-029
Attached patch alternative approach (obsolete) — Splinter Review
This is an alternative approach that I briefly proposed to Gavin on IRC.
The problem with the above patch is that I still see a bunch of assertions in those tests where we call BrowserShutdown before DelayedStartup.

Managing all of this at shutdown is a lot error-prone both for current and future code, and makes things less readable. I suggested to proceed like we do in BrowserTest harness, provide a RegisterBrowserShutdownFunction() global and use it everytime something that needs cleanup is initialized.
This has 2 advantages:
- finalizers are near initializers, thus the blame for finalizing stuff is clear.
- anything that has not been initialized won't be finalized

With this patch I ran browser_privatebrowsing_windowTitle.js test, this was usually throwing a lot of assertions, after patching there is none.

I'll run this through tryserver.
Comment on attachment 512342 [details] [diff] [review]
alternative approach

passed try
Attachment #512342 - Flags: review?(gavin.sharp)
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1297801781.1297808707.1084.gz
Rev3 WINNT 6.1 tracemonkey debug test mochitest-other on 2011/02/15 12:29:41
s: talos-r3-w7-013
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297878377.1297884961.19101.gz
Rev3 MacOSX Leopard 10.5.8 mozilla-central debug test mochitest-other on 2011/02/16 09:46:17
s: talos-r3-leopard-039
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1297960457.1297967000.32591.gz
Rev3 MacOSX Leopard 10.5.8 mozilla-central debug test mochitest-other on 2011/02/17 08:34:17
s: talos-r3-leopard-013
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1298276956.1298283517.6122.gz
Rev3 Fedora 12 mozilla-central debug test mochitest-other on 2011/02/21 00:29:16
s: talos-r3-fed-037
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298457556.1298464360.4270.gz
Rev3 WINNT 6.1 tracemonkey debug test mochitest-other on 2011/02/23 02:39:16
s: talos-r3-w7-035
Comment on attachment 509756 [details] [diff] [review]
patch v1.0

(bookkeeping)
Attachment #509756 - Flags: approval2.0+
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298556526.1298563038.12029.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitest-other on 2011/02/24 06:08:46
s: talos-r3-leopard-041
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298613097.1298619626.23377.gz
Rev3 Fedora 12x64 tracemonkey debug test mochitest-other on 2011/02/24 21:51:37
s: talos-r3-fed64-013
Please don't let the term shutdown slip into new API. The browser isn't shutting down here, but a browser window is closing.
(In reply to comment #38)
> Please don't let the term shutdown slip into new API. The browser isn't
> shutting down here, but a browser window is closing.

makes sense, what about RegisterOnWindowCloseHandler? At this point we could delay this fix to post-fx4 and also rename BrowserShutdown.
(In reply to comment #39)
> makes sense, what about RegisterOnWindowCloseHandler?

Seems a bit clunky. How about addEventListener("unload", ...)? :)
(In reply to comment #40)
> (In reply to comment #39)
> > makes sense, what about RegisterOnWindowCloseHandler?
> 
> Seems a bit clunky. How about addEventListener("unload", ...)? :)

A lot of these registrations could be in the startup path, using addEventListener is more expensive than just adding a function to an array and have a single event listener. Also here I'd like to wrap each finalization in a try catch to make "shutdown" more reliable and have meaningful errors, so using a registration function seems simpler.
(In reply to comment #42)
> A lot of these registrations could be in the startup path, using
> addEventListener is more expensive than just adding a function to an array and
> have a single event listener.

The changes in BrowserStartup aren't needed anyway, as the load event is guaranteed to occur before the unload event.

> Also here I'd like to wrap each finalization in a
> try catch to make "shutdown" more reliable and have meaningful errors, so using
> a registration function seems simpler.

I don't think I see the need. A call to an uninit function shouldn't fail, as long as the corresponding init function was called beforehand.
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1298930953.1298937562.21522.gz
Rev3 MacOSX Leopard 10.5.8 tracemonkey debug test mochitest-other on 2011/02/28 14:09:13
s: talos-r3-leopard-016
(In reply to comment #43)
> (In reply to comment #42)
> The changes in BrowserStartup aren't needed anyway, as the load event is
> guaranteed to occur before the unload event.

> I don't think I see the need. A call to an uninit function shouldn't fail, as
> long as the corresponding init function was called beforehand.

Ideally this bug is not needed as well, but it happened because the way this is done now is confusing and hard to report.
Breaking init/uninit stuff is easy, but noticing and reporting the breakage is not, that is what this tries to handle.
Thus I'm trying to find alternatives to better associate window-close stuff with relative window-open stuff, and to better report errors (So that bugs can be filed earlier and with more context).
Now, we can figure out a better approach, even better than mine, but the problem exists.
The current setup isn't really confusing or even wrong as far as I can see. It just assumes that windows aren't closed before delayedStartup is called, which is a sane assumption for Firefox in the wild but not for tests.
Also note that addEventListener("load", ...) accompanied by addEventListener("unload", ...) is the common pattern used by extensions.
I don't think current setup is plain wrong, just it doesn't help bug reporters.
I guess I'll make a minimal patch (just handle clearTimeout and move shutdown for delayed stuff in a condition) to fix this issue early in the 4.next reopening, and then we can discuss the rest apart.
Could we please have whichever whatever patch, in a sort of a desperate hurry?

About half of the newly added WinXP debug runs (http://tbpl.mozilla.org/?noignore=1 since they're hidden for a permaorange in mochitest-chrome) are hitting this, the half that aren't hitting bug 629610 and getting taken out before this can hit, and according to bug 614955 comment 42 instead of just creating a 50MB log and then tying up the slave for 90 minutes before timing out, they may be tying up the slave for 20 hours.
(In reply to comment #54)
> Could we please have whichever whatever patch, in a sort of a desperate hurry?

I think even if I do a patch in the next 2 seconds, it won't be allowed to land, we cannot land a browser change till the tree reopens for check-ins that are not fennec-only. Btw, plan to to do it today.
(In reply to comment #55)
> (In reply to comment #54)
> > Could we please have whichever whatever patch, in a sort of a desperate hurry?
> 
> I think even if I do a patch in the next 2 seconds, it won't be allowed to
> land, we cannot land a browser change till the tree reopens for check-ins that
> are not fennec-only. Btw, plan to to do it today.

FWIW, people have been breaking this rule by landing all sorts of (most likely) unnecessary test fixes ever since we've closed the tree down to only blockers, so if you can come up with a patch which fixes this, you have my approval on landing it (and yes, you can quote me on this!)  :D
NPOTB test fixes can land, of course. This isn't going to be such a fix.
(In reply to comment #57)
> NPOTB test fixes can land, of course. This isn't going to be such a fix.

Well, I'll shut my mouth then.  :-)
Attachment #512342 - Attachment is obsolete: true
Attachment #512342 - Flags: review?(gavin.sharp)
Attachment #512170 - Attachment is obsolete: true
Attached patch reduced patch v1.0 (obsolete) — Splinter Review
This is the basic idea, to avoid doing too many changes here (I avoided to do any change to the uninit calls themselves).
It just tracks if delayed startup did run, and avoids running delayed shutdown tasks if it didn't.
Before going for review I still have to run tests, but attaching nontheless to collect eventual earlier feedback.
Blocks: 646492
Comment on attachment 522678 [details] [diff] [review]
reduced patch v1.0

local testing was fine.
Attachment #522678 - Flags: review?(gavin.sharp)
Blocks: 646862
Use an early return to avoid the indentation changes? Would make this a lot easier to review.
No longer blocks: 646862
Oh, I guess that'd require re-ordering stuff anyways, nevermind.
yes, I should still reorder stuff, if you want an easier patch I could make a patch with ignoring whitespaces, it would save some line. Btw, I mostly copy/pasted preserving original calls ordering to avoid surprises.
Attached patch reduced patch, slightly tweaked (obsolete) — Splinter Review
I found that the easiest way to review this was to write it myself and then compare to yours - turns out our patches are almost identical, which is a good sign! (I snuck in some minor addition style/comment tweaks). I also audited the uninit methods being called to ensure there were no hidden dependencies.

In testing this, I discovered another case we need to handle: unload can be called before load does, if you call close() before the load is complete, so we need to handle that too. In this case, I just made BrowserShutdown a no-op.

My tests were basically using:
var win=OpenBrowserWindow(); setTimeout(function () {win.close();}, SOMETIMEOUT);

For me, SOMETIMEOUT of 100ms was enough to hit the second case (unload firing before load), and 150ms was enough to catch the other condition (after load, but before delayedStartup).
Attachment #522678 - Attachment is obsolete: true
Attachment #528183 - Flags: feedback?(mak77)
Attachment #528183 - Flags: feedback?(dao)
Attachment #522678 - Flags: review?(gavin.sharp)
Attachment #528183 - Flags: feedback?(dao) → feedback+
Comment on attachment 528183 [details] [diff] [review]
reduced patch, slightly tweaked

It's fine, minor changes mostly.

In the nonBrowserWindowShutdown case there is only delayed shutdown to do, so you didn't use gStartupRan, personally I'd still set it to true in nonBrowserWindowStartup though, so that if somebody in future uses it it's correctly setup.

In browser.js the bracing coding style is mixed up, since I see you moved braces to same line as function definitions I assume this is the wanted style in browser.js, you should then probably also fix nonBrowserWindowStartup style.
Attachment #528183 - Flags: feedback?(mak77) → feedback+
Attached patch final patchSplinter Review
Final patch, ready for checkin. Passed tests on try.
Attachment #528183 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/8deabe91c24b
Status: NEW → RESOLVED
Closed: 13 years ago
Keywords: checkin-needed
Product: Core → Firefox
QA Contact: general → general
Resolution: --- → FIXED
Target Milestone: --- → Firefox 6
Blocks: 619631
Whiteboard: [orange][test which aborts the suite] → [test which aborts the suite]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: