Closed Bug 797263 Opened 7 years ago Closed 7 years ago

Intermittent browser_586068-reload.js | Test timed out

Categories

(Firefox :: Session Restore, defect)

x86
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 21
Tracking Status
b2g18 --- disabled
b2g18-v1.0.1 --- disabled

People

(Reporter: philor, Assigned: ttaubert)

References

Details

(Keywords: intermittent-failure)

Attachments

(3 files, 1 obsolete file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=15762886&tree=Mozilla-Inbound
Rev3 WINNT 6.1 mozilla-inbound debug test mochitest-other on 2012-10-02 19:30:54 PDT for push acf91f25f228
slave: talos-r3-w7-005

TEST-PASS | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser_586068-reload.js | load 6 - correct tab was reloaded
WARNING: NS_ENSURE_TRUE(mMutable) failed: file e:/builds/moz2_slave/m-in-w32-dbg/build/netwerk/base/src/nsSimpleURI.cpp, line 273
WARNING: NS_ENSURE_TRUE(mMutable) failed: file e:/builds/moz2_slave/m-in-w32-dbg/build/netwerk/base/src/nsSimpleURI.cpp, line 273
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
WARNING: No outer window available!: file e:/builds/moz2_slave/m-in-w32-dbg/build/dom/base/nsGlobalWindow.cpp, line 9197
++DOMWINDOW == 50 (0E56D518) [serial = 4247] [outer = 1EFB6D50]
WARNING: NS_ENSURE_TRUE(mMutable) failed: file e:/builds/moz2_slave/m-in-w32-dbg/build/netwerk/base/src/nsSimpleURI.cpp, line 273
TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser_586068-reload.js | Test timed out
(screenshot of a window claiming to be about:blank (and "Connecting...") but showing the index of the root of the mochitest server)
INFO TEST-END | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser_586068-reload.js | finished in 31408ms
https://tbpl.mozilla.org/php/getParsedLog.php?id=15737409&tree=Fx-Team
Rev3 WINNT 6.1 fx-team debug test mochitest-other on 2012-10-02 02:28:07 PDT for push 85dd8e346102
slave: talos-r3-w7-099

TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser_586068-reload.js | Test timed out
Now one of our top (non-linux weirdness) oranges.

Test was happily running for a week (after landing in bug 789102), before the first occurrence here:
https://tbpl.mozilla.org/?rev=85dd8e346102&tree=Fx-Team

No idea what regressed it...
(In reply to Ed Morley [:edmorley UTC+1] from comment #110)
> Now one of our top (non-linux weirdness) oranges.
> 
> Test was happily running for a week (after landing in bug 789102), before
> the first occurrence here:
> https://tbpl.mozilla.org/?rev=85dd8e346102&tree=Fx-Team
> 
> No idea what regressed it...

Believe the cause to have been bug 731974; we'd actually been permaorange with this on WinXP debug for 3 days, but due to coalescing it hadn't been overly obvious :-(

Anyway, bug 731974 is now backed out, so hoping this will go away on all platforms.
Has not gone away... I'll take a look at the test.
Going to assert that bug 731974 was not the cause, since there are plenty of examples even after the backout.  I'm going to reland that (with the crash fix).

I did some digging into this, however.  The "testCascade" part is what's failing.  Two randomly-chosen failures and a success look like:

fail-1:
...
TEST-PASS | browser_586068-reload.js | load 4 - correct tab was reloaded
TEST-PASS | browser_586068-reload.js | load 5 - correct tab was reloaded
TEST-UNEXPECTED-FAIL | browser_586068-reload.js | Test timed out

fail-2:
...
TEST-PASS | browser_586068-reload.js | load 5 - correct tab was reloaded
TEST-PASS | browser_586068-reload.js | load 6 - correct tab was reloaded
TEST-UNEXPECTED-FAIL | browser_586068-reload.js | Test timed out

success:
...
TEST-PASS | browser_586068-reload.js | load 3 - correct tab was reloaded
TEST-PASS | browser_586068-reload.js | load 4 - correct tab was reloaded
TEST-PASS | browser_586068-reload.js | load 5 - correct tab was reloaded
TEST-PASS | browser_586068-reload.js | load 6 - correct tab was reloaded
TEST-INFO | browser_586068-reload.js | running test_finish
INFO TEST-END | browser_586068-reload.js | finished in 30620ms

Note that in fail-1, "load 6" never happened.  In fail-2, "load 6" did, but we never executed the test_finish callback!  In between those two is a waitForBrowserState() of the state at the start of the test.

In the attached screenshots after the two failing states, in fail-1, there are 6 tabs visible, but tab 6 is still reloading (throbber is throbbing).  In fail-2, there's only one tab visible, the URL is about:blank, and the throbber is throbbing.  I don't know what the saved state looks like, but in "load 6", the reload is of example.com/#6.  I know we redirect example.com to our own local http server, but could there be load issues there that are causing the load to never finish, either on the 6th tab, or on resetting things back to the state before the test started?  This test is heavily triggered off those state changes, so if the loads never finish, that would cause the test to time out.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #213)
> Going to assert that bug 731974 was not the cause, since there are plenty of
> examples even after the backout.  I'm going to reland that (with the crash
> fix).

Please do not reland yet.

Whilst this still occurs with that bug backed out, the frequency is *much* less:
http://brasstacks.mozilla.com/orangefactor/?display=Bug&tree=trunk&startday=2012-10-04&endday=2012-10-11&bugid=797263

(or just look at the bug comment frequency)
(And thank you for looking at this :-))
Erf.  Too late, relanded already :/  I guess let's see what happens?  I'll keep looking...
Perhaps if we do a bunch of retriggers on m-oth (particularly winxp debug) we can tell for sure :-)
Yeah, I'll watch it throughout the night, probably kicking off some retriggers if needed.  Having the frequency go up might actually be a good thing, if it increases the chances of someone reproducing it locally.  I really wish we could have replay debugging for this stuff on the tinderboxes :(
It actually fails more than you see here - while you were out of the tree, we weren't having the https://tbpl.mozilla.org/php/getParsedLog.php?id=16036887&tree=Mozilla-Inbound pattern of failing several hundred more tests after the timeout in this test, but when we do have it, tbplbot fails at commenting in the bug.
Regressed again, will back out bug 731974 once inbound's .hg store is fixed.