Closed Bug 562326 Opened 10 years ago Closed 10 years ago

bug451286_window.xul (used by test_bug451286.xul) randomly times out on the tinderbox if the HTML5 parser is enabled

Categories

(Toolkit :: Find Toolbar, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla1.9.3a5

People

(Reporter: hsivonen, Assigned: graememcc)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 2 obsolete files)

bug451286_window.xul (used by bug451286_window.xul) assumes data: URLs are parsed synchronously. As of the HTML5 parser, they aren't, so the test should wait for the load event for the iframes into which the data: URLs are loaded.
Locally with the old parser this test fails for me on line
window[import] = window.opener.wrappedJSObject[import];
with null opener.
Summary: bug451286_window.xul (used by bug451286_window.xul) assumes data: URLs are parsed synchronously → bug451286_window.xul (used by test_bug451286.xul) randomly fails on the tinderbox if the HTML5 parser is enabled
Attached patch Potential fix (obsolete) — Splinter Review
This patch doesn't make the test fail for me locally. However, since I'm unable to reproduce the failure locally or on the tryserver, I can't tell if this fixes the problem without pushing this into m-c. :-(
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Attachment #443081 - Flags: review?(bzbarsky)
Attachment #443081 - Attachment is obsolete: true
Attachment #443082 - Flags: review?(bzbarsky)
Attachment #443081 - Flags: review?(bzbarsky)
Comment on attachment 443082 [details] [diff] [review]
Potential fix, without debugging code

Would just checking in onPageShow that event.target == gBrowser.contentDocument be good enough?  I would bet that in the failing cases it's not...
Summary: bug451286_window.xul (used by test_bug451286.xul) randomly fails on the tinderbox if the HTML5 parser is enabled → bug451286_window.xul (used by test_bug451286.xul) randomly times out on the tinderbox if the HTML5 parser is enabled
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273260986.1273262727.4727.gz
WINNT 5.2 mozilla-central debug test mochitest-other on 2010/05/07 12:36:26
s: win32-slave32
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1273572934.1273575061.11172.gz#err0
WINNT 5.2 mozilla-central debug test mochitest-other on 2010/05/11 03:15:34
Apologies for not having been about recently, particularly given how noisy an orange this has turned out to be.

Left my box continually running the chrome test suite for over 10 hours, and didn't fail once locally.

Henri's patch in comment 5 seems the right approach, the delayed loading of the iframe's src tags do look like the obvious point of failure.

Boris, in comment 32, I guess you're thinking about the problem being the possibility of the test starting after receiving a pageshow event unrelated to the loadURI request for the test page? We would still then need to check each iframe's readystate is complete before the test could do anything useful.
Comment 32 just says that the pageshow for the toplevel page should imply that the readyState for all subframes is complete.  If that's not the case, something is very wrong.
FWIW, the last few logs (the only ones I checked) all have this right before the timeout:
> JavaScript error: , line 0: uncaught exception: [Exception... "Parameter is not an object"  code: "1003" nsresult: "0x805303eb (NS_ERROR_DOM_NOT_OBJECT_ERR)"  location: "chrome://global/content/bindings/findbar.xml Line: 940"]
> ###!!! ASSERTION: stylesheet not found: 'Not Reached', file e:/builds/moz2_slave/mozilla-central-win32-debug/build/content/base/src/nsDocument.cpp, line 3392

with a total of 5 instances of the assertion-failure between the "JavaScript error" and the timeout.

(There appear to be many other instances of that assertion in the log, though, so maybe it's just noise)
(In reply to comment #133)
> FWIW, the last few logs (the only ones I checked) all have this right before
> the timeout:
> > JavaScript error: , line 0: uncaught exception: [Exception... "Parameter is not an object"  code: "1003" nsresult: "0x805303eb (NS_ERROR_DOM_NOT_OBJECT_ERR)"  location: "chrome://global/content/bindings/findbar.xml Line: 940"]

which is being thrown by the call 
  this._searchRange.selectNodeContents(doc.body);
so presumably body must be null/undefined at this point (code above this already checks doc is an instance of HTMLDocument). The _highlightDoc method this line comes from recurses through frames first, so doc could be the document of either of the frames or the top-level document.
Can we add logging to the test to log the URI of the event.target in that pageshow handler?  Or log whether it's == gBrowser.contentDocument?  Or both?

We could also add try/catch around the relevant calls into gFindBar to see which one this is happening in, right?
(In reply to comment #146)
> Can we add logging to the test to log the URI of the event.target in that
> pageshow handler?  Or log whether it's == gBrowser.contentDocument?  Or both?
> 
> We could also add try/catch around the relevant calls into gFindBar to see
> which one this is happening in, right?

Pushed http://hg.mozilla.org/mozilla-central/rev/8e79b68f2d6c, and will back it out when I return home from work today - I'm sure by then we'll have more than enough failures.
Yeah, so the test is indeed starting too early, responding to a pageshow event from an iframe...

7450 INFO TEST-PASS | chrome://mochikit/content/chrome/toolkit/content/tests/chrome/test_bug451286.xul | [Bug 562326] pageshow event received for data:text/html,text
7451 INFO TEST-PASS | chrome://mochikit/content/chrome/toolkit/content/tests/chrome/test_bug451286.xul | [Bug 562326] Event target gBrowser.contentDocument? false

and in the cases where it's failing the second iframe presumably hasn't been parsed yet...
NEXT ERROR 7507 ERROR TEST-UNEXPECTED-FAIL | chrome://mochikit/content/chrome/toolkit/content/tests/chrome/test_bug451286.xul | [Bug 562326] Mark this failure as Bug 526326...
7508 ERROR TEST-UNEXPECTED-FAIL | chrome://mochikit/content/chrome/toolkit/content/tests/chrome/test_bug451286.xul | [Bug 562326] First toggleHighlight call failed [Exception... "Parameter is not an object"  code: "1003" nsresult: "0x805303eb (NS_ERROR_DOM_NOT_OBJECT_ERR)"  location: "chrome://global/content/bindings/findbar.xml Line: 946"]
JavaScript error: chrome://mochikit/content/chrome/toolkit/content/tests/chrome/bug451286_window.xul, line 144: ifBody is null

I'd hope this'll fix the orange, but we should take it in any case, the test starting based on a subframe pageshow is bogus.
Attachment #445413 - Flags: review?(gavin.sharp)
Attachment #445413 - Flags: review?(gavin.sharp) → review+
Backed out the temporary code:
http://hg.mozilla.org/mozilla-central/rev/984404851f87

and pushed the patch:
http://hg.mozilla.org/mozilla-central/rev/2c63457bc391

I'll close this out tomorrow morning - given the frequency of this failure, if it hasn't failed in ~12 hours we can assume it's fixed.

Gavin, thanks, was gambling (correctly) that you'd quickly give an r+ in time to push this with the debugging backout.
I haven't seen any failures since. The two logs above were from builds from changesets before I pushed.

Please reopen if any more failures pop up.
Assignee: hsivonen → graememcc_firefox
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Flags: in-testsuite+
Target Milestone: --- → mozilla1.9.3a5
Version: unspecified → Trunk
Attachment #443082 - Attachment is obsolete: true
Attachment #443082 - Flags: review?(bzbarsky)
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.