Closed
Bug 450637
Opened 16 years ago
Closed 16 years ago
Single reftest fails on win32 unittest VMs but passes on win32 hardware
Categories
(Core :: Layout, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: lsblakk, Assigned: jruderman)
References
Details
(Keywords: verified1.9.0.4)
Attachments
(1 file)
1.21 KB,
patch
|
Details | Diff | Splinter Review |
On Tuesday August 12th in the morning, we moved the 1.9 unittest boxes over to a new network. Since getting the buildslaves up and running again, the same test has failed repeatedly on all three win32 boxes:
REFTEST UNEXPECTED FAIL: file:///E:/slave/trunk_2k3_pgo/mozilla/layout/reftests/bugs/212563-1.html
Any information on what could be causing this is appreciated.
Comment 1•16 years ago
|
||
This failure is keeping the 1.9 tree closed (I'm closing it now) right before code freeze tomorrow.
Updated•16 years ago
|
Severity: normal → blocker
Comment 2•16 years ago
|
||
This was "fixed" by Lukas by bringing back old machines. Do we want to close this or keep investigating the cause later?
![]() |
||
Comment 3•16 years ago
|
||
What was the rest of the failure log for that test? Like the two data: URIs?
Comment 4•16 years ago
|
||
The test was just showing an empty iframe, without "foo".
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.0/1218748797.1218756216.5978.gz#err0
Reporter | ||
Comment 5•16 years ago
|
||
This is still failing on fx-win32-1.9-slave09 though - the pgo box. So unless we can switch that one to a physical box and see if it stops happening, this bug is still valid.
Assignee | ||
Updated•16 years ago
|
Assignee: nobody → jruderman
Comment 6•16 years ago
|
||
Jesse and I discussed this, and I believe it's a race condition in the test that's related to it being a PGO and run in a VM. Reftest takes the snapshot onload of the iframe, but in this test, the iframe document rewrites the iframe onload, so I think the behavior might be undefined. I think we should change the test to bubble up an event from the iframe when it's finished rewriting the frame, and the parent can use the "reftest-wait" class and remove it upon receiving the event from the frame.
![]() |
||
Comment 7•16 years ago
|
||
Why would that cause a race? The onload for the iframe should fire, the rewriting happen, then the snapshot be taken. Does the rewriting start new loads or something?
Comment 8•16 years ago
|
||
The parent document onload is what triggers reftest to take the snapshot. Clearly something bad happens here, as the failure mode is that the test document's snapshot comes out with an empty iframe.
Assignee | ||
Comment 9•16 years ago
|
||
bz, the onload "rewriting" is a document.write. Does that count as a "new load" that's allowed to happen asynchronously?
![]() |
||
Comment 10•16 years ago
|
||
Hmm. document.write might be async in some cases in terms of the DOM appearing (even in cases when it's not writing out <script>), yes. I'm actually not sure. Blake, do you know?
Ted, I realize something is going wrong. I just want to make sure it's not a bug that webpages would run into.
Assignee | ||
Comment 11•16 years ago
|
||
Done:
* Checked that it still crashes Mozilla 1.6 Alpha 1.
* Checked that the reftest still passes on my machine.
To do:
* Make sure that "document.write that blows away the document" is allowed to be asynchronous (bz/mrbkap).
* Find out whether this change actually fixes the problem on the PGO box.
Reporter | ||
Comment 12•16 years ago
|
||
Something fun to add to the mix:
On unittest 1.9 staging - See Tinderbox tree UnitTest - the windows vm fx-win32-1.9-slave07 has also been failing the single reftest, but the other windows vm fx-win32-1.9-slave08 doing pgo builds, is not.
These two VMs were created at the same time and afaik have the same configuration.
Reporter | ||
Comment 13•16 years ago
|
||
fx-win32-1.9-slave07 just passed this reftest on a build starting at 9am PDT on staging-master
Reporter | ||
Comment 14•16 years ago
|
||
that was an anomaly - it's back to failing again.
Comment 16•16 years ago
|
||
FWIW, the clobber build (from bug 454696) at 2008/09/10 16:27 was green (ie passed reftest for bug 212563), but the subsequent three runs all failed it. Could it matter if dist/bin is blown away each run ?
Comment 17•16 years ago
|
||
(In reply to comment #11)
> Created an attachment (id=334355) [details]
> patch: change the reftest to use reftest-wait
>
> Done:
> * Checked that it still crashes Mozilla 1.6 Alpha 1.
> * Checked that the reftest still passes on my machine.
>
> To do:
> * Make sure that "document.write that blows away the document" is allowed to be
> asynchronous (bz/mrbkap).
> * Find out whether this change actually fixes the problem on the PGO box.
Any progress here? This machine is pretty much perma-orange on the Firefox 3.0 tinderbox.
Assignee | ||
Comment 18•16 years ago
|
||
I think mrbkap will look soon.
Comment 19•16 years ago
|
||
bz and I talked on IRC the other day. We decided that this fix is correct (we shouldn't be relying on the parser not to interrupt itself during document.write).
Reporter | ||
Comment 20•16 years ago
|
||
Changed summary to reflect that this is occurring on both 1.9 and
mozilla-central on VMs only.
Summary: Single reftest fail on win32 unittest 1.9 boxes since network switch → Single reftest fail on win32 unittest VMs
Comment 21•16 years ago
|
||
(In reply to comment #19)
> bz and I talked on IRC the other day. We decided that this fix is correct (we
> shouldn't be relying on the parser not to interrupt itself during
> document.write).
hi;
Sorry, I didnt follow. Are you saying that this test failure is expected?
Note: we are seeing consistent failure on VMs, on both 1.9 and m-c... and consistent passing on physical hardwre, on both 1.9 and m-c.
For background, note that we are migrating a bunch of machines from QA network to Build network, and are currently blocked mid-migration trying to figure out what is causing this test failure. Hence the urgency.
Updated•16 years ago
|
Summary: Single reftest fail on win32 unittest VMs → Single reftest fails on win32 unittest VMs but passes on win32 hardware
Comment 22•16 years ago
|
||
(In reply to comment #21)
> Sorry, I didnt follow. Are you saying that this test failure is expected?
Yes.
Assignee | ||
Comment 23•16 years ago
|
||
I checked in the test change (hg & cvs). I hope it fixes the orange!
Reporter | ||
Comment 24•16 years ago
|
||
So what happens now is that the first build goes green and then subsequent builds continue to fail this reftest - on both 1.9 and m-c, still only an issue on VMs.
Should have more results on this tomorrow morning after a few builds have cycled to see if it ever passes again without intervention.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 25•16 years ago
|
||
Whoah:
REFTEST UNEXPECTED FAIL (LOADING): file:///E:/slave/trunk_2k3_pgo/mozilla/layout/reftests/bugs/212563-1.html
That's not at all what I was expecting, but I guess it's consistent with the "foo in iframe" vs "blank iframe" reftest failures we saw before my test change went in.
I guess this timing issue somehow manages to stop one of the onloads from firing. I added some dumps so we can at least figure out which one. I hope the unit test build machines on which this test fails have dump enabled (debug builds or browser.dom.window.dump.enabled).
New theories:
* the innermost frame's onload fires before the middle frame finishes HTML-parsing, causing the document.write() to append to the <frameset> document instead of replace it with an <html> document.
* the innermost frame's onload 'd.close();' call fails because of xpconnect being weird.
Assignee | ||
Comment 26•16 years ago
|
||
I wish reftest "loading" errors said whether the test didn't finish loading, the test continued to have "reftest-wait", or both.
![]() |
||
Comment 27•16 years ago
|
||
Theory 1 sounds pretty plausible to me, for what it's worth.
Assignee | ||
Comment 28•16 years ago
|
||
I wasn't able to make a test that was both deterministic and capable of crashing Mozilla 1.6 Alpha 1. So instead I split it into two tests, one of which is deterministic, and one of which crashes Mozilla 1.6 Alpha 1 and happens to pass regardless of who wins the race.
Fun fact: the deterministic testcase fails in Mozilla 1.6 Alpha 1 for an entirely different reason, involving privileges and/or xpconnect being weird ;)
Assignee | ||
Comment 29•16 years ago
|
||
Assignee | ||
Comment 30•16 years ago
|
||
I think that greened it.
Status: REOPENED → RESOLVED
Closed: 16 years ago → 16 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 31•16 years ago
|
||
so much green it hurts the eyes. thanks Jesse!
Status: RESOLVED → VERIFIED
Comment 32•16 years ago
|
||
"its not easy being green"... but sure looks very very very nice. Thanks Jesse!
Comment 33•16 years ago
|
||
(In reply to comment #31)
> so much green it hurts the eyes. thanks Jesse!
I take it, Lukas, that I can mark this as verified by you for 1.9.0.4?
Comment 34•16 years ago
|
||
The beautiful greenness of the Firefox3.0 tree (minus bug 460474) is good enough for me!
Keywords: fixed1.9.0.4 → verified1.9.0.4
You need to log in
before you can comment on or make changes to this bug.
Description
•