Closed Bug 589457 Opened 14 years ago Closed 13 years ago

Intermittent failure in test_x-frame-options.html | test deny - got true, expected false and/or test sameorigin2 - got [object HTMLHeadingElement], expected null

Categories

(Core :: DOM: Core & HTML, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- .x+

People

(Reporter: philor, Assigned: bsterne)

References

Details

(Keywords: assertion, intermittent-failure, Whiteboard: [softblocker][fixed by bug 632835])

Attachments

(3 files, 1 obsolete file)

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282356182.1282357263.25769.gz
Rev3 WINNT 6.1 mozilla-central opt test mochitests-1/5 on 2010/08/20 19:03:02
s: talos-r3-w7-014

30412 INFO TEST-START | /tests/content/base/test/test_x-frame-options.html
30413 INFO TEST-PASS | /tests/content/base/test/test_x-frame-options.html | test control1 - true should equal true
30414 INFO TEST-PASS | /tests/content/base/test/test_x-frame-options.html | test control2 - true should equal true
30415 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_x-frame-options.html | test deny - got true, expected false
30416 INFO TEST-PASS | /tests/content/base/test/test_x-frame-options.html | test sameorigin1 - true should equal true
30417 INFO TEST-PASS | /tests/content/base/test/test_x-frame-options.html | test sameorigin2 - false should equal false
30418 INFO SimpleTest finished /tests/content/base/test/test_x-frame-options.html in 548ms
Summary: Intermittent failure in test_x-frame-options.html | test deny - got true, expected false → Intermittent failure in test_x-frame-options.html | test deny - got true, expected false and/or test sameorigin2 - got true, expected false
Nomming since it's an apparently unexamined failure in an apparently security-related test.
blocking2.0: --- → ?
bsterne, sounds like you can probably catch this in record/replay
Assignee: nobody → bsterne
blocking2.0: ? → final+
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284746057.1284750321.18196.gz
30521 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_x-frame-options.html | test sameorigin2 - got true, expected false
Attached patch fixSplinter Review
bent and I were unable to catch this test error in in his VM after a few nights of trying.  I have a hypothesis about the cause of the bug, and we can reason a few things about the behavior we're seeing, so ideally if I push this fix and we don't see the random oranges for a sufficient period of time we can resolve this bug.

Note all the failures are of the form "got true, expected false" meaning there was a node in a document that we didn't expect to be there (because it should have been blocked by X-F-O).  It's possible that the getElementById call failed without returning something, so |test| winds up still holding a reference to the previous test element.  This patch unrolls the loop of tests and doesn't reuse the |test| variable, instead using a unique variable name for each subtest.  I also changed the expected values to be unique where possible (e.g. not null) to help with further debugging if this patch doesn't fix the issue.
Attachment #488359 - Flags: review?(jst)
Attachment #488359 - Flags: review?(jst) → review+
Summary: Intermittent failure in test_x-frame-options.html | test deny - got true, expected false and/or test sameorigin2 - got true, expected false → Intermittent failure in test_x-frame-options.html | test deny - got true, expected false and/or test sameorigin2 - got [object HTMLHeadingElement], expected null
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1291709174.1291711151.28262.gz&fulltext=1
OS X 10.6 comm-central-trunk debug test mochitests-1/5 on 2010/12/07 00:06:14
{
...
Security Error: Content at http://example.com/tests/content/base/test/file_x-frame-options_page.sjs?testid=sameorigin2&xfo=sameorigin may not load data from http://mochi.test:8888/tests/?autorun=1&closeWhenDone=1&consoleLevel=INFO&totalChunks=5&thisChunk=1&chunkByDir=4.
...
###!!! ASSERTION: XPConnect is being called on a scope without a 'Components' property!: 'Error', file /builds/slave/comm-central-trunk-macosx64-debug/build/mozilla/js/src/xpconnect/src/xpcwrappednativescope.cpp, line 779
DEBUG_CheckForComponentsInScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:780]
XPCWrappedNativeScope::FindInJSObjectScope [js/src/xpconnect/src/xpcwrappednativescope.cpp:838]
XPCConvert::NativeInterface2JSObject [js/src/xpconnect/src/xpcconvert.cpp:1164]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcconvert.cpp:489]
XPCConvert::NativeData2JS [js/src/xpconnect/src/xpcprivate.h:3189]
nsXPCWrappedJSClass::CallMethod [js/src/xpconnect/src/xpcwrappedjsclass.cpp:1587]
nsXPCWrappedJS::CallMethod [js/src/xpconnect/src/xpcwrappedjs.cpp:588]
PrepareAndDispatch [xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_darwin.cpp:153]
...
31019 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/test_x-frame-options.html | test deny - got [object HTMLHeadingElement @ 0x11e70d3a0 (native @ 0x11b466680)], expected null
}
Keywords: assertion
OS: Windows 7 → All
Attached patch add loggingSplinter Review
Adds some HTTP logging to hopefully debug the random orange.
Attachment #501167 - Flags: review?(jst)
Whiteboard: [orange] → [orange][softblocker]
Attachment #501167 - Flags: review?(jst) → review+
** PRODUCT DRIVERS PLEASE NOTE **

This bug is one of 7 automatically changed from blocking2.0:final+ to blocking2.0:.x during the endgame of Firefox 4 for the following reasons:

 - it was marked as a soft blocking issue without a requirement for beta coverage
blocking2.0: final+ → .x+
Brandon, have you had time to investigate this after the logging patch you landed in comment 278?  This is one of the most frequent oranges right now on our test machines, and we need to investigate it as soon as we can! :-)
I read the test's code to figure out if I can see anything obviously wrong.  I saw something which I didn't like:

<http://mxr.mozilla.org/mozilla-central/source/content/base/test/file_x-frame-options_page.sjs#15>

This seems to be a loop injected for the purpose of generating a delay in the server's response.  Is this intended?  If yes, then it's a bad way of doing that; you should use a timer instead.  I removed it locally, and the test seemed to work fine (albeit a bit faster, which is expected!).
I've looked many hours at this bug.  bent and I were never able to catch it on the record-and-replay box even letting it run all night.

(In reply to comment #460)
> <http://mxr.mozilla.org/mozilla-central/source/content/base/test/file_x-frame-options_page.sjs#15>
> 
> This seems to be a loop injected for the purpose of generating a delay in the
> server's response.  Is this intended?  If yes, then it's a bad way of doing
> that; you should use a timer instead.

I think I wrote it that way because I had heard that timers should be avoided in mochitests, but I can see now that our mochitest docs have recommended standard ways to use timers.

> I removed it locally, and the test
> seemed to work fine (albeit a bit faster, which is expected!).

I'm happy to switch that to a timer, but I'm not sure the hypothesis is valid.  Note the "sameorigin" case doesn't have a timer, but there are still test failures that follow that branch.  Then again, maybe the test harness gets screwed up somehow by that loop.  I'll make a patch to switch the loop to a timer, but it's unclear when I'll be able to land that (I'm happy to land it whenever I can get clearance).

Thanks for poking the bug.  I'd be happy to entertain any other suggestions for fixing it.
We've unilaterally declared that the current tree state includes the usual exception that test-only patches can land without approval, and gotten away with it, so as long as you're not touching code, just tests, you already have clearance.
Attached patch switch loop to nsITimer (obsolete) — Splinter Review
Removes the loop in the "deny" case and moves the whole response into an nsITimer.
Attachment #519672 - Flags: review?(ehsan)
Attached patch Remove the delaySplinter Review
Attachment #519672 - Attachment is obsolete: true
Attachment #519681 - Flags: review?(ehsan)
Attachment #519672 - Flags: review?(ehsan)
Comment on attachment 519681 [details] [diff] [review]
Remove the delay

r=me, although I have a hard time seeing how this solves anything.  But let's not lose our hopes!  ;-)
Attachment #519681 - Flags: review?(ehsan) → review+
Brandon, please note that you're creating a timer in a local scope.  There exists the possibility that it could be GCed before firing; please don't don't that.
(In reply to comment #466)
> Brandon, please note that you're creating a timer in a local scope.  There
> exists the possibility that it could be GCed before firing; please don't don't
> that.

He's not going to land that patch.
As far as I read those failure logs:

In the case of sameorigin2 failure, sameorigin1 is loaded twice.
In the case of deny failure, control2 is loaded twice.
(In reply to comment #497)
> As far as I read those failure logs:
> 
> In the case of sameorigin2 failure, sameorigin1 is loaded twice.
> In the case of deny failure, control2 is loaded twice.

That is indeed true!  It looks to me like this failure could be caused by the sameorigin1 document to be loaded into sameorigin2 iframe (or the control2 document being loaded into the deny iframe.

This is _really_ bad.  Brandon, can you determine what might cause this?  At this point, I really think this is a code problem...
(In reply to comment #502)
> This is _really_ bad.  Brandon, can you determine what might cause this?  At
> this point, I really think this is a code problem...

I am still working on this, everyone.  Sorry for the trouble.  This is my top priority code-wise right now.

I'm currently using Clint's patch from bug 603517 to run this (and a handful of other tests) in a loop until I hopefully catch it.  It has run several hundred times already :-/  I wonder if it's possible that one of the majority of tests that I've disabled currently is hosing the test harness before test_x-frame-options runs?
(In reply to comment #504)
> (In reply to comment #502)
> > This is _really_ bad.  Brandon, can you determine what might cause this?  At
> > this point, I really think this is a code problem...
> 
> I am still working on this, everyone.  Sorry for the trouble.  This is my top
> priority code-wise right now.
> 
> I'm currently using Clint's patch from bug 603517 to run this (and a handful of
> other tests) in a loop until I hopefully catch it.  It has run several hundred
> times already :-/  I wonder if it's possible that one of the majority of tests
> that I've disabled currently is hosing the test harness before
> test_x-frame-options runs?

That's a possibility.  Another thing that you can try is to file a bug with releng to get access to one of the machines on which this failure happens, and run Firefox exactly as it runs on our test machines.
(In reply to comment #504)
> (In reply to comment #502)
> > This is _really_ bad.  Brandon, can you determine what might cause this?  At
> > this point, I really think this is a code problem...
> 
> I am still working on this, everyone.  Sorry for the trouble.  This is my top
> priority code-wise right now.
> 
> I'm currently using Clint's patch from bug 603517 to run this (and a handful of
> other tests) in a loop until I hopefully catch it.  It has run several hundred
> times already :-/  I wonder if it's possible that one of the majority of tests
> that I've disabled currently is hosing the test harness before
> test_x-frame-options runs?

Have you tried running the entire test suite in a loop?
(In reply to comment #506)
> Have you tried running the entire test suite in a loop?

What I'm doing currently is running the exact chunk of tests that was running on Tinderbox during one of the most recent failures.
I am now suspecting this failure has been fixed in mozilla-central because there is no failure in mozilla-central after comment #489.

I have no idea which changeset fixed this though.
(In reply to comment #507)
> What I'm doing currently is running the exact chunk of tests that was running
> on Tinderbox during one of the most recent failures.

I ran that chunk of tests 500 times last night on the record and reply box with no failures of this test :-|


(In reply to comment #508)
> I am now suspecting this failure has been fixed in mozilla-central because
> there is no failure in mozilla-central after comment #489.
> 
> I have no idea which changeset fixed this though.

(In reply to comment #509)
> Maybe bug 632835?

That sounds like quite a good hypothesis, actually.  Bug 632835 comment 3 has a testcase where reloading the page causes iframes to be duplicated and misplaced.  That would be consistent to comment 497 where hiro was seeing duplicate documents show up in the test where there should have been none.

I'll hopefully await others' opinions.
Comment 510 is a fair assumption, I'd say.  So I'm going to announce this as FIXED unless it happens again.
Status: NEW → RESOLVED
Closed: 13 years ago
Depends on: 632835
Resolution: --- → FIXED
Whiteboard: [orange][softblocker] → [orange][softblocker][fixed by bug 632835]
Whiteboard: [orange][softblocker][fixed by bug 632835] → [softblocker][fixed by bug 632835]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: