Closed Bug 1284763 Opened 3 years ago Closed 2 years ago

Intermittent dom/base/test/test_anchor_area_referrer.html | Test timed out.

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox52 --- wontfix
firefox-esr52 --- unaffected
firefox53 --- fixed
firefox54 --- fixed
firefox55 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: gbrown)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed])

Attachments

(1 file)

Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
See Also: → 1305622
wow this spiked on March 7th, all on android debug mochitest-1, I have done some retriggers:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=Android%204.3%20API15%2B%20debug%20Mochitests%20executed%20by%20TaskCluster%20test-android-4.3-arm7-api-15%2Fdebug-mochitest-1%20tc-M(1)&tochange=5891f51354b2aee761210589d35b2bd92470ba54&fromchange=1a9059a55ce0d12376465994f43060cb7cc537a5&selectedJob=82733088

I suspect in a couple hours I will have data here, ni? myself to report on the root cause.
Flags: needinfo?(jmaher)
Whiteboard: [stockwell needswork]
the culprit is bug 1335751 (which I know :jonco has worked hard to fix the gc::GRAY assertions), so why would we be causing some tests to time out.

Specifically this bug (dom/base/test/test_anchor_area_referrer.html) and bug 1305622 (dom/base/test/test_bug704320.html) are perma failing on android debug.  I assume that is the full scope of this.

for this bug, I see this in the log file:
[task 2017-03-09T12:42:09.525751Z] 12:42:09     INFO -  337 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin in meta strict-origin in attr --- origin (https://example.com/)
[task 2017-03-09T12:42:09.526076Z] 12:42:09     INFO -  Buffered messages logged at 12:41:58
[task 2017-03-09T12:42:09.526698Z] 12:42:09     INFO -  338 INFO {"origin-in-meta-strict-origin-when-cross-origin-in-attr":{"referrer":"https://example.com/tests/dom/base/test/referrer_testserver.sjs?ACTION=generate-area-policy-test&NAME=origin-in-meta-strict-origin-when-cross-origin-in-attr&ATTRIBUTE_POLICY=strict-origin-when-cross-origin&META_POLICY=origin&SCHEME_FROM=https&SCHEME_TO=https","policy":"full","expected":"strict-origin-when-cross-origin"}}
[task 2017-03-09T12:42:09.526954Z] 12:42:09     INFO -  339 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin-in-meta-strict-origin-when-cross-origin-in-attr tests have to be performed.
[task 2017-03-09T12:42:09.527264Z] 12:42:09     INFO -  340 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin in meta strict-origin-when-cross-origin in attr --- full (https://example.com/tests/dom/base/test/referrer_testserver.sjs?ACTION=generate-area-policy-test&NAME=origin-in-meta-strict-origin-when-cross-origin-in-attr&ATTRIBUTE_POLICY=strict-origin-when-cross-origin&META_POLICY=origin&SCHEME_FROM=https&SCHEME_TO=https)
[task 2017-03-09T12:42:09.527492Z] 12:42:09     INFO -  Buffered messages finished
[task 2017-03-09T12:42:09.527759Z] 12:42:09     INFO -  341 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_anchor_area_referrer.html | Test timed out.
[task 2017-03-09T12:42:09.528003Z] 12:42:09     INFO -      reportError@SimpleTest/TestRunner.js:120:7
[task 2017-03-09T12:42:09.528308Z] 12:42:09     INFO -      TestRunner._checkForHangs@SimpleTest/TestRunner.js:141:7
[task 2017-03-09T12:42:09.528820Z] 12:42:09     INFO -  342 INFO {"strict-origin-when-cross-origin-with-origin-in-meta":{"referrer":"http://example.com/","policy":"origin","expected":"strict-origin-when-cross-origin"}}
[task 2017-03-09T12:42:09.529012Z] 12:42:09     INFO -  343 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | strict-origin-when-cross-origin-with-origin-in-meta tests have to be performed.
[task 2017-03-09T12:42:09.529334Z] 12:42:09     INFO -  344 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | strict-origin-when-cross-origin with origin in meta --- origin (http://example.com/)
[task 2017-03-09T12:42:19.953445Z] 12:42:19     INFO -  345 INFO TEST-OK | dom/base/test/test_anchor_area_referrer.html | took 310578ms

we are failing at this location in the test:
https://dxr.mozilla.org/mozilla-central/source/dom/base/test/test_anchor_area_referrer.html?q=path%3Atest_anchor_area_referrer.html&redirect_type=single#95

which is the first time we go from https -> http in that test case.



oddly enough in the other failure, I see:
[task 2017-03-09T06:03:03.335995Z] 06:03:03     INFO -  280 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https no-referrer-when-downgrade
[task 2017-03-09T06:03:03.336604Z] 06:03:03     INFO -  281 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https no-referrer
[task 2017-03-09T06:03:03.337197Z] 06:03:03     INFO -  282 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https unsafe-url
[task 2017-03-09T06:03:03.337783Z] 06:03:03     INFO -  283 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https origin
[task 2017-03-09T06:03:03.338378Z] 06:03:03     INFO -  284 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https origin-when-cross-origin
[task 2017-03-09T06:03:03.338977Z] 06:03:03     INFO -  285 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https same-origin
[task 2017-03-09T06:03:03.339526Z] 06:03:03     INFO -  286 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https strict-origin
[task 2017-03-09T06:03:03.340019Z] 06:03:03     INFO -  287 INFO TEST-PASS | dom/base/test/test_bug704320.html | window.location http-to-https strict-origin-when-cross-origin
[task 2017-03-09T06:03:03.340400Z] 06:03:03     INFO -  Buffered messages finished
[task 2017-03-09T06:03:03.341101Z] 06:03:03     INFO -  288 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_bug704320.html | Test timed out.
[task 2017-03-09T06:03:03.341615Z] 06:03:03     INFO -      reportError@SimpleTest/TestRunner.js:120:7
[task 2017-03-09T06:03:03.342165Z] 06:03:03     INFO -      TestRunner._checkForHangs@SimpleTest/TestRunner.js:141:7
[task 2017-03-09T06:03:03.342740Z] 06:03:03     INFO -  289 INFO TEST-PASS | dom/base/test/test_bug704320.html | stylesheet http-to-http no-referrer-when-downgrade
[task 2017-03-09T06:03:03.343441Z] 06:03:03     INFO -  290 INFO TEST-PASS | dom/base/test/test_bug704320.html | stylesheet http-to-http no-referrer
[t


which is failing in the same check!  This is good, a single problem (although do we have duplicate coverage?!!).  I am not clear why the JS changes would cause this failure.

both use .sjs files:
https://dxr.mozilla.org/mozilla-central/source/dom/base/test/referrer_testserver.sjs
https://dxr.mozilla.org/mozilla-central/source/dom/base/test/bug704320.sjs

possibly that is our culprit.

:jonco, can you think of anything that would cause this?
Blocks: 1335751
Flags: needinfo?(jmaher) → needinfo?(jcoppeard)
It's possible that the extra checks in bug 1335751 have pushed this test just over the timeout limit.

Steve do you think you could take a quick look since I'm away?

It's really unlikely that the assertions I added are causing a timeout themselves.
Flags: needinfo?(jcoppeard) → needinfo?(sphink)
I overlooked the obvious that we are timing out and might have been already a long running test:
Test durations for dom/base/test/test_anchor_area_referrer.html on mozilla-central,mozilla-inbound,autoland between 2017-03-02 and 2017-03-09
android-4-3-armv7-api15/debug-chunked: 269.77 s (238.27 s - 318.97 s over 64 runs)

I think on android we have a 300s timeout, so it looks like we were pretty close to it to start with.
Yes, it looks like the added checks are triggering at least half a dozen tests to go over the timeout. It very much looks like it's "just" a slowdown, not anything more subtle.

I'm looking at these right now to try to figure out what the right tradeoff might be. I may take the coward's way out and have it only do the mark stack walking check 1 out of every k times, at least for now.
Flags: needinfo?(sphink)
gbrown, how problematic are the additional Android timeouts? Now I'm thinking of maybe skipping the extra checks partially or entirely on Android, since they're so slow to start with. It would reduce GC test coverage on Android, which isn't great.
Flags: needinfo?(gbrown)
The change slowed down Android Debug mochitests collectively by about 20%; we increased the test "chunks" and now those jobs are no longer timing out (bug 1345350).

There are also specific test failures:
 - this bug
 - bug 1282207
 - bug 1292735
 - bug 1305622

I wouldn't be surprised if there are a few more, but I don't think there are many. We could probably handle the test failures by increasing test timeouts or perhaps skipping a few tests on Android Debug - probably not a big deal.

On the other hand, Android Debug was already a very slow platform, and I hate to see it get slower.
Flags: needinfo?(gbrown)
Depends on: 1346874
This is looking pretty bad. At least for now, my vote is for disabling the extra checks on Android debug. (Bug 1346874)
as a note, the fix for bug 1346874 landed yesterday, so we should see a reduction here.
no reduction yet, let me see if a fix is pending, otherwise a temporary disabling sounds like a next step.
oh, this really dropped this weekend- will revisit after a couple high volume days
(In reply to Joel Maher ( :jmaher) from comment #27)
> oh, this really dropped this weekend- will revisit after a couple high
> volume days

Yes, I would expect this one to be much improved with the second attempt in bug 1346874.
a few instances remain, probably unrelated to gc::GREY stuff, just old fashioned issues now.
Whiteboard: [stockwell needswork] → [stockwell fixed]
Certainly this is much improved since bug 1346974, but failures continue, always Android Debug.

Logs usually show the test progressing a few seconds before the timeout:

[task 2017-03-30T10:04:17.119605Z] 10:04:17     INFO -  Buffered messages logged at 10:03:48
[task 2017-03-30T10:04:17.120036Z] 10:04:17     INFO -  159 INFO {"strict-origin-when-cross-origin-with-origin-in-meta":{"referrer":"http://example.com/","policy":"origin","expected":"strict-origin-when-cross-origin"}}
[task 2017-03-30T10:04:17.120465Z] 10:04:17     INFO -  160 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | strict-origin-when-cross-origin-with-origin-in-meta tests have to be performed.
[task 2017-03-30T10:04:17.120898Z] 10:04:17     INFO -  161 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | strict-origin-when-cross-origin with origin in meta --- origin (http://example.com/)
[task 2017-03-30T10:04:17.121156Z] 10:04:17     INFO -  Buffered messages logged at 10:03:59
[task 2017-03-30T10:04:17.121411Z] 10:04:17     INFO -  162 INFO {"same-origin-with-origin-in-meta":{"referrer":"","policy":"none","expected":"same-origin"}}
[task 2017-03-30T10:04:17.121826Z] 10:04:17     INFO -  163 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | same-origin-with-origin-in-meta tests have to be performed.
[task 2017-03-30T10:04:17.122186Z] 10:04:17     INFO -  164 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | same-origin with origin in meta --- none ()
[task 2017-03-30T10:04:17.122533Z] 10:04:17     INFO -  Buffered messages logged at 10:04:10
[task 2017-03-30T10:04:17.123206Z] 10:04:17     INFO -  165 INFO {"origin-with-no-meta":{"referrer":"http://example.com/","policy":"origin","expected":"origin"}}
[task 2017-03-30T10:04:17.123283Z] 10:04:17     INFO -  166 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin-with-no-meta tests have to be performed.
[task 2017-03-30T10:04:17.123698Z] 10:04:17     INFO -  167 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin (anchor) with no meta --- origin (http://example.com/)
[task 2017-03-30T10:04:17.123752Z] 10:04:17     INFO -  Buffered messages finished
[task 2017-03-30T10:04:17.124148Z] 10:04:17     INFO -  168 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_anchor_area_referrer.html | Test timed out.
[task 2017-03-30T10:04:17.124258Z] 10:04:17     INFO -      reportError@SimpleTest/TestRunner.js:120:7
[task 2017-03-30T10:04:17.124615Z] 10:04:17     INFO -      TestRunner._checkForHangs@SimpleTest/TestRunner.js:141:7
[task 2017-03-30T10:04:17.124968Z] 10:04:17     INFO -  169 INFO TEST-OK | dom/base/test/test_anchor_area_referrer.html | took 302458ms
[task 2017-03-30T10:04:17.125090Z] 10:04:17     INFO -  170 ERROR [SimpleTest.finish()] this test already called finish!
[task 2017-03-30T10:04:17.125466Z] 10:04:17     INFO -  171 INFO TEST-UNEXPECTED-ERROR | dom/base/test/test_anchor_area_referrer.html | called finish() multiple times


[task 2017-03-29T05:20:32.508317Z] 05:20:32     INFO -  Buffered messages logged at 05:20:07
[task 2017-03-29T05:20:32.508774Z] 05:20:32     INFO -  156 INFO {"origin-in-meta-strict-origin-when-cross-origin-in-attr":{"referrer":"https://example.com/tests/dom/base/test/referrer_testserver.sjs?ACTION=generate-area-policy-test&NAME=origin-in-meta-strict-origin-when-cross-origin-in-attr&ATTRIBUTE_POLICY=strict-origin-when-cross-origin&META_POLICY=origin&SCHEME_FROM=https&SCHEME_TO=https","policy":"full","expected":"strict-origin-when-cross-origin"}}
[task 2017-03-29T05:20:32.509209Z] 05:20:32     INFO -  157 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin-in-meta-strict-origin-when-cross-origin-in-attr tests have to be performed.
[task 2017-03-29T05:20:32.509473Z] 05:20:32     INFO -  158 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | origin in meta strict-origin-when-cross-origin in attr --- full (https://example.com/tests/dom/base/test/referrer_testserver.sjs?ACTION=generate-area-policy-test&NAME=origin-in-meta-strict-origin-when-cross-origin-in-attr&ATTRIBUTE_POLICY=strict-origin-when-cross-origin&META_POLICY=origin&SCHEME_FROM=https&SCHEME_TO=https)
[task 2017-03-29T05:20:32.509618Z] 05:20:32     INFO -  Buffered messages logged at 05:20:20
[task 2017-03-29T05:20:32.509898Z] 05:20:32     INFO -  159 INFO {"strict-origin-when-cross-origin-with-origin-in-meta":{"referrer":"http://example.com/","policy":"origin","expected":"strict-origin-when-cross-origin"}}
[task 2017-03-29T05:20:32.510076Z] 05:20:32     INFO -  160 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | strict-origin-when-cross-origin-with-origin-in-meta tests have to be performed.
[task 2017-03-29T05:20:32.510344Z] 05:20:32     INFO -  Buffered messages logged at 05:20:21
[task 2017-03-29T05:20:32.510661Z] 05:20:32     INFO -  161 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | strict-origin-when-cross-origin with origin in meta --- origin (http://example.com/)
[task 2017-03-29T05:20:32.510962Z] 05:20:32     INFO -  Buffered messages finished
[task 2017-03-29T05:20:32.511331Z] 05:20:32     INFO -  162 INFO TEST-UNEXPECTED-FAIL | dom/base/test/test_anchor_area_referrer.html | Test timed out.
[task 2017-03-29T05:20:32.511574Z] 05:20:32     INFO -      reportError@SimpleTest/TestRunner.js:120:7
[task 2017-03-29T05:20:32.511822Z] 05:20:32     INFO -      TestRunner._checkForHangs@SimpleTest/TestRunner.js:141:7
[task 2017-03-29T05:20:42.885922Z] 05:20:42     INFO -  163 INFO {"same-origin-with-origin-in-meta":{"referrer":"","policy":"none","expected":"same-origin"}}
[task 2017-03-29T05:20:42.887157Z] 05:20:42     INFO -  164 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | same-origin-with-origin-in-meta tests have to be performed.
[task 2017-03-29T05:20:42.887539Z] 05:20:42     INFO -  165 INFO TEST-PASS | dom/base/test/test_anchor_area_referrer.html | same-origin with origin in meta --- none ()
[task 2017-03-29T05:20:42.887871Z] 05:20:42     INFO -  166 INFO TEST-OK | dom/base/test/test_anchor_area_referrer.html | took 311736ms
Assignee: nobody → gbrown
Test durations for dom/base/test/test_anchor_area_referrer.html on mozilla-central,mozilla-inbound,autoland between 2017-03-23 and 2017-03-30

android-4-3-armv7-api15/debug-chunked: 276.14 s (247.90 s - 309.11 s over 54 runs)
android-4-3-armv7-api15/opt-chunked: 102.16 s (90.42 s - 121.39 s over 164 runs)
linux32-nightly/opt-chunked:     9.48 s (8.09 s - 11.81 s over 6 runs)
linux32-nightly/opt-e10s:       10.83 s (9.84 s - 11.81 s over 3 runs)
linux32/debug-chunked:          54.19 s (36.79 s - 72.82 s over 105 runs)
linux32/debug-e10s:             60.93 s (53.56 s - 72.82 s over 47 runs)
linux32/opt-chunked:            12.58 s (8.41 s - 38.57 s over 97 runs)
linux32/opt-e10s:               14.49 s (10.57 s - 38.57 s over 51 runs)
linux64-ccov/opt-chunked:       15.42 s (15.42 s - 15.42 s over 1 runs)
linux64-jsdcov/opt-chunked:      6.63 s (6.41 s - 6.78 s over 3 runs)
linux64-nightly/opt-chunked:     9.65 s (7.43 s - 12.10 s over 6 runs)
linux64-nightly/opt-e10s:       11.31 s (9.99 s - 12.10 s over 3 runs)
linux64-nightly/opt:           194.02 s (194.02 s - 194.02 s over 1 runs)
linux64/asan-chunked:           23.86 s (16.09 s - 49.59 s over 218 runs)
linux64/asan-e10s:              27.29 s (20.76 s - 32.64 s over 54 runs)
linux64/debug-chunked:          47.77 s (30.53 s - 77.33 s over 226 runs)
linux64/debug-e10s:             49.98 s (36.86 s - 77.33 s over 167 runs)
linux64/opt-chunked:            11.15 s (7.11 s - 16.65 s over 96 runs)
linux64/opt-e10s:               12.81 s (9.58 s - 16.65 s over 51 runs)
linux64/opt:                   220.23 s (208.44 s - 229.89 s over 7 runs)
linux64/pgo-chunked:             9.10 s (6.15 s - 13.61 s over 116 runs)
linux64/pgo-e10s:               10.29 s (8.06 s - 13.61 s over 58 runs)
macosx64/opt-e10s:               3.71 s (3.25 s - 5.85 s over 98 runs)
macosx64/opt:                    3.86 s (3.48 s - 4.17 s over 101 runs)
win32/debug-e10s:               10.30 s (9.85 s - 12.66 s over 78 runs)
win32/debug:                    12.09 s (11.43 s - 15.15 s over 85 runs)
win32/opt-e10s:                  3.09 s (2.85 s - 3.85 s over 84 runs)
win32/opt:                       3.45 s (3.28 s - 4.28 s over 80 runs)
win32/pgo-e10s:                  2.55 s (2.42 s - 3.52 s over 53 runs)
win32/pgo:                       2.85 s (2.67 s - 3.45 s over 48 runs)
win64/debug-e10s:                9.01 s (8.60 s - 12.84 s over 165 runs)
win64/debug:                    10.29 s (9.69 s - 10.80 s over 78 runs)
win64/opt-e10s:                  2.95 s (2.76 s - 3.93 s over 184 runs)
win64/opt:                       3.32 s (3.25 s - 3.57 s over 85 runs)
win64/pgo-e10s:                  2.46 s (2.33 s - 2.57 s over 50 runs)
win64/pgo:                       2.78 s (2.71 s - 2.88 s over 47 runs)


This is a long-running test on some Linux flavors and Android. Android Debug seems to be the only one that times out.
Attached patch increase timeoutSplinter Review
Attachment #8853090 - Flags: review?(jmaher)
Attachment #8853090 - Flags: review?(jmaher) → review+
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3bc7acffbda
Request longer timeout for test_anchor_area_referrer.html; r=jmaher
https://hg.mozilla.org/mozilla-central/rev/b3bc7acffbda
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.