Closed Bug 1041075 Opened 5 years ago Closed 5 years ago

Intermittent test_value_storage.html | Test timed out.; regression from mochitest structured logging, fixed with requestLongerTimeout()

Categories

(Core :: Layout, defect)

x86_64
Android
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox32 --- unaffected
firefox33 --- fixed
firefox34 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- unaffected

People

(Reporter: KWierso, Assigned: dbaron)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=44132529&tree=Mozilla-Inbound
Android 2.3 Armv6 Emulator mozilla-inbound opt test mochitest-12 on 2014-07-18 11:17:26 PDT for push c57608929ef7

slave: tst-linux64-spot-939



12:24:07     INFO -  600 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialization should match property value 
12:24:07     INFO -  601 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+serialize should be idempotent for 'grid-column: auto/2' 
12:24:07     INFO -  602 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialize(grid-column-start)+parse should be the identity transform for 'grid-column: auto/2' 
12:24:07     INFO -  603 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialize(grid-column-end)+parse should be the identity transform for 'grid-column: auto/2' 
12:24:07     INFO -  604 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+split+serialize should be idempotent for 'grid-column: auto/2' 
12:24:07     INFO -  605 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize(grid-column-start) should be idempotent for 'grid-column: auto/2' 
12:24:07     INFO -  606 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize(grid-column-end) should be idempotent for 'grid-column: auto/2' 
12:24:07     INFO -  607 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | unremovable properties when setting property grid-column to auto/2 
12:24:07     INFO -  608 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | non-empty serialization after removing all properties when setting property grid-column to auto/2 
12:24:07     INFO -  609 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'var(--a)' on 'grid-column' 
12:24:07     INFO -  610 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'var(--a)' on 'grid-column' (for 'grid-column-start') 
12:24:07     INFO -  611 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'var(--a)' on 'grid-column' (for 'grid-column-end') 
12:24:07     INFO -  612 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialization should match property value 
12:24:07     INFO -  613 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+serialize should be idempotent for 'grid-column: var(--a)' 
12:24:07     INFO -  614 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize(grid-column-start) should be idempotent for 'grid-column: var(--a)' 
12:24:07     INFO -  615 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize(grid-column-end) should be idempotent for 'grid-column: var(--a)' 
12:24:07     INFO -  616 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | unremovable properties when setting property grid-column to var(--a) 
12:24:07     INFO -  617 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | non-empty serialization after removing all properties when setting property grid-column to var(--a) 
12:24:07     INFO -  618 INFO TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_value_storage.html | Test timed out. 
12:24:07     INFO -  TEST-INFO | expected PASS
12:24:07     INFO -  619 INFO TEST-OK | /tests/layout/style/test/test_value_storage.html | took 632049ms
12:24:07     INFO -  620 INFO dumping last 100 message(s)
12:24:07     INFO -  621 INFO if you need more context, please use SimpleTest.requestCompleteLog() in your test
12:24:07     INFO -  622 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'color-dodge' on 'background-blend-mode' 
12:24:07     INFO -  623 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialization should match property value 
12:24:07     INFO -  624 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+serialize should be idempotent for 'background-blend-mode: color-dodge' 
12:24:07     INFO -  625 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialize+parse should be identity transform for 'background-blend-mode: color-dodge' 
12:24:07     INFO -  626 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize should be idempotent for 'background-blend-mode: color-dodge' 
12:24:07     INFO -  627 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'var(--a)' on 'background-blend-mode' 
12:24:07     INFO -  628 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialization should match property value 




LATER



12:24:07     INFO -  712 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'luminosity' on 'background-blend-mode' 
12:24:07     INFO -  713 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialization should match property value 
12:24:07     INFO -  714 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+serialize should be idempotent for 'background-blend-mode: luminosity' 
12:24:07     INFO -  715 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialize+parse should be identity transform for 'background-blend-mode: luminosity' 
12:24:07     INFO -  716 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize should be idempotent for 'background-blend-mode: luminosity' 
12:24:07     INFO -  717 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | setting 'var(--a)' on 'background-blend-mode' 
12:24:07     INFO -  718 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialization should match property value 
12:24:07     INFO -  719 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+serialize should be idempotent for 'background-blend-mode: var(--a)' 
12:24:07     INFO -  720 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | serialize+parse should be identity transform for 'background-blend-mode: var(--a)' 
12:24:07     INFO -  721 INFO TEST-PASS | /tests/layout/style/test/test_value_storage.html | parse+compute+serialize should be idempotent for 'background-blend-mode: var(--a)' 
12:24:07     INFO -  722 INFO TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_value_storage.html | Test timed out. 
12:24:07     INFO -  TEST-INFO | expected PASS
12:24:07     INFO -  723 INFO TEST-START | /tests/layout/style/test/test_variable_serialization_computed.html
12:24:07     INFO -  724 INFO TEST-OK | /tests/layout/style/test/test_variable_serialization_computed.html | took 6044ms
12:24:07     INFO -  725 INFO TEST-START | /tests/layout/style/test/test_variable_serialization_specified.html
12:24:07     INFO -  726 INFO TEST-OK | /tests/layout/style/test/test_variable_serialization_specified.html | took 7222ms
12:24:07     INFO -  727 INFO TEST-START | /tests/layout/style/test/test_variables.html
12:24:07     INFO -  728 INFO TEST-OK | /tests/layout/style/test/test_variables.html | took 4879ms
12:24:07     INFO -  729 INFO TEST-START | /tests/layout/style/test/test_viewport_units.html
12:24:07     INFO -  730 INFO TEST-OK | /tests/layout/style/test/test_viewport_units.html | took 4458ms
12:24:07     INFO -  731 INFO TEST-START | /tests/layout/tables/test/test_bug337124.html
12:24:07     INFO -  732 INFO TEST-OK | /tests/layout/tables/test/test_bug337124.html | took 2935ms
12:24:07     INFO -  733 INFO TEST-START | /tests/layout/tables/test/test_bug541668_table_event_delivery.html
12:24:07     INFO -  734 INFO TEST-OK | /tests/layout/tables/test/test_bug541668_table_event_delivery.html | took 3292ms
Given the frequency, it seems pretty likely that the source of the regression ought to be in https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=17fad150b6b0&tochange=c57608929ef7
The oldest place this occurred appears to be https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=c57608929ef7 , so https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=c57608929ef7&jobname=Android%202\.3%20.*%20mochitest-12 (at least if the down arrow works, which I'm not sure if it will) should be a good place to start doing retriggers.

I did some retriggers on https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=ecdb409898a6 .
Now I realize there were a few starred in bug 965655 prior to that.
I think this has become needed due to changes in the configuration of
the VMs on which we're running tests.
Attachment #8461809 - Flags: review?(dholbert)
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
Just to check that, I did a bunch of retriggers on a build from more than a day before the spike in reports started:

https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=4829189e7999&jobname=Android%202\.3%20.*%20mochitest-12
Comment on attachment 8461809 [details] [diff] [review]
Request longer timeout for test_value_storage.html

r=me
Attachment #8461809 - Flags: review?(dholbert) → review+
comment 102's test showed that this probably wasn't from a VM configuration change, so I'm continuing to bisect to find the cause.
Based on:
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=3a0e9a83d8c5&jobname=Android%202\.3%20.*%20mochitest-12
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=e97c96651f93&jobname=Android%202\.3%20.*%20mochitest-12
(although I've done a few more retriggers on the latter), I'm converging on the regression range here being:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=e97c96651f93&tochange=3a0e9a83d8c5
which I think almost certainly means that the cause of the regression is:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a283c1237d96

Is this an expected regression from integrating structured logging into mochitest?  Does structured logging have performance problems that slow down tests with large numbers of assertions substantially slower than they used to be?
Blocks: 886570
Flags: needinfo?(james)
Flags: needinfo?(akachkach)
(In reply to David Baron [:dbaron] (UTC-7) (needinfo? for questions) from comment #120)
> which I think almost certainly means that the cause of the regression is:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/a283c1237d96

Further evidence for this is bug 965655 comment 19, which is from a try run by akachkach.
Summary: Intermittent test_value_storage.html | Test timed out. → Intermittent test_value_storage.html | Test timed out.; regression from mochitest structured logging, fixed with requestLongerTimeout()
It's more than possible that this patch adds substantial overhead (creating structured messages, serializing them, deserializing them... then deserializing them and/or formatting them for those that aren't buffered), so if this test has a large number of assertions it might need a bigger timeout delay.
Flags: needinfo?(akachkach)
https://hg.mozilla.org/mozilla-central/rev/c5feae4f583b
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla34 → ---
This does appear to be fixed by increasing the timeout, although I think it seems worth filing a separate bug on the performance regression from structured logging.
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
It will probably get worse with bug 1042921, so we'll definitely have to follow the performance impact of structured logging.
I filed bug 1045205 on the performance regression.
Flags: needinfo?(james)
QA Whiteboard: [qa-]
I looked at the failure from comment 131.
Log: https://tbpl.mozilla.org/php/getParsedLog.php?id=47819311&tree=Mozilla-Inbound
Full logcat: http://mozilla-releng-blobs.s3.amazonaws.com/blobs/mozilla-inbound/sha512/39291468a5542b31a3461bc0b7b6ed24dfc129c8f0ade16d6d461dc746521c2d2810f3e8795f22220c555c6db96b78f6104f9761c9107c1b45624344a4ad632a


The test does seem to be skirting the 15 minute timeout (requestLongerTimeout(3) --> 3 * 5 minutes), though it looks like we should be done with ~22 seconds to spare. In particular, the full logcat contains this, before the timeout message (not sure where the bogus characters come from; ignoring those for now):

09-10 14:39:16.217 I/GeckoDump(  584): ⰲ겿{"action":"test_end","time":1410385156214,"thread":"","js_source":"TestRunner","pid":null,"source":"mochitest","test":"/tests/layout/style/test/test_value_storage.html","status":"OK","message":"Finished in 878560ms","extra":{"runtime":878560}}ⰲ겿

The 878560ms runtime there is 14 minutes, 38.56 seconds. Under the allotted 15 minutes. So, shouldn't be counted as a timeout.

Yet, 30 seconds later, we spam this:
09-10 14:39:44.057 I/GeckoDump(  584): ⰲ겿{"action":"test_status","time":1410385184055,"thread":"","js_source":"TestRunner","pid":null,"source":"mochitest","test":"/tests/layout/style/test/test_value_storage.html","subtest":"Test timed out.","status":"FAIL","expected":"PASS"}ⰲ겿

The intervening time seems to be a GC (with several "GC_EXPLICIT freed" lines) -- maybe that's credited as part of the test's allotted time... That seems a bit odd, since we've already reported the test as complete.

In any case:
 (1) we seem to be pretty close to the 15 minute timeout (for this platform), and we probably need to bump up the requestLongerTimeout again.
...and also:
 (2) the harness might have a bug where a post-test-completion GC gets counted against the test (which really only matters if the test is close to its time-limit, but is still worth fixing).
(I filed bug 1068951 on the test-harness timeout-reporting bug (labeled as (2) in my previous comment).)
Per comment 136 (and previous tbplbot comments), we're running right up to the edge of this test's current 15-minute limit, in what looks like an otherwise-successful testrun.

(In the log I looked at in comment 136, there's a final GC that seems to be pushing us over the limit; but even without that, we're within 30 seconds of the limit, which is too close for comfort & reliability.)
Comment on attachment 8491104 [details] [diff] [review]
followup: request even longer timeout (20 min instead of 15 min)

Oops, meant to request review on this change. See comment 138 for why I think we should bump up the requestLongerTimeout() factor here.
Attachment #8491104 - Flags: review?(dbaron)
Attachment #8491104 - Flags: review?(dbaron) → review+
See Also: → 1073761
See Also: → 1317732
You need to log in before you can comment on or make changes to this bug.