Closed Bug 1381839 Opened 8 years ago Closed 8 years ago

Intermittent Android crashtest [taskcluster:error] Task timeout after 3600 seconds. Force killing container.

Categories

(Firefox for Android Graveyard :: Testing, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1381933

People

(Reporter: aryx, Assigned: gbrown)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed:other])

Runs of the Android crashtests suite randomly time out. It seems this started last Friday ~3pm PDT: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=f5687797261d69daf1a788d15275909be03c4c2f&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=runnable&filter-resultStatus=success&filter-searchStr=android%20crashtest&tochange=922f398557affed695563755c1b92469075aea1b There is usually nothing in the log between a TEST-END and the information that the task got killed, e.g. [task 2017-07-14T22:40:43.262883Z] 22:40:43 INFO - REFTEST TEST-PASS | http://10.0.2.2:8854/tests/layout/base/crashtests/373919.xhtml | (LOAD ONLY) [task 2017-07-14T22:40:43.263310Z] 22:40:43 INFO - REFTEST TEST-END | http://10.0.2.2:8854/tests/layout/base/crashtests/373919.xhtml [taskcluster:error] Task timeout after 3600 seconds. Force killing container. [taskcluster 2017-07-14 23:33:32.864Z] === Task Finished === gbrown, can you take a look at this, please?
Flags: needinfo?(gbrown)
Blocks: 1204281
See also bug 1381283 just because it seems unusual for crashtests to start timing out on 2 different platforms at around the same time; the logs look quite different.
Assignee: nobody → gbrown
Flags: needinfo?(gbrown)
Priority: -- → P1
See Also: → 1381283
It looks like there are a few tests that hang in a similar way: Debug crashtest-6: 22:31:02 INFO - REFTEST NoneREFTEST TEST-START | http://10.0.2.2:8854/tests/layout/generic/crashtests/370174-3.html 22:31:02 INFO - REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/layout/generic/crashtests/370174-3.html | 187 / 319 (58%) 22:31:02 INFO - REFTEST TEST-PASS | http://10.0.2.2:8854/tests/layout/generic/crashtests/370174-3.html | (LOAD ONLY) 22:31:02 INFO - REFTEST TEST-END | http://10.0.2.2:8854/tests/layout/generic/crashtests/370174-3.html [taskcluster:error] Task timeout after 3600 seconds. Force killing container. [taskcluster 2017-07-15 23:17:34.140Z] === Task Finished === Debug crashtest-5: 22:40:43 INFO - REFTEST NoneREFTEST TEST-START | http://10.0.2.2:8854/tests/layout/base/crashtests/373919.xhtml 22:40:43 INFO - REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/layout/base/crashtests/373919.xhtml | 10 / 323 (3%) 22:40:43 INFO - REFTEST TEST-PASS | http://10.0.2.2:8854/tests/layout/base/crashtests/373919.xhtml | (LOAD ONLY) 22:40:43 INFO - REFTEST TEST-END | http://10.0.2.2:8854/tests/layout/base/crashtests/373919.xhtml [taskcluster:error] Task timeout after 3600 seconds. Force killing container. [taskcluster 2017-07-14 23:33:32.864Z] === Task Finished === Debug crashtest-3: 09:31:28 INFO - REFTEST NoneREFTEST TEST-START | http://10.0.2.2:8854/tests/editor/composer/crashtests/351236-1.html 09:31:28 INFO - REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/editor/composer/crashtests/351236-1.html | 185 / 322 (57%) 09:31:39 INFO - REFTEST TEST-PASS | http://10.0.2.2:8854/tests/editor/composer/crashtests/351236-1.html | (LOAD ONLY) 09:31:39 INFO - REFTEST TEST-END | http://10.0.2.2:8854/tests/editor/composer/crashtests/351236-1.html [taskcluster:error] Task timeout after 3600 seconds. Force killing container. [taskcluster 2017-07-18 10:17:11.019Z] === Task Finished ===
The range in comment 3 strongly suggests this problem started with https://hg.mozilla.org/integration/mozilla-inbound/rev/c038d1ebf74fa3b58d01937f553b050520b45ffa, bug 1362903. https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb0ea9caa7e5c3b993090b89ac191ecf2a226d75 appears to confirm that bug 1362903 is responsible. https://bugzilla.mozilla.org/show_bug.cgi?id=1362903#c20 even notes the issue, but assumed it was a pre-existing condition; while job timeouts were pre-existing, we did not have any significant Android crashtest job timeouts before this change. :freesamael -- Do you understand what is going wrong here? Note that 2 of the 3 problematic tests (comment 2) rely on reload().
Flags: needinfo?(sawang)
Blocks: 1362903
Thanks Samael!
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Whiteboard: [stockwell fixed:other]
Blocks: 1411358
No longer blocks: 1411358
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.