Open Bug 1420394 Opened 2 years ago Updated 6 days ago

Intermittent reftest [taskcluster:error] Task aborted - max run time exceeded

Categories

(Testing :: Reftest, defect)

Version 3
defect
Not set

Tracking

(Not tracked)

People

(Reporter: aiakab, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

Here is the bug from autoland https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=6cd9206792f9f0cdb4828fba6d6f9dd0745d550b&selectedJob=147327027

It is e10s test-windows7-32/debug-reftest-gpu-e10s-1 tc-R-e10s(Rg1). It has failed with Aborting task - max run time exceeded!       and      Result: IDLENESS_LIMIT_EXCEEDED.  I have re triggered the job twice and  the job finished sucessfuly 

Log link: https://treeherder.mozilla.org/logviewer.html#?job_id=147327027&repo=autoland

Part of a log:
06:12:59     INFO -  [Child 2844, Main Thread] WARNING: Subsequent listeners are not retargetable: file z:/build/build/src/netwerk/protocol/http/HttpChannelChild.cpp, line 3416
10486
06:13:01     INFO -  REFTEST TEST-LOAD | http://localhost:49308/1511501287663/83/encoder.html?img=size-33x33.png&mime=image/vnd.microsoft.icon | 796 / 1936 (41%)
10487
06:13:01     INFO -  --DOMWINDOW == 12 (09147000) [pid = 2844] [serial = 2161] [outer = 00000000] [url = http://localhost:49308/1511501287649/80/encoder.html?img=size-17x17.png&mime=image/vnd.microsoft.icon]
10488
06:13:01     INFO -  --DOMWINDOW == 11 (09149000) [pid = 2844] [serial = 2160] [outer = 00000000] [url = http://localhost:49308/1511501287649/80/size-17x17.png]
10489
06:13:01     INFO -  --DOMWINDOW == 10 (0EB6F000) [pid = 2844] [serial = 2162] [outer = 00000000] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E]
10490
06:13:01     INFO -  --DOMWINDOW == 9 (0EB6B800) [pid = 2844] [serial = 2158] [outer = 00000000] [url = http://localhost:49308/1511501287646/79/encoder.html?img=size-16x16.png&mime=image/vnd.microsoft.icon]
10491
06:13:01     INFO -  --DOMWINDOW == 8 (0914D800) [pid = 2844] [serial = 2159] [outer = 00000000] [url = data:text/html;charset=UTF-8,%3C%21%2D%2DCLEAR%2D%2D%3E]
10492
06:13:02     INFO -  ++DOMWINDOW == 9 (09149000) [pid = 2844] [serial = 2170] [outer = 061843A0]
10493
[taskcluster 2017-11-24T06:13:03.201Z] Aborting task - max run time exceeded!
10494
[taskcluster 2017-11-24T06:13:03.419Z]   Exit Code: 0
10495
[taskcluster 2017-11-24T06:13:03.419Z]   User Time: 0s
10496
[taskcluster 2017-11-24T06:13:03.422Z] Kernel Time: 31.2002ms
10497
[taskcluster 2017-11-24T06:13:03.422Z]   Wall Time: 59m55.211s
10498
[taskcluster 2017-11-24T06:13:03.422Z] Peak Memory: 2248704
10499
[taskcluster 2017-11-24T06:13:03.422Z]      Result: IDLENESS_LIMIT_EXCEEDED
10500
[taskcluster 2017-11-24T06:13:03.424Z] === Task Finished ===
10501
[taskcluster 2017-11-24T06:13:03.424Z] Task Duration: 59m55.264s
Duplicate of this bug: 1442893
Summary: Intermittent reftest Aborting task - max run time exceeded! → Intermittent reftest [taskcluster:error] Task aborted - max run time exceeded
This bug failed 38 times in the last 7 days, occurs on windows7-32 opt, pgo and debug.

Log:
https://treeherder.mozilla.org/logviewer.html#?job_id=206763080&repo=autoland&lineNumber=17179

1:05:05     INFO - REFTEST TEST-PASS | file:///Z:/task_1540029163/build/tests/reftest/tests/layout/reftests/text-overflow/vertical-decorations-4.html != file:///Z:/task_1540029163/build/tests/reftest/tests/layout/reftests/text-overflow/vertical-decorations-3-4-notref.html | image comparison, max difference: 255, number of differing pixels: 125
11:05:05     INFO - REFTEST TEST-END | file:///Z:/task_1540029163/build/tests/reftest/tests/layout/reftests/text-overflow/vertical-decorations-4.html != file:///Z:/task_1540029163/build/tests/reftest/tests/layout/reftests/text-overflow/vertical-decorations-3-4-notref.html
[taskcluster:error] Aborting task...
11:05:06     INFO - REFTEST INFO | Slowest test took 6198ms (file:///Z:/task_1540029163/build/tests/reftest/tests/layout/reftests/text-overflow/selection.html)
11:05:06     INFO - REFTEST INFO | Total canvas count = 3
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 5916 (child process of PID 5812) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 5424 (child process of PID 5812) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 5192 (child process of PID 5812) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 4392 (child process of PID 5812) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 5812 (child process of PID 5752) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 5752 (child process of PID 2932) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 2508 (child process of PID 1188) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 2932 (child process of PID 1188) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 1188 (child process of PID 2104) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] SUCCESS: The process with PID 2104 (child process of PID 1164) has been terminated.
[taskcluster 2018-10-20T11:05:06.461Z] 
[taskcluster 2018-10-20T11:05:06.468Z] === Task Finished ===
[taskcluster 2018-10-20T11:05:06.469Z] Task Duration: 59m32.742s
[taskcluster 2018-10-20T11:05:07.424Z] Uploading artifact public/logs/localconfig.json from file logs\localconfig.json with content encoding "gzip", mime type "application/octet-stream" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:08.388Z] Uploading artifact public/logs/log_critical.log from file logs\log_critical.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:08.832Z] Uploading artifact public/logs/log_error.log from file logs\log_error.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:09.477Z] Uploading artifact public/logs/log_fatal.log from file logs\log_fatal.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:09.896Z] Uploading artifact public/logs/log_info.log from file logs\log_info.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:10.938Z] Uploading artifact public/logs/log_raw.log from file logs\log_raw.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:11.634Z] Uploading artifact public/logs/log_warning.log from file logs\log_warning.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:12.128Z] Uploading artifact public/test_info/reftest-gpu_errorsummary.log from file build\blobber_upload_dir\reftest-gpu_errorsummary.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:12.558Z] Uploading artifact public/test_info/reftest-gpu_raw.log from file build\blobber_upload_dir\reftest-gpu_raw.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:13.662Z] Uploading artifact public/test_info/system-info.log from file build\blobber_upload_dir\system-info.log with content encoding "gzip", mime type "text/plain" and expiry 2019-10-20T08:45:01.371Z
[taskcluster 2018-10-20T11:05:14.260Z] Uploading redirect artifact public/logs/live.log to URL https://queue.taskcluster.net/v1/task/bt3YEyWcSvCl1IPb9lpCpg/runs/0/artifacts/public/logs/live_backing.log with mime type "text/plain; charset=utf-8" and expiry 2019-10-20T08:45:01.371Z
[taskcluster:error] Task aborted - max run time exceeded

ahal: Can you please take a look at the this bug?
Flags: needinfo?(ahal)
Whiteboard: [stockwell needswork]
This might just be a case of a task near the borderline of the timeout threshold. I'll keep the needinfo to dig deeper.
In the last 7 days there have been 83 occurrences mostly on Windows 10 64, some on Os X, build types Asan and Debug.
Whiteboard: [stockwell unknown] → [stockwell unknown][stockwell needswork]
This bug has failed 32 times in the last 7 days. Occurs on osx-10-10, linux64-qr, windows10-64-ccov and windows7-32 on debug opt and pgo build types.

Log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=216733713&repo=mozilla-inbound&lineNumber=21901

 INFO - REFTEST TEST-START | file:///Users/cltbld/tasks/task_1544669417/build/tests/jsreftest/tests/jsreftest.html?test=non262/lexical-environment/block-scoped-functions-annex-b-with.js
20:41:57     INFO - REFTEST TEST-LOAD | file:///Users/cltbld/tasks/task_1544669417/build/tests/jsreftest/tests/jsreftest.html?test=non262/lexical-environment/block-scoped-functions-annex-b-with.js | 1700 / 17967 (9%)
20:41:57     INFO - ++DOMWINDOW == 97 (0x122930400) [pid = 2036] [serial = 3628] [outer = 0x112ec0800]
20:41:57     INFO - TEST-INFO |  FAILED!  TypeError: Assertion failed: got "true", expected "false"
20:41:57     INFO - JavaScript error: file:///Users/cltbld/tasks/task_1544669417/build/tests/jsreftest/tests/shell.js, line 127: TypeError: Assertion failed: got "true", expected "false"
20:41:57     INFO - [Child 2036, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/workspace/build/src/dom/base/nsContentUtils.cpp, line 3764
20:41:57     INFO - REFTEST TEST-PASS | file:///Users/cltbld/tasks/task_1544669417/build/tests/jsreftest/tests/jsreftest.html?test=non262/lexical-environment/block-scoped-functions-annex-b-with.js | (LOAD ONLY)
20:41:57     INFO - REFTEST TEST-END | file:///Users/cltbld/tasks/task_1544669417/build/tests/jsreftest/tests/jsreftest.html?test=non262/lexical-environment/block-scoped-functions-annex-b-with.js
20:41:58     INFO - ++DOMWINDOW == 98 (0x122934800) [pid = 2036] [serial = 3629] [outer = 0x112ec0800]
20:41:58     INFO - [Child 2036, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/base/nsContentUtils.cpp, line 8218
20:41:58     INFO - REFTEST TEST-START | file:///Users/cltbld/tasks/task_1544669417/build/tests/jsreftest/tests/jsreftest.html?test=non262/lexical-environment/block-scoped-functions-annex-b.js
(In reply to Dorel Luca [:dluca] from comment #48)
> This bug has failed 32 times in the last 7 days. Occurs on osx-10-10,
> linux64-qr, windows10-64-ccov and windows7-32 on debug opt and pgo build
> types.

I think that latest issue got exposed by bug 1512989. To be clear, bug 1512989 didn't cause it, but the jsreftests were perma-green (i.e even when there was a failure). So since the fix landed in bug 1512989 I'd expect to see all sorts of intermittent jsreftest stuff popping up. Unfortunately there won't be any easy way of tracking down when the intermittents started happening :(.

On another note, the failure in there is more specific than this bug. "Task aborted - max runtime exceeded" is just a generic message that can happen for a multitude of reasons. I think we should avoid starring things against this bug unless there is no other more specific log to star against. Maybe we can improve the logging of the harnesses so no intermittents get starred against this.
Flags: needinfo?(ahal)
See Also: → 1411358

In the last 7 days, there are 32 failures.

They occur on
16 linux64 (asan, debug)
5 linux32 (debug)
5 macosx64-qr (debug, opt)
3 linux64-qr (debug)
2 windows7-32 (debug)
1 osx-cross (debug)

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=221131401&repo=mozilla-central&lineNumber=115821

Whiteboard: [stockwell unknown][stockwell needswork] → [stockwell needswork]
See Also: → 1529777
You need to log in before you can comment on or make changes to this bug.