Closed Bug 1578336 Opened 6 years ago Closed 2 years ago

Intermittent [taskcluster:error] exit status 1

Categories

(Core :: JavaScript Engine, defect, P5)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Assigned: sfink, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, regression)

Attachments

(2 files)

Filed by: rmaries [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=264653142&repo=mozilla-central
Full log: https://queue.taskcluster.net/v1/task/LSGsZjPKShKs0jLTmxd-uA/runs/0/artifacts/public/logs/live_backing.log


03:01:12 INFO - TEST-START | /animation-worklet/worklet-animation-with-invalid-effect.https.html
03:01:12 INFO - Closing window 69
03:01:12 INFO - PID 2108 | [Child 7988, Chrome_ChildThread] WARNING: pipe error: 109: file z:/task[Parent 11092, Gecko_IOThread] WARNING: pipe error: 109: file z:/task_1567463977/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 341[taskcluster 2019-09-03T04:36:36.061Z] Uploading redirect artifact public/logs/live.log to URL https://queue.taskcluster.net/v1/task/LSGsZjPKShKs0jLTmxd-uA/runs/0/artifacts/public/logs/live_backing.log with mime type "text/plain; charset=utf-8" and expiry 2020-09-01T21:42:40.019Z
[taskcluster:error] exit status 1

All the logs in the passed week are SM bustages (https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2020-02-25&endday=2020-03-03&tree=all&bug=1578336) with these lines at the end:

[task 2020-02-26T06:29:22.295Z] Considering var_in_fun_plus_arg.js
[task 2020-02-26T06:29:22.313Z] + BUILD_STATUS=1
[task 2020-02-26T06:29:22.313Z] + mkdir -p Z:/task_1582686773/public/build
[task 2020-02-26T06:29:22.327Z] + cp -rL Z:/task_1582686773/src/obj-spider/dist/bin/js Z:/task_1582686773/src/obj-spider/dist/bin/jsapi-tests Z:/task_1582686773/src/obj-spider/dist/bin/js-gdb.py Z:/task_1582686773/public/build
[task 2020-02-26T06:29:22.571Z] + gzip -c Z:/task_1582686773/fetches/clang/bin/llvm-symbolizer
[task 2020-02-26T06:29:22.590Z] gzip: Z:/task_1582686773/fetches/clang/bin/llvm-symbolizer: No such file or directory
[task 2020-02-26T06:29:22.591Z] + true
[task 2020-02-26T06:29:22.591Z] + '[' -n 20200226031119 ']'
[task 2020-02-26T06:29:22.591Z] + '[' -n 1b4db5eb7f6f73d6b6b06956c34deab11854e12b ']'
[task 2020-02-26T06:29:22.591Z] + cat
[task 2020-02-26T06:29:22.607Z] + exit 1
[fetches 2020-02-26T06:29:22.609Z] removing Z:/task_1582686773/fetches
[fetches 2020-02-26T06:29:23.070Z] finished
[taskcluster 2020-02-26T06:29:23.097Z] Exit Code: 1
[taskcluster 2020-02-26T06:29:23.097Z] User Time: 0s
[taskcluster 2020-02-26T06:29:23.097Z] Kernel Time: 15.625ms
[taskcluster 2020-02-26T06:29:23.097Z] Wall Time: 3h14m12.5185109s
[taskcluster 2020-02-26T06:29:23.097Z] Result: FAILED
[taskcluster 2020-02-26T06:29:23.097Z] === Task Finished ===
[taskcluster 2020-02-26T06:29:23.097Z] Task Duration: 3h14m12.5194898s
[taskcluster 2020-02-26T06:29:23.399Z] Uploading artifact public/build/js-gdb.py from file public\build\js-gdb.py with content encoding "gzip", mime type "application/octet-stream" and expiry 2021-02-25T03:13:39.356Z
[taskcluster 2020-02-26T06:29:23.657Z] Uploading artifact public/build/js.exe from file public\build\js.exe with content encoding "gzip", mime type "application/x-msdownload" and expiry 2021-02-25T03:13:39.356Z
[taskcluster 2020-02-26T06:29:26.092Z] Uploading artifact public/build/jsapi-tests.exe from file public\build\jsapi-tests.exe with content encoding "gzip", mime type "application/x-msdownload" and expiry 2021-02-25T03:13:39.356Z
[taskcluster 2020-02-26T06:29:29.878Z] Uploading artifact public/build/llvm-symbolizer.gz from file public\build\llvm-symbolizer.gz with content encoding "identity", mime type "application/x-gzip" and expiry 2021-02-25T03:13:39.356Z
[taskcluster 2020-02-26T06:39:20.290Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/fbG8xJtZTD-OuxknTU28UA/0/on/aws/i-093ccef4bf98a4810/until/1582700360.536
[taskcluster 2020-02-26T06:45:41.584Z] Uploading artifact public/build/target.json from file public\build\target.json with content encoding "gzip", mime type "application/octet-stream" and expiry 2021-02-25T03:13:39.356Z
[taskcluster 2020-02-26T06:45:41.645Z] [mounts] Preserving cache: Moving "Z:\task_1582686773\build" to "Z:\caches\IZq_McTBQGeWakgl7Pxf3g"
[taskcluster 2020-02-26T06:45:41.645Z] [mounts] Denying task_1582686773 access to 'Z:\caches\IZq_McTBQGeWakgl7Pxf3g'
[taskcluster 2020-02-26T06:45:41.791Z] Uploading redirect artifact public/logs/live.log to URL https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/fbG8xJtZTD-OuxknTU28UA/runs/0/artifacts/public/logs/live_backing.log with mime type "text/plain; charset=utf-8" and expiry 2021-02-25T03:13:39.356Z
[taskcluster:error] exit status 1

Recent failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=291356214&repo=autoland
Jan, could you have a look over this or redirect to someone? Thank you.

Component: web-platform-tests → JavaScript Engine
Flags: needinfo?(jdemooij)
Product: Testing → Core
Version: Version 3 → unspecified

In a green win64 cgc build we have:

[task 2020-03-04T07:30:13.209Z] Testing jsapi-tests\binast\parser\multipart\unit\var_in_fun_plus_arg.js
[task 2020-03-04T07:30:13.210Z] Got the same AST when parsing jsapi-tests\binast\parser\multipart\unit\var_in_fun_plus_arg.js
[task 2020-03-04T07:30:13.210Z] Considering var_in_fun_plus_arg.js
[task 2020-03-04T07:30:13.223Z] + BUILD_STATUS=0

In a failing one we have:

[task 2020-02-28T02:01:22.676Z] Testing jsapi-tests\binast\parser\multipart\unit\var_in_fun_plus_arg.js
[task 2020-02-28T02:01:22.676Z] Got the same AST when parsing jsapi-tests\binast\parser\multipart\unit\var_in_fun_plus_arg.js
[task 2020-02-28T02:01:22.676Z] Considering var_in_fun_plus_arg.js
[task 2020-02-28T02:01:22.692Z] + BUILD_STATUS=1

So for some reason in build-sm.sh we get a non-zero exit code from autospider.py, but I don't see an obvious reason for that in the log.

Any thoughts, Steve?

Flags: needinfo?(jdemooij) → needinfo?(sphink)

I don't see what's actually failing there either, but it seems unnecessarily difficult to determine. I'll put up a diagnostic patch.

Assignee: nobody → sphink
Flags: needinfo?(sphink)
Keywords: leave-open
Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/73887fc2caf8 Better reporting on what failed during autospider.py run r=jandem

For example, I'm seeing an error at "check-jit-test"
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=293186561&repo=autoland&lineNumber=29742

Slightly surprisingly, test results appear after this in the log as well.

In the jobs that fail, the jit-tests keep running after make has moved on and started running jsapi-tests. Since the tests are still running, the python.exe .. jit_test.py is probably still running normally. It seems that mozmake.exe has decided that it is dead which makes me wonder if there is a timeout somewhere. The jit_test.py is running for over three hours at this point..

From the log:

[task 2020-03-14T21:08:42.081Z] + python3 Z:/task_1584219157/src/js/src/devtools/automation/autospider.py --platform=win64 compacting
...
[task 2020-03-15T00:08:42.534Z] in directory Z:\task_1584219157\src\obj-spider, running ['Z:\\task_1584219157\\src\\obj-spider\\dist\\bin\\jsapi-tests']

Three hours... autospider.py has a timeout of 3 hours and then processes are killed. It looks like something is misbehaving there.

Also, do we know why cgc jit-tests on Windows are so much slower than the Linux64 cgc build?

Flags: needinfo?(sphink)

The windows cgc builds are currently very close to the 3hr timeout and are
failing quite often in automation. This patch bumps the timeout to 3.5 hr as
a stop-gap until we have a better fix (eg reducing overhead and splitting
jobs).

See Also: → 1622824

I spot-checked two of these failures. Both are reporting that they failed in jsapi-tests. The test output shows

[task 2021-04-11T04:30:30.612Z] testThreadingThreadDetach
[task 2021-04-11T04:30:30.612Z] TEST-PASS | testThreadingThreadDetach | ok
[task 2021-04-11T04:30:30.612Z] testThreadingThreadJoin
[taskcluster 2021-04-11T04:43:08.382Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618117388.481
[taskcluster 2021-04-11T05:00:08.676Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618118408.755
[taskcluster 2021-04-11T05:17:08.992Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618119429.071
[taskcluster 2021-04-11T05:34:09.309Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618120449.37
[taskcluster 2021-04-11T05:51:09.537Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618121469.588
[taskcluster 2021-04-11T06:08:09.799Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618122489.838
[taskcluster 2021-04-11T06:25:10.038Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618123510.062
[taskcluster 2021-04-11T06:42:10.292Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618124530.311
[taskcluster 2021-04-11T06:59:10.520Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618125550.537
[taskcluster 2021-04-11T07:16:10.747Z] [taskcluster-proxy] Successfully refreshed taskcluster-proxy credentials: task-client/BOvXUIcDRTKT_aieWXlpbw/0/on/us-east-1/i-0dc66e9879dd8ab42/until/1618126570.758
[task 2021-04-11T07:17:10.949Z] TEST-PASS | testThreadingThreain directory ./obj-spider, running ['mozmake', 'check-jstests']
[task 2021-04-11T07:17:10.964Z] mozmake -C js/src check-jstests
[task 2021-04-11T07:17:10.977Z] mozmake[1]: Entering directory 'Z:/task_1618111956/obj-spider/js/src'
[task 2021-04-11T07:17:10.977Z] Z:/task_1618111956/obj-spider/_virtualenvs/init_py3/Scripts/python.exe -u z:/task_1618111956/src/js/src/tests/jstests.py \
[task 2021-04-11T07:17:10.977Z]         --no-progress --format=automation --timeout 300 \
[task 2021-04-11T07:17:10.977Z]         -j16 --jitflags=jstests \
[task 2021-04-11T07:17:10.977Z]         ../../dist/bin/js.exe

so jsapi-tests started, made it to at least testThreadingThreadJoin, which hung for nearly 3 hours before it "passed" (it reported TEST-PASS). Then the output got truncated (the finishing "adJoin | ok" never appears, nor even "adJ") and clobbered by the output of a check-jstests that was apparently kicked off next.

So again, it seems like

  • the test is timing out, perhaps testThreadingThreadJoin in particular
  • whatever timeout it hits is then allowing it to report having passed (that particular sub-test)
  • its buffered output is getting eaten
  • the error code is now being correctly reported (at the end of the log, jsapi-tests reports an exit status of 1)
  • this mostly happens on Windows

I glanced at 4 or 5 non-Windows failures. 1 or 2 were misfiled. The rest were different problems with the same symptom. It might be worth emitting an error that treeherder would pick up on to distinguish these.

Severity: normal → S3

This only hides other failures that are starred with this bug, such as Bug 1805765, Bug 1823395 and Bug 1825562.
Closing this as it has no actual failures for which this bug was filed.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: