Closed Bug 1405388 Opened 7 years ago Closed 6 years ago

Intermittent ESLINT (ES-only) [taskcluster:error] Task timeout after 1800 seconds.

Categories

(Developer Infrastructure :: Lint and Formatting, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

Details

(Keywords: intermittent-failure)

Filed by: archaeopteryx [at] coole-files.de

https://treeherder.mozilla.org/logviewer.html#?job_id=134664254&repo=autoland

https://queue.taskcluster.net/v1/task/f_OmgKqQQaiy9vBbJFmpEw/runs/0/artifacts/public/logs/live_backing.log

[vcs 2017-10-03T16:05:27.237Z] 235738 files updated, 0 files merged, 0 files removed, 0 files unresolved
[vcs 2017-10-03T16:05:27.334Z] updated to 12107922ed6df7dc0a681f0f088817dc58e59e83
[task 2017-10-03T16:05:27.699Z] executing ['bash', '-cx', 'cd /builds/worker/checkouts/gecko/ && cp -r /build/node_modules_eslint node_modules && ln -s ../tools/lint/eslint/eslint-plugin-mozilla node_modules && ln -s ../tools/lint/eslint/eslint-plugin-spidermonkey-js node_modules && ./mach lint -l eslint -f treeherder --quiet\n']
[task 2017-10-03T16:05:27.703Z] + cd /builds/worker/checkouts/gecko/
[task 2017-10-03T16:05:27.703Z] + cp -r /build/node_modules_eslint node_modules
[task 2017-10-03T16:05:27.933Z] + ln -s ../tools/lint/eslint/eslint-plugin-mozilla node_modules
[task 2017-10-03T16:05:27.935Z] + ln -s ../tools/lint/eslint/eslint-plugin-spidermonkey-js node_modules
[task 2017-10-03T16:05:27.935Z] + ./mach lint -l eslint -f treeherder --quiet
[task 2017-10-03T16:05:28.750Z] New python executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenv/bin/python2.7
[task 2017-10-03T16:05:28.751Z] Also creating executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenv/bin/python
[task 2017-10-03T16:05:30.735Z] Installing setuptools, pip, wheel...done.
[task 2017-10-03T16:05:31.423Z] running build_ext
[task 2017-10-03T16:05:31.423Z] building 'psutil._psutil_linux' extension
[task 2017-10-03T16:05:31.423Z] creating build
[task 2017-10-03T16:05:31.423Z] creating build/temp.linux-x86_64-2.7
[task 2017-10-03T16:05:31.423Z] creating build/temp.linux-x86_64-2.7/psutil
[task 2017-10-03T16:05:31.423Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -DPSUTIL_VERSION=311 -I/usr/include/python2.7 -c psutil/_psutil_linux.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_linux.o
[task 2017-10-03T16:05:31.423Z] creating build/lib.linux-x86_64-2.7
[task 2017-10-03T16:05:31.424Z] creating build/lib.linux-x86_64-2.7/psutil
[task 2017-10-03T16:05:31.424Z] x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/psutil/_psutil_linux.o -o build/lib.linux-x86_64-2.7/psutil/_psutil_linux.so
[task 2017-10-03T16:05:31.424Z] building 'psutil._psutil_posix' extension
[task 2017-10-03T16:05:31.424Z] x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/include/python2.7 -c psutil/_psutil_posix.c -o build/temp.linux-x86_64-2.7/psutil/_psutil_posix.o
[task 2017-10-03T16:05:31.424Z] x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-Bsymbolic-functions -Wl,-z,relro -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/psutil/_psutil_posix.o -o build/lib.linux-x86_64-2.7/psutil/_psutil_posix.so
[task 2017-10-03T16:05:31.424Z] copying build/lib.linux-x86_64-2.7/psutil/_psutil_linux.so -> psutil
[task 2017-10-03T16:05:31.424Z] copying build/lib.linux-x86_64-2.7/psutil/_psutil_posix.so -> psutil
[task 2017-10-03T16:05:31.425Z] 

[taskcluster:error] Task timeout after 1800 seconds. Force killing container.
[taskcluster 2017-10-03 16:11:55.698Z] === Task Finished ===
[taskcluster 2017-10-03 16:11:55.699Z] Unsuccessful task run with exit code: -1 completed in 1859.796 seconds
Component: Buildduty → Lint
Product: Release Engineering → Testing
QA Contact: catlee
Chris, do we have any stats on how long the build runs take? I'm wondering if we're hitting up against the limit here on these VMs.

(and if so, how do we extend the timeout)?
Flags: needinfo?(catlee)
I don't know off the top of my head. Treeherder should have this data.
Flags: needinfo?(catlee)
Ok, according to https://activedata.allizom.org/tools/query.html#query_id=I0qzxxds most tasks are finishing in about 10 mins. So we don't need to extend the timeout. This looks more like something hung.

We haven't touched ESLint's versions recently, so my best guess is that this is a one-off glitch.
https://wiki.mozilla.org/Bugmasters#Intermittent_Test_Failure_Cleanup
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → INCOMPLETE
Reopening since I was hit by this on autoland.

https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=984d714aa987dd15ffb9dbde66f8a06559b49077&selectedJob=152066833
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Summary: Intermittent eslint [taskcluster:error] Task timeout after 1800 seconds. Force killing container. → Intermittent eslint [taskcluster:error] Task timeout after 1800 seconds. Force killing container. [taskcluster:error] Task timeout after 3600 seconds. Force killing container. / [taskcluster:error] Task timeout after 5400 seconds. Force killing container.
This needs splitting out into multiple bugs. The original bug was for ESLint, but recent markings have mainly been "source-test-file-metadata-bugzilla-components".

Just looking briefly at the logs, it is likely to be different issues. ESLint is typically something like:

---------
+ ./mach lint -l eslint -f treeherder --quiet
New python executable in /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenv/bin/python2.7

... lots of venv setup stuff ...

Error processing command. Ignoring because optional. (optional:packages.txt:comm/build/virtualenv_packages.txt)

[taskcluster:error] Task timeout after 1800 seconds. Force killing container.
---------

The bugzilla metadata is something like:

---------
+ ./mach file-info bugzilla-automation /builds/worker/artifacts
WARNING: Not a supported OS_TARGET for NSPR in moz.build: "". Use --with-system-nspr
[taskcluster:error] Task timeout after 1800 seconds. Force killing container.
---------

They're obviously different, timeouts shouldn't be treated the same for different builders imo.
Flags: needinfo?(sheriffs)
Andrew, any idea how we could get output on the very rare instances this does fail? It looks like the output is the same, save for the task timeout message.

I'm wondering if there's a way if we could at least say if ESLint is still running or something...
Flags: needinfo?(ahalberstadt)
We could busy wait for the eslint process here (pass in timeout=10 in a while loop):
https://searchfox.org/mozilla-central/source/tools/lint/eslint/__init__.py#93

Then we could test the PID is still alive and print the results to stdout or file. I don't think that's something we'd want to land permanently.. though we could do 1000 retriggers on try and hopefully reproduce it...

Maybe a better first step would be to add some actually useful print debugging that we *can* land. Even just knowing whether we get past that call to proc.wait() would be valuable.
Flags: needinfo?(ahalberstadt)
Summary: Intermittent eslint [taskcluster:error] Task timeout after 1800 seconds. Force killing container. [taskcluster:error] Task timeout after 3600 seconds. Force killing container. / [taskcluster:error] Task timeout after 5400 seconds. Force killing container. → Intermittent ESLINT (ES-only) [taskcluster:error] Task timeout after 1800 seconds.
Product: Testing → Firefox Build System
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → INCOMPLETE
Product: Firefox Build System → Developer Infrastructure
You need to log in before you can comment on or make changes to this bug.