Closed
Bug 1502269
Opened 6 years ago
Closed 6 years ago
docker-worker task still running even after it exceeded maxruntime
Categories
(Release Engineering :: Release Automation: Other, defect)
Release Engineering
Release Automation: Other
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: jlund, Assigned: wcosta)
References
Details
maxruntime is 90m, I cancelled and reran the task after 115m:
https://tools.taskcluster.net/groups/SjVFa0xFSqe78-GZj-becQ/tasks/WMa_uLdcRSS6QHHPzGpQfg/details
Comment 1•6 years ago
|
||
I'm not sure what counts as 'task execution time', because if it's from when the 'Task Starting' log is produced, that might explain it. The first log entry is 03:12:25, and 'Task Starting' was 03:44:08. It took over 20 minutes to prep the docker image from the downloaded archive.
[taskcluster 2018-10-26 03:21:46.261Z] Loading docker image from downloaded archive.
[taskcluster 2018-10-26 03:43:58.726Z] Image 'public/image.tar.zst' from task 'CMbXFR28R8SSwDl0GY6NTw' loaded. Using image ID sha256:b8c62222c5366444530fce8627a1fa8a52ddf5d25aceefed4e1bc6047ef99215.
[taskcluster 2018-10-26 03:44:08.093Z] === Task Starting ===
Comment 2•6 years ago
|
||
https://github.com/taskcluster/docker-worker/blob/master/src/lib/task.js#L976
It looks like the timeout is set up between the 'docker pull' and 'docker start', so the extra-long image download accounts for this. It seems the fairest approach (infrastructure slowness shouldn't eat into a task runtime) but it's worth highlighting
Comment 4•6 years ago
|
||
I think this is possibly a Docker-Worker issue. Wander, do you know what's going on here?
Flags: needinfo?(jhford) → needinfo?(wcosta)
Assignee | ||
Comment 5•6 years ago
|
||
I am putting this in my backlog.
Assignee: nobody → wcosta
Status: NEW → ASSIGNED
Flags: needinfo?(wcosta)
Reporter | ||
Comment 6•6 years ago
|
||
hit this again with https://tools.taskcluster.net/groups/USzychG6QQCdkzutn7Ws_w/tasks/W1zTBzieSXC8WvwIqaH4Mw run0
I think anyway. Also noteworthy, using the "terminate" ui button on that worker gave me a "internal server error": https://tools.taskcluster.net/provisioners/aws-provisioner-v1/worker-types/gecko-3-b-linux/workers/us-east-1/i-070095455bc22e18b
Server error made me think it wasn't a scope issue.
I cancelled the task and reran it to hopefully unblock the release.
Summary: Update Verify task still running even after it exceeded maxruntime → docker-worker task still running even after it exceeded maxruntime
Comment 7•6 years ago
|
||
We will live with this until we move to generic-worker.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•