Closed
Bug 1474953
Opened 7 years ago
Closed 5 years ago
a workers' recent tasks may not exist
Categories
(Taskcluster :: UI, defect)
Taskcluster
UI
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: dhouse, Assigned: hassan)
References
Details
(Whiteboard: taskcluster-web)
I have a worker that is not getting tasks when it requests them (and there are tasks in the queue). Please remove/reset/clear this worker's definition is task cluster so that it does not have a missing task id in recentTasks.
Viewed from worker explorer, https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-osx-1010/workers/mdc1/t-yosemite-r7-415, this error is displayed:
```
Error ResourceNotFound
eXaoNqntTqirwYAIuPYzcQ does not correspond to a task that exists. Are you sure this task has already been submitted?
```
Here is the current worker state from the queue api:
```
{
"workerType": "gecko-t-osx-1010",
"provisionerId": "releng-hardware",
"workerId": "t-yosemite-r7-415",
"workerGroup": "mdc1",
"recentTasks": [
{
"taskId": "HuE1-zbqQsWfcnHqr6ZZaQ",
"runId": 0
},
{
"taskId": "O4h25IxNQF6oNwHJTtEgRw",
"runId": 0
},
{
"taskId": "LxvEBScLRICJyX9hcL-FIw",
"runId": 0
},
{
"taskId": "Z2oqfWayS4GMAk7o780hZg",
"runId": 0
},
{
"taskId": "eXaoNqntTqirwYAIuPYzcQ",
"runId": 0
},
{
"taskId": "FBcNC4elRm-OFIQaMc2G4w",
"runId": 0
},
{
"taskId": "dGhh_iXdTu6rrfKfMeRXjw",
"runId": 0
},
{
"taskId": "HfDBWMjDQESoxI8k2E0TxQ",
"runId": 0
},
{
"taskId": "WU-P-dBkSCGuqbGy2nzLxQ",
"runId": 0
},
{
"taskId": "aFNOVf4tRsmczcovtMjgAQ",
"runId": 0
},
{
"taskId": "aDii_Q6STnK61ZzgiY674w",
"runId": 0
},
{
"taskId": "Y8l6yREoSDG91Kj5Soth-Q",
"runId": 0
},
{
"taskId": "TTBLg2LGS5S2-LBK8CcxbQ",
"runId": 0
},
{
"taskId": "Pto0whIXQTyEKkiNBu4T9A",
"runId": 0
},
{
"taskId": "CBnIRFlQSkerkeA4sUv0Cg",
"runId": 0
},
{
"taskId": "MSbGeKb9RqOZWwJn1rUf6A",
"runId": 0
},
{
"taskId": "Eif3Usc9SE2eoCrWHrnWnA",
"runId": 0
},
{
"taskId": "X6Hxk3TkSiC_-I_TLcRuow",
"runId": 0
},
{
"taskId": "BZCmtql-Tm2h_iDJyaL_vw",
"runId": 0
},
{
"taskId": "JaMpPUB2TqaNqdW_tqX26w",
"runId": 0
}
],
"expires": "2018-07-12T15:47:27.437Z",
"firstClaim": "2018-05-10T13:33:11.530Z",
[...]
]
```
Comment 1•7 years ago
|
||
That's just a tools bug -- that error should be swallowed and the UI should omit the row or show it as missing. Presumably that task has expired.
The only state stored in the queue for a worker is this sort of advisory stuff used for debugging via the UI -- none of it affects claiming. So something else is wrong with the claiming process.
Component: Operations → Tools
Summary: Invalid task id. Please clear/remove worker definition. → a workers' recent tasks may not exist
I'm okay with the tasks not existing as long that is not breaking the claiming process.
From the worker logs, it looks like generic-worker is requesting tasks and getting nothing (despite tasks being in the queue). What might cause that?
Comment 3•7 years ago
|
||
We looked in the logs and were unable to answer the question in comment 2. But let's leave this open to fix the UI issue.
This worker is taking tasks now. It "reclaimed" a task soon after I created this bug, and it has been running tasks since then but the tools ui still shows the error (I expect when that bad taskid "scrolls" out of the recent tasks then it will display this worker correctly).
I see this problem happened again for another worker:
https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-osx-1010/workers/mdc1/t-yosemite-r7-357
```
Error ResourceNotFound
eTZxpnbWRxWqU825BrrRkQ does not correspond to a task that exists. Are you sure this task exists?
{
"method": "status",
"params": {
"taskId": "eTZxpnbWRxWqU825BrrRkQ"
},
"payload": {},
"time": "2018-07-26T15:14:37.308Z"
}
```
```
$ wget -O - https://queue.taskcluster.net/v1/task/eTZxpnbWRxWqU825BrrRkQ/status
--2018-07-26 09:20:47-- https://queue.taskcluster.net/v1/task/eTZxpnbWRxWqU825BrrRkQ/status
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving queue.taskcluster.net (queue.taskcluster.net)... 184.72.216.59, 50.19.109.135, 50.16.233.7
Connecting to queue.taskcluster.net (queue.taskcluster.net)|184.72.216.59|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2018-07-26 09:20:48 ERROR 404: Not Found.
```
```
$ wget -q -O - https://queue.taskcluster.net/v1/provisioners/releng-hardware/worker-types/gecko-t-osx-1010/workers/mdc1/t-yosemite-r7-357
{
"workerType": "gecko-t-osx-1010",
"provisionerId": "releng-hardware",
"workerId": "t-yosemite-r7-357",
"workerGroup": "mdc1",
"recentTasks": [
{
"taskId": "eTZxpnbWRxWqU825BrrRkQ",
"runId": 0
},
{
"taskId": "TvfbKwmvSj29Q-kHS9kiyQ",
"runId": 0
},
{
"taskId": "DqRilpRuSyeGQgUQTkKPCQ",
"runId": 0
},
{
"taskId": "BsNAhcmhQjuCldEOlqSRkg",
"runId": 0
},
{
"taskId": "SdiZaW9_T9mh816wP5_JZQ",
"runId": 0
},
{
"taskId": "aRtdkQUJSAu2Ff2g59f73Q",
"runId": 0
},
{
"taskId": "aelzKWQkS52xSg8HyTZKbQ",
"runId": 0
},
{
"taskId": "fwRJXdPXT3GQKG0nUsg1PA",
"runId": 0
},
{
"taskId": "aRbAnwa7Q8OWhegJWQ0pPQ",
"runId": 0
},
{
"taskId": "PSUmmEg2Te-ifiFi8B7jGA",
"runId": 0
},
{
"taskId": "GKFj3kjORwSbGkrJ-8PSGw",
"runId": 0
},
{
"taskId": "Vj3wO7H6Q5-R7Je9IuYPsw",
"runId": 0
},
{
"taskId": "FJchuXl4TV-OnVCmtTJycA",
"runId": 0
},
{
"taskId": "FG6NKFhzTYeCW2CRqQ43nA",
"runId": 0
},
{
"taskId": "JJ0DETYPQRiKik94R-KvTg",
"runId": 0
},
{
"taskId": "Pg1tJeoMTPioZnVNKUSrMA",
"runId": 0
},
{
"taskId": "RAZRPgoZQ-qHf8H9LfBSLg",
"runId": 0
},
{
"taskId": "QEFHAK_zRsidfTgLsGB1Ww",
"runId": 0
},
{
"taskId": "J2hVj6GGQwqXKxbk8MkZIg",
"runId": 0
},
{
"taskId": "SJas7EHxTP6PlxPb_r9nmw",
"runId": 0
}
],
"expires": "2018-06-13T13:17:02.366Z",
"firstClaim": "2017-10-18T18:27:29.898Z",
"quarantineUntil": "3018-06-12T13:29:07.000Z",
"actions": [
{
"name": "ping",
"title": "ping",
"context": "worker",
"url": "https://roller1.srv.releng.mdc1.mozilla.com:443/api/v1/workers/<workerId>/jobs?provisioner_id=<provisionerId>&worker_type=<workerType>&worker_group=<workerGroup>&task_name=ping",
"method": "POST",
"description": "ping server"
},
{
"name": "reboot",
"title": "reboot",
"context": "worker",
"url": "https://roller1.srv.releng.mdc1.mozilla.com:443/api/v1/workers/<workerId>/jobs?provisioner_id=<provisionerId>&worker_type=<workerType>&worker_group=<workerGroup>&task_name=reboot",
"method": "POST",
"description": "reboot hardware"
}
]
}
```
Comment 6•7 years ago
|
||
Per earlier discussion, this is a UI bug, and it's not fixed yet so not surprising it still occurs :)
Comment 7•6 years ago
|
||
Commit pushed to master at https://github.com/taskcluster/taskcluster-tools
https://github.com/taskcluster/taskcluster-tools/commit/d801d11d7b3effe0d56603c1e1d88f90c14b9d95
Bug 1474953 - catch task status 404s and show in table (#582)
Assignee | ||
Comment 8•6 years ago
|
||
This has been implemented in taskcluster-tools but still need to be done in taskcluster-web. I can take care of the rest. Thank you Dave for the pull-request :)
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → helfi92
Whiteboard: taskcluster-web
Updated•6 years ago
|
Component: Tools → UI and Tools
Reporter | ||
Comment 10•5 years ago
|
||
Closing this sounds good.
I think this was a problem for CIDuty when they could not view workers in the UI because the task data had been dropped (and the UI would fail the entire worker view/page when it could not load the tasks).
Flags: needinfo?(dhouse)
Updated•5 years ago
|
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•