Closed Bug 1474539 (t-yosemite-r7-415) Opened 7 years ago Closed 7 years ago

[MDC1] t-yosemite-r7-415 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dlabici, Unassigned)

References

Details

Machine was missing from TC, went ahead and re-imaged it. Deploy Studio email came with Successful, waiting for machine to take jobs.
Rebooted machine, its been 3 hours since reimage and it yet has to claim a job. If after reboot no tasks will be takes, I will reimage it again.
The reimage looks successful/good. I checked on the machine and it is correctly running generic-worker and has no puppet errors. This may be a problem in the definition in task cluster. In the taskcluster worker explorer, this error is displayed: ``` Error ResourceNotFound eXaoNqntTqirwYAIuPYzcQ does not correspond to a task that exists. Are you sure this task has already been submitted? ``` On the host, 415, the logs only show that it requested a task. I tried redefining the worker as expiring last year and leaving it unquarantined. I expect for the worker to be cleaned up because of the old expiration and I want to see if it gets redefined and avoids the above error.
The worker info shows the reported bad task id as a recent task. So I think the tools.taskcluster.net ui is failing when requesting that task's info from the queue: ``` {'workerType': 'gecko-t-osx-1010', 'provisionerId': 'releng-hardware', 'workerId': 't-yosemite-r7-415', 'workerGroup': 'mdc1', 'recentTasks': [{'taskId': 'CWeq691CR32efHiAsVLvjw', 'runId': 0}, {'taskId': 'HuE1-zbqQsWfcnHqr6ZZaQ', 'runId': 0}, {'taskId': 'O4h25IxNQF6oNwHJTtEgRw', 'runId': 0}, {'taskId': 'LxvEBScLRICJyX9hcL-FIw', 'runId': 0}, {'taskId': 'Z2oqfWayS4GMAk7o780hZg', 'runId': 0}, {'taskId': 'eXaoNqntTqirwYAIuPYzcQ', 'runId': 0}, {'taskId': 'FBcNC4elRm-OFIQaMc2G4w', 'runId': 0}, {'taskId': 'dGhh_iXdTu6rrfKfMeRXjw', 'runId': 0}, {'taskId': 'HfDBWMjDQESoxI8k2E0TxQ', 'runId': 0}, {'taskId': 'WU-P-dBkSCGuqbGy2nzLxQ', 'runId': 0}, {'taskId': 'aFNOVf4tRsmczcovtMjgAQ', 'runId': 0}, {'taskId': 'aDii_Q6STnK61ZzgiY674w', 'runId': 0}, {'taskId': 'Y8l6yREoSDG91Kj5Soth-Q', 'runId': 0}, {'taskId': 'TTBLg2LGS5S2-LBK8CcxbQ', 'runId': 0}, {'taskId': 'Pto0whIXQTyEKkiNBu4T9A', 'runId': 0}, {'taskId': 'CBnIRFlQSkerkeA4sUv0Cg', 'runId': 0}, {'taskId': 'MSbGeKb9RqOZWwJn1rUf6A', 'runId': 0}, {'taskId': 'Eif3Usc9SE2eoCrWHrnWnA', 'runId': 0}, {'taskId': 'X6Hxk3TkSiC_-I_TLcRuow', 'runId': 0}, {'taskId': 'BZCmtql-Tm2h_iDJyaL_vw', 'runId': 0}], 'expires': '2017-07-11T15:47:23.835Z', 'firstClaim': '2018-05-10T13:33:11.530Z', 'actions': [{'name': 'ping', 'title': 'ping', 'context': 'worker', 'url': 'https://roller1.srv.releng.mdc1.mozilla.com:443/api/v1/workers/<workerId>/jobs?provisioner_id=<provisionerId>&worker_type=<workerType>&worker_group=<workerGroup>&task_name=ping', 'method': 'POST', 'description': 'ping server'}, {'name': 'reboot', 'title': 'reboot', 'context': 'worker', 'url': 'https://roller1.srv.releng.mdc1.mozilla.com:443/api/v1/workers/<workerId>/jobs?provisioner_id=<provisionerId>&worker_type=<workerType>&worker_group=<workerGroup>&task_name=reboot', 'method': 'POST', 'description': 'reboot hardware'}]} ```
Depends on: 1474953
Summary: t-yosemite-r7-415.test.releng.mdc1.mozilla.com problem tracking → [MDC1] t-yosemite-r7-415 problem tracking
No problems with this machine right now. The bad-task issue is a bug in taskcluster. This worker is running tasks, reporting logs to papertrail, and available to ssh. ``` [dhouse@rejh2.srv.releng.mdc1.mozilla.com ~]$ ssh root@t-yosemite-r7-415.test.releng.mdc1.mozilla.com This host is set to follow security level "low" Unauthorized access prohibited [root@t-yosemite-r7-415.test.releng.mdc1.mozilla.com ~]# ps -ef|grep worker 29 330 1 0 8:21AM ?? 0:00.00 /bin/bash /usr/local/bin/run-generic-worker.sh run --config /etc/generic-worker.config 29 332 330 0 8:21AM ?? 0:00.63 /usr/local/bin/generic-worker run --config /etc/generic-worker.config 29 333 330 0 8:21AM ?? 0:00.02 logger -t generic-worker -s 0 654 647 0 8:23AM ttys000 0:00.00 grep worker [root@t-yosemite-r7-415.test.releng.mdc1.mozilla.com ~]# ls -ltr /Users/cltbld/tasks/task*/logs/ total 488 -rw-r--r-- 1 cltbld staff 0 Jul 26 08:21 log_fatal.log -rw-r--r-- 1 cltbld staff 0 Jul 26 08:21 log_error.log -rw-r--r-- 1 cltbld staff 0 Jul 26 08:21 log_critical.log -rw-r--r-- 1 cltbld staff 1942 Jul 26 08:21 localconfig.json -rw-r--r-- 1 cltbld staff 43 Jul 26 08:23 log_warning.log -rw-r--r-- 1 cltbld staff 110400 Jul 26 08:23 log_raw.log -rw-r--r-- 1 cltbld staff 129140 Jul 26 08:23 log_info.log [root@t-yosemite-r7-415.test.releng.mdc1.mozilla.com ~]# uptime 8:24 up 3 mins, 2 users, load averages: 3.43 2.08 0.94 ```
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.