Closed Bug 1462820 (T-W1064-MS-118) Opened 7 years ago Closed 6 years ago

[MDC1] T-W1064-MS-118 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: riman, Unassigned)

References

Details

This worker is not taking tasks since 5 hours ago. Checked in Nagios: Host Status: DOWN (for 0d 5h 5m 49s)
The host status is Critical and has been down for 23H as of now with a packet loss of 100%. Tried rebooting the worker on tools.taskcluster.net but i'm getting a 404 error. Yes i am behind VPN and yes i added the ssl certificate.
Currently the worker can not be found on Taskcluster. I have rebooted the machine via iLO. We will continue monitor it.
Checked the worker. It seems that is back on Taskcluster and it took jobs. I will close the ticket for now as it seems the problem is fixed
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
re-opened bug for continue tracking it
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
rebooted and re-imaged the machine ( as it wasn't visible on Task Cluster ). Currently it is available and it took jobs
Summary: T-W1064-MS-118 problem tracking → [MDC1]T-W1064-MS-118 problem tracking
Summary: [MDC1]T-W1064-MS-118 problem tracking → [MDC1] T-W1064-MS-118 problem tracking
Alias: T-W1064-MS-118
Depends on: 1452133
I've rebooted the worker and it's running jobs now.
Status: REOPENED → RESOLVED
Closed: 7 years ago6 years ago
Resolution: --- → FIXED
Worker was not taking jobs. Logs where showing: T-W1064-MS-118.mdc1.mozilla.com Service_Control_Manager: The sshd service terminated unexpectedly. Reboot did nothing to it so I reimaged. Has started working.
Re-opend bug. Worker is not taking jobs. Tried rebooting, then reset bios and then reimage. It seems that nothing helped On papertrail the last entry is from 27.08.2018
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1492238
machine seems to be up and running tasks. we will close the bug for now. If the problem will persist in the future, we will re-open the bug.
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED

Re-opening the bug. The machine is not available on Taskcluster. There is no document on ServiceNow portal and I couldn't find any last logs on papertrail.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

the machine seems to be up and running and taking jobs.
https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-win10-64-hw/workers/mdc1/T-W1064-MS-118
We will close the bug for now. If the problem will persist in the future, we will re-open this bug.

Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.