Closed
Bug 1477779
Opened 7 years ago
Closed 7 years ago
[MDC2] t-yosemite-r7-229.test.releng.mdc2.mozilla.com. is unreachable
Categories
(Infrastructure & Operations :: DCOps, task)
Infrastructure & Operations
DCOps
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dhouse, Assigned: van)
References
Details
(Whiteboard: REQ0260221, REQ0259960, requires on site vist, REQ0239167)
Please physically check and reboot+reimage t-yosemite-r7-229.test.releng.mdc2.mozilla.com
It does not respond to ping or ssh, and I tried snmp power on+off without that bringing it back:
Checking power:
```
# snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.1.3.4
iso.3.6.1.4.1.1718.3.2.3.1.5.1.3.4 = INTEGER: 1
```
Then I powered it off, waited a few seconds, and powered it back on, and finally starting pinging for it to come alive (no response):
```
# snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.1.3.4
iso.3.6.1.4.1.1718.3.2.3.1.5.1.3.4 = INTEGER: 1
# snmpset -v 2c -c secret pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.11.1.3.4 i 2
iso.3.6.1.4.1.1718.3.2.3.1.11.1.3.4 = INTEGER: 2
# snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.1.3.4
iso.3.6.1.4.1.1718.3.2.3.1.5.1.3.4 = INTEGER: 0
# snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.1.3.4
iso.3.6.1.4.1.1718.3.2.3.1.5.1.3.4 = INTEGER: 0
# snmpset -v 2c -c secret pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.11.1.3.4 i 1
iso.3.6.1.4.1.1718.3.2.3.1.11.1.3.4 = INTEGER: 1
# snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.1.3.4
iso.3.6.1.4.1.1718.3.2.3.1.5.1.3.4 = INTEGER: 1
# ping t-yosemite-r7-229.test.releng.mdc2.mozilla.com
PING t-yosemite-r7-229.test.releng.mdc2.mozilla.com (10.51.56.45) 56(84) bytes of data.
^C
--- t-yosemite-r7-229.test.releng.mdc2.mozilla.com ping statistics ---
60 packets transmitted, 0 received, 100% packet loss, time 59297ms
```
Please physically check this machine, and reboot it again. But it does not need reimaged.
I'm not sure if anything was done, but this machine is responding today. I cycled it again with snmp power off and then on (no ping/ssh before, but it showed it had power), and then it came up and responded to ping and then ssh (successful login).
I'll close this. If we have a repeat of the machine getting stuck and not responding to ping/ssh again, I'll open a new bug.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
The problem has repeated. This machine appears fine after a reboot, but then stops responding to ping/ssh and no logs are forwarded.
Please physically inspect this machine and netboot/reimage it.
| Assignee | ||
Comment 4•7 years ago
|
||
opened REQ0239167 for reimage.
Assignee: server-ops-dcops → vle
Whiteboard: REQ0239167
| Assignee | ||
Comment 5•7 years ago
|
||
will need to check next visit.
07-26-2018 18:04 EDT - Nicholas Trout Additional comments
Could not pull up display on any of the three effected mac minis. Spoke to Van. He will correct upon his next visit.
| Assignee | ||
Updated•7 years ago
|
Whiteboard: REQ0239167 → requires on site vist, REQ0239167
| Assignee | ||
Updated•7 years ago
|
Summary: t-yosemite-r7-229.test.releng.mdc2.mozilla.com. is unreachable → [MDC2] -yosemite-r7-229.test.releng.mdc2.mozilla.com. is unreachable
| Assignee | ||
Comment 6•7 years ago
|
||
back online after reimage.
vle@DESKTOP-3HK51T3:~$ fping t-yosemite-r7-229.test.releng.mdc2.mozilla.com
t-yosemite-r7-229.test.releng.mdc2.mozilla.com is alive
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Comment 7•7 years ago
|
||
Looks like the worker is not ssh-able again. However, it is pingable
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: [MDC2] -yosemite-r7-229.test.releng.mdc2.mozilla.com. is unreachable → [MDC2] t-yosemite-r7-229.test.releng.mdc2.mozilla.com. is unreachable
| Assignee | ||
Comment 8•7 years ago
|
||
opened REQ0259960 with QTS for reimage.
Whiteboard: requires on site vist, REQ0239167 → REQ0259960, requires on site vist, REQ0239167
| Assignee | ||
Comment 9•7 years ago
|
||
QTS might have missed this one, opened REQ0260221 for reimage.
Whiteboard: REQ0259960, requires on site vist, REQ0239167 → REQ0260221, REQ0259960, requires on site vist, REQ0239167
| Assignee | ||
Comment 10•7 years ago
|
||
QTS reimged mini.
vle@DESKTOP-3HK51T3:~$ fping t-yosemite-r7-229.test.releng.mdc2.mozilla.com
t-yosemite-r7-229.test.releng.mdc2.mozilla.com is alive
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Comment 11•7 years ago
|
||
Seems like the worker is not reachable once again.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 12•7 years ago
|
||
the machine seems to be up and running and taking jobs.
https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-osx-1010/workers/mdc2/t-yosemite-r7-229
We will close the bug for now. If the problem will persist in the future, we will re-open this bug.
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•