Closed Bug 1477385 Opened 7 years ago Closed 7 years ago

mac-v2-signing13.srv.releng.mdc2.mozilla.com is unreachable

Categories

(Infrastructure & Operations :: DCOps, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dhouse, Assigned: van)

References

Details

(Whiteboard: REQ0239132 , REQ0238705)

Please netboot/reimage mac-v2-signing13.srv.releng.mdc2.mozilla.com The machine was not allowing me in through ssh (password prompt. didn't match gpg private passwords) and we have no logs reported to papertrail. I attempted snmp (pdu1.gc131.ops.releng.mdc2.mozilla.com:ba13) reboot of the machine and powering it off, waiting (verified off with snmpget), and then back on, but each time it has not come back up to respond to ssh or ping. ``` # snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 iso.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 = INTEGER: 1 # snmpset -v 2c -c secret pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.11.2.1.13 i 2 iso.3.6.1.4.1.1718.3.2.3.1.11.2.1.13 = INTEGER: 2 # snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 iso.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 = INTEGER: 0 # snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 iso.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 = INTEGER: 0 # snmpset -v 2c -c secret pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.11.2.1.13 i 1 iso.3.6.1.4.1.1718.3.2.3.1.11.2.1.13 = INTEGER: 1 # snmpget -v 2c -c public pdu1.gc131.ops.releng.mdc2.mozilla.com 1.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 iso.3.6.1.4.1.1718.3.2.3.1.5.2.1.13 = INTEGER: 1 [dhouse@rejh2.srv.releng.mdc1.mozilla.com ~]$ ping 10.51.48.37 PING 10.51.48.37 (10.51.48.37) 56(84) bytes of data. ^C --- 10.51.48.37 ping statistics --- 329 packets transmitted, 0 received, 100% packet loss, time 327999ms ```
Jake, will we need to change anything in deploystudio to get this reimaged? (Or would you suggest that we have the machine physically checked/rebooted first?)
Flags: needinfo?(jwatkins)
This host already exists in the deploystudio database so when you 'bless' it and start a reimage, deploystudio knows exactly what to do with. The problem here is the host is unreachable and therefore we can't run 'bless' on the host remotely. Someone will need physically check the host and see it through a reimage.
Flags: needinfo?(jwatkins)
:dividehex, no need to flip this to the .srv vlan before reimaging or doing something on the deploystudio back end? if not, i will open a ticket with QTS later today for the reimage.
Flags: needinfo?(jwatkins)
Assignee: server-ops-dcops → vle
Whiteboard: REQ0238705
QTS netbooted the mini but i didn't get the 'success' email from deploystudio.
:van, because the host is in the srv vlan, and NOT in the test vlan, QTS will need to boot into the recovery uefi with cmd+R. From there, they will need to open a terminal and use the bless command with the proper netboot IP. The typical 'netboot' way (cmd+N) doesn't work because the host not in the same vlan as the netboot/deploystudio host. I would also recommend QTS, use the crash cart to monitor the progress of the reimaging or at least verify the host has properly loaded the deploystudio workflow and is proceeding to reimage. Since this is MDC2, use this command from the recovery terminal: /usr/sbin/bless --netboot --server bsdp://10.51.56.16; reboot
Flags: needinfo?(jwatkins)
REQ0239132 opened with QTS to bless this mini per c#5.
Whiteboard: REQ0238705 → REQ0239132 , REQ0238705
host is back online. vle@DESKTOP-3HK51T3:~$ fping mac-v2-signing13.srv.releng.mdc2.mozilla.com mac-v2-signing13.srv.releng.mdc2.mozilla.com is alive
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.