Closed Bug 1307131 Opened 8 years ago Closed 8 years ago

Reimaging Windows testers via ipmitool isn't working

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aselagea, Unassigned)

References

Details

Not sure if this is a general issue, but I've tried reimaging t-w864-ix-207 a couple of times as it's failing jobs. Unfortunately, it didn't work. Commands used: "ipmitool -U releng -P XXXXXXXX -H t-w864-ix-207-mgmt.build.mozilla.org chassis bootdev;ipmitool -U releng -P XXXXXXXX -H t-w864-ix-207-mgmt.build.mozilla.org pxe chassis power reset" These commands seem to run fine, but the machine simply went off and needed manual intervention to get back online. To check if the machine had been reimaged, I logged in via VNC, opened a cmd windows and ran "systeminfo". The original install date is Sep 12 2016.
Well, the first command is ""ipmitool -U releng -P XXXXXXXX -H t-w864-ix-207-mgmt.build.mozilla.org chassis bootdev pxe". Wrong copy&paste.
:aselagea, win7 uses external (Other) graphics card, while winXP and win8 uses the onboard graphics card. if you're reimaging these hosts to one of the other o/s, you need to change the graphics setting in BIOS. hit Del during bootup Advanced > PCI/PnP Configuration > Boot Graphics Adapter Priority (change this to Other or Onboard VGA depending on os) let me know if you're still having issues with this box.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Assignee: server-ops-dcops → vle
QA Contact: cshields
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
let me try to run the ipmitool and watch the process. i thought this bug was re: the graphics issue.
the ipmitool command works as expected. can you check the fqdn? i tried to .org and it didn't resolve for me. however the host is not pingable just like my other w8 reimages. is it in the correct OU? ipmitool -U releng -P XXXXX -H t-w864-ix-207-mgmt.inband.releng.scl3.mozilla.com chassis bootdev pxe; ipmitool -U releng -P XXXXX -H t-w864-ix-207-mgmt.inband.releng.scl3.mozilla.com chassis power reset i pinged :arr to inquire about this issue and she suggested: arr> sounds like it's not finishing the reimage and applying the GPO that shuts off the firewall
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
Sorry for reopening this, but something doesn't feel right here. I tried reimaging one random machine from each of the following pools: xp32, w732, w864. While the process went fine for xp32 and w864 (both machines had onboard graphics cards enabled), it ended with an unreachable w732 machine - bug 1307390. Regarding t-w864-ix-207, it is a former w732 machine and it has no onboard graphics card atm. I did another reimage using the commands from #c4 and got an unreachable machine again - bug 1307054. While this may be out of scope for this bug, I was wondering if the re-image process encounters issues for machines having no onboard graphics?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Moving this and NI Q and markco to see if this is fallout from the winpocalypse.
Assignee: vle → relops
Component: DCOps → RelOps
Flags: needinfo?(q)
Flags: needinfo?(mcornmesser)
I think this unrelated to the winpocalypse and the graphic cards. It is not so much that they do not have on board graphic cards, but the priority in the bios is set to the other card. We need to get someone on site to take a look at these. I would guess either a network issue or something is up with the hardware itself. Particularly since the management ip is reachable but the other ip is not.
Flags: needinfo?(mcornmesser)
From what I understand, w7 hosts become unreachable only after an attempted reimage. That's why I think it's something with the imaging process, not the hardware. My suspicion was that the imaging process was not finishing correctly and that the firewall was not being disabled (and therefore blocking all traffic).
I am taking a deeper look into this.
It looks like it is at least communicating with wds1. From a netstat: TCP 10.22.69.26:445 t-w864-ix-012:49412 ESTABLISHED TCP 10.22.69.26:445 t-w732-ix-026:49412 ESTABLISHED TCP 10.22.69.26:445 t-w864-ix-207:49412 ESTABLISHED I rekicked off the install on t-w864-ix-207 and started a parallel install on t-w864-ix-012. While running a continuous ping to both of the nodes to the production ip. 012 eventually started to reply to the ping. 207 has not replied to the ping.
Van: Would you be able to check and see what t-w864-ix-207 is actually displaying?
Flags: needinfo?(vle)
207 is back up now after a hard reboot. t-w864-ix-207.wintest.releng.scl3.mozilla.com is alive [vle@admin1b.private.scl3 ~]$ ssh !$
Flags: needinfo?(vle)
Flags: needinfo?(q)
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.