Closed Bug 1451835 Opened 7 years ago Closed 6 years ago

Deploy Windows 10 moonshot nodes in MDC2

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: markco, Assigned: markco)

References

Details

No description provided.
Assignee: relops → mcornmesser
Blocks: 1443589
No longer blocks: 1443589
The plan is to do this post moonshot firmware upgrade. jmaher: Should we set up a staging pool for you all to take a look at before doing the complete deployment to MDC2?
Flags: needinfo?(jmaher)
I don't think we need a staging pool as the firmware/osinstall will be identical.
Flags: needinfo?(jmaher)
Hello, as discussed, I just finished re-image the 8th chassis and I want to report a few machines that still don't want to appear in taskcluster and take jobs: 318, 320, 321, 322, 323, 324, 345. The above mentioned machines have been re-image twice, after the first re-image I waited like 30 minutes to take jobs, (atm the pending tests are more then 900) meanwhile the others took more at least 3-4 jobs and completed them without exceptions
Continued with the re-image process, all of the machines from chassis 9 are continuing PXE booting, except for MS-361 which did not had that problem and 377 which has no display output. So far I tried rebooting the machines and booting from the SSD/HDD, the install continues but at the next reboot the machines PXE boots again. I also tried to modify the boot order, but it did not make any difference. I'll try to see if the machines from chassis 10 are behaving the same.
And the same booting behavior can be seen on chassis 10. At this point, I stopped re-imaging them since I cannot get them up and working, also shutdown the ones I worked on (all of the machines with PXE booting from chassis 9 and 10 (the first six))
Because those have not been imaged previously, they will need to have the node boot order set to m.2, pxe from the chassis.
I've setup chassis 9 for windows machines to that the boot order : m2, pxe. Will check with the first 3 machines, if the issue still persists and report here
After changing the node boot order, I was able to reimage almost all of the machines from chassis 9
I have finished to re-image all the machines from chassis 10. No issues encountered. Waiting for them to take jobs. We'll go further with chassis 11.
Updates: I've re-imaged the following machines once again: 318, 320, 321, 322, 323, 324, 345 . The process seems to be successfully finished but the workers are still missing from TC. The rest of the re-imaged machines from the chassis 8, 9 and 10 are working well.
Depends on: 1493321
Seems like chassis 11 wasn't a lucky one. Proceeded to reimage the first 12 machines and 11 of them have the pxe boot problem. Skipping 11 I've manage to finish reimaging chassis 12 . It's going to need a check later on , but for now, machines started picking up jobs.
I have reimaged chassis 11, 13 and 14. I had some issues with some of the machines. I have created bugs for these machines : T-W1064-MS-599, T-W1064-MS-600, T-W1064-MS-474, T-W1064-MS-471
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.