Closed Bug 1484870 Opened 6 years ago Closed 6 years ago

Deploy generic-worker 10 to MDC1 Win 10 hardware nodes

Categories

(Infrastructure & Operations :: RelOps: OpenCloudConfig, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: markco, Assigned: markco)

References

Details

Attachments

(2 files)

This bug is to track the deployment on the imaging needed to support generic-worker 10.
Assignee: nobody → mcornmesser
Depends on: 1443589
Beginning deployment with ms-016 - ms-045
Summary: Deploy generic-worker 10 to MDC1 and MDC2 hardware nodes → Deploy generic-worker 10 to MDC1 Win 10 hardware nodes
Hit a snag with this. For some reason ms-016 through ms-030 is failing during deployment at the OS installation step. However ms-031 through ms-045 with the exception of one node are installing.
After a 3rd try ms-016 through ms-045 except for one node are installed or in process of being installed. I will idenetify which node it was tomorrow am.
I've re-imaged each worker from Chassis 2. Except for T-W1054-MS-{065, 071}, were all re-imagined successfully. I'll keep on monitoring to see if they will take jobs.
Attached image during-the-reimage.PNG
The issue encountered during the re-image process for T-W1064-MS-{065, 071}
Attached image after-reboot.PNG
The issue encountered after reboot for T-W1064-MS-[065, 071}
T-W1064-MS-072 has the same issues.
See Also: → 1485840
(In reply to Radu Iman[:riman] from comment #7) > T-W1064-MS-072 has the same issues. I will address these nodes this afternoon or Monday.
I am adding a disk partition clean command into to the deployment that will address the issues happening with 65, 71, and 72. It might also address the the continuous pxe boot issue. Anecdotally, I have not seen it on nodes that I have ran the partition clean command on. CiDuty; Ni you all as a heads up, and also please let me know if you see any issues. I will reinstall the 3 nodes mention above this afternoon.
Flags: needinfo?(ciduty)
Can we close this bug or is there more outstanding work? Thanks!
Flags: needinfo?(mcornmesser)
I am comfortable with closing this today. As of last night we were down to 2 nodes that weren't installed, ms-118 (bug 1462820) and ms-131 (bug 1481068).
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(mcornmesser)
Resolution: --- → FIXED
Thank you for the heads up Mark! Would you like CIDuty to double check the workers that were continuously pxe booting before and see if it still is the case?
Flags: needinfo?(mcornmesser)
(In reply to Zsolt Fay [:zsoltfay] from comment #12) > Thank you for the heads up Mark! Would you like CIDuty to double check the > workers that were continuously pxe booting before and see if it still is the > case? yes, please. There are 2 separate cases. Where the windows install environment loads and it pxe boots on the next start up. I am hoping that is resolved. Any case where the node continuously tries to pxe boot but never gets to the Windows environment is most likely on the wrong vlan. In those case you may want to open a netops bug and ask them to check the network configuration for the blade.
Flags: needinfo?(mcornmesser)
We've had the following machines with the continuous PXE boot issue, they all belong to the first case that Mark described above: 066, 077, 085, 116, 151, 165, 168, 241, 281. I've checked them all today and none showed signs of the issue anymore. I'm guessing the first case issue has been resolved.
Flags: needinfo?(ciduty)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: