mac minis to physically check
Categories
(Infrastructure & Operations :: RelOps: Posix OS, task)
Tracking
(Not tracked)
People
(Reporter: dhouse, Assigned: dividehex)
References
Details
User Story
host serial asset rack shelf notes/bug? signing(notarization) minis: mac-v3-signing8 C07TQ095G1J2 35245 IT42 15.20 (https://bugzilla.mozilla.org/show_bug.cgi?id=1575615#c11) * Firmware not upgraded. Currently: MM71.022.B14 Needs to be: MM71.0232.B00 mac-v3-signing9 C07T610MG1J2 44214 IT41 16.10 (https://bugzilla.mozilla.org/show_bug.cgi?id=1570187#c8) * Firmware not upgraded. Currently: MM71.022.B08 Needs to be: MM71.0232.B00 testers: 257 C07RJ11CG1J2 03962 IT44 14.10 * still running yosemite even thought hostname was changed to mojave. reimaged to mojave 262 C07RJ0X9G1J2 03967 IT44 16.2 * hung. rebooted via power button 272 C07RJ0YPG1J2 03977 IT44 26.2 * dead. no power up 336 C07RJ10UG1J2 02984 IT40 5.2 * powered off 339 C07RJ0ZSG1J2 02987 IT40 7.1 * powered off; would not enter recovery, forced to create user in order to bless and reimage. must press power from the rear 340 C07RJ11GG1J2 02988 IT40 7.2 * powered off; would not enter recovery, forced to create user in order to bless and reimage. must press power from the rear 356 C07RJ133G1J2 18463 IT40 17.2 * powered off; would not enter recovery, forced to create user in order to bless and reimage. 364 C07RJ10CG1J2 18471 IT40 26.20 powered off, no video. need more trouble shooting 369 C07RJ11EG1J2 18476 IT40 30.1 315 C07RJ13AG1J2 02963 IT41 19.1 * reimaged (took longer than normal) 461 C07SQ0PCG1J2 15745 IT41 4.10 blessed to reimage as staging but didn't come back * bad switch port? The port seemed dead with no light and the os report cable not connected. I moved the cable to another port it seemed to connect but ssh was really spoty. ~~462 C07SQ0QHG1J2 15746 IT41 4.20~~ good staging; for comparison 463 C07SQ0RAG1J2 15747 IT41 5.10 went offline after multiuser test (softwareupdate in log) 464 C07SQ0NUG1J2 15748 IT41 5.20 went offline after multiuser test (softwareupdate in log) 465 C07SQ0RNG1J2 15749 IT41 6.10 went offline after multiuser test (softwareupdate in log) * All 3 multiuser were in a reboot loop. I entered recovery and reimaged. 468 C07SQ0QMG1J2 15752 IT41 7.2 * completely dead; no power up
mac minis to physically inspect and troubleshoot.
I checked over all of the mdc1 minis again, I was able to bring up 264, kicked off reimaging of 257, and added 257 and 364 to the list. I'll remove 257 if it succeeds in reimaging.
I kicked off a reimage of #256 to make sure deploystudio is working on install2 (since it cycled for the power maintenance wednesday):
256 C07RJ134G1J2 03961 mdc1 IT44 12.20
I haven't seen a deploystudio email yet. bsdpy was not affected by the power (uptime is around 1 year on there)
(In reply to Dave House [:dhouse] from comment #2)
I kicked off a reimage of #256 to make sure deploystudio is working on install2 (since it cycled for the power maintenance wednesday):
256 C07RJ134G1J2 03961 mdc1 IT44 12.20I haven't seen a deploystudio email yet. bsdpy was not affected by the power (uptime is around 1 year on there)
The reimages are failing with the last log entries:
2019-08-30 19:09:08.133 DeployStudio Runtime.bin[360:18919] Network address: 10.49.56.196 (t-mojave-r7-256.test.releng.mdc1.mozilla.com)
2019-08-30 19:09:08.155 DeployStudio Runtime.bin[360:18919] Network interface speed: AUTOSELECT (1000BASET <FULL-DUPLEX,FLOW-CONTROL>)
2019-08-30 19:09:08.176 DeployStudio Runtime.bin[360:18919] Operating System: Mac OS X Version 10.13.6 (Build 17G65)
2019-08-30 19:09:08.177 DeployStudio Runtime.bin[360:18919] Date: 19/08/30 12:09:08
2019-08-30 19:09:08.177 DeployStudio Runtime.bin[360:18919] ====================================================================================================
2019-08-30 19:09:09.047 DeployStudio Runtime.bin[360:18919] 24 plugins were successfully loaded!
2019-08-30 19:09:12.174 DeployStudio Runtime.bin[360:51587] The user 'dsadmin' was successfully authenticated.
2019-08-30 19:09:12.283 DeployStudio Runtime.bin[360:18919] Connected to server install2.test.releng.mdc1.mozilla.com (1.7.8)
2019-08-30 19:09:12.424 DeployStudio Runtime.bin[360:52105] Checking server reachability (server=install2.test.releng.mdc1.mozilla.com port=445) ...
2019-08-30 19:09:13.589 DeployStudio Runtime.bin[360:52105] Checking server reachability (server=10.49.56.17 port=445) ...
So, something isn't responding/connecting (on 445?). I started deploystudio after the power maintenance, but I must not have started everything (or correctly?).
| Assignee | ||
Comment 4•6 years ago
|
||
So, something isn't responding/connecting (on 445?). I started deploystudio after the power maintenance, but I must not have started everything (or correctly?).
I restarted the file server on install2 mdc1. I think that cleared up the issue. I've been able to re-image since.
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Updated•6 years ago
|
We have a repeated deploystudio success mail over the weekend like:
The workflow 'Update Firmware' was launched on the computer C07TQ095G1J2 (name: mac-v3-signing8, ip: 10.49.48.23, mac: a8:60:b6:39:b7:78) with a SUCCESSFUL termination status. This mail was generated automatically by DeployStudio Server. -- The DeployStudio Team.
So I'll check it later today. I think I can set the next workflow for this machine (in the deploystudio database) as the mojave reimage and then it will run that instead of repeating the firmware update.
I moved #8 over to the signing workflow and that completed. I am doing the same for #9, but I may need to ask QTS to reboot and bless it (deploystudio said it was waiting at the workflow selector prompt).
I asked QTS to check the remaining problem minis in MDC1:
Rack Shelf Asset Host
IT40 5.2 02984 336
IT40 26.2 18471 364
IT40 30.1 18476 369
IT41 4.1 15745 461
IT41 7.2 15752 468
IT41 16.10 44214 mac-v3-s9
IT44 26.2 03977 272
| Assignee | ||
Comment 8•6 years ago
|
||
:dhouse, what is the status of these minis? Did QTS finish investigating them?
yes, this is out of date. One had dead video and was replaced with an r8.
Current state is in the spreadsheet. I'll double-check over them and make sure I don't have other abandoned bugs for them.
Description
•