please reimage mac-v2-signing8-13 as notarization boxes
Categories
(Infrastructure & Operations :: RelOps: Hardware, task)
Tracking
(Not tracked)
People
(Reporter: mozilla, Unassigned)
References
Details
Old hosts (from bug 1567235:
mac-v2-signing8.srv.releng.mdc1
mac-v2-signing9.srv.releng.mdc1
mac-v2-signing10.srv.releng.mdc1
mac-v2-signing11.srv.releng.mdc2
mac-v2-signing12.srv.releng.mdc2
mac-v2-signing13.srv.releng.mdc2
I've killed the signing servers on these hosts and performed a secrets wipe. We should image these for ronin-puppet, as
mac-v3-signing7.srv.releng.mdc1.mozilla.com
mac-v3-signing8.srv.releng.mdc1.mozilla.com
mac-v3-signing9.srv.releng.mdc1.mozilla.com
mac-v3-signing10.srv.releng.mdc2.mozilla.com
mac-v3-signing11.srv.releng.mdc2.mozilla.com
mac-v3-signing12.srv.releng.mdc2.mozilla.com
(Looks like we already have backfill for mac-v3-signing4.mdc2 in bug 1561117).
We also need to add these to the NATs in bug 1552305, and make sure they have outbound connectivity to Apple.
Comment 1•5 years ago
|
||
These hosts have been renamed, re-imaged and puppetized to the proper puppet role, including the backfill in bug 1561117. I've also filed bug 1570504 to have their IPs added to the autograph NAT policy.
All releng networks should allow full outbound connectivity to Apple since the firewall deny rule was removed awhile back.
Reporter | ||
Comment 2•5 years ago
|
||
Thank you!
Reporter | ||
Comment 3•5 years ago
•
|
||
mac-v3-signing9.srv.releng.mdc1.mozilla.com looks unreachable atm.
mac-v3-signing8.srv.releng.mdc1.mozilla.com may not have finished its puppet run. I killed puppet, configured it manually, but a) I had to create ~cltbld
manually, and b) sudoers keeps getting overwritten.
7, 10, 11, 12, and 13 all look good and are live, thank you!
Comment 4•5 years ago
|
||
We're in much better shape with the 5 extra hosts, but it would be good to close this out.
mac-v3-signing8.srv.releng.mdc1 seems to still have SIP enabled, so I can't attempt to reimage it:
# csrutil status
System Integrity Protection status: enabled.
# /usr/sbin/bless --netboot --nextonly --server [redacted] && reboot
Could not set boot device property: 0xe00002e2
Can't set EFI
I can't reach mac-v3-signing9.srv.releng.mdc1.mozilla.com either. It'll need some onsite hands to investigate ?
Comment 5•5 years ago
|
||
dhouse, could you do your magic on mac-v3-signing8.srv.releng.mdc1 to disable SIP ?
Comment 6•5 years ago
•
|
||
(In reply to Nick Thomas [:nthomas] (UTC+12) from comment #4)
We're in much better shape with the 5 extra hosts, but it would be good to close this out.
mac-v3-signing8.srv.releng.mdc1 seems to still have SIP enabled, so I can't attempt to reimage it:
# csrutil status System Integrity Protection status: enabled. # /usr/sbin/bless --netboot --nextonly --server [redacted] && reboot Could not set boot device property: 0xe00002e2 Can't set EFI
I can't reach mac-v3-signing9.srv.releng.mdc1.mozilla.com either. It'll need some onsite hands to investigate ?
Since mac-v3-signing8.srv.releng.mdc1
got imaged to mojave with SIP enabled, it will need to be disabled and reimaged from the recovery console. We'll need to file a ticket with QTS.
Same for mac-v3-signing9.srv.releng.mdc1
since it seems to be offline. My guess is it's asleep and might also have SIP enabled. I've already tried power-cycling it.
Comment 7•5 years ago
|
||
I've filed a ticket with QTS to disable SIP and reimage the 2 minis.
"""
Short description
Disable SIP and reiamge 2 mac minis from recovery console
Description
Please disable SIP and initiate a reimage from the recovery console for the 2 following minis: Asset Tag: 35245, Rack: IT42 - 15.2 Asset Tag: 44214, Rack: IT41 - 16.1 1. Reboot into recovery mode with: Option-⌘-R (option + command + r) 2. Open Terminal session: Utilities->Terminal 2. Disable SIP: csrutil disable
3. Trigger a reimage: /usr/sbin/bless --netboot --server bsdp://10.51.56.233
4. Reboot from terminal: reboot
"""
Comment 8•5 years ago
|
||
QTS reports one mini was successful and the other won't boot. Either way I can't reach either of them. At this point, I'll investigate these 2 minis when I go to MDC1.
"""
Please pardon the slight delay on our part for not providing this update to you sooner. We have completed 1 of 2 mac minis. We have successfully disabled SIP and initiated a reimage for Asset Tag 35245. We have made a total of 3 attempts to do the same with Asset Tag 44214; however, we continue receive no video output from the machine. We have verified the HDMI cable is firmly seated for Asset Tag 44214, at both ends of the machines. Please advise us if you would like for us to swap the cable on the machine or instead how you wish for us to proceed with the task. We are prepared to swap the cable for you and have staged another cable we can use which we have verified is a good known working cable. Thank you for allowing us to support your needs.
"""
Jake, for swapping power (https://bugzilla.mozilla.org/show_bug.cgi?id=1575615) I was not able to get #8, asset 35245, recovered with QTS (tried different power and all other cables and directly connecting the crashcart display, and holding power (10s) with the power cable disconnected to clear SMC (power savings)). Could you include this one, #8, in your review of #9?
Comment 10•5 years ago
|
||
Jake recovered #8 and #9 when he visited MDC1 last Friday. He found that they needed firmware upgrades. I'm seeing deploystudio success mails for #8 re-peating the firmware upgrade successfully; I'm switching it to the signing reimage workflow.
Comment 11•5 years ago
|
||
(In reply to Dave House [:dhouse] from comment #10)
Jake recovered #8 and #9 when he visited MDC1 last Friday. He found that they needed firmware upgrades. I'm seeing deploystudio success mails for #8 re-peating the firmware upgrade successfully; I'm switching it to the signing reimage workflow.
#8 completed the reimage successfully: "The workflow 'Deploy Mojave Signing v3' was launched on the computer C07TQ095G1J2 (name: mac-v3-signing8, ip: 10.49.48.23, mac: a8:60:b6:39:b7:78) with a SUCCESSFUL termination status."
[dhouse@mac-v3-signing8.srv.releng.mdc1.mozilla.com ~]$ w
19:12 up 2 mins, 2 users, load averages: 4.04 3.34 1.50
USER TTY FROM LOGIN@ IDLE WHAT
cltbld console - 19:10 2 -
dhouse s000 10.49.48.101 19:12 - w
[dhouse@mac-v3-signing8.srv.releng.mdc1.mozilla.com ~]$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.5
BuildVersion: 18F132
[dhouse@mac-v3-signing8.srv.releng.mdc1.mozilla.com ~]$ csrutil status
System Integrity Protection status: disabled.
[dhouse@mac-v3-signing8.srv.releng.mdc1.mozilla.com code]$ cat /etc/puppet_role
mac_v3_signing
[dhouse@mac-v3-signing8.srv.releng.mdc1.mozilla.com ~]$ git -C /etc/puppet/environments/production/code rev-parse HEAD
5cc237276154c064e1592f93fbea269d3efc6a51
Comment 12•5 years ago
|
||
I'm doing the same for #9:
-
confirmed firmware update success:
The workflow 'Update Firmware' was launched on the computer C07T610MG1J2 (name: mac-v3-signing9, ip: 10.49.48.24, mac: a8:60:b6:24:f7:8e) with a SUCCESSFUL termination status. -
pinning to mojave signing workflow and rebooting
Comment 13•5 years ago
|
||
(In reply to Dave House [:dhouse] from comment #12)
I'm doing the same for #9:
confirmed firmware update success:
The workflow 'Update Firmware' was launched on the computer C07T610MG1J2 (name: mac-v3-signing9, ip: 10.49.48.24, mac: a8:60:b6:24:f7:8e) with a SUCCESSFUL termination status.pinning to mojave signing workflow and rebooting
I cycled the pdu power for #9 (I think it was waiting for a workflow section to be manually chosen on-screen).
https://inventory1.corpdmz.mdc1.mozilla.com/systems/show/38507/
Comment 14•5 years ago
|
||
Thanks for fixing up mac-v3-signing8.srv.releng.mdc1 - I've set that up and turned it on in prod.
How's mac-v3-signing9.srv.releng.mdc1 going ? Not responding to pings right now.
Comment 15•5 years ago
|
||
Thanks Nick for getting #8 up and checking on this!
I've asked the QTS remote hands to check and reimage mac-v3-signing9. When Jake was last there, he fixed the firmware, but it didn't come back after the reimages (https://bugzilla.mozilla.org/show_bug.cgi?id=1577813)
Comment 16•5 years ago
|
||
(In reply to Dave House [:dhouse] from comment #15)
Thanks Nick for getting #8 up and checking on this!
I've asked the QTS remote hands to check and reimage mac-v3-signing9. When Jake was last there, he fixed the firmware, but it didn't come back after the reimages (https://bugzilla.mozilla.org/show_bug.cgi?id=1577813)
QTS was not able to recover #9 (no video, reseated all cables and multiple power cycles; I'll ask them to try a new video cable)
Comment 17•5 years ago
|
||
(In reply to Dave House [:dhouse] from comment #16)
(In reply to Dave House [:dhouse] from comment #15)
Thanks Nick for getting #8 up and checking on this!
I've asked the QTS remote hands to check and reimage mac-v3-signing9. When Jake was last there, he fixed the firmware, but it didn't come back after the reimages (https://bugzilla.mozilla.org/show_bug.cgi?id=1577813)
QTS was not able to recover #9 (no video, reseated all cables and multiple power cycles; I'll ask them to try a new video cable)
I'm re-purposing a test worker to replace #9 as it could not be recovered with QTS's help.
Comment 18•5 years ago
|
||
Aki/Nick, with replacing mac-v3-signing9, we'll de-commission the previous machine (bug 1588852). For the old mini, what is the security process to de-comm it? I'm guessing DCOps has a process they've followed before, but I don't know it and want to confirm we're covered.
Reporter | ||
Comment 19•5 years ago
|
||
We need to wipe
/builds/scriptworker/{nightly,release}-signing.keychain
/builds/scriptworker/{scriptworker,script_config}.yaml
/Users/cltbld/ed25519_privkey
most likely with rm -P
or some other command that overwrites the data multiple times before removing.
Are there ronin-puppet secrets populated on the macs? If so, we may need to wipe those as well.
Comment 20•5 years ago
|
||
(In reply to Aki Sasaki [:aki] (he/him) (UTC-7) from comment #19)
We need to wipe
/builds/scriptworker/{nightly,release}-signing.keychain
/builds/scriptworker/{scriptworker,script_config}.yaml
/Users/cltbld/ed25519_privkey
most likely with
rm -P
or some other command that overwrites the data multiple times before removing.Are there ronin-puppet secrets populated on the macs? If so, we may need to wipe those as well.
Thanks! There are ronin-puppet secrets on there also.
I'll ask DCOps if they can wipe those locations or the full disk.
Comment 21•5 years ago
|
||
A secure erase of the whole disk would be best, given the SSD disk used in the mini and the snapshots that APFS makes.
Comment 22•5 years ago
|
||
Nick/Aki, the new mac-v3-signing9.srv.releng.mdc1.mozilla.com is reimaged and available. This is a re-purposed tester. Please let me know if you have any trouble with it.
Comment 23•5 years ago
|
||
Thanks Dave. Simon is just finishing up the ronin puppet changes so we'll use that to re-image mac-v3-signing9 and use it as a canary in prod.
Reporter | ||
Comment 24•4 years ago
|
||
I think we may be done here?
Reporter | ||
Updated•4 years ago
|
Description
•