Closed Bug 1294655 Opened 8 years ago Closed 8 years ago

Please decommission t-snow-r4 machines that have been shut down

Categories

(Infrastructure & Operations :: DCOps, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aselagea, Assigned: van)

References

Details

Attachments

(1 file)

In bugs 1292656, 1279394 we had most of the t-snow-r4 machines disabled and removed from bb-configs. We would need to decommission them. 

Taking a look at slavealloc, the ranges would be the following (including interval ends):
[1,2] + [4,28] + [30,75] + [98,146]
Attached file decomm-t-snow-r4.txt
Attached the list of machines that would need to be decommissioned. 

0003, 0029, 0034, 0093, 0115 and 0130-0163 are all marked as "decommissioned" in inventory, so skipping them from the list. Also updated slavealloc notes.
Noisebridge (a local hackerspace charity organization) might be interested in taking a bunch of the r4 mac minis. When is the next recycling pickup, so I can give them a deadline by which to make a decision.
no scheduled disposal at the moment. please give me some lead time to unrack the minis for the donation. we've been taking advantage of MTV's e-waste for the bad/decommissioned minis after removing the drives and memory as no other parts are salvageable and they're difficult to palletize.
Assignee: server-ops-dcops → vle
QA Contact: cshields
Hello!  This is the organization: https://noisebridge.net/  And we are a 501(c)(3) incorporated in California.  Details with EIN for tax purposes:  https://noisebridge.net/wiki/Incorporation.  

You can email if you would like to confirm, or if we need to do any paperwork, to secretary@noisebridge.net. (That goes to several people, including me.)    

If there is a date/time range for us to arrange to come down with a truck, please let me know! Thanks very much, we will put the hardware to good use for things like dedicated 3-d printer stations.
Ah, I assumed this was for mac minis with drives. If that isn't the case we likely don't want them!
:alin: do you have a list of which hosts should REMAIN in production? That might be easier if I'm wiping all of the disks on the decom ones so that we can donate them.
Flags: needinfo?(aselagea)
no, the drives are still intact. :arr is going to create a default task to secure wipe the drives on the minis and ill netboot them. after this is complete, i can unrack and put them in some type of box for you. 

please note that we are using c7-c14 power cables for these minis. if you use the standard 5-15 power cables, you'll need to purchase them.

please give me a few weeks to complete this as im the only one in scl3 and on several projects.
I think the list we have left is 25 machines:

22-24, 76-97 (inclusive)
Note that I did not touch the following machines since they aren't marked as decommissioned in slavealloc:

t-snow-r4-0022
t-snow-r4-0023
t-snow-r4-0024

The secure wipe was initiated on the following but wasn't successful (expected on the unresolvable ones):

t-snow-r4-0003 - not reachable
t-snow-r4-0029 - could not resolve
t-snow-r4-0034 - could not resolve
t-snow-r4-0040 - not reachable
t-snow-r4-0044 - still at workflow selector
t-snow-r4-0115 - not reachable

The rest of the machines below seem to be chugging away and should probably take 30+ hours (30 hours for a 7-pass on a 500G disk).

t-snow-r4-0001
t-snow-r4-0002
t-snow-r4-0004
t-snow-r4-0005
t-snow-r4-0006
t-snow-r4-0007
t-snow-r4-0008
t-snow-r4-0009
t-snow-r4-0010
t-snow-r4-0011
t-snow-r4-0012
t-snow-r4-0013
t-snow-r4-0014
t-snow-r4-0015
t-snow-r4-0016
t-snow-r4-0017
t-snow-r4-0018
t-snow-r4-0019
t-snow-r4-0020
t-snow-r4-0021
t-snow-r4-0025
t-snow-r4-0026
t-snow-r4-0027
t-snow-r4-0028
t-snow-r4-0030
t-snow-r4-0031
t-snow-r4-0032
t-snow-r4-0033
t-snow-r4-0035
t-snow-r4-0036
t-snow-r4-0037
t-snow-r4-0038
t-snow-r4-0039
t-snow-r4-0041
t-snow-r4-0042
t-snow-r4-0043
t-snow-r4-0045
t-snow-r4-0046
t-snow-r4-0047
t-snow-r4-0048
t-snow-r4-0049
t-snow-r4-0050
t-snow-r4-0051
t-snow-r4-0052
t-snow-r4-0053
t-snow-r4-0054
t-snow-r4-0055
t-snow-r4-0056
t-snow-r4-0057
t-snow-r4-0058
t-snow-r4-0059
t-snow-r4-0060
t-snow-r4-0061
t-snow-r4-0062
t-snow-r4-0063
t-snow-r4-0064
t-snow-r4-0065
t-snow-r4-0066
t-snow-r4-0067
t-snow-r4-0068
t-snow-r4-0069
t-snow-r4-0070
t-snow-r4-0071
t-snow-r4-0072
t-snow-r4-0073
t-snow-r4-0074
t-snow-r4-0075
t-snow-r4-0098
t-snow-r4-0099
t-snow-r4-0100
t-snow-r4-0101
t-snow-r4-0102
t-snow-r4-0103
t-snow-r4-0104
t-snow-r4-0105
t-snow-r4-0106
t-snow-r4-0107
t-snow-r4-0108
t-snow-r4-0109
t-snow-r4-0110
t-snow-r4-0111
t-snow-r4-0112
t-snow-r4-0113
t-snow-r4-0114
t-snow-r4-0116
t-snow-r4-0117
t-snow-r4-0118
t-snow-r4-0119
t-snow-r4-0120
t-snow-r4-0121
t-snow-r4-0122
t-snow-r4-0123
t-snow-r4-0124
t-snow-r4-0125
t-snow-r4-0126
t-snow-r4-0127
t-snow-r4-0128
t-snow-r4-0129
(In reply to Amy Rich [:arr] [:arich] from comment #6)
> :alin: do you have a list of which hosts should REMAIN in production? That
> might be easier if I'm wiping all of the disks on the decom ones so that we
> can donate them.

Per Coop's comment in https://bugzilla.mozilla.org/show_bug.cgi?id=1292656#c6:
"In that case, we should step down to 20 machines for now, and then 10 machines once Firefox 49 is out."

I don't have more info other than this :-)
Flags: needinfo?(aselagea)
Status for the 9 machines for which secure wipe hasn't been initiated yet/was unsuccessful:

t-snow-r4-0003 - missing from slavealloc, marked as "decommissioned in inventory", decommissioned in bug 1095241
t-snow-r4-0022 - no notes in slavealloc, inventory: "production", reachable, no tracking bug
t-snow-r4-0023 - no notes in slavealloc, inventory: "production", reachable, tracking bug: 938285
t-snow-r4-0024 - no notes in slavealloc, inventory: "production", reachable, tracking bug: 820499
t-snow-r4-0029 - missing from slavealloc, marked as "decommissioned in inventory", decommissioned in bug 1116572
t-snow-r4-0034 - missing from slavealloc, marked as "decommissioned in inventory", decommissioned in bug 1202415 
t-snow-r4-0040 - slavealloc: "Disabled in bug 1279394", inventory: "production", reachable after a PDU reboot 
t-snow-r4-0044 - slavealloc: "Disabled in bug 1279394", inventory: "production", still at workflow selector
t-snow-r4-0115 - missing from slavealloc, marked as "decommissioned in inventory", decommissioned in bug 996314
I've hand started the wipe on t-snow-r4-0044.

22, 23, and 24 didn't actually have keys on them and had the wrong root pw, so I don't know what state they were in. I've kicked off a wipe on those now as well. That leaves us 20 machines (76 - 97) still in production.
unracked about 80 of the 100 mac minis. im velcroing the cables so we can reuse these chassis/slots if we plan to order more for capacity/DR planning. should be able to wrap this up next week.

:lizzard, do you want to meet at the data center (SCL3) next week with a truck and some boxes - preferably Wednesday or Friday?
Removed form nagios in svn revision 121354.
remaining minis have been removed. taking 1 mini for netops/gcox (bug 1301216) testing in PDX1.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: