Closed
Bug 774829
Opened 12 years ago
Closed 12 years ago
upgrade heatsink/fan/memory and move mw32-ix-slave13 - mw32-ix-slave26 to scl1
Categories
(Infrastructure & Operations :: DCOps, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: arich, Unassigned)
References
Details
(Whiteboard: mtv to scl1 [reit])
Derek, you asked for a heads up on needing to upgrading and move these boxes. Coop says that we'll be ready to go soon, so I'm filing this bug to coordinate. He'll update this bug when the machines are disabled and ready to move. The following machines are slated to be upgraded and moved: mw32-ix-slave11 mw32-ix-slave12 mw32-ix-slave13 mw32-ix-slave14 mw32-ix-slave15 mw32-ix-slave16 mw32-ix-slave17 mw32-ix-slave18 mw32-ix-slave19 mw32-ix-slave20 mw32-ix-slave21 mw32-ix-slave22 mw32-ix-slave23 mw32-ix-slave24 mw32-ix-slave25 mw32-ix-slave26 The will be changing hostnames when they reach the new datacenter, so if they have hostnames on the labels, please let me know and we can get you an updated list. Matt or Jake can tell you where the parts for the upgrades are, and give any special instructions.
Reporter | ||
Comment 1•12 years ago
|
||
The parts are stored in boxes in haxxor.
Updated•12 years ago
|
Summary: upgrade heatfink/fan/memory and move mw32-ix-slave11 - mw32-ix-slave26 to scl1 → upgrade heatsink/fan/memory and move mw32-ix-slave11 - mw32-ix-slave26 to scl1
Updated•12 years ago
|
Whiteboard: mtv to scl1
Reporter | ||
Comment 2•12 years ago
|
||
The mapping of the old to new hostnames for this batch: w64-ix-slave85 mw32-ix-slave11 w64-ix-slave86 mw32-ix-slave12 w64-ix-slave87 mw32-ix-slave13 w64-ix-slave88 mw32-ix-slave14 w64-ix-slave89 mw32-ix-slave15 w64-ix-slave90 mw32-ix-slave16 w64-ix-slave91 mw32-ix-slave17 w64-ix-slave92 mw32-ix-slave18 w64-ix-slave93 mw32-ix-slave19 w64-ix-slave94 mw32-ix-slave20 w64-ix-slave95 mw32-ix-slave21 w64-ix-slave96 mw32-ix-slave22 w64-ix-slave97 mw32-ix-slave23 w64-ix-slave98 mw32-ix-slave24 w64-ix-slave99 mw32-ix-slave25 w64-ix-slave100 mw32-ix-slave26 w64-ix-slave85-mgmt mw32-ix-slave11-mgmt w64-ix-slave86-mgmt mw32-ix-slave12-mgmt w64-ix-slave87-mgmt mw32-ix-slave13-mgmt w64-ix-slave88-mgmt mw32-ix-slave14-mgmt w64-ix-slave89-mgmt mw32-ix-slave15-mgmt w64-ix-slave90-mgmt mw32-ix-slave16-mgmt w64-ix-slave91-mgmt mw32-ix-slave17-mgmt w64-ix-slave92-mgmt mw32-ix-slave18-mgmt w64-ix-slave93-mgmt mw32-ix-slave19-mgmt w64-ix-slave94-mgmt mw32-ix-slave20-mgmt w64-ix-slave95-mgmt mw32-ix-slave21-mgmt w64-ix-slave96-mgmt mw32-ix-slave22-mgmt w64-ix-slave97-mgmt mw32-ix-slave23-mgmt w64-ix-slave98-mgmt mw32-ix-slave24-mgmt w64-ix-slave99-mgmt mw32-ix-slave25-mgmt w64-ix-slave100-mgmt mw32-ix-slave26-mgmt
Updated•12 years ago
|
Whiteboard: mtv to scl1 → mtv to scl1 [reit]
Updated•12 years ago
|
colo-trip: --- → mtv1
Reporter | ||
Comment 3•12 years ago
|
||
The focus/purpose of this bug is on re-tasking a portion of the 32-bit builder pool from windows 32 to windows 64 to facilitate bug 758275 (increasing the size of the 64-bit builder pool). This should have no impact on any chemspills or regular builds because the only machines that should be moved are those that are extra capacity in the 32-bit builder pool. Changing any these machines to 64-bit builders necessitates moving them to scl1, and moving to scl1 necessitates upgrading the hardware. The information needed here from releng (and the reason that this bug is currently blocked) is how many servers (some number between 0 and 16) that can be retasked from 32-bit builders to 64-bit builders. * IFF we only need 10 32-bit builders, then we go ahead and finish this move in its entirety, and it doesn't matter how long it takes us (within reason, say a day) to move and reimage them. We are only ADDING capacity to the 64-bit builder pool and removing UNUSED capacity from the 32-bit builder pool. * IFF we need fewer than 26 32-bit builders but more than 10, then we can move some subset of machines, and it doesn't matter how long it takes us (within reason, say a day) to move and reimage them. We are only ADDING capacity to the 64-bit builder pool and removing UNUSED capacity from the 32-bit builder pool. * IFF we still need all 26 32-bit builders, then we don't do the move at all and we R/F this bug without further action.
Comment 4•12 years ago
|
||
Hal I wanted to check in for an update on this. Any feedback on how many machines we want to re-purpose as 64 bit?
Reporter | ||
Comment 5•12 years ago
|
||
Releng has requested that we move 14 of these 16 machines. Does dcops have a preference on which those are? If not, I suggest we upgrade/move: mw32-ix-slave11 mw32-ix-slave12 mw32-ix-slave13 mw32-ix-slave14 mw32-ix-slave15 mw32-ix-slave16 mw32-ix-slave17 mw32-ix-slave18 mw32-ix-slave19 mw32-ix-slave20 mw32-ix-slave21 mw32-ix-slave22 mw32-ix-slave23 mw32-ix-slave24 In terms of priority, this is behind getting the tegras fully functional, so do you guys know when you might get a chance to upgrade and move them?
Comment 6•12 years ago
|
||
Hi Amy, Speaking with Derek before on this bug, he wants our "SLA" for inter-colo moves to be 10 business days. However, we're not that busy so we can start working on the upgrades today or tomorrow and start moving them Monday. Have these hosts been brought down and can we upgrade them at our convenience? Thanks, Van
Reporter | ||
Comment 7•12 years ago
|
||
Van: hwine will follow up when releng is ready for the machines to be taken out of service.
Comment 8•12 years ago
|
||
We need to leave 2 in this block so we will be working on the following: mw32-ix-slave13 mw32-ix-slave14 mw32-ix-slave15 mw32-ix-slave16 mw32-ix-slave17 mw32-ix-slave18 mw32-ix-slave19 mw32-ix-slave20 mw32-ix-slave21 mw32-ix-slave22 mw32-ix-slave23 mw32-ix-slave24 Hal is coordinating with RelEng to take these out of service and will update the bug when you can get started.
Reporter | ||
Comment 9•12 years ago
|
||
Actually, based on the list hal provided, the following should be moved (I didn't notice that 25 was not in his list): mw32-ix-slave12 mw32-ix-slave13 mw32-ix-slave14 mw32-ix-slave15 mw32-ix-slave16 mw32-ix-slave17 mw32-ix-slave18 mw32-ix-slave19 mw32-ix-slave20 mw32-ix-slave21 mw32-ix-slave22 mw32-ix-slave23 mw32-ix-slave24 mw32-ix-slave25 That leaves 11 and 26 in mtv1 (no upgrade, no move).
Comment 10•12 years ago
|
||
2 corrections to comment #9: - total count is 14 (not 12) - the final, official, shut them down any time you want list is: mw32-ix-slave13 mw32-ix-slave14 mw32-ix-slave15 mw32-ix-slave16 mw32-ix-slave17 mw32-ix-slave18 mw32-ix-slave19 mw32-ix-slave20 mw32-ix-slave21 mw32-ix-slave22 mw32-ix-slave23 mw32-ix-slave24 mw32-ix-slave25 mw32-ix-slave26 This is the official go to start unracking them! Thanks!
Reporter | ||
Comment 11•12 years ago
|
||
These hosts have been renamed in inventory (see comment 2 for the new hostname mappings). DCops: please update the rack and switch info once the machines have moved.
Reporter | ||
Comment 12•12 years ago
|
||
Hosts removed from nagios and commented out entries made in nagios for new hosts. Still to be done: Remove old hostnames from DHCP and DNS once the move is complete and verified.
Comment 13•12 years ago
|
||
:arr, we spoke in #dcops and you said the hard drives were to be upgraded as well. the hds inside the machine are currently 250gb 7200 rpm SATA drives, and the ones I found inside haxxor are also 250gb 7200 rpm SATA drives. i didnt bother with swapping out the drives since they're identical. please let me know if this is not correct and I should be looking for another set of drives. thanks, van
Comment 14•12 years ago
|
||
The hosts have been upgraded to 8gb of memory, heat sink and fan replaced. Lisa is scheduling a pick-up from MV and delivery to SCL1 for us Thursday at 4pm. We can probably rack, cable and inventory them Friday. Please let me know if there are any issues. Thanks, Van
Comment 15•12 years ago
|
||
Van, we can't schedule the move sooner?
Comment 16•12 years ago
|
||
Melissa, as Van noted in Comment 6, we generally need 10 business days to schedule a move of this size. When we have more than two or three servers, it becomes necessary to involve WPR and their third-party moving service. Van has actually managed to get everyone scheduled within 6 days of the request, which is already better than the expected forecast.
Reporter | ||
Updated•12 years ago
|
Summary: upgrade heatsink/fan/memory and move mw32-ix-slave11 - mw32-ix-slave26 to scl1 → upgrade heatsink/fan/memory and move mw32-ix-slave13 - mw32-ix-slave26 to scl1
Comment 17•12 years ago
|
||
(In reply to Van Le [:van] from comment #13) > :arr, we spoke in #dcops and you said the hard drives were to be upgraded as > well. the hds inside the machine are currently 250gb 7200 rpm SATA drives, > and the ones I found inside haxxor are also 250gb 7200 rpm SATA drives. i > didnt bother with swapping out the drives since they're identical. please > let me know if this is not correct and I should be looking for another set > of drives. > > thanks, > van :arr, was this question about disks ever resolved?
Comment 18•12 years ago
|
||
(In reply to Derek Moore from comment #16) > Melissa, as Van noted in Comment 6, we generally need 10 business days to > schedule a move of this size. When we have more than two or three servers, > it becomes necessary to involve WPR and their third-party moving service. > Van has actually managed to get everyone scheduled within 6 days of the > request, which is already better than the expected forecast. :dmoore: 1) Could we shuffle 2-3 machines at a time? :-) We're talking 12 machines here total, but each batch of 2-3 would help out as soon as they came online. 2) Are there other things we could be doing while we wait? ** machine imaging? ** verify netflows? ** nagios? ...? As you can probably tell, I'm looking for a way we can have these machines in production helping clear our win2008 backlog asap.
Reporter | ||
Comment 19•12 years ago
|
||
There's nothing left that we can do to these machines until they're in scl1.
Reporter | ||
Comment 20•12 years ago
|
||
(In reply to Amy Rich [:arich] [:arr] from comment #19) Actually, I should be more specific there. If the hardware's been upgraded, there's nothing more that dcops or relops can do for these machines. If releng has other bugs that they want to file to prep things on their end, there may be stuff to do there (buildbot, graphs, etc). This bug is just for the hardware, though, and there's a corresponding bug that relops has for reimaging the machines (which will include monitoring) once the hardware is up.
Comment 21•12 years ago
|
||
Hosts have been moved to SCL1. We should have them up by tomorrow.
colo-trip: mtv1 → scl1
Comment 22•12 years ago
|
||
correction, by end of day tomorrow. We still need to rack, cable, update inventory and configure the switch these hosts will be residing on.
Comment 23•12 years ago
|
||
Move has been completed. New rack, switch, pdu has been installed and configured. All hosts should be reachable and inventory has been updated. Please let me know of any issues. https://inventory.mozilla.org/en-US/systems/racks/?location=0&status=&rack=246&allocation=
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Assignee: server-ops → server-ops-dcops
Updated•10 years ago
|
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•