Closed Bug 759466 Opened 13 years ago Closed 12 years ago

Install the new r5 minis in scl3

Categories

(Infrastructure & Operations :: RelOps: General, task)

x86_64
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: coop, Assigned: dividehex)

References

Details

I've heard back from the Graphics team and they don't care about having a distinct GPU on the testing Macs. Given that, I think we should order more minis in the existing rev5 configuration so we can migrate machines between pools as necessary. Amy tells me that 106 minis will two racks to capacity, one of which already has some minis in it, so let's go ahead and order that many.
Assignee: server-ops-releng → arich
The order for the minis has been placed. We'll be claiming 6 of these to replace relops minis in scl1 (so we can reclaim those that are currently being used as staging for the r5 builders). Are we also going to make some of these builders?
Blocks: 765223
(In reply to Amy Rich [:arich] [:arr] from comment #1) > Are we also going to make some of these builders? I've filed bug 765223 to get 10 of these imaged as builders.
We have 106 new minis in scl3 that are being racked and cabled. 6 of them should replace the r5-miniNN machine in scl1 that we temporarily set up for releng before we had scl3 infrastructure. I believe these belong on the build network (not try), but coop can confirm. 10 of them will be split between build and try (coop to designate where these go). I believe we have one mini set aside for the 10.8 talos ref image already, so we don't need to reserve one here. So the other 90 will be set up a 10.8 preview 4 release 2 machines. We need to notify dcops regarding which vlans these all belong on.
Assignee: arich → jwatkins
Summary: Order 106 rev5 minis as 10.8 testers → Install the new r5 minis in scl3
Inventory information should be forthcoming from dcops as part of bug 763579 90 of these machines should be in vlan 256 an be called: talos-mtnlion-r5-NNN.test.releng.scl3.mozilla.com where NNN is 001 - 090 The should get a clean (unpuppetized) copy of 10.7 upgraded to preview 4 release 2 (with power saving turned off and remote login turned on (account cltbld created)) 14 of them should be in vlan 252 and be called: bld-lion-r5-NNN.build.releng.scl3.mozilla.com where NNN is 081 - 094 They should get the same image as other bld-lion-r5-NNN machines 2 of them should be in vlan 264 and be called: bld-lion-r5-NNN.try.releng.scl3.mozilla.com where NNN is 095 - 096 They should get the same image as other bld-lion-r5-NNN machines
inventory, dns, and dhcp added for all 106 hosts. nagios is broken at the moment, so waiting till that's fixed before adding any checks.
The builders have been added to nagios and downtimed until the 13th. The mtnlion machines have been added to the configs, but are commented out until we get an image working and get them online. Jake: It might be best to just having dcops netboot them and put lion on them for now so that we can manage them remotely (unless you're near getting the mountain lion stuff done). That will at least help us identify any DOAs.
I have run into problems with deploying 10.8 with our current version of DS. So in the meantime, lets start by asking dcops to netboot bld-lion-r5-{081-096} and . I've set the default DS group to the 'Restore bld-lion-r5' workflow. Once these are all imaged (AND accounted for), we can change the default workflow to 'Restore fresh 10.7.2 image to R5' and ask dcops to netboot the rest so that we can identify any DOAs. These should probably be kept to batches of no more than 10 at a time so as not to choke the server.
Depends on: 770736
The following builders have been imaged and are now ready for use (this includes the replacement for the 6 preprod machines r5-mini-001 through r5-mini-006): bld-lion-r5-080.build.releng.scl3 bld-lion-r5-081.build.releng.scl3 bld-lion-r5-082.build.releng.scl3 bld-lion-r5-083.build.releng.scl3 bld-lion-r5-084.build.releng.scl3 bld-lion-r5-085.build.releng.scl3 bld-lion-r5-086.build.releng.scl3 bld-lion-r5-087.build.releng.scl3 bld-lion-r5-088.build.releng.scl3 bld-lion-r5-089.build.releng.scl3 bld-lion-r5-090.build.releng.scl3 bld-lion-r5-091.build.releng.scl3 bld-lion-r5-092.build.releng.scl3 bld-lion-r5-093.build.releng.scl3 bld-lion-r5-094.build.releng.scl3 bld-lion-r5-095.try.releng.scl3 bld-lion-r5-096.try.releng.scl3
Blocks: 773331
Just want to make a note that we are stalled on this work while waiting on releng to finish the puppet modules before we deploy the final image. I know Kim is working on it, but there was no note in the bug.
Okay, please enable these machines. I'd like to try to puppetize them all and run tests invoked via buildbot.
All machines have been reimaged except for those needing physical touch because I was unable to log in. Filed bug 781564 to handle those.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
No longer depends on: 760093
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.