Closed Bug 739787 Opened 12 years ago Closed 12 years ago

migrate two or more vms from IT vmware to releng vmware in sjc1

Categories

(Infrastructure & Operations :: Virtualization, task)

x86
macOS
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arich, Assigned: dparsons)

Details

Per our irc conversation:

The infra vmware hosts (pm-vmware*) that are home some of the releng machines in sjc1 are moving on the D train, so we need to migrate some or all of the following vms off of those vmware hosts to the releng vmware hosts (bm-vmware*) before the 6th.  

There are two that definitely need to move:

ganglia2.build
bm-admin01

The first is relatively easy to take a short downtime on with coordination up front.  That should also be the same for the second, but that's a bit tricker since it's the nagios host.

Two others that *may* need to move:

    dm-wwwbuild01 - also need disk storage - 729667 (sensitive to downtime)
    byob-keymaster1 - 726707 (would need to double check about downtime)

And two others that likely will not need to move, but I'm including for the sake of completeness in case any configuration needs to be done up front:

    dm-pvtbuild01 (fuzzer)  - 723340
    mozillabuild-builder - off, used by ted, just needs to be in scl3


These would all be moving to bm-vmware<something> and it's attached storage.  Which ever hosts look like they're best able to handle the load would be fine, since they *should* all be configured the same.
Severity: normal → major
Status: NEW → ASSIGNED
The following VMs have been migrated to bm-vmware*:
ganglia2.build
bm-admin01
mozillabuild-builder

The following VMs have NOT been migrated:
dm-wwwbuild01
byob-keymaster1
dm-pvtbuild01

Those VMs were not migrated because they need to be shut down to be migrated and that outage needs to be scheduled. Additionally, VLAN plumbing needs to be completed (bug opened for that).
Waiting to hear from releng when the tree closure will be to schedule these last three.  Speculation seems to point to Tuesday at the moment.
Tentative treeclosure for 9:00 - 12:00 on 4/6 to move the rest of these hosts.  coop to confirm in this bug tomorrow.
Hrm, no one updated this bug, while I was out.  The tree closure is scheduled for today from 9:00 - 12:00PDT.  Dan said he'll be performing the migration around 10:00.
Chris sent the following downtime notice at April 5, 2012 10:33:21 AM PDT to tree-management, planning, relese@ and relops@


The Mozilla IT and RelEng teams need to take a downtime on Friday, April
6th (tomorrow), to move a VM that runs several core services. The
affected services are:

clobberer
talos bundles
builddata
buildapi
trychooser
tryserver-symbols

The downtime is scheduled for 3 hours, starting at 09:00 PST. We will
open the trees and inform #developers as soon as possible after the
maintenance is complete.

As always, please let RelEng/myself know ASAP if there is any reason we
should not proceed with this downtime.

cheers,
--
coop
The following VMs have been migrated to the releng cluster:

dm-wwwbuild01
byob-keymaster1
dm-pvtbuild01

byob-keymaster1 has been left down (the state it was in before the migration) because the actual vm is supposed to be migrating to scl3 as soon as it's done being copied there.  Please test your services to make sure all is well.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
dm-wwwbuild01 is back up.  I've done the following tests (all destructive changes made to my own pushes)

-loading the clobber, set a couple build-system android clobbers

-cancelled a running test on build-system using buildapi
{
  "completed_at": null, 
  "what": {
    "bid": 10344251
  }, 
  "action": "cancel_build", 
  "who": "jford@mozilla.com", 
  "when": 1333735003.278666, 
  "complete_data": null
}

-pending and running builds are not working
http://build.mozilla.org/builds/running.html
http://build.mozilla.org/builds/pending.html

-generated a wait times report with the web ui
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.