As part of Bug 820918 we've upgraded Bugzilla in phx1 (both hardware and software). This is supposed to be active - passive, meaning if SCL3 fails, we can switch over to Bugzilla in phx1 until scl3 is back online. We should test this failover to phx1 as well as failback to scl3. * date, time, duration of maintenance 13 July (tentative), happy to go with the tree closing window. We need about 4 hours. 1 hour to failover, 2 to test and 1 to fail back. * system(s) affected All of Bugzilla in phx1 and scl3, i.e.; PHX1 ---- web[1-5].bugs.phx1 bugzilla[1-4].db.phx1 push1.bugs.phx1 jobqueue1.bugs.phx1 Zeus Load balancers in phx1 (front and back end, config changes only) SCL3 ---- web[1-5].bugs.scl3 bugzilla[1-4].db.scl3 bugzillaadm.private.scl3 push1.bugs.scl3 jobqueue[1-2].bugs.scl3 Zeus Load balancers in scl3 (front and back end, config changes only) * end-user impact The site (bugzilla.mozilla.org) will be intermittently offline for the duration of the maintenance window. This is a test and is meant to identify possible issues (both infra and procedure wise) on how we can failover better. * maintenance plan and timeline (link to a wiki or etherpad is fine) TBD. Will work with Sheeri and glob * rollback plan / rollback point (at which point will you determine to roll back) Same as above * notification mechanisms As the CAB deems fit. * who will be point, who else will be involved fox2mike and sheeri, at the minimum.
Approved for the window on July 13th on July 3rd.
This was tested and worked. I will file follow up bugs for some minor things that we noticed. bugzilla is back in scl3. https://bugzilla.mozilla.org/show_bug.cgi?id=893470 was the test.