Closed Bug 974449 Opened 10 years ago Closed 10 years ago

Request for downtime for bouncer failover

Categories

(Infrastructure & Operations :: Change Requests, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bjohnson, Unassigned)

References

Details

In order to reduce the size and improve performance of the bouncer database replication, we need to set them to used MIXED mode replication instead of statement based only as per bug 943214.

We would like a short downtime window to bring down the replication master and set it to mixed mode after bringing down each slave (with removal from load balancing) and setting them to MIXED first.

- date, time, duration of maintenance:
  Sunday February 23rd, 3:00 PST - 30 minutes.

- system(s) affected:
   - bouncer databases.  (bouncer[1-3].db.phx1.mozilla.com)

- end-user impact:
   - bouncer writes will be inhibited for about 10-15 seconds.


- maintenance plan and timeline (link to a wiki or etherpad is fine):
   - Remove slaves from load balancer 1 by 1 and restart mysql service, setting replication mode to MIXED.
   - Upon success, remove the master and do the same. This will result in a 10-15 second downtime per server while it restarts.
   - No impact from slave restarts. 10-15 second write only downtime while master restarts.

- rollback plan / rollback point (at which point will you determine to roll back)
   - This is only changing a replication method flag, which should have no need for rollback, however if necessary we'll reset to statement and re-sync the slaves.

- notification mechanisms
   - cyborgshadow will notify bouncer teams of approved downtime window.

- who will be point, who else will be involved 
   - db team: Brandon Johnson :cyborgshadow (point)
Flags: cab-review?
Moved the timing to the Tree Closing Window on 2/22 at 1200 PST
Flags: cab-review? → cab-review+
Changing this as discussed in the meeting; it's easier just to do the failover so bouncer1 is freed up. I will be on point for this.
Summary: Request for downtime for bouncer restarts. → Request for downtime for bouncer failover
Blocks: 971818
I have moved all the slaves to slaves of bouncer2, so now it goes:

bouncer1<->bouncer2->{bouncer3/4/backup2}

This is in preparation for the failover, when bouncer2 will be the master.
bouncer has been failed over. bouncer2 is the master, bouncer1 is out of the load balancer.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
Change Request: --- → approved
Flags: cab-review+
You need to log in before you can comment on or make changes to this bug.