Closed Bug 825036 Opened 12 years ago Closed 11 years ago

Maintenance window for Socorro DBs to upgrade disks, failover and consolidate directories

Categories

(Data & BI Services Team :: DB: MySQL, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: selenamarie, Assigned: selenamarie)

References

Details

Ticket for managing a maintenance window for Socorro DBs. 

Targeting 1/3/13.
Depends on: 812548, 790690, 825032
Disks have been replaced in tp-socorro01-master02 and replication has been restored.

Next, tp-socorro01-master01 when we can schedule a maintenance window.
Thinking this is more coordination with DBA's than WebOps... moving. If you need a webop on standby for the migration, let us know.

I understand DCOps might need to be available too... CC'ing dmoore.
Assignee: server-ops-webops → server-ops-database
Component: Server Operations: Web Operations → Server Operations: Database
QA Contact: nmaul → cshields
When will DCOps be onsite next?
Via IRC, we coordinated January 17th as the day to conduct the maintenance. Time TBD. Waiting for confirmation of travel for dcops.
Time for window will be 5-7pm PT, which will cover: 

* Hardhatting crash-stats.mozilla.com
* Failover from master->replica database
* Restoring crash-stats.mozilla.com to service

And then we'll upgrade disks and re-create the replica after the site is back running on master02.
FWIW, this wasn't done due to phoenix load balancer issues as well as both :solarce and :mpressman being out.

I'd like to explore failing over to socorro-master02 when Matt is back next week. I know that the load balancer isn't used for everything, so failover may have many manual parts, but we should get those written down in a procedure so that a real failover has the experience of having had at least one "fire drill" first.

That way, we can failover to socorro-master02 and use it, and very little coordination with dcops needs to be done to actually install the disks.
Sure thing. I talked with DCOps and they're up for replacing the disks 1/31.
We have started this window
User-visible changes were completed at 5:47pm PT. 

Now replicating system to master01 and maintenance window has ended.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee: server-ops-database → sdeckelmann
Depends on: 837705
Product: mozilla.org → Data & BI Services Team
You need to log in before you can comment on or make changes to this bug.