Closed Bug 838159 Opened 11 years ago Closed 11 years ago

Additional master server for Socorro Postgres databases

Categories

(Data & BI Services Team :: DB: MySQL, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: selenamarie, Assigned: mpressman)

References

Details

The strange failure of the drive in master01 brought to light a failure mode we'd like to avoid: 

* Secondary database goes offline or is unavailable
* Our master database fails and we need to fail over, or restore from a backup

We would like a third system that is the same as master01/master02 for failover. 

Because we've made RO access to the secondary system available, people are building applications that connect only to it for data. We need a third system running as a replica online at all times supporting the RO zeus node.
Summary: Additional master server for Socorro databases → Additional master server for Socorro Postgres databases
Blocks: 823507
Is this third system running as a replica this? https://bugzilla.mozilla.org/show_bug.cgi?id=813317

Does that count as n+1 or do we need another one?
(In reply to Sheeri Cabral [:sheeri] from comment #1)
> Is this third system running as a replica this?
> https://bugzilla.mozilla.org/show_bug.cgi?id=813317
> 
> Does that count as n+1 or do we need another one?

We need another one. The Reporting replica is going to be used for a different purpose - long running experimental queries and will have a different configuration.
Awesome! Wasn't sure if it could also be used as read-only. Sounds like no.

I'm all for n+1 configuration!

cc'ing Corey for hardware ordering.
(In reply to Sheeri Cabral [:sheeri] from comment #3)
> Awesome! Wasn't sure if it could also be used as read-only. Sounds like no.

It can be used as a RO, but the query timeout will be set to 5 min. The query timeout on the reporting replica is going to be much longer - possibly an hour or more.

> I'm all for n+1 configuration!
> 
> cc'ing Corey for hardware ordering.

Woot.
Blade server shipped via Fed Ex 294466306093837 ETA 2/18
Storage Blades shipped via fed Ex 812085414085796 ETA 2/15
Server hard drives (2) shipped via Fed Ex 812085414085864 ETA 2/15
Storage blade hard drives (12) shipped via Fed Ex 294466306090096 ETA 2/19
Removing Rich - the server is now racked, so it's ready for SRE kickstarting and then puppetizing. Matt - I'm leaving this in your hands.
Assignee: server-ops-database → mpressman
This is up and running as socorro3.db.phx1 and is in puppet
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Data & BI Services Team
You need to log in before you can comment on or make changes to this bug.