Closed Bug 1267793 Opened 8 years ago Closed 8 years ago

Make hgssh1 the failover

Categories

(Developer Services :: Mercurial: hg.mozilla.org, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gps, Assigned: gps)

Details

Currently, hgssh3 (a temporary loaner from IT) is our master running CentOS 7. hgssh2 is our standby (running CentOS 6). hgssh1 is sitting around idle freshly imaged as CentOS 7.

We want to restore hgssh1 as our master and hgssh2 as our standby so we can give hgssh3 back to IT.

Alternatively, we can buy new hardware, rack it as hgssh1, make it master, then update hgssh2 to CentOS 7.
I think the new hardware will take a week or two to get in place, so continuing on with the shell game for hgssh1/2/3 is ok. maybe we just make hgssh1 master with hgssh3 as standby, so they're both centos7, until the new hardware is in?  I don't think corey needs the loaner blade back immediately, so this splits the difference between the plans.
We decided to get hgssh1 stood up as a failover so we have 2 machines running CentOS 7. When the new hardware arrives, we can install as hgssh2 and something else and switch things to the new hardware. We'll keep hgssh3 as primary until the new hardware arrives because its hardware is newer and IT doesn't need it back yet.
Assignee: nobody → gps
Status: NEW → ASSIGNED
Summary: Restore hgssh1 as hg.mo master, hgssh2 as standby → Make hgssh1 the failover
hgssh1:/etc/mercurial/pulse.json is missing. I installed it manually. But it should be present in Puppet.
Flags: needinfo?(klibby)
I also think /etc/mercurial/mirror and /etc/mercurial/mirror.pub need some Puppet work.

Actually, those files need more than just Puppet work: we should start some better practices around how we manage that shared SSH key. But that's for another bug.

I also noticed hgssh3:/etc/mercurial/fubar-out.txt
(In reply to Gregory Szorc [:gps] from comment #3)
> hgssh1:/etc/mercurial/pulse.json is missing. I installed it manually. But it
> should be present in Puppet.

added.

(In reply to Gregory Szorc [:gps] from comment #4)
> I also noticed hgssh3:/etc/mercurial/fubar-out.txt

nuked.

(In reply to Gregory Szorc [:gps] from comment #4)
> I also think /etc/mercurial/mirror and /etc/mercurial/mirror.pub need some
> Puppet work.
> Actually, those files need more than just Puppet work: we should start some
> better practices around how we manage that shared SSH key. But that's for
> another bug.

wfm.
Flags: needinfo?(klibby)
At this point, I'm pretty confident making hgssh1 the failover in the zlb.

There is still some work that could be done on the ansible side of things to denote which server is the master and which services should be running. But we can track that elsewhere.
hgssh1 enabled and hgssh2 disabled in hg-ssh-failover pool in zeus.

we really ought to do a fail over and back some time soon to verify that everything works as expected.
We tested the fail over. Not sure what else there is to do in this bug, so closing.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.