Closed Bug 1237811 Opened 5 years ago Closed 2 years ago
Establish a unified log of hg
This bug is about building a unified log of hg.mo events that is derived from the replication log. It will be used to power publishing to Pulse, creating an aggregate Gecko repo, etc. More details will be provided in the commit messages.
https://hg.mozilla.org/hgcustom/version-control-tools/rev/a58cb053f645e0b3fb2c877de0ab250dca47dd74 vcsreplicator: add message aggregation daemon (bug 1237811)
fubar: can you please add a Nagios monitor for a process with "vcsreplicator-aggregator /etc/mercurial/pushdataaggregator.ini" in its arguments to the hgssh master server? The full process is "/var/hg/venv_tools/bin/python2.7 /var/hg/venv_tools/bin/vcsreplicator-aggregator /etc/mercurial/pushdataaggregator.ini" but the actual python path may change over time. There should be at most 1 process.
I can, though what happens when we have a failover event? There isn't any automated way for nagios to know which is the master at any given time, afaik.
We could have puppet or something put a file on the machine indicating which machine is master or if the current machine is master. Then we could write custom Nagios checks that take the master into consideration. e.g. if you aren't the master, the Nagios check verifies 0 processes are present. This all draws more attention to the fact that our failover situation is far from robust...
meh. between this and 1196915, I'll just make a hostgroup for just hgssh3 for the short term. added 'procs - hg vcsreplicator aggregator' nrpe check. will need mana page and docs on what to do when it goes off.
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.