Closed Bug 823497 Opened 12 years ago Closed 12 years ago

Monitoring for ReplayDB

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: selenamarie, Assigned: ashish)

References

Details

Could I get a list of current monitoring for Socorro's ReplayDB? I think that the IP for this system is 10.8.70.124. I'd like to review that and add a few checks.
Blocks: 823507
No Postgre-specific checks are added to socorro1.dev.db.phx1.mozilla.com (except for a Disk check on /pgdata).
Assignee: server-ops → ashish
Let's add checks for: * Postgres running on port 5499 * hot_standby_delay is < 16777216
Ping!
(In reply to Selena Deckelmann :selenamarie :selena from comment #3) > Ping! Sorry I was out sick all of last week. I'll set these up in your AM. Since this is dev, would the alerts need oncall intervention?
(In reply to Ashish Vijayaram [:ashish] from comment #4) > (In reply to Selena Deckelmann :selenamarie :selena from comment #3) > > Ping! > > Sorry I was out sick all of last week. I'll set these up in your AM. Since > this is dev, would the alerts need oncall intervention? Not for now. We are running a restore test on this system this week, so having it alert on #socorro-alerts will be sufficient for now.
The Hot Standby Delay checks against the production DB by default. Is that expected or should I swap it with a different host? Thanks!
That is expected. The replication is going from prod->dev for historical reasons. We will be moving at some point in the near future. The replica is down today because of a restore test, and will be restored by EOD.
Added the checks. The Postgre check is failing, might need the nagios user to be authorized. Please do the needful. Thanks!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.