Closed
Bug 697269
Opened 13 years ago
Closed 13 years ago
modify hg-mirrors sync check to operate on a per-repo basis
Categories
(Infrastructure & Operations :: RelOps: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: arich, Assigned: dustin)
Details
Right now the hg-mirrors check looks at all the repos at once. We would be better served by having each repo checked individually so that if one repo is out of sync (because of an ongoing push) and recovers, but then another goes out of sync before the next check (because of a different ongoing push), we do not get notified (false positive).
Handing this to dustin to whip up a new check.
Assignee | ||
Comment 1•13 years ago
|
||
I added the following as check_hg_mirror (separate from the existing check_hg_mirrorsync). This is committed in puppet, but the nagios checks are still using check_hg_mirrorsync, so no alerts should occur. I'll make the nagios changes next week.
----
import yaml
import os
import os.path
import sys
import time
import argparse
parser = argparse.ArgumentParser(description='check status of a mirrored hg repo')
parser.add_argument('-r', dest='repo', required=True,
help='repo to check')
parser.add_argument('-W', dest='warning', type=int, default=200,
help='data age (s) for WARNING')
parser.add_argument('-C', dest='critical', type=int, default=300,
help='data age (s) for CRITICAL')
args = parser.parse_args()
datafile="/dev/shm/check_hg_mirrorsync/state"
statd = os.stat(datafile)
now = time.time()
data_age = now - statd.st_mtime
data = yaml.load(file(datafile))
total_count = len(data.keys())
if args.repo not in data:
print "'%s' not a known repo" % args.repo
sys.exit(3) # UNKNOWN
# check sync before data age, since it's more important
master_tip = data[args.repo]['upstream_tip']
local_tip = data[args.repo]['mirror_tip']
if master_tip != local_tip:
print "repo '%s' is out of sync" % args.repo
sys.exit(2) # CRITICAL
if data_age > min(args.warning, args.critical):
print "sync data is stale. %i seconds" % data_age
if data_age > args.critical:
sys.exit(2) # CRITICAL
sys.exit(1) # WARNING
else:
print "SYNC OK"
sys.exit(0)
Assignee | ||
Comment 2•13 years ago
|
||
These are starting to go green in nagios.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•