Nagios monitoring for crontabber

RESOLVED DUPLICATE of bug 818736

Status

mozilla.org Graveyard
Server Operations
RESOLVED DUPLICATE of bug 818736
6 years ago
3 years ago

People

(Reporter: peterbe, Assigned: ashish)

Tracking

Details

(Reporter)

Description

6 years ago
We're moving to a new system for running *all* cron jobs called Crontabber. 
It records any failures and we want to be alerted if any job doesn't exist cleanly. 

The status of all cron jobs gets outputted to a local crontabbers.json file on sp-admin01 and we also replicate this information every time to the postgres server. 

We also log every failure on the log files by the way.

Comment 1

6 years ago
We're rolling crontabber back right now, at least partly because we don't have a monitoring method.  Let's work this out before we attempt to deploy it again.
Assignee: nobody → server-ops
Component: Infra → Server Operations
Product: Socorro → mozilla.org
QA Contact: jdow
Version: unspecified → other
(Reporter)

Comment 2

6 years ago
(In reply to Peter Bengtsson [:peterbe] from comment #0)
> We're moving to a new system for running *all* cron jobs called Crontabber. 
> It records any failures and we want to be alerted if any job doesn't exist
> cleanly. 
> 
s/exists/exits

In python you can find all/any errors like this:

import json
def run():
    errors = 0
    for key, info in json.load(open('crontabber.json')).items():
       if info.get('last_error'):
          print key
          print info['last_error']
          print info['error_count']
          print
          errors += 1
    
    sys.exit(errors)


...in case that helps.
(Assignee)

Comment 3

6 years ago
What is the location of the json file? We already have a nagios check that looks for json output from a http request and alerts based on the response key/value pairs. Can modify that to check for a local file instead.
Assignee: server-ops → ashish
Status: NEW → ASSIGNED
(Reporter)

Comment 4

6 years ago
Whichever head it's installed on, the file is in 
/home/socorro/persistent/crontabbers.json

We configure it like this
https://github.com/mozilla/socorro/blob/master/config/crontabber.ini#L23

E.g.
[pbengtsson@sp-admin01.phx1 ~]$ ls -l /home/socorro/persistent/crontabbers.json
-rw-r--r-- 1 socorro socorro 8133 Jul 31 16:46 /home/socorro/persistent/crontabbers.json

Comment 5

6 years ago
Crons are run from the admin boxes. You can find those listed in the crash-stats.mozilla.com mana docs. Alternatively, if you have access to the other socorro nagios alerts all of our current cron monitoring should be taking place on the same box as this.
(Assignee)

Updated

6 years ago
Depends on: 781077
(Reporter)

Comment 6

6 years ago
Marking as dupe of 818736 because that one has more (recent) information. 
I guess I forgot about this but when I filed the second one.
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 818736
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.