Closed Bug 1023873 Opened 7 years ago Closed 7 years ago

documentation update for "crontab is CRITICAL: CRITICAL - duplicates (DuplicatesCronApp)" required

Categories

(Socorro :: Infra, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dmaher, Assigned: lonnen)

Details

tl;dr is that alerts on "duplicates (DuplicatesCronApp)" need to be triaged immediately.  The documentation should reflect a greater sense of urgency for this alert.


Example alert:

05:42:31 < nagios-phx1> Wed 05:42:31 PDT [1398] sp-admin01.phx1.mozilla.com:Socorro Admin - crontab is CRITICAL: CRITICAL - duplicates (DuplicatesCronApp) (http://m.mozilla.org/Socorro+Admin+-+crontab)


Ramifications:

06:45:48 < selenamarie> phrawzty:  see bug 1023867
06:45:56 < selenamarie> phrawzty: /var/log/socorro/crontabber.log
06:51:59 <@phrawzty> selenamarie: so it looks like some "bad" data was inserted into postgres ?
06:53:11 < selenamarie> phrawzty: "attempted" to be inserted :) it did not succeed
06:53:17 < selenamarie> the underlying problem is not that.
06:53:49 < selenamarie> underlying problem could be any number of things so i'm going to sit here and read through update_reports_duplicates, reformatting it so that it is readable to start, and then see if either a data fix or a stored proc fix is most appropriate
06:53:53 < selenamarie> we might do both
06:54:03 < selenamarie> this is a pretty bad failure.
06:54:19 < selenamarie> i think we should update the documentation for this particular issue in Mana so that it is clear someone should start triaging it immediately
06:54:29 < selenamarie> phrawzty: this blocked all data processing on postgres overnight
I've updated the run book entry for that alert to indicate that a bug should be filed for Critical alerts from crontabber and further excalation should come to the admin contact listed on the mana page for crash stats. I've also listed myself as the first admin contact on the mana page.
Assignee: nobody → chris.lonnen
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.