Closed
Bug 843843
Opened 11 years ago
Closed 11 years ago
Remove the old socorro cron log nagios check
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ericz, Assigned: ericz)
Details
As per :lonnen, we should remove the check for this alert as it is no longer pertinent as of today. < nagios-phx1> | Thu 14:50:41 PST [152] sp-admin01.phx1.mozilla.com:Socorro Admin - cron_bugzilla.log is CRITICAL: FILE_AGE CRITICAL: /var/log/socorro/cron_bugzilla.log is 5951 seconds old and 9806793 bytes IRC conversation: [16:37] < lonnen> | ericz: we made changes to our crontab today, and introduced a new single cron that manages and runs scripts that used to be crons [16:37] < lonnen> | ericz: used to be meaning... this morning [16:37] < ericz> | Does that mean it's ok if /var/log/socorro/cron_bugzilla.log is old? [16:37] < lonnen> | ericz: yes. it also means we should disable that alert [16:38] < lonnen> | because that job is no longer running [16:40] < ericz> | lonnen: Ok, should we disable that alert everywhere (assuming it runs in more than one place) or just specific servers/environments? [16:41] < lonnen> | ericz: I believe it runs on the admin node for stage and prod (and dev, but I don't know if nagios is hooked up on dev) [16:41] < lonnen> | ericz: and yeah, everywhere. although it should already be disabled on dev and stage
Comment 1•11 years ago
|
||
Do the other cron checks need to go too? This is the full list: /var/log/socorro/cron_bugzilla.log /var/log/socorro/cron_status.log /var/log/socorro/cron_create_partitions.log /var/log/socorro/cron_submitter-crash-reports.allizom.org.log
Assignee | ||
Comment 2•11 years ago
|
||
We're going to hold on this for a bit as there is debate about rolling back to the old checks.
Comment 3•11 years ago
|
||
Due to the looming weekend, we're going to config off this new cron system and re-enable the old cron job system. We'll need to re-enable the cron checks listed above, including the bugzilla cron. Apologies for the confusion before.
Assignee | ||
Comment 4•11 years ago
|
||
No problem. None of them have been removed yet (just one was ack'd) so we should be good for the weekend. We can reconvene next week.
Assignee | ||
Updated•11 years ago
|
Assignee: server-ops → eziegenhorn
Comment 5•11 years ago
|
||
Thank you!
Comment 6•11 years ago
|
||
We have pushed the same change to our crontab back into production. We are watching closely overnight to make sure we've fixed the bugs from last week. The following may alert overnight: /var/log/socorro/cron_bugzilla.log /var/log/socorro/cron_status.log /var/log/socorro/cron_create_partitions.log /var/log/socorro/cron_submitter-crash-reports.allizom.org.log if all is well in the morning, they will need to be removed
Comment 7•11 years ago
|
||
We've had two good overnight runs, so I think its safe to proceed.
Comment 8•11 years ago
|
||
It's a no change Friday but from the nagios side, seems only the following two in a "Critical" State; /var/log/socorro/cron_bugzilla.log /var/log/socorro/cron_submitter-crash-reports.allizom.org.log The rest are ok; /var/log/socorro/cron_create_partitions.log /var/log/socorro/cron_status.log /var/log/socorro/socorro-monitor.log So to make sure, which ones are we now keeping? From the original request, seems we were only removing; /var/log/socorro/cron_bugzilla.log So to make sure this is done correct, please clarify which ones you want to remove come Monday.
Updated•11 years ago
|
Flags: needinfo?(chris.lonnen)
Comment 9•11 years ago
|
||
Sorry for the confusion. Please remove: /var/log/socorro/cron_bugzilla.log
Flags: needinfo?(chris.lonnen)
Assignee | ||
Comment 10•11 years ago
|
||
Removed socorro-admin-cron_bugzilla check.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•