[some?] tegra nagios alerts seem to be notifying immediately rather than waiting

RESOLVED FIXED

Status

Infrastructure & Operations
RelOps
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: aki, Assigned: arr)

Tracking

Details

(Reporter)

Description

6 years ago
There's a decent amount of:

* tegra XXX PING is CRITICAL
* 10-20 minutes later, tegra XXX PING is OK.

Ooc, what are these set to?  They're 12/12 failed before notifying, but I don't remember what the interval is between those checks.

The minis appear to check 30/30 before notifying; if the intervals are the same, we might want to increase the time-before-notify on the tegras as well.
(Assignee)

Comment 1

6 years ago
There was an issue with the script that generates the hosts files that was ignoring the first_notification_delay for the PING checks (host checks and tegra-tcp checks were fine).  I've fixed the script and remade the config files.  The 720 minute delay should now be in effect.
(Assignee)

Comment 2

6 years ago
There was an issue with the script that generates the hosts files that was ignoring the first_notification_delay for the PING checks (host checks and tegra-tcp checks were fine).  I've fixed the script and remade the config files.  The 720 minute delay should now be in effect.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Updated

6 years ago
Assignee: server-ops-releng → arich
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.