Closed
Bug 1378457
Opened 8 years ago
Closed 8 years ago
nagios-releng bot is not showing same alerts in different channels
Categories
(Infrastructure & Operations :: MOC: Service Requests, task, P2)
Infrastructure & Operations
MOC: Service Requests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: arich, Assigned: jlaz)
Details
The releng-nagios bot is on two engops related channels, #buildduty and #platform-ops-alerts. We are seeing alerts in #platform-ops-alerts that are not showing up in #buildduty, which means that people are not seeing and able to respond to specific alerts (so marking this as a P2).
I'm not sure if this has something to do with the recent cutover from nagios3 to nagios4 or what. Maybe the old machine is still up and not notifying on all the checks?
Here are some examples that don't show up in #buildduty
Wed 17:48:03 UTC [7833] [] signing-linux-1.srv.releng.use1.mozilla.com:Pending Scriptworker Tasks is CRITICAL: PENDING_TASKS CRITICAL - 220/100 pending tasks for scriptworker-prov-v1:signing-linux-v1 (http://m.mozilla.org/Pending+Scriptworker+Tasks)
Wed 17:48:12 UTC [7834] [] signing-linux-4.srv.releng.usw2.mozilla.com:Pending Scriptworker Tasks is CRITICAL: PENDING_TASKS CRITICAL - 220/100 pending tasks for scriptworker-prov-v1:signing-linux-v1 (http://m.mozilla.org/Pending+Scriptworker+Tasks)
If I look at the web GUI https://nagios1.private.releng.scl3.mozilla.com/releng-scl3/ they show up there.
Wed 17:51:14 UTC [7835] [] signing-linux-2.srv.releng.usw2.mozilla.com:Pending Scriptworker Tasks is CRITICAL: PENDING_TASKS CRITICAL - 236/100 pending tasks for scriptworker-prov-v1:signing-linux-v1 (http://m.mozilla.org/Pending+Scriptworker+Tasks)
![]() |
||
Comment 1•8 years ago
|
||
modules/nagios4/manifests/prod/mozilla/contactgroups.pp:
There's a contactgroup 'build' that sends to buildduty and platform-ops-alerts.
There's a contactgroup 'platformops' that sends to just platform-ops-alerts.
"Pending Scriptworker Tasks" sends to just contactgroup 'platformops'. And it appears to be the only check config'ed that way.
Probably just flip that check to 'build'.. and remove the contactgroup for platformops if you don't have a case of wanting to alerts one without the other.
Assignee | ||
Comment 2•8 years ago
|
||
The 'scriptworker tasks' Nagios checks were created in bug 1332640 recently, and initial alerts sent to #platform-ops-alerts as requested. We can definitely add the buildduty contact group if you'd like, but note that there will be some noise generated, which is being addressed in bug 1377147
Reporter | ||
Comment 3•8 years ago
|
||
Please change the contact group to build, not platformops. The platformops contact group is only for non-releng stuff (like dev-services) that doesn't go to #buildduty but is managed by relops.
Assignee | ||
Comment 4•8 years ago
|
||
Changed the Nagios contactgroup to build, we should be set now.
Assignee: nobody → jlaz
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment 5•8 years ago
|
||
Related to this bug, :arr also requested the following change.
─
commit 78bcea674e7ac9af81dbbec5a75b80e6abe25b2d
Author: Keegan Ferrando :fauweh <kferrando@mozilla.org>
Date: Thu Jul 6 13:27:20 2017 -0700
Send releng puppet compilation errors to #sysadmins as per :arr
diff --git a/modules/nagios4/manifests/prod/releng/services.pp b/modules/nagios4/manifests/prod/releng/services.pp
index 7552cca534..9135e031cb 100644
--- a/modules/nagios4/manifests/prod/releng/services.pp
+++ b/modules/nagios4/manifests/prod/releng/services.pp
@@ -1171,17 +1171,17 @@ class nagios4::prod::releng::services {
default => [
]
}
},
"puppet_catalog" => {
service_description => "Puppet catalog compilation",
check_command => 'check_puppet_catalog',
normal_check_interval => 30,
- contact_groups => 'infrastructure, build',
+ contact_groups => 'infrastructure, build, sysalerts',
hostgroups => $nagiosbot ? {
'nagios-releng' => [
'admin-servers',
'nagios-servers'
],
default => [
'nagios-servers'
]
You need to log in
before you can comment on or make changes to this bug.
Description
•