Closed
Bug 1505216
Opened 7 years ago
Closed 7 years ago
monitor all UPS for temperature
Categories
(Infrastructure & Operations :: MOC: Service Requests, task)
Infrastructure & Operations
MOC: Service Requests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: van, Assigned: ryanc)
Details
it doesn't look like TPE1's UPS alerted of us the temperature issue in TPE1 today (11/6/2018) before the devices started shutting down. can we make sure we have all the UPS monitoring temperature or maybe it is but something is broken so it didn't alert in #sysadmins? i see UPS ups-red01.df401-1.private.tpe1 on the observium [1]list though.
van> nagios-mdc1: status ups-red01.df401-1.private.tpe1.mozilla.net:*
3:34 PM
<nagios-mdc1> van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:PING is OK - PING OK - Packet loss = 0%, RTA = 136.26 ms Last Checked: 2018-11-06 23:33:22 UTC
3:34 PM van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:UPS Battery Replacement is OK - SNMP OK - Status 1 Last Checked: 2018-11-06 23:32:49 UTC
3:34 PM van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:UPS Battery Status is OK - SNMP OK - Status 2 Last Checked: 2018-11-06 23:28:47 UTC
3:34 PM van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:UPS Output Status is OK - SNMP OK - Status 2 Last Checked: 2018-11-06 23:31:09 UTC
[1] https://observium1.private.mdc2.mozilla.com/alert_check/alert_test_id=19/
| Reporter | ||
Comment 1•7 years ago
|
||
i thought observium interacted with the irc bots. is it possible to add the temperature/humidty UPS check to nagios so we get the alerts in #sysadmins? thanks!
| Assignee | ||
Comment 2•7 years ago
|
||
(In reply to Van Le [:van] from comment #0)
> it doesn't look like TPE1's UPS alerted of us the temperature issue in TPE1
> today (11/6/2018) before the devices started shutting down. can we make sure
> we have all the UPS monitoring temperature or maybe it is but something is
> broken so it didn't alert in #sysadmins? i see UPS
> ups-red01.df401-1.private.tpe1 on the observium [1]list though.
>
> van> nagios-mdc1: status ups-red01.df401-1.private.tpe1.mozilla.net:*
> 3:34 PM
> <nagios-mdc1> van: [networkops]
> ups-red01.df401-1.private.tpe1.mozilla.net:PING is OK - PING OK - Packet
> loss = 0%, RTA = 136.26 ms Last Checked: 2018-11-06 23:33:22 UTC
> 3:34 PM van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:UPS
> Battery Replacement is OK - SNMP OK - Status 1 Last Checked: 2018-11-06
> 23:32:49 UTC
> 3:34 PM van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:UPS
> Battery Status is OK - SNMP OK - Status 2 Last Checked: 2018-11-06 23:28:47
> UTC
> 3:34 PM van: [networkops] ups-red01.df401-1.private.tpe1.mozilla.net:UPS
> Output Status is OK - SNMP OK - Status 2 Last Checked: 2018-11-06 23:31:09
> UTC
>
> [1] https://observium1.private.mdc2.mozilla.com/alert_check/alert_test_id=19/
Yeah it did, https://observium1.private.mdc2.mozilla.com/graphs/to=1541548780/device=25/type=device_temperature/from=1541462380/legend=yes/
https://mozilla.pagerduty.com/incidents/PODKQRR
Assignee: nobody → rchilds
Status: NEW → ASSIGNED
| Assignee | ||
Comment 3•7 years ago
|
||
(In reply to Van Le [:van] from comment #1)
> i thought observium interacted with the irc bots. is it possible to add the
> temperature/humidty UPS check to nagios so we get the alerts in #sysadmins?
> thanks!
There's an irc bot to query Observium, it doesn't display alerts, but they do go to Slack, which I can invite you to and others to -- Let me know
| Assignee | ||
Comment 4•7 years ago
|
||
And in regards to the title of this bug, "monitor all UPS for temperature", we do monitor all ups', as long as they're in the secrets file we've designated for this host so that Puppet automatically adds them
| Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Ryan C [:ryanc] (UTC-4) from comment #4)
> And in regards to the title of this bug, "monitor all UPS for temperature",
> we do monitor all ups', as long as they're in the secrets file we've
> designated for this host so that Puppet automatically adds them
puppet_secrets/hiera/nodes/observium1.private.mdc2.mozilla.com.yaml
| Reporter | ||
Comment 6•7 years ago
|
||
thanks for clearing this up :ryanc!
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•