Closed
Bug 672145
Opened 14 years ago
Closed 14 years ago
Verify nagios monitoring on dm-ausstage01 & dp-ausstage01
Categories
(Infrastructure & Operations :: RelOps: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: nthomas, Assigned: jabba)
Details
Between 3:30am and 5am PDT on the 17th I got cron mail about being out of space on dp-ausstage01.phx.mozilla.com:/opt, but can't see any nagios alerts for the same period. Please check what nagios monitoring is in place for that box and for dm-ausstage01.mozilla.org.
Comment 1•14 years ago
|
||
dp-nagios01 is a host in Phoenix that's managed by ops and has a completely different nagios server which releng doesn't have access to and which does not notify releng of anything as far as I know. I'm ccing Rob and Justin since they've both been involved with that nagios server.
Assignee: server-ops-releng → arich
Reporter | ||
Comment 2•14 years ago
|
||
I had checked the irc channel were IT nagios reports, and didn't see anything there, but I don't know if reports everything.
Assignee | ||
Comment 3•14 years ago
|
||
Yep, that box never got more than a ping check when it got set up:
22:36 < jabba> nagios-sjc1: status dm-ausstage01:*
22:36 < nagios-sjc1> jabba: dm-ausstage01:avg load is OK: OK - load average: 0.05, 0.04,
0.00
22:36 < nagios-sjc1> jabba: dm-ausstage01:disk - /opt is OK: DISK OK - free space: /opt
49809 MB (69% inode=20%):
22:36 < nagios-sjc1> jabba: dm-ausstage01:PING is OK: PING OK - Packet loss = 0%, RTA =
1.76 ms
22:36 < nagios-sjc1> jabba: dm-ausstage01:root partition is OK: DISK OK - free space: /
18911 MB (52% inode=59%):
22:37 < jabba> nagios-phx1: status dp-ausstage01.phx:*
22:37 < nagios-phx1> jabba: dp-ausstage01.phx:PING is OK: PING OK - Packet loss = 0%, RTA
= 0.76 ms
I can add monitoring to it, but it will involve me puppetizing it.
Assignee: arich → jdow
Assignee | ||
Comment 4•14 years ago
|
||
I puppetized the host and added basic monitoring to it:
10:55 < jabba> nagios-phx1: status dp-ausstage01.phx:*
10:55 < nagios-phx1> jabba: dp-ausstage01.phx:avg load has not yet been checked.
10:55 < nagios-phx1> jabba: dp-ausstage01.phx:disk - /opt has not yet been checked.
10:55 < nagios-phx1> jabba: dp-ausstage01.phx:PING is OK: PING OK - Packet loss = 0%, RTA = 1.27 ms
10:55 < nagios-phx1> jabba: dp-ausstage01.phx:root partition has not yet been checked.
It's set up the same as dm-ausstage01 (reports to #sysadmins)
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 5•14 years ago
|
||
Thanks!
Updated•12 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•