Closed
Bug 1472874
Opened 7 years ago
Closed 7 years ago
monitor c7000s boa and switches in MDC2
Categories
(Infrastructure & Operations :: MOC: Service Requests, task)
Infrastructure & Operations
MOC: Service Requests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: van, Assigned: ryanc)
References
Details
please monitor these new hosts
hpswitch1-access.gf135.ops.mdc2.mozilla.net: 10.50.8.17
boa-a1-chassis1.gf135.ops.mdc2.mozilla.com: 10.50.8.32
boa-a2-chassis1.gf135.ops.mdc2.mozilla.com: 10.50.8.33
hpswitch2-access.gf135.ops.mdc2.mozilla.net:10.50.8.18
boa-a1-chassis2.gf135.ops.mdc2.mozilla.com: 10.50.8.50
boa-a2-chassis2.gf135.ops.mdc2.mozilla.com: 10.50.8.51
| Reporter | ||
Updated•7 years ago
|
Summary: monitor c7000s in MDC2 → monitor c7000s boa and switches in MDC2
| Reporter | ||
Comment 1•7 years ago
|
||
err, these are not hosts but rather onboard administrators and cisco switches.
| Assignee | ||
Updated•7 years ago
|
Assignee: nobody → rchilds
Status: NEW → ASSIGNED
| Assignee | ||
Comment 2•7 years ago
|
||
(In reply to Van Le [:van] from comment #0)
> please monitor these new hosts
>
> hpswitch1-access.gf135.ops.mdc2.mozilla.net: 10.50.8.17
> boa-a1-chassis1.gf135.ops.mdc2.mozilla.com: 10.50.8.32
> boa-a2-chassis1.gf135.ops.mdc2.mozilla.com: 10.50.8.33
>
> hpswitch2-access.gf135.ops.mdc2.mozilla.net:10.50.8.18
> boa-a1-chassis2.gf135.ops.mdc2.mozilla.com: 10.50.8.50
> boa-a2-chassis2.gf135.ops.mdc2.mozilla.com: 10.50.8.51
For the switches, we're only monitoring ping or is there anything else we should be doing with snmp etc?
Flags: needinfo?(vle)
See Also: → 1470774
| Assignee | ||
Comment 3•7 years ago
|
||
Initial stuff pushed in 9c73a0918d31db5765bece7302c2497cdececa22 -- Will report back
| Reporter | ||
Comment 4•7 years ago
|
||
these should be monitored the same as our other cisco module switches.
for reference:
switch1.r301-5.ops.scl3.mozilla.net (10.22.8.140)
switch1.r302-1.ops.scl3.mozilla.net (10.22.8.149)
Flags: needinfo?(vle)
| Assignee | ||
Comment 5•7 years ago
|
||
These are not looking consistent. Please make sure ACLs are in place and that all BOA are using the same community string,
05:17:11 <nagios-mdc2> ryanc: [Unknown] boa-a1-chassis1.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is OK - OK - System: 'BladeSystem c7000 Enclosure G2', SN: 'USE951W478', Firmware: '4.22', hardware working fine, 1 blades, 2 i/o modules Last Checked: 2018-07-04 09:08:25 UTC
05:17:11 <nagios-mdc2> ryanc: [Unknown] boa-a1-chassis2.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is CRITICAL - PSU 6 is Failed (generalFailure), input line status: linePowerLoss<br/>Enclosure overall health condition is Degraded Last Checked: 2018-07-04 09:08:31 UTC
05:17:11 <nagios-mdc2> ryanc: [Unknown] boa-a2-chassis1.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is CRITICAL - SNMP CRITICAL: No response from remote host "10.50.8.33" Last Checked: 2018-07-04 09:08:45 UTC
05:17:11 <nagios-mdc2> ryanc: [Unknown] boa-a2-chassis2.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is CRITICAL - SNMP CRITICAL: No response from remote host "10.50.8.51" Last Checked: 2018-07-04 09:16:50 UTC
Comment 6•7 years ago
|
||
(IRC) Fri 03:08:32 UTC [9204] [Unknown] boa-a1-chassis2.gf135.ops.mdc2.mozilla.com:HP Agents is CRITICAL: Compaq/HP Agent Check: cpqRackPowerEnclosureCondition (1:degraded) cpqRackPowerSupplyCondition (6:failed) (http://m.mozilla.org/HP+Agents)
(IRC) Fri 03:08:34 UTC [9205] [Unknown] boa-a1-chassis2.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is CRITICAL: PSU 6 is Failed (generalFailure), input line status: linePowerLoss<br/>Enclosure overall health condition is Degraded (http://m.mozilla.org/HP+Blade+Chassis)
(IRC) Fri 03:08:57 UTC [9208] [Unknown] boa-a2-chassis1.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is CRITICAL: SNMP CRITICAL: No response from remote host "10.50.8.33" (http://m.mozilla.org/HP+Blade+Chassis)
Downtimed 7d waiting for the fixes for comment5
| Reporter | ||
Comment 7•7 years ago
|
||
acls are not the issue. it appears since these are on active/standby, only the active boa will respond to snmp. please monitor the "boa-a1*" boards. we can monitor the secondary (a2) for ping since they respond to that.
| Reporter | ||
Comment 8•7 years ago
|
||
>05:17:11 <nagios-mdc2> ryanc: [Unknown] boa-a1-chassis2.gf135.ops.mdc2.mozilla.com:HP Blade Chassis is CRITICAL - PSU 6 is Failed (generalFailure), input line status: linePowerLoss<br/>Enclosure overall health condition is Degraded Last Checked: 2018-07-04 09:08:31 UTC
just an fyi, i have opened bug 1472871 for the bad PSU.
Comment 9•7 years ago
|
||
Monitoring changes made in commit 59b102d6183fceb02fb55ce5c62a5781c691889b.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•