Closed Bug 527699 Opened 16 years ago Closed 16 years ago

investigate bm-xserve12 interface flaps

Categories

(mozilla.org Graveyard :: Server Operations, task)

All
Other
task
Not set
minor

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mrz, Assigned: dmoore)

References

Details

switch port keeps flapping. Either bad hardware or the server keeps rebooting. 08:30 <@nagios> [07] dm-nagios01:SNMP Alerting is WARNING: core1 - 117 GigabitEthernet4/15 ethernetCsmacd Lost Carrier 08:30 <@nagios> [08] dm-nagios01:SNMP Alerting is WARNING: core1 - 117 GigabitEthernet4/15 ethernetCsmacd up 08:31 <@nagios> [09] dm-nagios01:SNMP Alerting is WARNING: core1 - 117 GigabitEthernet4/15 ethernetCsmacd Lost Carrier 08:31 <@nagios> [10] dm-nagios01:SNMP Alerting is WARNING: core1 - 117 GigabitEthernet4/15 ethernetCsmacd up interface GigabitEthernet4/15 description bm-xserve12:103.02.9B switchport switchport access vlan 71 no ip address spanning-tree portfast
punting - does this host reboot often?
Assignee: server-ops → nobody
Component: Server Operations → Release Engineering
QA Contact: mrz → release
bm-xserve12:log cltbld$ grep BOOT_TIME /var/log/system.log Nov 10 02:33:57 bm-xserve12 bootlog[108]: BOOT_TIME: 1257849215 0 Nov 10 03:56:36 bm-xserve12 bootlog[107]: BOOT_TIME: 1257854174 0 Nov 10 04:51:33 bm-xserve12 bootlog[108]: BOOT_TIME: 1257857470 0 Nov 10 08:31:02 bm-xserve12 bootlog[108]: BOOT_TIME: 1257870639 0 Nov 10 09:29:13 bm-xserve12 bootlog[107]: BOOT_TIME: 1257874131 0 Nov 10 09:48:19 bm-xserve12 bootlog[108]: BOOT_TIME: 1257875276 0 Nov 10 10:20:05 bm-xserve12 bootlog[107]: BOOT_TIME: 1257877183 0 Nov 10 11:01:11 bm-xserve12 bootlog[107]: BOOT_TIME: 1257879649 0 Nov 10 11:25:54 bm-xserve12 bootlog[106]: BOOT_TIME: 1257881130 0 Nov 10 12:37:31 bm-xserve12 bootlog[107]: BOOT_TIME: 1257885429 0 Nov 10 13:08:46 bm-xserve12 bootlog[108]: BOOT_TIME: 1257887303 0 Like the majority of our talos and build machines this one reboots after every build/test job. Have we tried another switch port ?
Another switch port wouldn't change it - if it reboots it'll drop the interface and the switch will send a trap. It's noise. Probably need a way to exclude certain interfaces from sending traps. Nothing else on your side, reassign to server-ops?
Why would bm-xserve09 and 12 hit this issue but not the many other "boxes" that are rebooting all the time ?
Assignee: nobody → server-ops
Component: Release Engineering → Server Operations
QA Contact: release → mrz
The HP switches don't send traps (they aren't configured to). Anything connected to the Cisco 6509s do.
Assignee: server-ops → dmoore
I'm investigating how to fix this on IT's side, by filtering the traps at either the switch or Nagios level.
Resolved in software by justdave.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.