Closed Bug 528186 Opened 15 years ago Closed 15 years ago

primary FWSM failed, secondary failed to complete failover transition

Categories

(mozilla.org Graveyard :: Server Operations: Projects, task)

x86
macOS
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: dmoore)

Details

Nagios is going off about a bunch of things, I can't connect to mpt-vpn in any way, and we're starting to get alarms about build machines.
Things appear back now...lowering sev
Severity: blocker → critical
Assigning to dmoore for investigation.  Emailed him with some output already.

Short story:
Primary FWSM failed and the Standby never fully finished the takeover until I rebooted the Primary (core1).

Looks like it was about 13 mins of outage.  For RelEng, would have affected any inter-vlan traffic (Vlan90 to Vlan71).
Assignee: server-ops → dmoore
Summary: mpt-vpn, bm-vmware03, other things appear dead → primary FWSM failed, secondary failed to complete failover transition
This was a failover failure. The failover sync process (doorbell_poll) crashed on the Active module, which meant it began sending incomplete health messages. The two modules could not successfully negotiate a failover, leaving both of them in an intermediate state.

There is no pending fix from Cisco.
Component: Server Operations → Server Operations: Projects
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.