Closed
Bug 1078504
Opened 11 years ago
Closed 11 years ago
Network issues involving PHX1
Categories
(Infrastructure & Operations :: MOC: Problems, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: achavez, Unassigned)
References
Details
No description provided.
Comment 1•11 years ago
|
||
tracking network issues in PHX1 datacenter. Netops is already investigating.
Comment 2•11 years ago
|
||
Main problem appears to have been core1.phx1, please see 1078077 for more information about what
changes we made there.
We appear to be under a frag attack.
This could be what effected core1.phx1.
I have updated the firewall filters on our border routers
to protect them from this attack.
Now trying to protect the rest of our network equipment
For people following the bug and not #moc, we're having more issues with core1.phx1.
Comment 4•11 years ago
|
||
PHX1 network outage is still ongoing. The network operations team is investigating a hardware failure with one of the core network switches and determining the best course for recovery.
Comment 7•11 years ago
|
||
Main problem appears to have been core1.phx1, please see 1078077 for more information about what
changes we made there.
We appear to be under a frag attack.
This could be what effected core1.phx1.
I have updated the firewall filters on our border routers
to protect them from this attack.
Now trying to protect the rest of our network equipment
Comment 8•11 years ago
|
||
all core switches have had protections added.
we've worked around core1.phx1 failing.
hopefully things are better now.
Status: NEW → ASSIGNED
Comment 9•11 years ago
|
||
Status as of 10/7 9am EST:
The network is now stable.
The root cause of our problems appears to have been an attack involving NTP against a
number of our core switches.
The core switches have had firewall filters updated on them to protect them from this issue.
However, core1.phx1 also failed during this outage -- either a hardware failure that caused
the system to fail, or the ntp attack caused enough problems to cause the system to crash.
Then, when core1.phx1 tried to reboot, other hardware problems were exposed -- bad flash
drive apparently -- so that the operating system could be loaded.
Replacement hardware is being shipped to us.
I will update this bug once there are further developments.
Comment 10•11 years ago
|
||
Changes made for work around:
core1.phx1:
disabled interface vlan unit 5
core2.phx1:
deactivated vrrp on vlan unit 5
changed the IP address on the vlan interface from 63.245.217.253/24 to
63.245.217.1/24
added an l3-interface to the "vips" vlan of vlan.5
fw1.phx1:
James changed the priorities and interface watching configuration
These configuration changes will be un-done at the appropriate time, after core1.phx1 is
repaired.
Updated•11 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•