Closed Bug 720118 Opened 13 years ago Closed 13 years ago

Recover from network outage in SJC1

Categories

(Release Engineering :: General, defect, P1)

x86
All
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: nthomas)

References

Details

A firewall died in SJC1 (bug 720105), leaving our infra in a parlous state with 'open' network connections which were actually black holes. * scheduler masters: restarted * buildbot-master08: master restarted, update_from_files killed, pulse publisher restarted * buildbot-master07/09: pulse publisher restarted * buildbot-master04: master restarted * buildbot-master19: master restarted, update_from_files killed * buildbot-master20: master restarted * by now the connections were expiring by themselves and the systems recovered Thank to bear for his help. Deps filed for the IT systems that are out of reach.
Forgot to mention that foopy21 took a holiday too, bug 720113. Unclear if that's related.
Nothing more I can do here. Please reopen if further issues arise.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Priority: P2 → P1
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.