Closed Bug 831088 Opened 11 years ago Closed 11 years ago

Complete network testing of scl3 HCI

Categories

(Infrastructure & Operations Graveyard :: NetOps, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dparsons, Assigned: cransom)

References

Details

:ravi has mentioned that netops needs to finish network testing in scl3 HCI before we bring it to production status. When can this testing be completed and the stamp of "netops production ready" be applied?
Good question.  It'll be a topic at the next meeting (monday)
Assignee: network-operations → cransom
Status: NEW → ASSIGNED
Just to fill in history, there was no meeting as it was a holiday, so no meeting.  I talked with Ravi Friday to run through some test cases.  

Things corrected thus far:
source-nat interface removed and are now sourcing from a dedicated ip
graceful-switchover enabled for core1 so minimize data plane disruption during RE failover


I'm currently debugging BGP+BFD which will enable faster BGP failover.  At its current state, it's just making bgp unstable and likely due to a juniper bug.
BGP+BFD was definitely due to a bug in 11.4r5.5 and in testing, found another one regarding lacp links.  These are fixed in 11.4r6.5 and that was rolled out flawlessly on the firewalls with an in service upgrade.   I'll be upgrading core1 in the morning as that doesn't support in service upgrades yet and requires both switches to reboot simultaneously.  As soon as that is complete and I verify failover there, this env will be signed off on completely.
:casey, when you say both switches need to reboot simultaneously, are you talking about the 10Gb switches that the scl3 HCI ESX hosts and NetApp connect to? If so, I will need to shut down the VMs that are running. They're not in production yet, but that doesn't mean we want to risk corrupting their filesystems.
:lerxst and I squared this away tonight. other than miscellaneous small things like flows (if needed), there should be no major changes to this env. leaving open until friday to make sure switches/firewalls are stable on this new code.
i also took system snapshots, post upgrade.
I believe this is notice that corp.scl3 has netops seal of approval.  All provisioning work is complete and the environment is stable.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.