eg: https://tbpl.mozilla.org/?rev=8803e8c0ee3e https://tbpl.mozilla.org/?rev=c0e6e76aafed https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=124fc1fcd3eb All main non-try trees closed.
Looks like one of the VPN tunnels to us-east-1 went down at 10:27ET. Failed over to the other tunnel.
Tunnel 1 22.214.171.124 UP 2013-10-19 05:21 EDT 1 BGP ROUTES Tunnel 2 126.96.36.199 DOWN 2013-10-23 10:27 EDT IPSEC IS UP
(In reply to Chris AtLee [:catlee] from comment #1) > Looks like one of the VPN tunnels to us-east-1 went down at 10:27ET. Failed > over to the other tunnel. Once we're out of the woods, where do you get to see that failure? I don't see anything on #buildduty. Thanks!
Tunnel 2 looks like it's back up. Tunnel 1 188.8.131.52 UP 2013-10-19 05:21 EDT 1 BGP ROUTES Tunnel 2 184.108.40.206 UP 2013-10-23 11:08 EDT 1 BGP ROUTES Armen, this is from Amazon's VPC dashboard.
@timestamp,@source_host,@message 2013-10-23T14:27:53.000Z,fw1.releng.console.scl3.mozilla.net,%-RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 169.254.255.77 (External AS 7224) changed state from Established to Idle (event HoldTime) this is the only important event that happened between 14:20 and 14:40UTC on our infrastructure. Tunnels to AWS flapping happens regularly (lot more frequently than ipsec to any of our offices for example).
Disconnects haven't occurred since, and the other tree carnage seems to be under control; reopening. Do we have a bug open for increasing the resilience of these VPN tunnels? (I forget)
Severity: blocker → critical
(In reply to Ed Morley [:edmorley UTC+1] from comment #6) > Do we have a bug open for increasing the resilience of these VPN tunnels? (I > forget) Nothing more can be done on our side from what we can identify so far. We did open a new case with Amazon on the issue, and much more details are in Bug 918677. I'm going to close this one out in favor of the investigation going on in there.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 918677
sgtm, thank you :-)
You need to log in before you can comment on or make changes to this bug.