Closed
Bug 1079885
Opened 11 years ago
Closed 11 years ago
fw1.tier2.yvr1 is flapping
Categories
(Infrastructure & Operations Graveyard :: NetOps: Office Carrier, task)
Infrastructure & Operations Graveyard
NetOps: Office Carrier
x86
macOS
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Usul, Assigned: dcurado)
References
Details
No description provided.
Assignee | ||
Comment 1•11 years ago
|
||
working on this
Assignee: network-operations → dcurado
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•11 years ago
|
||
oh yeah, it's terago -- the wireless ISP... (sad clown face)
Customer Name: MZ Canada
Location Code: MZCA2
Customer Experience Coordinator at: 1-866-TeraGo2 (1-866-837-2462)
last time this happened i called and they "reset their radio" and that fixed it.
i opened ticket 154747
They asked for a log of all the up/down events so they could look at the
time stamps, because they don't see any issues on their side.
I have done a cut and paste from IRC, and will edit out all the other stuff,
and will mail it to them, noc@terago.ca, with the ticket number in the subject line.
They said they will call back within 2 hours. I'm going to hold them to that.
Assignee | ||
Comment 3•11 years ago
|
||
Terago did in fact call me back, and was able to prove out their entire network as not being
lossy. So
a) I pointed the finger of blame at them without really looking at everything
b) there is a problem between SCL3 and Terago somewhere
I'll see if I can figure out where.
Assignee | ||
Comment 4•11 years ago
|
||
This is starting to look like a problem withing Global-Crossing, aka Level3.
(So I think James may have this correct already...)
If I do a traceroute from fw1.yvr1.mozilla.net, towards admin1.scl3.mozilla.com,
but use a source IP of fw1.tier2.mozilla.net... and got this:
dcurado@fw1.ops.yvr1.mozilla.net> traceroute 63.245.214.130 source 68.179.67.73
traceroute to 63.245.214.130 (63.245.214.130) from 68.179.67.73, 30 hops max, 40 byte packets
1 64.213.70.193 (64.213.70.193) 61.652 ms 61.146 ms 65.845 ms
2 159.63.48.157 (159.63.48.157) 62.187 ms 63.063 ms 72.150 ms
3 * * *
4 * * 208.178.58.82 (208.178.58.82) 66.437 ms
5 213.155.131.77 (213.155.131.77) 63.486 ms 69.704 ms 68.693 ms
6 62.115.138.195 (62.115.138.195) 66.892 ms 213.155.135.187 (213.155.135.187) 69.347 ms 213.155.134.103 (213.155.134.103) 66.842 ms
7 62.115.8.162 (62.115.8.162) 65.816 ms 66.259 ms 87.889 ms
8 63.245.219.162 (63.245.219.162) 73.442 ms 66.636 ms 64.511 ms
9 63.245.214.45 (63.245.214.45) 76.333 ms 67.825 ms 79.370 ms
by hop 9, we're inside our network.
The problem appears to start at hop 3, but who is that?
So I did a traceroute from admin1.yvr1.mozilla.com to admin1.scl3.mozilla.com, and
got this:
traceroute to admin1.scl3.mozilla.com (63.245.214.130), 30 hops max, 60 byte packets
1 fw1.private.yvr1.mozilla.net (10.244.75.1) 1.232 ms 0.833 ms 1.074 ms
2 64.213.70.193 (64.213.70.193) 2.100 ms 3.444 ms 2.986 ms
3 so-2-0-0.ar2.sea1.gblx.net (159.63.48.157) 30.971 ms 31.149 ms 31.001 ms
4 ae5-120G.ar7.DAL2.gblx.net (67.16.166.41) 53.327 ms ae6-120G.ar7.DAL2.gblx.net (67.16.166.49) 47.071 ms 51.629 ms
5 telia-2.csr1.DAL2.gblx.net (208.178.58.82) 53.610 ms 53.247 ms 53.644 ms
6 las-b21-link.telia.net (213.248.80.13) 82.937 ms 83.167 ms 82.871 ms
7 sjo-bb1-link.telia.net (62.115.138.191) 81.045 ms sjo-bb1-link.telia.net (62.115.138.193) 80.684 ms sjo-bb1-link.telia.net (62.115.138.191) 80.556 ms
8 mozilla-ic-155747-sjo-bb1.c.telia.net (62.115.8.162) 76.872 ms 80.452 ms 79.591 ms
9 xe-0-0-1.border2.scl3.mozilla.net (63.245.219.162) 120.943 ms 106.340 ms 101.749 ms
10 v-1127.core2.scl3.mozilla.net (63.245.214.45) 82.089 ms 81.972 ms 82.263 ms
So it appears to be global-crossing getting back to Terago that is the issue.
Of course, we don't actually know that path. We'd have to be sitting inside
of Level3, traceroute'ing back toward fw1.tier2.yvr1.mozilla.net
Let me see if I can find a looking glass within their network that allows pings to go out.
Assignee | ||
Comment 5•11 years ago
|
||
We switched the Internet access for YVR1 over to our backup link there.
Traffic the US west coast from YVR now does not go through Level3, but... there is
still loss. Now inside of above.net.
My guess (and it's just that, a guess) is that there is a fiber cut in the US NW that
is impacting multiple providers.
=-(
Assignee | ||
Comment 6•11 years ago
|
||
We still saw packet loss today, going through our backup provider.
However, after switching back to our primary provider, the problems appear to be resolved.
I am closing this as resolved. Please re-open if this bug should not be put to bed yet.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•3 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•