Closed Bug 1040297 Opened 6 years ago Closed 5 years ago

Connectivity issues from Verizon FiOS to Mozilla and other sites

Categories

(Infrastructure & Operations :: NetOps, task, major)

x86_64
Linux
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jesup, Assigned: arzhel)

References

Details

Attachments

(1 file)

ryanvm/mreavy/asuth/myself have investigated the FiOS problems getting to v.mozilla.com and other mozilla addresses. It appears to be a peering issue at the alter.net/teliasonera-gw.customer.alter.net (152.179.50.234) gateway (alter.net is verizon's backhaul unit). Can you apply pressure to telia or mozilla's ISP?

using mtr, I've seen loss as high as 70% at that gateway. Downloading from box.com (also in San Jose) I see 1/2 second of traffic, then box stops seeing my ACKs (just periodically retransmits, per Wireshark). After transferring 300-500KB of 10MB Vidyo download from box.com, it fails after a few minutes

All of us are in suburban philly on FiOS 75/35.
From Yammer:

c'ing Randell Jesup who also reported the problem and probably uses vidyo more than me.

On 07/17/2014 02:36 PM, James Barnell wrote:
> 1.  Where are connecting to/from?  (Home/Office)

Home.

> 2.  If this is in your home what is your local ISP?

Verizon FIOS, 70 megabits down, 35 megabits up, all computers connected via ethernet to a GigE switch to my (GigE) router (DD-WRT), the router is connected to the FIOS box via ethernet too (not coax).

I have enabled responding to pings on my router's WAN interface. You should be able to ping me, if it's helpful, at pool-173-49-137-94.phlapa.fios.verizon.net

> 3.  Are you connected to any type of VPN while you are experiencing these problems?

No.

> 4.  What operating system do you run on your computer?

I use a Windows 7x64 laptop for vidyo after the linux approach proved too flakey (USB sound subsystem seemed to fall over).  I definitely experienced a hard call drop earlier this morning where none of the other participants were affected, and have been experiencing some flakeyness in terms of connecting to calls over the past few days, although some of that may be related to the machine coming out of suspend and being slow/dumb about re-establishing its DHCP lease, etc.  Specifically, the contact list has been laggy in updating / doesn't want to let me join calls.

I've also been seeing slow downloads/connections from git.mozilla.org and ftp.mozilla.org (of nightlies so it doesn't get bouncered-bounced).  The former I tracerouted and definitely was also in an SCL data-center, didn't look into the latter so much.  In some cases just restarting the ftp.mozilla.org download made things go faster.


> 5.  Could you please send me a traceroute from your workstation to v.mozilla.com
> 6.  If you happen to have an application called mtr the output from that would be terrific as it will show we we may of loss.

$ mtr -w -c 100 v.mozilla.com

Start: Thu Jul 17 14:42:21 2014

HOST: icona                                   Loss%   Snt   Last   Avg  Best  Wrst StDev

  1.|-- 192.168.1.1                              0.0%   100    0.2   0.2   0.1   0.3   0.0

  2.|-- L100.PHLAPA-VFTTP-82.verizon-gni.net     0.0%   100    4.5   9.1   2.7 283.0  29.9

  3.|-- G0-10-2-0.PHLAPA-LCR-21.verizon-gni.net  0.0%   100    7.5   6.8   2.6  15.1   2.1

  4.|-- ae13-0.PHIL-BB-RTR1.verizon-gni.net      0.0%   100    3.3  18.5   2.7 133.0  24.4

  5.|-- 0.xe-3-0-1.XL3.IAD8.ALTER.NET            0.0%   100   10.9  15.9   7.9  68.3  14.3

  6.|-- TenGigE0-4-1-0.GW1.IAD8.ALTER.NET        0.0%   100   11.3  14.1  10.3  30.7   4.4

  7.|-- teliasonera-gw.customer.alter.net       85.0%   100   12.5  11.6  10.6  12.8   0.4

  8.|-- ash-bb3-link.telia.net                   0.0%   100   12.2  12.5   8.0  63.4   6.8

  9.|-- sjo-bb1-link.telia.net                   0.0%   100   72.8  75.9  72.8  95.7   2.8

 10.|-- mozilla-ic-155747-sjo-bb1.c.telia.net    0.0%   100   82.4  82.2  78.1 125.2   5.4

 11.|-- xe-0-0-1.border2.scl3.mozilla.net        0.0%   100   88.7  89.5  87.0 107.4   2.8

 12.|-- v-1027.core1.scl3.mozilla.net            0.0%   100   81.7  78.2  75.1  82.9   1.8

 13.|-- v.mozilla.com                            1.0%   100   88.5  88.9  87.6  90.3   0.4

$ mtr -w -c 300 v.mozilla.com

Start: Thu Jul 17 14:46:27 2014

HOST: icona                                   Loss%   Snt   Last   Avg  Best  Wrst StDev

  1.|-- 192.168.1.1                              0.0%   300    0.3   0.2   0.1   0.3   0.0

  2.|-- L100.PHLAPA-VFTTP-82.verizon-gni.net     0.0%   300    4.7   9.7   2.7  87.2  15.0

  3.|-- G0-10-2-0.PHLAPA-LCR-21.verizon-gni.net  0.0%   300   10.1   6.9   2.8  17.6   2.0

  4.|-- ae13-0.PHIL-BB-RTR1.verizon-gni.net      0.0%   300    3.2  16.7   2.8  86.0  19.9

  5.|-- 0.xe-3-0-1.XL3.IAD8.ALTER.NET            0.0%   300    8.6  14.3   7.8  68.8  10.1

  6.|-- TenGigE0-6-0-0.GW1.IAD8.ALTER.NET        0.0%   300   11.6  14.3  10.3  34.0   4.3

  7.|-- teliasonera-gw.customer.alter.net       45.3%   300   11.8  12.3  10.3  68.4   5.6

  8.|-- ash-bb3-link.telia.net                   0.0%   300   12.2  12.5   7.8  62.9   5.8

  9.|-- sjo-bb1-link.telia.net                   0.0%   300   75.2  75.6  72.8  88.8   1.6

 10.|-- mozilla-ic-155747-sjo-bb1.c.telia.net    0.0%   300   80.7  81.1  77.8 105.2   2.8

 11.|-- xe-0-0-1.border2.scl3.mozilla.net        0.0%   300  101.2  94.8  86.8 138.1   8.4

 12.|-- v-1027.core1.scl3.mozilla.net            0.0%   300   76.7  78.2  75.2  82.7   1.8

 13.|-- v.mozilla.com                            0.3%   300   88.9  89.0  87.6  90.3   0.4

traceroute to v.mozilla.com (63.245.215.252), 30 hops max, 60 byte packets

 1  192.168.1.1 (192.168.1.1)  0.117 ms  0.134 ms  0.178 ms

 2  L100.PHLAPA-VFTTP-82.verizon-gni.net (98.111.140.1)  4.996 ms  4.997 ms  5.002 ms

 3  G0-10-2-0.PHLAPA-LCR-21.verizon-gni.net (130.81.183.76)  7.244 ms  7.262 ms  7.317 ms

 4  ae13-0.PHIL-BB-RTR1.verizon-gni.net (130.81.163.132)  5.166 ms ae0-0.PHIL-BB-RTR1.verizon-gni.net (130.81.209.178)  65.078 ms 130.81.199.18 (130.81.199.18)  65.059 ms

 5  0.xe-6-1-1.XL3.IAD8.ALTER.NET (152.63.3.245)  12.219 ms 0.xe-3-1-0.XL3.IAD8.ALTER.NET (152.63.3.65)  12.237 ms 0.xe-3-0-1.XL3.IAD8.ALTER.NET (152.63.3.61)  12.207 ms

 6  TenGigE0-4-2-0.GW1.IAD8.ALTER.NET (152.63.38.254)  12.284 ms TenGigE0-4-1-0.GW1.IAD8.ALTER.NET (152.63.38.246)  11.694 ms TenGigE0-4-0-0.GW1.IAD8.ALTER.NET (152.63.32.233)  17.630 ms

 7  teliasonera-gw.customer.alter.net (152.179.50.234)  12.487 ms  12.501 ms  12.456 ms

 8  ash-bb4-link.telia.net (80.91.252.98)  12.542 ms ash-bb4-link.telia.net (213.155.133.228)  12.578 ms ash-bb3-link.telia.net (80.91.252.88)  12.416 ms

 9  sjo-bb1-link.telia.net (213.155.130.211)  77.816 ms sjo-bb1-link.telia.net (80.91.248.188)  79.916 ms sjo-bb1-link.telia.net (213.155.135.159)  77.785 ms

10  mozilla-ic-155747-sjo-bb1.c.telia.net (62.115.8.162)  82.479 ms  82.450 ms  82.469 ms

11  xe-0-0-1.border2.scl3.mozilla.net (63.245.219.162)  89.864 ms  89.846 ms  89.764 ms

12  v-1027.core1.scl3.mozilla.net (63.245.214.73)  79.652 ms  79.621 ms  79.594 ms

13  v.mozilla.com (63.245.215.252)  86.859 ms  86.828 ms  86.782 ms


Andrew
Comment from public thread at dslreports:

Hmmmm...I've seen that peering point like that in my traces to Netflix (e.g. teliasonera-gw.customer.alter.net [152.179.21.42]). They use Telia in addition to others. Maybe the Netflix "situation" is spilling over to the sites you're trying to access?
Assignee: network-operations → arzhel
Reply from Telia:

-------------------
Hi Arzhel,

Have your end user been in contact with Verizon regarding this?
This is what we can see from our side, standing on the teliasonera-gw.customer.alter.net.
ash-b2_re0> show route pool-173-49-137-94.phlapa.fios.verizon.net

inet.0: 554582 destinations, 1504427 routes (554487 active, 33 holddown, 23944 hidden)
+ = Active Route, - = Last Active, * = Both

173.49.0.0/16      *[BGP/170] 7w3d 08:00:59, MED 0, localpref 100
                      AS path: 701 I
                    > to 152.179.50.233 via ae4.0

Using the 152.179.50.234 as source;
ash-b2_re0> traceroute pool-173-49-137-94.phlapa.fios.verizon.net source 152.179.50.234
traceroute to pool-173-49-137-94.phlapa.fios.verizon.net (173.49.137.94) from 152.179.50.234, 30 hops max, 40 byte packets
 1  TenGigE0-0-0-10.GW1.IAD8.ALTER.NET (152.179.50.233)  5.969 ms  12.397 ms  11.979 ms
 2  P0-15-0-0.PHLAPA-LCR-22.verizon-gni.net (130.81.199.21)  10.475 ms  10.109 ms  8.254 ms
 3  * * *
 4  * * *
ash-b2_re0> ping pool-173-49-137-94.phlapa.fios.verizon.net source 152.179.50.234 rapid count 1000 size 1472
PING pool-173-49-137-94.phlapa.fios.verizon.net (173.49.137.94): 1472 data bytes
--- pool-173-49-137-94.phlapa.fios.verizon.net ping statistics ---
1000 packets transmitted, 1000 packets received, 0% packet loss
round-trip min/avg/max/stddev = 14.788/19.034/82.797/6.734 ms

I'm unable to replicate the packet loss seen from the MTR.
Does your end user still see this problem? Under which timespan was this previously present?
-------------------

Can you answer those questions?
Attached image mtr.png
mtr is a pain to capture text from... sorry.  Time given is EDT (just before 12:30pm EDT)
I have contacted verizon.  They said "no known problems" and to check back in a day.  Still seeing the problem (though it does disappear at times; generally not for long).
Bumping severity

I've found this also affects downloading nightly updates from Help -> About Nightly and from nightly.mozilla.org (fails at <1MB).

Aurora downloads quickly; my guess is the difference is CDN, so Aurora/Beta/Release users are probably ok (or are shortly after we release).

Perhaps we can push the nightly-latest (whatever nightly.mozilla.org/auto-updates/Help pull from) to the CDN?
Severity: normal → major
I use Aurora on my phone and it stalls out foreboding updates over my home Wi-Fi pretty regularly. It even did today :(
s/foreboding/downloading of course. Silly autocorrect.
Randall et al:  This looks to be symptomatic of the current Verizon dispute with its peers.  We've opened a ticket with Telia although they are batting it back to Verizon.  I'm closing this as its not something we'll be able to correct.  For some other information here is a link that should show you some of this bickering:  
http://blog.level3.com/global-connectivity/verizons-accidental-mea-culpa/
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
For those using linux, it might be worth tweaking your TCP stack to retransmit more aggressively and (selfishly) ignore router congestion guidance.  This seems to have made a git checkout go from 10-16KB/s jump to 30-60KB/s for me, but who knows.

I made a shell script to apply the changes since I'm not sure I want to add them to /etc/sysctl.conf since I'm blindly trying things:

#!/bin/sh

# docs at https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt

# maybe encourage faster retransmits;
# http://whitequark.org/blog/2011/09/12/tweaking-linux-tcp-stack-for-lossy-wireless-networks/
# suggests that's the result.  The docs on this are very limited.
sysctl net.ipv4.tcp_low_latency=1

# retransmit earlier
sysctl net.ipv4.tcp_early_retrans=1

# ignore congestion guidance from routers
sysctl net.ipv4.tcp_ecn=0
(In reply to James Barnell from comment #10)
> Randall et al:  This looks to be symptomatic of the current Verizon dispute
> with its peers.  We've opened a ticket with Telia although they are batting
> it back to Verizon.  I'm closing this as its not something we'll be able to
> correct.  For some other information here is a link that should show you
> some of this bickering:  
> http://blog.level3.com/global-connectivity/verizons-accidental-mea-culpa/

Reopening:
a) still happening, regardless of who is at fault
b) Major impact on users of nightly/aurora - from affected parts of the internet, downloads of nightly and aurora are virtually impossible (from nightly.mozilla.org/aurora.mozilla.org).  Typically you get <1MB before the download fails.
c) likely problems with background updates, though those may slip through due to throttling and download restarts.

It has MAJOR impacts on developers, even to having trouble getting bugzilla pages to load quickly, but mostly with downloading tbpl/try builds, loading tbpl/pushlogs, and doing hg repo updates (I had to try for most of a day to get my Win32 machine to update inbound after a long gap - kept failing and rolling ack).
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
I pinged you, but you were off apparently.  Wanted to make sure you know about this issue, and inform anyone else who should know/care.  Affects some reasonable percentage of Verizon FiOS customers - those of us in the philly area, DC, likely many others
Flags: needinfo?(lmandel)
For those who are interested in a workaround and don't have their own already, I cleaned up my linode VM's openvpn configuration and migrated the VM from their Newark data-center (which along with their Atlanta and Fremont data-centers all suffer from the same alter.net -> telia.net route we hit when we access Moz SCL-ish servers) to their Dallas data-center.

It's not a silver bullet since now the first hop out of Verizon for my VM is to level3 (ae16.edge1.washingtondc12.level3.net) who Verizon is also not super-friends with.  But at least if things are sucking, it's an option that can be switched to in the hopes it's better.  Of course after I finished setting this up today, things seemed to be much better, but who knows.  (They were definitely horrible earlier today and last night.)

Anywho, my linode has bandwidth to burn (3 TB) so if affected people would like, I can pretty easily provide a zip/tgz of the keying material and an .ovpn-style config file plus tips on how to make NetworkManager on Ubuntu work.  Notes/reqs are:
- I need to know how many machines to produce keys for since each key is limited to 1 session
- My server config currently pushes itself as the default route and DNS server; that can likely be overridden on the client but I can probably also make the server stop doing that.  That choice stems from living in Canada for a while and trying to outwit hulu where you really want the DNS lookup coming the US, but it's also handy on sketchy wi-fi/ISPs.
- No doing stuff that gets the man all up in my business.
Tyler, this seems like something your team should be aware of.
Flags: needinfo?(tdowner)
I havne't seen any user feedback on this, but we will keep our eyes out. Thanks!
Flags: needinfo?(tdowner)
I don't have anything to add but will clear the ni on me as I am aware and will be discussing this with a larger group tomorrow.
Flags: needinfo?(lmandel)
Any word?

Right now I can't even update my Aurora release builds most of the time.  I couldn't even successfully download the Aurora stub installer (~440KB).  Let's not talk about bugzilla, hg repo access, tbpl...

It it was just a few devs, that's one thing.  But I believe this is affected a large number of in-the-field users, especially those using Aurora and Nightly, and community members accessing Mozilla resources (like bugzilla, tbpl, etc).  (We know it's affecting FiOS customers in the Philadelphia and DC areas at least, from talking to other Mozilla employees on this bug.)
Ignoring the larger problem (which is very serious and would be good to workaround to any extent possible, don't know if aurora uses bouncer and can use a network Verizon isn't playing hardball with), I do want to comment that my linode VPN is working great for me; downloads of b2g-desktop are back down to ~20 seconds instead of ~20 minutes with a chance of timeout.  I'm still happy to provide openvpn keys/configs to affected MoCo/MoFo employees (and maybe vouched mozillians? I just don't want to be running a VPN service) who want to at least give it a try.  Just email me privately at asuth@mozilla.com and I'll (rather insecurely) email you the zipfiles back.
Please use first the Offices OpenVPN jumphosts: https://intranet.mozilla.org/Office_JumpHost#Common_Files
(use the "Legacy config for offices" file).

Could someone on Verizon give us a temporary shell, so we can run more tests?

Thanks
Hm, yeah, the SFO jump host transitions from Verizon's alter.net to gblx.net which was Global Crossing but now is Level 3.  Fewer hops too, although obviously higher latency if your packet didn't really want to go all the way to California.  But I think it does for most Moz data-centers.  So that should be fine if people want to use that.  (Maybe better than my VPN?  I'm seeing noticeable loss to my VM on one of level3's internal routers with mtr, but it's not impacting pinging my actual VM, so maybe it's an internal misconfiguration on their part?)

 7. 0.ae1.BR1.IAD8.ALTER.NET             0.0%   611   12.9  11.7  10.2  38.8   2.5
 8. te2-3.ar6.DCA3.gblx.net              0.0%   610    7.8  16.7   7.6 232.1  28.4
 9. mozilla.vlan426.asr1.sfo1.gblx.net   0.3%   610   87.5  91.9  85.1 202.9  18.3
10. 146.82.185.164                       1.3%   610   84.9  89.1  80.4 196.1  18.3
Some more data points after getting access to a machine hosted on Jesup's side.

wget http://ftp.mozilla.org/pub/mozilla.org/mozilla/VMs/CentOS5-ReferencePlatform.tar.bz2 -O- > /dev/null
Reaches 12.1KB/s max when I first tested it.
But running the exact same test a few minutes later shows speed around 9.94MB/s (and stays like that for at least a couple hours)

Also maybe just a coincidence but the 2 times I tried to download the same file but replacing http:// with ftp:// the download speed went back to normal (I have to wait for the connection to be degraded to be able to test it again).

While 
wget http://ping.online.net/10000Mo.dat -O- > /dev/null
is always fine: 9.45MB/s

Both downloads are http and use the same transit/peering (Verizon, then Telia)

mozilla@home ~ $ mtr ftp.mozilla.org -w --report -c 100
Start: Thu Aug 21 02:07:04 2014
HOST: home                                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 192.168.1.1                             84.0%   100    0.5   0.5   0.5   0.6   0.0
  2.|-- L100.PHLAPA-VFTTP-82.verizon-gni.net     0.0%   100    5.2  10.4   3.5 436.4  43.4
  3.|-- G0-10-2-1.PHLAPA-LCR-21.verizon-gni.net  0.0%   100    5.7   8.4   5.7  20.4   2.1
  4.|-- ae11-0.PHIL-BB-RTR1.verizon-gni.net      0.0%   100    6.2  21.5   5.9  88.7  22.9
  5.|-- 0.xe-6-1-1.XL3.IAD8.ALTER.NET            0.0%   100   11.6  17.7  11.1  96.0  14.0
  6.|-- TenGigE0-6-0-2.GW1.IAD8.ALTER.NET        0.0%   100   17.0  16.7  11.2  31.1   3.8
  7.|-- teliasonera-gw.customer.alter.net       12.0%   100   12.8  17.5  11.8  47.4   5.9
  8.|-- ash-bb4-link.telia.net                   0.0%   100   12.9  15.8  10.9  73.3  12.5
  9.|-- sjo-bb1-link.telia.net                   0.0%   100   78.8  77.4  75.5  88.2   1.3
 10.|-- mozilla-ic-155747-sjo-bb1.c.telia.net    0.0%   100   76.3  77.4  73.8  98.8   3.0
 11.|-- xe-0-0-1.border2.scl3.mozilla.net        0.0%   100   92.5  92.6  90.9 112.4   2.1
 12.|-- v-1027.core1.scl3.mozilla.net            2.0%   100   90.4  90.7  88.3  95.4   1.4
 13.|-- ftp1-zlb.vips.scl3.mozilla.com           1.0%   100   89.3  87.2  85.8  90.4   0.7

mozilla@home ~ $ mtr ping.online.net -w --report -c 100
Start: Thu Aug 21 02:07:26 2014
HOST: home                                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 192.168.1.1                              1.0%   100    0.5   0.7   0.5   1.0   0.0
  2.|-- L100.PHLAPA-VFTTP-82.verizon-gni.net     0.0%   100    5.0   6.3   3.2 104.7  10.7
  3.|-- G0-10-2-0.PHLAPA-LCR-21.verizon-gni.net  0.0%   100    8.4   8.4   5.8  17.6   1.9
  4.|-- ae13-0.PHIL-BB-RTR1.verizon-gni.net      0.0%   100    5.6  20.0   3.5 118.8  24.0
  5.|-- 0.xe-3-0-1.XL3.IAD8.ALTER.NET            0.0%   100   28.6  15.7  10.9  64.3   9.8
  6.|-- TenGigE0-6-0-3.GW1.IAD8.ALTER.NET        0.0%   100   11.5  16.6  11.2  23.1   3.3
  7.|-- teliasonera-gw.customer.alter.net       32.0%   100   12.0  13.8  11.0  70.2   7.6
  8.|-- ash-bb4-link.telia.net                   0.0%   100   12.8  13.5  11.1  58.9   5.8
  9.|-- prs-bb2-link.telia.net                   0.0%   100   92.8  99.7  90.9 246.7  25.2
 10.|-- prs-b8-link.telia.net                    0.0%   100   93.3  94.0  91.0 115.0   3.3
 11.|-- online-ic-305116-prs-b8.c.telia.net      0.0%   100   98.7 100.3  98.1 111.8   1.7
 12.|-- 45x-s44-2-a9k2.dc3.poneytelecom.eu       0.0%   100  104.5 102.7  98.5 208.4  10.9
 13.|-- ping.online.net                          1.0%   100   91.2  92.2  90.7  93.7   0.6

When the performances are bad, there are very high packet loss ( 84.0% ) on the first hop (192.168.1.1) while it's fine when the download goes full speed.
> When the performances are bad, there are very high packet loss ( 84.0% ) on the first hop (192.168.1.1) while it's fine when the download goes full speed.

Note that at least three other people on FiOS (2 others in Philly burbs, 1 in DC) working for mozilla have the same problem, so I don't think it's a local LAN problem.  (and it clearly tracks the far-end source/destination)

If the peering point is overloaded (or if someone is traffic-shaping/etc), ftp and UDP may well be in different traffic queues from HTTP/HTTPS. I've noticed that Vidyo/Loop/WebRTC (UDP media traffic) tend work work fine to locations affected, though Vidyo can have trouble connecting or drop you in mid-call, likely because a long-poll HTTP(S) failure.

Over this matches what I see.
(In reply to Arzhel Younsi [:XioNoX] from comment #22)
> When the performances are bad, there are very high packet loss ( 84.0% ) on
> the first hop (192.168.1.1) while it's fine when the download goes full
> speed.

Note that I think this just happens if you run 2+ mtr jobs in parallel.  Specifically, if I run one job, things are fine.  Then I run a second job (in a different terminal tab), the first job starts missing all 192.168.1.1 pings.  If I stop the second job, the first job gets all its pings again.  I am using an OpenWRT router (rather than the crappy router Verizon hands out) over all ethernet, no wireless involved.

In agreement with :jesup, things can be intermittent and some type of shaping somewhere would not surprise me.  Specifically, sometimes I get full speed (without VPN) of like ~10MiB/sec downloads, sometimes I'm lucky to get ~15k or ~50k/sec.  Hopefully things are improving as Verizon and Netflix complete their build-out.
wget http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/firefox-34.0a1.en-US.mac.dmg -O- > /dev/null

two runs, seconds apart.  First: ramped up to 5.5MB/s and 8M transferred and I killed it.  up-arrow, return.  Started around 40KB/s, quickly down below 15.  Stalled out and failed at 3%/3MB transferred.

3 more runs: two start fast, third quickly drops under 20KB/s, and keeps dropping.
Something I saw yesterday that might be useful or just interesting is Brendan Gregg's perf-tools project (https://github.com/brendangregg/perf-tools) has a "tcpretrans" script, described as:
net/tcpretrans: show TCP retransmits, with address and other details.

example run at:
https://github.com/brendangregg/perf-tools/blob/master/examples/tcpretrans_example.txt

I think inevitably the problem is resulting in retransmits, so I think the utility might be limited.  Although it might make it more obvious if it's flow/route specific versus a comprehensive router sadness.
Duplicate of this bug: 1061077
This issue is affecting me as well. For the past couple of weeks, I couldn't pull down any code from hg.mozilla.org. It would fail early on. Downloads from ftp.mozilla.org take forever as well. I had to connect to my university's VPN to get working downloads. Something is seriously up with the peering here.

FWIW I am also using Verizon FiOS with 70mbps down in Virginia, and according to a traceroute, I also have to go through teliasonera-gw.customer.alter.net (152.179.50.234).
The problems that Verizon is creating through its peering policies are getting more and more 
attention.  The people who are reporting the issues in this bug are not alone with their
headaches.
The problem for Mozilla's network engineers is this:
We don't buy anything from Verizon, so we have no basis to call or write them to
complain.  Even if we could do that, Verizon would not lift a finger to address
our complaints, as there is no benefit in it for them.

The people on this bug who do buy service from Verizon, can at least call
them and complain, as you are paying customers.

That said, I saw one engineer on NANOG say that he used the following hack:
 - If their end user on Verizon had a static (or at least stable) IP address
   from Verizon, they could add a static route on their border routers forcing
   the traffic out through a different provider
 - This is not a guarantee that the packets from Mozilla back to our friends
   using Verizon FiOS will avoid the congestion point, but it might
 - Keep in mind that this means your packets *to* Mozilla will still hit that
   congestion point, but there may not be congestion in the transmit direction
 - The data flows will be routed asymmetrically, but that's OK.  A great deal
   of traffic travels asymmetrically.

So, for those of you on this bug that are feeling download pain.  If you'd like to
try an experiment, let me know?  I'll work with you.  We'll figure out your public
IP, and set up a time to do some tests.  We have two other providers other than Telia,
and we can try creating static routes via one, and then the other, to see if it
helps or not.

Thanks -- Dave
Flags: needinfo?(rjesup)
(In reply to Dave Curado :dcurado from comment #29)
> The problems that Verizon is creating through its peering policies are
> getting more and more 
> attention.  The people who are reporting the issues in this bug are not
> alone with their
> headaches.
> The problem for Mozilla's network engineers is this:
> We don't buy anything from Verizon, so we have no basis to call or write
> them to
> complain.  Even if we could do that, Verizon would not lift a finger to
> address
> our complaints, as there is no benefit in it for them.

> 
> The people on this bug who do buy service from Verizon, can at least call
> them and complain, as you are paying customers.

Can you point us to reports we can reference?

And can you tell us how to apply what pressure we can as customers?  (I can open a new ticket, but they'll start by "we'll monitor your connection" unless I push and can reference something.)  Obviously, I can point here.

Also: since this is affecting us as an organization, can you contact press and see if it makes sense to have press contact verizon, and also consider a press release (or talk to other companies affected, and talk about some joint pressure/press-release?  Telia may be able to point us to other affected orgs or press coverage might.


> 
> That said, I saw one engineer on NANOG say that he used the following hack:
>  - If their end user on Verizon had a static (or at least stable) IP address
>    from Verizon, they could add a static route on their border routers
> forcing
>    the traffic out through a different provider

FiOS IPs are somewhat stable, but not static for non-business subscribers.  (And I don't know for sure if they still give 5 static IPs for business.)

Thanks for updating us
Flags: needinfo?(rjesup)
Here is a thread from the NANOG mailing list that discusses this...
http://mailman.nanog.org/pipermail/nanog/2014-August/069467.html

I don't think you need any help with regards to understanding how to
apply pressure to a service provider. =-) 
You're absolutely correct though -- I suspect they have a canned answer of,
"we'll monitor your connection"... they may monitor your connection, but that's
not where the problem lies.  The problem lies with their connections to other providers.

My feeling is that as a network engineer, contacting the press or making press releases
falls way outside my job description.  Actually, way way outside my job description.
If Mozilla wants to take on Verizon in the press, I think that is something that would
have to be taken up by more qualified parts of our organization (and vetted by legal).

FWIW, Verizon has already received a lot of bad press WRT their lack of performance
with NetFlix.  My guess is that Verizon isn't listening, but I don't actually know that.

If you'd like to try to experiment with having us route traffic from our data centers back
to you via a different provider, let me know your current external IP, and I'd be glad to
set up a time to try it with you.
cc Kev, who has already followed up with one contact at Verizon with no success. He is going to try an alternate route. I am looking for additional contacts at Verizon in parallel.
Okay, so I discovered something really strange. Under Windows 7 64 bit, I have terrible download speeds from ftp.mozilla.org and the Mozilla hg repos. Under Linux Mint 17 64 bit on the same machine, same network, I get normal download speeds.

Could this be an issue related to TCP transmission settings?
That is great info Quentin!

I am not a windows user at all, but I did google around a bit, and found this
somewhat wild thread... (it's quite long)
http://social.technet.microsoft.com/Forums/windows/en-US/4537c7b6-9761-41c5-8b47-0ecb831c8575/why-is-windows-7-so-slow-in-copying-network-files?forum=w7itproperf

There's a lot of noise in the thread, but scanning through it, I did find a few posts
saying, "I fixed it, and here's how..."  Maybe that can help.
(In reply to Quentin Headen [:qheaden] from comment #33)
> Okay, so I discovered something really strange. Under Windows 7 64 bit, I
> have terrible download speeds from ftp.mozilla.org and the Mozilla hg repos.
> Under Linux Mint 17 64 bit on the same machine, same network, I get normal
> download speeds.

Could be.  Note that the problems I've encountered have been intermittent and random-seeming throughout the day.  Unless there's something going on where one OS is using IPv6 and the other isn't and your network uses IPv6 in a way that affects routing (https://blog.mozilla.org/it/2012/05/24/ipv6-is-here/ indicates Moz got IPv6 a while ago, but it sounds like Verizon only supports it if you yell at them and you're lucky; apparently you can test at http://test-ipv6.com/), I'd think it's conceivable that it's just coincidence.  Certain things could correlate, however.  Like if you're booting to Windows when people are getting home from work and firing up Netflix.

One way to gather data, if you're interested, is probably just to try pinging the relevant mozilla servers on each OS.  If you experience similar packet loss at similar times but your downloads only suck on Windows, maybe Linux Mint has much better transmission settings.  In my investigation into settings and playing with my settings, I was able to get better performance, but things were still ridiculously slow, so I don't think that would be it on its own.

For pinging, I'd suggest using http://oss.oetiker.ch/smokeping/.  I've been running it and I can see in my logs that when I switch to my workaround VPN from no-vpn the ping packet loss rate from my home to git.mozilla.org/ftp.mozilla.org goes from 1/20 or 2/20 to 0/20 and sometimes it comes back when I go off the VPN too.  But sometimes things are green either way.
I'll note tests (on Fedora) by IT from my machine here and my own tests will show in a period of seconds, individual wget's will be slow or fast, apparently randomly.  Note also that the behavior is "sticky" - a single TCP stream is either slow or fast generally; I rarely see a fast-slow-fast/etc type of operation, so the issue is tied to something stateful at the TCP level in the network.

I don't think there's any actual change in behavior here.  Maybe (IPV6, a forgotten VPN-to-office ;-) ) Quentin is avoiding the issue in some way.
Is that still an ongoing issue? I quickly tested it with Randell's test machine and couldn't replicate. But as it's quite random maybe I just got "lucky".
(In reply to Arzhel Younsi [:XioNoX] from comment #37)
> Is that still an ongoing issue? I quickly tested it with Randell's test
> machine and couldn't replicate. But as it's quite random maybe I just got
> "lucky".

Yes.  Downloading builds from ftp.mozilla.org (or Nightly updates from Help->About) is an exercise in "start download, note temp estimate is 2 hrs; hit stop; hit retry; rinse, repeat until the estimate I get lucky and the estimate is 10-30 seconds."  Similar problems with Bugzilla, git updates from codeaurora (last I checked), etc.

Ryan, Andrew, bz, etc: I presume you still have the problem too. 

I re-reported to verizon, but have gotten no action (no surprise).
There is a hack we can try, which may help you.
We can not control how your provider sends your packets to us, but we can modify the way
the packets from Mozilla's data centers head back towards you.

I tested this hack, and it improved network performance.
There are a lot of variables that are not in our direct control, so I can't make any
promises.  However, if you'd like to try, I'd be glad to set it up for you.

If you'd like to do that, please send us the IP address you're coming from.
(the outside interface IP of your home router)  There are a bunch of ways
to get this info if you don't already know it.  
I often tell people to use:  http://ifconfig.me

Thanks,
Dave
Flags: needinfo?(rjesup)
71.175.4.200 -- at the moment.  It does tend to be stable.

Alternatively if that doesn't work I can change my main router to an ASUS RT-66N I have waiting here, and flash it with DD-WRT or Tomato and set up a VPN for addresses that are used by Mozilla (etc); either to a commercial VPN or andrew's.
Flags: needinfo?(rjesup)
OK, how do things look now?
Any better?
Thanks,
Dave
I went ahead and signed up for VyprVPN and it's night and day better.
Quick ping to see if there is any improvement from verizon?
Yes, things have improved on my end in the past few months. I guess all those interconnection deals paid off.
Alright, thanks. Going to close that one. Please reopen if needed.
Status: REOPENED → RESOLVED
Closed: 6 years ago5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.