Closed Bug 698503 Opened 14 years ago Closed 13 years ago

[Paris Office] Heavy bandwidth usage slows down the entire office network

Categories

(Infrastructure & Operations Graveyard :: NetOps, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: mounir, Assigned: ravi)

References

Details

I'm currently downloading KUbuntu on my desktop for testing purpose and my coworkers were complaining that the network is really slow. Indeed, I can't see what I'm typing on IRC through SSH and google.fr ping went from 10ms to 500ms and I have 5 and 10% of packet loss on both my laptop (WIFI) and Desktop (wired). Though, I'm not sure the packet loss happens only during heavy network usage. Unfortunately, it seems that the fiber didn't help us...
Assignee: network-operations → ravi
The CIR for GBLX is set too low. I"m trying to expedite with the carrier to increase.
Any update on this? We were 5 at the office this morning with no one doing any task requesting high bandwidth when the ping on google.com was at 600ms during some time...
That by itself is not very helpful or even indicative of anything with the new circuit, but is also very curious. It will be helpful if you can attach output from MTR[1] which can be installed via Homebrew[2] on OSX. The order for increase of bandwidth is in and should hopefully be complete by the top of next week. In the meantime we will investigate if there is anything else going on in the office that is silently consuming bandiwdth. [1] http://www.bitwizard.nl/mtr/ [2] http://mxcl.github.com/homebrew/
Also, do we know whether these problems are both in the wired and wireless networks? I've heard stories about the wifi in the Paris office not exactly being great, but maybe that's been fixed long ago...
Both issues (ping and packet losses) seem to happen when wired and when using wifi.
Next time it happens please note the time in this bug. We're running probes every minute to google.fr and www.mozilla.org. Hopefully we can catch where the issue may be happening.
(In reply to Ravi Pina [:ravi] from comment #7) > Next time it happens please note the time in this bug. We're running probes > every minute to google.fr and www.mozilla.org. Hopefully we can catch where > the issue may be happening. This was happening a few seconds ago.
The window started at or about 20111124T1116 and ended at or about 20111124T1129. Since we have captured data I can get a ticket opened with the vendor. Sample captures to www.mozilla.org and 8.8.8.8 respectively. :::::::::::::: 20111124T1116 :::::::::::::: HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.6 0.6 0.0 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 263.4 279.6 263.4 298.9 16.1 3. MOZILLA.TenGigabitEthernet2-1.ar1.SNV2.gblx.net 0.0% 5 445.5 418.0 404.0 445.5 16.4 4. te-8-2.core2.sjc.mozilla.net 20.0% 5 416.5 428.4 416.5 442.4 10.7 5. moz.org01.generic-zlb.sj.mozilla.com 20.0% 5 441.8 443.8 438.7 455.4 7.9 HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.6 0.7 0.0 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 300.7 272.4 250.3 300.7 20.6 3. 72.14.198.169 20.0% 5 274.8 286.9 274.8 303.2 11.8 4. 209.85.251.40 0.0% 5 309.0 292.9 264.0 316.4 20.8 5. 209.85.253.20 0.0% 5 287.2 297.7 287.2 308.7 8.4 6. 209.85.250.167 0.0% 5 331.4 297.8 270.7 331.4 26.9 7. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0 :::::::::::::: 20111124T1118 :::::::::::::: HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.5 0.4 0.6 0.1 2. vlan402.asr1.cdg3.gblx.net 20.0% 5 290.9 298.4 290.9 306.4 6.3 3. MOZILLA.TenGigabitEthernet2-1.ar1.SNV2.gblx.net 0.0% 5 415.6 423.6 415.6 429.4 5.3 4. te-8-2.core2.sjc.mozilla.net 20.0% 5 441.1 440.9 435.3 448.9 5.8 5. moz.org01.generic-zlb.sj.mozilla.com 20.0% 5 445.7 438.8 415.6 448.5 15.5 HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.5 0.7 0.0 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 275.8 282.6 273.2 292.0 8.4 3. 72.14.198.169 0.0% 5 308.6 300.9 267.1 314.2 19.1 4. 209.85.251.40 20.0% 5 348.1 299.1 263.5 348.1 36.1 5. 209.85.253.20 0.0% 5 286.1 296.7 271.6 334.4 27.5 6. 209.85.250.161 0.0% 5 356.8 311.0 285.0 356.8 27.5 7. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0 :::::::::::::: 20111124T1119 :::::::::::::: HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.5 0.6 0.1 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 656.9 457.2 391.7 656.9 112.6 3. MOZILLA.TenGigabitEthernet2-1.ar1.SNV2.gblx.net 0.0% 5 569.9 577.6 565.8 593.5 11.6 4. te-8-2.core2.sjc.mozilla.net 40.0% 5 567.7 560.8 538.5 576.2 19.8 5. moz.org01.generic-zlb.sj.mozilla.com 60.0% 5 607.2 575.6 543.9 607.2 44.7 HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.7 0.6 0.7 0.0 2. vlan402.asr1.cdg3.gblx.net 20.0% 5 504.5 431.1 388.8 504.5 52.5 3. 72.14.198.169 0.0% 5 482.2 437.8 409.6 482.2 32.6 4. 209.85.250.142 0.0% 5 460.2 422.5 390.9 460.2 29.0 5. 216.239.43.233 0.0% 5 437.9 428.7 420.8 437.9 6.7 6. 209.85.250.165 0.0% 5 442.4 411.5 382.6 442.4 21.8 7. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0 :::::::::::::: 20111124T1120 :::::::::::::: HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.5 0.5 0.5 0.6 0.1 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 534.1 525.3 505.0 544.2 17.3 3. MOZILLA.TenGigabitEthernet2-1.ar1.SNV2.gblx.net 20.0% 5 665.2 667.4 661.8 674.2 5.3 4. te-8-2.core2.sjc.mozilla.net 20.0% 5 684.7 679.6 670.0 692.7 11.0 5. moz.org01.generic-zlb.sj.mozilla.com 20.0% 5 715.2 681.5 660.0 715.2 24.1 HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.5 0.7 0.1 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 528.5 524.1 510.1 538.9 11.7 3. 72.14.198.169 0.0% 5 561.5 536.3 517.0 561.5 22.3 4. 209.85.250.142 0.0% 5 539.6 548.1 527.4 584.4 21.8 5. 216.239.43.233 0.0% 5 571.9 550.7 485.3 582.8 39.8 6. 209.85.250.163 0.0% 5 550.1 539.9 495.5 561.0 25.7 7. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0 :::::::::::::: 20111124T1121 :::::::::::::: HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.5 0.7 0.1 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 496.1 484.8 462.5 508.3 20.7 3. MOZILLA.TenGigabitEthernet2-1.ar1.SNV2.gblx.net 20.0% 5 637.1 632.8 618.5 648.6 13.0 4. te-8-2.core2.sjc.mozilla.net 20.0% 5 656.7 631.3 613.1 656.7 18.3 5. moz.org01.generic-zlb.sj.mozilla.com 40.0% 5 631.7 631.6 625.6 637.4 5.9 HOST: admin1a.private.par1.mozilla.com Loss% Snt Last Avg Best Wrst StDev 1. fw1.private.par1.mozilla.net 0.0% 5 0.6 0.6 0.6 0.6 0.0 2. vlan402.asr1.cdg3.gblx.net 0.0% 5 486.5 488.9 462.2 505.9 16.4 3. 72.14.198.169 0.0% 5 473.5 497.7 473.5 530.0 21.1 4. 209.85.251.40 0.0% 5 498.4 537.3 475.0 630.6 63.7 5. 209.85.253.20 0.0% 5 476.6 488.4 472.9 507.9 14.7 6. 209.85.250.161 0.0% 5 509.4 494.7 476.6 519.2 18.5 7. ??? 100.0 5 0.0 0.0 0.0 0.0 0.0
Status: NEW → ASSIGNED
The 10M cap may still be contributing to perceived slowness, but high loss was captured during the window in comment 9. Bug 705242 was opened to track the case with the vendor Level3 (LVLT).
Issue just happened again. at 4:50PM, Paris time. ping google.com: 64 bytes from 173.194.67.99: icmp_seq=50 ttl=52 time=398.082 ms 64 bytes from 173.194.67.99: icmp_seq=51 ttl=52 time=318.163 ms 64 bytes from 173.194.67.99: icmp_seq=52 ttl=52 time=339.918 ms 64 bytes from 173.194.67.99: icmp_seq=53 ttl=52 time=364.045 ms 64 bytes from 173.194.67.99: icmp_seq=54 ttl=52 time=386.930 ms 64 bytes from 173.194.67.99: icmp_seq=55 ttl=52 time=410.850 ms 64 bytes from 173.194.67.99: icmp_seq=56 ttl=52 time=10.159 ms 64 bytes from 173.194.67.99: icmp_seq=57 ttl=52 time=9.711 ms 64 bytes from 173.194.67.99: icmp_seq=58 ttl=52 time=9.640 ms 64 bytes from 173.194.67.99: icmp_seq=59 ttl=52 time=9.626 ms 64 bytes from 173.194.67.99: icmp_seq=60 ttl=52 time=11.433 ms 64 bytes from 173.194.67.99: icmp_seq=61 ttl=52 time=9.718 ms 64 bytes from 173.194.67.99: icmp_seq=62 ttl=52 time=9.755 ms ^C --- google.com ping statistics --- 63 packets transmitted, 60 packets received, 4.8% packet loss round-trip min/avg/max/stddev = 8.064/174.079/454.501/149.395 ms MacBook-Air-de-Tristan:~ tristan$
Just happened again, at 7:56PM Paris time (10:56AM Pacific) MacBook-Air-de-Tristan:~ tristan$ ping google.com PING google.com (173.194.66.147): 56 data bytes Request timeout for icmp_seq 0 64 bytes from 173.194.66.147: icmp_seq=1 ttl=52 time=409.117 ms 64 bytes from 173.194.66.147: icmp_seq=2 ttl=52 time=431.927 ms 64 bytes from 173.194.66.147: icmp_seq=3 ttl=52 time=280.641 ms 64 bytes from 173.194.66.147: icmp_seq=4 ttl=52 time=376.679 ms 64 bytes from 173.194.66.147: icmp_seq=5 ttl=52 time=399.380 ms 64 bytes from 173.194.66.147: icmp_seq=6 ttl=52 time=422.269 ms 64 bytes from 173.194.66.147: icmp_seq=7 ttl=52 time=342.761 ms 64 bytes from 173.194.66.147: icmp_seq=8 ttl=52 time=309.102 ms 64 bytes from 173.194.66.147: icmp_seq=9 ttl=52 time=388.605 ms 64 bytes from 173.194.66.147: icmp_seq=10 ttl=52 time=304.855 ms 64 bytes from 173.194.66.147: icmp_seq=11 ttl=52 time=434.135 ms Request timeout for icmp_seq 12 64 bytes from 173.194.66.147: icmp_seq=13 ttl=52 time=478.975 ms 64 bytes from 173.194.66.147: icmp_seq=14 ttl=52 time=400.752 ms 64 bytes from 173.194.66.147: icmp_seq=15 ttl=52 time=321.203 ms 64 bytes from 173.194.66.147: icmp_seq=16 ttl=52 time=299.682 ms 64 bytes from 173.194.66.147: icmp_seq=17 ttl=52 time=469.187 ms 64 bytes from 173.194.66.147: icmp_seq=18 ttl=52 time=363.828 ms 64 bytes from 173.194.66.147: icmp_seq=19 ttl=52 time=412.463 ms 64 bytes from 173.194.66.147: icmp_seq=20 ttl=52 time=435.318 ms 64 bytes from 173.194.66.147: icmp_seq=21 ttl=52 time=291.620 ms 64 bytes from 173.194.66.147: icmp_seq=22 ttl=52 time=329.747 ms 64 bytes from 173.194.66.147: icmp_seq=23 ttl=52 time=402.978 ms 64 bytes from 173.194.66.147: icmp_seq=24 ttl=52 time=308.441 ms 64 bytes from 173.194.66.147: icmp_seq=25 ttl=52 time=9.579 ms 64 bytes from 173.194.66.147: icmp_seq=26 ttl=52 time=7.882 ms 64 bytes from 173.194.66.147: icmp_seq=27 ttl=52 time=9.689 ms 64 bytes from 173.194.66.147: icmp_seq=28 ttl=52 time=7.818 ms 64 bytes from 173.194.66.147: icmp_seq=29 ttl=52 time=10.972 ms 64 bytes from 173.194.66.147: icmp_seq=30 ttl=52 time=9.557 ms 64 bytes from 173.194.66.147: icmp_seq=31 ttl=52 time=9.437 ms ^C --- google.com ping statistics --- 32 packets transmitted, 30 packets received, 6.2% packet loss round-trip min/avg/max/stddev = 7.818/289.287/478.975/162.716 ms MacBook-Air-de-Tristan:~ tristan$
Bandwidth has been raised to 30M so symptoms directly related to this bug are resolved. Bug 705242 will remain open in case that issue is unrelated.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blake is currently cloning b2g repo and the ping is very bad again: PING google.com (173.194.34.16) 56(84) bytes of data. 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=1 ttl=59 time=376 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=2 ttl=59 time=383 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=3 ttl=59 time=391 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=4 ttl=59 time=396 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=5 ttl=59 time=419 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=6 ttl=59 time=443 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=7 ttl=59 time=486 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=8 ttl=59 time=546 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=9 ttl=59 time=486 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=10 ttl=59 time=2.69 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=11 ttl=59 time=224 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=12 ttl=59 time=349 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=13 ttl=59 time=290 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=14 ttl=59 time=323 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=16 ttl=59 time=2.75 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=17 ttl=59 time=266 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=18 ttl=59 time=255 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=19 ttl=59 time=259 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=20 ttl=59 time=280 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=21 ttl=59 time=302 ms 64 bytes from par03s02-in-f16.1e100.net (173.194.34.16): icmp_req=22 ttl=59 time=327 ms ^C --- google.com ping statistics --- 22 packets transmitted, 21 received, 4% packet loss, time 21023ms rtt min/avg/max/mdev = 2.690/324.592/546.100/133.476 ms
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
ICMP is not a suitable test in this case unless it is accompanied by slowness to all other connections in the office.
(In reply to Ravi Pina [:ravi] from comment #15) > ICMP is not a suitable test in this case unless it is accompanied by > slowness to all other connections in the office. I'm not sure I understand what you meant.
Routers can de-prioritize ICMP which makes ping not such a great debugging tool. Might you have mtr installed?
(In reply to matthew zeier [:mrz] from comment #17) > Routers can de-prioritize ICMP which makes ping not such a great debugging > tool. I used ping to show some data. I'm actually noticing those issues while using ssh. Basically, remote connections are not usable in such situations because of the latency. > Might you have mtr installed? I could install it.
That's good info. Which hosts are ssh'd to?
(In reply to matthew zeier [:mrz] from comment #19) > Which hosts are ssh'd to? I'm ssh'ing to oldworld.fr which is a personal server based in France. I have no particular connection issue with it (it's a 100M/100M bandwidth). I believe other people are ssh'ing on other servers.
After my visit to the Paris office I learned the issues have subsided to where they're not an issue. The office is moving soon and we will ensure that the new location has suitable bandwidth, infrastructure, and comms room.
Status: REOPENED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → INCOMPLETE
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.