Closed
Bug 972446
Opened 10 years ago
Closed 10 years ago
Seeing timeouts when fetching from tooltool
Categories
(Infrastructure & Operations Graveyard :: NetOps, task)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: armenzg, Assigned: adam)
Details
https://tbpl.mozilla.org/php/getParsedLog.php?id=34624410&tree=Mozilla-Inbound Running locally on the machine I can see that we can only transfer at 100-200KB/secs and I see the files going to take longer than 300 seconds to download. Fetching... retry: Calling <function run_with_timeout at 0x7fd3530a0410> with args: (['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android-armv6/releng.manifest', 'fetch'], 300, None, None, False, True), kwargs: {}, attempt #1 Executing: ['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android-armv6/releng.manifest', 'fetch'] WARNING: Timeout (300) exceeded, killing process 2286 Traceback (most recent call last): File "/tools/tooltool.py", line 835, in <module> main() File "/tools/tooltool.py", line 832, in main exit(0 if process_command(options, args) else 1) File "/tools/tooltool.py", line 727, in process_command cache_folder=options['cache_folder']) File "/tools/tooltool.py", line 508, in fetch_files temp_file_name = fetch_file(base_urls, f) File "/tools/tooltool.py", line 425, in fetch_file indata = f.read(grabchunk) File "/tools/python27/lib/python2.7/socket.py", line 380, in read data = self._sock.recv(left) File "/tools/python27/lib/python2.7/httplib.py", line 561, in read s = self.fp.read(amt) File "/tools/python27/lib/python2.7/socket.py", line 380, in read data = self._sock.recv(left) KeyboardInterrupt retry: Failed, sleeping 30 seconds before retrying retry: Calling <function run_with_timeout at 0x7fd3530a0410> with args: (['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android-armv6/releng.manifest', 'fetch'], 300, None, None, False, True), kwargs: {}, attempt #2 Executing: ['/tools/tooltool.py', '--url', 'http://runtime-binaries.pvt.build.mozilla.org/tooltool', '--overwrite', '-m', 'mobile/android/config/tooltool-manifests/android-armv6/releng.manifest', 'fetch'] WARNING: Timeout (300) exceeded, killing process 2320
Comment 1•10 years ago
|
||
All trees closed as of 09:24 due to this.
Reporter | ||
Updated•10 years ago
|
Assignee: nobody → network-operations
Severity: normal → critical
Component: General Automation → NetOps
Product: Release Engineering → Infrastructure & Operations
QA Contact: catlee → adam
Version: unspecified → other
Comment 2•10 years ago
|
||
From #aws: 09:27 < XioNoX> found it, usw2 09:28 < XioNoX> according to smokeping, something with USW2 isn't very happy right now 09:28 < XioNoX> http://netops2.private.scl3.mozilla.com/smokeping/sm.cgi?target=Datacenters.RELENG-SCL3.nagios1-releng-usw2 09:28 < XioNoX> some packet loss on that link, even if latency is fine 09:30 < XioNoX> traffic is not too high, SPUs aren't overloaded 09:58 < hwine> XioNoX: should we open a case with AWS? trees closed
Comment 3•10 years ago
|
||
We switched over to the other tunnel and things seem stable. Reopened.
Comment 4•10 years ago
|
||
Lowering severity since switching tunnels resolved the issue.
Severity: critical → normal
Reporter | ||
Comment 5•10 years ago
|
||
What was the root cause? Thanks!
Updated•10 years ago
|
QA Contact: adam → jbarnell
Comment 6•10 years ago
|
||
As switching to the 2nd AWS tunnel solved the issue I'd guess an issue on one of the Amazon's endpoints. Did anyone open a ticket with them?
Assignee | ||
Updated•10 years ago
|
Assignee: network-operations → adam
Assignee | ||
Comment 7•10 years ago
|
||
Closing because we have other bugs tracking the packet loss issues with AWS.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Updated•2 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•