Closed Bug 1062910 Opened 10 years ago Closed 10 years ago

Frequent proxxy timeouts on usw2

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Unassigned)

References

Details

All trees closed.

https://tbpl.mozilla.org/php/getParsedLog.php?id=47404097&tree=B2g-Inbound

07:26:05     INFO - #####
07:26:05     INFO - ##### Running download-and-extract step.
07:26:05     INFO - #####
07:26:05     INFO - Running pre-action listener: _resource_record_pre_action
07:26:05     INFO - Running main action method: download_and_extract
07:26:05     INFO - Replacing url http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip -> https://ftp-ssl.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip
07:26:05     INFO - Replacing url http://ftp.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tar.bz2 -> https://ftp-ssl.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tar.bz2
07:26:05     INFO - proxxy config: {'regions': ['.use1.', '.usw2.'], 'instances': ['proxxy.srv.releng.use1.mozilla.com', 'proxxy.srv.releng.usw2.mozilla.com'], 'urls': [('http://ftp.mozilla.org', 'ftp.mozilla.org'), ('https://ftp.mozilla.org', 'ftp.mozilla.org'), ('https://ftp-ssl.mozilla.org', 'ftp.mozilla.org'), ('http://pvtbuilds.pvt.build.mozilla.org', 'pvtbuilds.mozilla.org'), ('http://runtime-binaries.pvt.build.mozilla.org', 'runtime-binaries.pvt.build.mozilla.org')]}
07:26:05     INFO - https://ftp-ssl.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip matches https://ftp-ssl.mozilla.org
07:26:05     INFO - URL Candidate: http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip
07:26:05     INFO - trying http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip
07:26:05     INFO - mkdir: /builds/slave/test/build
07:26:05     INFO - Downloading http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip to /builds/slave/test/build/b2g-35.0a1.en-US.linux-i686.tests.zip
07:26:05     INFO - retry: Calling <bound method B2GDesktopTest._download_file of <__main__.B2GDesktopTest object at 0x8b8156c>> with args: ('http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip', '/builds/slave/test/build/b2g-35.0a1.en-US.linux-i686.tests.zip'), kwargs: {}, attempt #1
07:26:35  WARNING - Timed out accessing http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip: timed out
07:26:35     INFO - retry: Failed, sleeping 30 seconds before retrying
07:27:05     INFO - retry: Calling <bound method B2GDesktopTest._download_file of <__main__.B2GDesktopTest object at 0x8b8156c>> with args: ('http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip', '/builds/slave/test/build/b2g-35.0a1.en-US.linux-i686.tests.zip'), kwargs: {}, attempt #2
07:27:36  WARNING - Timed out accessing http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip: timed out
07:27:36     INFO - retry: Failed, sleeping 60 seconds before retrying
07:28:36     INFO - retry: Calling <bound method B2GDesktopTest._download_file of <__main__.B2GDesktopTest object at 0x8b8156c>> with args: ('http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip', '/builds/slave/test/build/b2g-35.0a1.en-US.linux-i686.tests.zip'), kwargs: {}, attempt #3
07:29:06  WARNING - Timed out accessing http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip: timed out
07:29:06    FATAL - Can't download from http://ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com/pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip to /builds/slave/test/build/b2g-35.0a1.en-US.linux-i686.tests.zip!
07:29:06    FATAL - Caught exception: timed out
07:29:06    FATAL - Caught exception: timed out
07:29:06    FATAL - Caught exception: timed out
07:29:06    FATAL - Running post_fatal callback...
07:29:06    FATAL - Exiting -1
Here's a raw log entry from affected proxxy instance:

{ "@timestamp": "2014-09-04T14:29:14+00:00", "@fields": { "remote_addr": "10.132.158.137", "host": "ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com", "upstream_cache_status": "MISS", "upstream_response_time": "189.864", "body_bytes_sent": "142219886", "request_time": "189.868", "status": "200", "request": "GET /pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip HTTP/1.1", "request_method": "GET", "http_referrer": "-", "http_user_agent": "Python-urllib/2.7" } }

It looks like it took more than 3 minutes to download the file from ftp.m.o.
Some concurrent requests for the same file were blocked and eventually failed like this:

{ "@timestamp": "2014-09-04T14:30:15+00:00", "@fields": { "remote_addr": "10.132.57.192", "host": "ftp.mozilla.org.proxxy.srv.releng.usw2.mozilla.com", "upstream_cache_status": "HIT", "upstream_response_time": "-", "body_bytes_sent": "79159", "request_time": "249.153", "status": "200", "request": "GET /pub/mozilla.org/b2g/tinderbox-builds/b2g-inbound-linux32_gecko/1409838460/en-US/b2g-35.0a1.en-US.linux-i686.tests.zip HTTP/1.1", "request_method": "GET", "http_referrer": "-", "http_user_agent": "Python-urllib/2.7"} }

(the client aborted before downloading the whole file, thus body_bytes_sent is so small)
Blocks: 1014659
(In reply to Chris AtLee [:catlee] from comment #2)
> Backed out http://hg.mozilla.org/build/mozharness/rev/279414b34e2c and
> pushed to production

resolving per that
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.