Closed Bug 1307656 Opened 3 years ago Closed 3 years ago

Intermittent-infra httplib.BadStatusLine from archiver_client.py

Categories

(Release Engineering :: General, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Assigned: nthomas)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Though with no parseable error message, you'll never know how often it happens.

https://treeherder.mozilla.org/logviewer.html#?job_id=4488745&repo=autoland#L45-L105

 2016-10-04 17:47:23,134 Getting archive location from https://api.pub.build.mozilla.org/archiver/hgmo/integration/autoland/e912950f0968?&preferred_region=us-west-2&suffix=tar.gz&subdir=testing/mozharness
 Traceback (most recent call last):
   File "/tools/checkouts/build-tools/buildfarm/utils/archiver_client.py", line 313, in <module>
     main()
   File "/tools/checkouts/build-tools/buildfarm/utils/archiver_client.py", line 307, in main
     archiver(api_url=api_url, config_key=config, options=options)
   File "/tools/checkouts/build-tools/buildfarm/utils/archiver_client.py", line 194, in archiver
     response = get_url_response(api_url, options)
   File "/tools/checkouts/build-tools/buildfarm/utils/archiver_client.py", line 127, in get_url_response
     response = urllib2.urlopen(api_url, timeout=60)
   File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen
     return _opener.open(url, data, timeout)
   File "/usr/lib64/python2.6/urllib2.py", line 391, in open
     response = self._open(req, data)
   File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
     '_open', req)
   File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
     result = func(*args)
   File "/usr/lib64/python2.6/urllib2.py", line 1198, in https_open
     return self.do_open(httplib.HTTPSConnection, req)
   File "/usr/lib64/python2.6/urllib2.py", line 1163, in do_open
     r = h.getresponse()
   File "/usr/lib64/python2.6/httplib.py", line 990, in getresponse
     response.begin()
   File "/usr/lib64/python2.6/httplib.py", line 391, in begin
     version, status, reason = self._read_status()
   File "/usr/lib64/python2.6/httplib.py", line 355, in _read_status
     raise BadStatusLine(line)
httplib.BadStatusLine
Summary: Intermittent httplib.BadStatusLine from archiver_client.py → Intermittent-infra httplib.BadStatusLine from archiver_client.py
jlund, do you have time to look into this ? Looks like there's a try/except which doesn't catch BadStatusLine at http://hg.mozilla.org/build/tools/file/default/buildfarm/utils/archiver_client.py#l125, but do we have any logs that would tell us if the API end is sick/needs moar oomph ?
Flags: needinfo?(jlund)
Logs for api.pub.build.mozilla.org indicate 504 responses for several /archiver requests at 14:46 Pacific today, on both backends. Several other services had a similar issue for a couple of minutes. The error log has a bunch of this, which presumably explains the malformed response which the python script is not liking:

[Sun Oct 09 14:47:25 2016] [error] [client 63.245.214.82] Timeout when reading response headers from daemon process 'relengapi': /data/www/relengapi/relengapi.wsgi

Nothing much in newrelic, except for a spike in 'Request Queuing' up to 12s at the time of interest. Looks like all the httpd and relengapi (wsgi) processes have recycled since then.
Flags: needinfo?(jlund)
I can't find any logs for the wsgi process, so lets retry if we get an unexpected response. The server fixed itself up this time, and with backoff in the retry we hope to give it a chance to.
Assignee: nobody → nthomas
Attachment #8799323 - Flags: review?(bugspam.Callek)
Attachment #8799323 - Flags: review?(bugspam.Callek) → review+
Sheriffs, please reopen if the fix doesn't work as expected.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.