Closed Bug 1203176 Opened 9 years ago Closed 3 years ago

client libraries should handle end-points that return binary-blobs

Categories

(Taskcluster :: Services, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Unassigned)

References

Details

I was fiddling around with this today and noticed that the client hangs when running this method. I let it run for ~10min, and it spiked my cpu, and never returned. I got the following stack after ^C:
----> 1 resp = index.findArtifactFromTask("buildbot.revisions.00008b664c22.mozilla-inbound.win32", "public/build/firefox-42.0a1.en-US.win32.installer.exe")

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/taskcluster/client.py in apiCall(self, *args, **kwargs)
    453       def addApiCall(e):
    454         def apiCall(self, *args, **kwargs):
--> 455           return self._makeApiCall(e, *args, **kwargs)
    456         return apiCall
    457 

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/taskcluster/client.py in _makeApiCall(self, entry, *args, **kwargs)
    230     log.debug('Route is: %s', route)
    231 
--> 232     return self._makeHttpRequest(entry['method'], route, payload)
    233 
    234   def _processArgs(self, entry, *args, **kwargs):

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/taskcluster/client.py in _makeHttpRequest(self, method, route, payload)
    372       log.debug('Making attempt %d', retry)
    373       try:
--> 374         response = utils.makeSingleHttpRequest(method, url, payload, headers)
    375       except requests.exceptions.RequestException as rerr:
    376         if retry < retries:

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/taskcluster/utils.py in makeSingleHttpRequest(method, url, payload, headers)
    188   log.debug('Received HTTP Status:  %s' % response.status_code)
    189   log.debug('Received HTTP Headers: %s' % str(response.headers))
--> 190   log.debug('Received HTTP Payload: %s (limit 1024 char)' % str(response.text)[:1024])
    191 
    192   return response

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/requests/models.pyc in text(self)
    780         # Fallback to auto-detected encoding.
    781         if self.encoding is None:
--> 782             encoding = self.apparent_encoding
    783 
    784         # Decode unicode from given encoding.

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/requests/models.pyc in apparent_encoding(self)
    655     def apparent_encoding(self):
    656         """The apparent encoding, provided by the chardet library"""
--> 657         return chardet.detect(self.content)['encoding']
    658 
    659     def iter_content(self, chunk_size=1, decode_unicode=False):

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/requests/packages/chardet/__init__.pyc in detect(aBuf)
     28     u = universaldetector.UniversalDetector()
     29     u.reset()
---> 30     u.feed(aBuf)
     31     u.close()
     32     return u.result

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/requests/packages/chardet/universaldetector.pyc in feed(self, aBuf)
    126                                          Latin1Prober()]
    127             for prober in self._mCharSetProbers:
--> 128                 if prober.feed(aBuf) == constants.eFoundIt:
    129                     self.result = {'encoding': prober.get_charset_name(),
    130                                    'confidence': prober.get_confidence()}

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/requests/packages/chardet/charsetgroupprober.pyc in feed(self, aBuf)
     62             if not prober.active:
     63                 continue
---> 64             st = prober.feed(aBuf)
     65             if not st:
     66                 continue

/home/bhearsum/.virtualenvs/bbb/lib/python2.7/site-packages/requests/packages/chardet/sbcharsetprober.pyc in feed(self, aBuf)
     75             return self.get_state()
     76         for c in aBuf:
---> 77             order = self._mModel['charToOrderMap'][wrap_ord(c)]
     78             if order < SYMBOL_CAT_ORDER:
     79                 self._mTotalChar += 1



Making a request to the equivalent URL with curl worked fine:
  curl -L https://index.taskcluster.net/v1/task/buildbot.revisions.00008b664c22.mozilla-inbound.win32/artifacts/public/build/firefox-42.0a1.en-US.win32.installer.exe > /tmp/blah
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    29  100    29    0     0     37      0 --:--:-- --:--:-- --:--:--    36
100    29  100    29    0     0     17      0  0:00:01  0:00:01 --:--:--     0
100 40.4M  100 40.4M    0     0  2066k      0  0:00:20  0:00:20 --:--:-- 2653k
Blocks: 1142872
Pete - can you have a look?
Flags: needinfo?(pmoore)
302 jhford :)
Flags: needinfo?(pmoore) → needinfo?(jhford)
This endpoint is returning a streaming result.  We need to fix bug 1184209 before the client can handle it properly.
Depends on: 1184209
Flags: needinfo?(jhford)
I think I found a workaround for this, so it's not blocking me anymore.
No longer blocks: 1142872
Summary: python client hangs in findArtifactFromTask → client libraries should handle end-points that return binary-blobs
Found in triage.

This is still valid. Subsets of this could be good-first-bugs.
The tricky bit here will be downloading those blobs.  I don't think we want to rely on the HTTP implementation we use for API methods to do bulk downloads.

So maybe we just shouldn't create API methods for such functions, and suggest that users use buildUrl or buildSignedUrl for them and then download with a different client?  John, what do you think
Flags: needinfo?(jhford)
My understanding was that we were adding support for `output: 'blob'` to lib-api, which would mark in the API that the client should use binary object downloading flows.  Once we have support for that it should be possible to have binary object handling.  With that support, we should be able to special case the support for binary objects.  It should be possible for python (especially when using requests library) to handle binary downloads.

That said, this is most relevant for downloading artifacts.  We should consider using the lib-artifact cli tool or javascript/go clients for downloading artifacts, since this is a complex procedure to get right if verification of artifacts is desired.  We've had lots of bugs in places where people have built a url then used curl, it'd be nice if we could avoid that whole class of bugs.
Flags: needinfo?(jhford)
OK, so it sounds like the best choice at the moment is to just remove the methods themselves (since they will fail) and keep the URL-building methods.  John, is that something you could mentor?
Flags: needinfo?(jhford)
I can do that.  The rough approach that I'd suggest is when a method is called which has output: blob that we through a new exception that derives from TaskclusterFailure.
Flags: needinfo?(jhford)
Priority: -- → P5
Component: Client Libraries → Services

We've fixed this so that all REST endpoints return JSON. Some of them also return a 3xx and a Location header, but the client libraries ignore those.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.