Closed
Bug 1230610
Opened 10 years ago
Closed 10 years ago
Add a user agent to all HTTP requests Treeherder makes to both itself and other services
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P2)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(1 file)
In bug 1230222 we're going to encourage third party consumers of our API to Do The Right Thing and set an informative user agent when interacting with our API.
However Treeherder doesn't even do so itself for all requests at the moment (apart from those that use treeherder-client) -- we should do so.
| Assignee | ||
Comment 1•10 years ago
|
||
Attachment #8696144 -
Flags: review?(wlachance)
| Assignee | ||
Comment 2•10 years ago
|
||
In the future having make_request() will also give us a central location to add things like optional retry support, so we retry requests to eg AWS or hg.m.o rather than having to retry the whole task.
Comment 3•10 years ago
|
||
Comment on attachment 8696144 [details] [review]
Set a User-Agent for HTTP requests made by Treeherder
LGTM except for one nit regarding default parameters. r+ with that addressed.
Attachment #8696144 -
Flags: review?(wlachance) → review+
Comment 4•10 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/429feea9f77db872c2bf08d64e5492a3854249b6
Bug 1230610 - Rename get_remote_content to fetch_json
Since it's returning json specifically, and in the future we'll have
another function in that file for fetching text content too.
https://github.com/mozilla/treeherder/commit/35fe3b400cf1078f046b22e3600cfabeb979145c
Bug 1230610 - Switch Hg pushlog ingestion to using fetch_json()
To save reinventing the wheel, particularly once we start adding a
custom user agent to the request headers.
Later it might be worth moving logging into `fetch_json()` too, however
for now the various consumers of `fetch_json()` have different
interpretations of whether an `HTTPError` is really an error, which
means there isn't one size fits all, so let's leave that to another bug.
https://github.com/mozilla/treeherder/commit/b8195bce064db331f77da64fbb06a8b055fc396f
Bug 1230610 - Factor out the reusable parts of fetch_json()
Since later we'll be wanting to make POST requests too, as well as
return both text and the untouched response object.
https://github.com/mozilla/treeherder/commit/438c39100d2ef29ff96ebc67bb39e4b6782f7d01
Bug 1230610 - Switch to using make_request() in MozlogArtifactBuilder
https://github.com/mozilla/treeherder/commit/c2033e140ffbbca773cc5833865640be16438a7e
Bug 1230610 - Switch to using make_request() in ElasticsearchDocRequest
I've not added a `post_json()` to treeherder.etl.common since the return
value isn't used here (and I have no idea if the response is valid json)
and this code will be removed once we have autoclassification and stop
mirroring to ElasticSearch.
https://github.com/mozilla/treeherder/commit/dd03defabafd7fc9583f12c6f82761bade219931
Bug 1230610 - Give the store_error_summary arguments explicit names
https://github.com/mozilla/treeherder/commit/6ee48ba05086bdf3a480839b6fb4386fcc7e3de0
Bug 1230610 - Make store_error_summary return early if no log text
https://github.com/mozilla/treeherder/commit/ad934650f42ad020cd26a09f8907b901c63d0d20
Bug 1230610 - Add a fetch_text() and switch store_error_summary to it
https://github.com/mozilla/treeherder/commit/d3a5c2e9c39f4582d6a9aa62fa6e0de6a75cbc5f
Bug 1230610 - Rename TREEHERDER_REQUESTS_TIMEOUT to REQUESTS_TIMEOUT
https://github.com/mozilla/treeherder/commit/52ec478109b7000a006335a0a3279c17a0b730f8
Bug 1230610 - Set a User-Agent for HTTP requests made by Treeherder
We're encouraging consumers of our API to set an informative User Agent,
so it's only fair that we do the same for when we make requests to other
sites too. (And `make_request()` is also used for requests we make to
our own API, so this will actually benefit us as well.)
Since Treeherder doesn't have a version per-se (and adding the git
revision wouldn't really add any value), I've instead used the site host
name, to help differentiate traffic from stage/prod/heroku/locally. eg:
treeherder/treeherder.mozilla.org
treeherder/treeherder.allizom.org
treeherder/local.treeherder.mozilla.org
This seemed preferable to hardcoding nicknames (eg "stage") based on
hostname, but I'm open to alternative suggestions.
I've not migrated any more of the existing urllib2 users over to the
requests library in this PR since bug 1165356 covers that work, and
between mocking for tests & handling streaming support for log parsing,
it's too much scope creep. However I didn't want to block setting a UA
on that bug, so we'll just have to deal with the additional duplication
for now.
| Assignee | ||
Updated•10 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•