Closed Bug 1230610 Opened 10 years ago Closed 10 years ago

Add a user agent to all HTTP requests Treeherder makes to both itself and other services

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

In bug 1230222 we're going to encourage third party consumers of our API to Do The Right Thing and set an informative user agent when interacting with our API. However Treeherder doesn't even do so itself for all requests at the moment (apart from those that use treeherder-client) -- we should do so.
Attachment #8696144 - Flags: review?(wlachance)
Blocks: 1165356
In the future having make_request() will also give us a central location to add things like optional retry support, so we retry requests to eg AWS or hg.m.o rather than having to retry the whole task.
Comment on attachment 8696144 [details] [review] Set a User-Agent for HTTP requests made by Treeherder LGTM except for one nit regarding default parameters. r+ with that addressed.
Attachment #8696144 - Flags: review?(wlachance) → review+
Commits pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/429feea9f77db872c2bf08d64e5492a3854249b6 Bug 1230610 - Rename get_remote_content to fetch_json Since it's returning json specifically, and in the future we'll have another function in that file for fetching text content too. https://github.com/mozilla/treeherder/commit/35fe3b400cf1078f046b22e3600cfabeb979145c Bug 1230610 - Switch Hg pushlog ingestion to using fetch_json() To save reinventing the wheel, particularly once we start adding a custom user agent to the request headers. Later it might be worth moving logging into `fetch_json()` too, however for now the various consumers of `fetch_json()` have different interpretations of whether an `HTTPError` is really an error, which means there isn't one size fits all, so let's leave that to another bug. https://github.com/mozilla/treeherder/commit/b8195bce064db331f77da64fbb06a8b055fc396f Bug 1230610 - Factor out the reusable parts of fetch_json() Since later we'll be wanting to make POST requests too, as well as return both text and the untouched response object. https://github.com/mozilla/treeherder/commit/438c39100d2ef29ff96ebc67bb39e4b6782f7d01 Bug 1230610 - Switch to using make_request() in MozlogArtifactBuilder https://github.com/mozilla/treeherder/commit/c2033e140ffbbca773cc5833865640be16438a7e Bug 1230610 - Switch to using make_request() in ElasticsearchDocRequest I've not added a `post_json()` to treeherder.etl.common since the return value isn't used here (and I have no idea if the response is valid json) and this code will be removed once we have autoclassification and stop mirroring to ElasticSearch. https://github.com/mozilla/treeherder/commit/dd03defabafd7fc9583f12c6f82761bade219931 Bug 1230610 - Give the store_error_summary arguments explicit names https://github.com/mozilla/treeherder/commit/6ee48ba05086bdf3a480839b6fb4386fcc7e3de0 Bug 1230610 - Make store_error_summary return early if no log text https://github.com/mozilla/treeherder/commit/ad934650f42ad020cd26a09f8907b901c63d0d20 Bug 1230610 - Add a fetch_text() and switch store_error_summary to it https://github.com/mozilla/treeherder/commit/d3a5c2e9c39f4582d6a9aa62fa6e0de6a75cbc5f Bug 1230610 - Rename TREEHERDER_REQUESTS_TIMEOUT to REQUESTS_TIMEOUT https://github.com/mozilla/treeherder/commit/52ec478109b7000a006335a0a3279c17a0b730f8 Bug 1230610 - Set a User-Agent for HTTP requests made by Treeherder We're encouraging consumers of our API to set an informative User Agent, so it's only fair that we do the same for when we make requests to other sites too. (And `make_request()` is also used for requests we make to our own API, so this will actually benefit us as well.) Since Treeherder doesn't have a version per-se (and adding the git revision wouldn't really add any value), I've instead used the site host name, to help differentiate traffic from stage/prod/heroku/locally. eg: treeherder/treeherder.mozilla.org treeherder/treeherder.allizom.org treeherder/local.treeherder.mozilla.org This seemed preferable to hardcoding nicknames (eg "stage") based on hostname, but I'm open to alternative suggestions. I've not migrated any more of the existing urllib2 users over to the requests library in this PR since bug 1165356 covers that work, and between mocking for tests & handling streaming support for log parsing, it's too much scope creep. However I didn't want to block setting a UA on that bug, so we'll just have to deal with the additional duplication for now.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: