Closed Bug 1524941 Opened 5 years ago Closed 5 years ago

Block more default http library User Agents

Categories

(Tree Management :: Treeherder: API, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

For ~3 years we've asked that consumers of Treeherder's REST API set an appropriate User Agent when making requests to our API:
https://treeherder.readthedocs.io/rest_api.html#user-agents

This has been enforced by blocking the default User Agents of popular http libraries:
https://github.com/mozilla/treeherder/blob/9bc1da2c78f73c273c4a149a7a779f4d88ee7b1c/treeherder/config/settings.py#L247-L256

However whilst looking at New Relic for something else today, I see there are a few more user agents that could do with blocking.

Go 1.1 package http
Go-http-client/*
node-fetch/*
-> https://github.com/mozilla/firefox-health-backend/issues/53
python-requests/*
Python-urllib/*
reqwest/*
-> https://github.com/jgraham/fetchlogs/issues/5

All but the ones marked with GitHub issues are for non-legitimate traffic (eg scraping robots.txt or looking for PHP/... vulnerabilities), so no impact from blocking.

The two python related entries in comment 2 are already blocked.

For the Go related ones, they are only being used to scrap the homepage and robots.txt, so safe to block. This was determined using New Relic Insights and the query:

SELECT count(`request.uri`) FROM Transaction FACET `request.uri` WHERE user_agent LIKE 'Go%http%' SINCE 7 DAYS AGO LIMIT 200
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: