Closed
Bug 1230222
Opened 9 years ago
Closed 8 years ago
[Meta] Encourage tools that interact with our API to set informative user agents
Categories
(Tree Management :: Treeherder: API, defect, P3)
Tree Management
Treeherder: API
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
(Keywords: meta)
There are times when we're looking at New Relic or gunicorn logs and are trying to work out where a request originated from. For submissions to us, we now have the hawk client_id to help inform this, however: 1) This doesn't help identify GETs 2) The client_id is in the auth header, which isn't present in the gunicorn logs or New Relic transaction traces (albeit the latter will be helped by bug 1124278), unlike the user agent treeherder-client uses a user agent of eg: treeherder-pyclient/1.8.0 TreeBot uses eg: TreeBot/0.1 We should try and identify tools other than those that don't set a custom UA, and file bugs/open PRs to add one. There are also places within Treeherder itself, where we should be setting a UA but don't (eg the bugscache lookups that doesn't use treeherder-client) - plus we should of course do the right thing with requests we make to third party services too (like hg.mozilla.org).
Assignee | ||
Comment 1•9 years ago
|
||
Hi Saptarshi! I don't suppose you could set a custom user agent for the script that was mentioned in bug 1230179 comment 2? It will just allow us to more easily tell where requests are coming from in the case of API deprecation, or when requests are causing too much load etc (examples other tools use are in comment 0 here). Thanks :-)
Flags: needinfo?(sguha)
Comment 2•9 years ago
|
||
Absolutely. I've changed everything and my requests ought to have "SaptarshiGuhaTalos/1.0" as the user agent. If you'd like a more canonical string, I can change it easily.
Flags: needinfo?(sguha)
Assignee | ||
Comment 3•9 years ago
|
||
That's great - thank you :-)
Assignee | ||
Comment 4•8 years ago
|
||
Only use treeherder-client (which sets a UA): https://github.com/mozilla/mozilla_ci_tools https://github.com/adusca/try_extender https://github.com/chmanchester/trigger-bot https://github.com/mozilla/releasetasks https://hg.mozilla.org/build/puppet Already set a UA: https://github.com/globau/treebot Browser based, so browser UA + referrer is fine: https://hg.mozilla.org/hgcustom/version-control-tools/ Have a PR open to add a UA: https://github.com/mozilla/mozmill-ci https://github.com/mozilla-raptor/post-to-treeherder https://github.com/mozilla/autophone https://github.com/jmaher/alert_manager https://github.com/mozilla/pulse_actions https://github.com/sydvicious/mozplatformqa-jenkins https://github.com/mjzffr/treeherder-submission-example https://github.com/mozilla/funsize Left: treeherder-node (bug 1191403)
Assignee | ||
Comment 5•8 years ago
|
||
I keep on finding more - it's amazing how many projects are using our API now! https://github.com/mnoorenberghe/mozscreenshots https://github.com/h4writer/arewefastyet https://github.com/dminor/ouija https://github.com/klahnakoski/TestLog-ETL https://github.com/jcranmer/m-c-tools-code-coverage
Assignee | ||
Comment 6•8 years ago
|
||
Looking much more useful now (and some of the dependant bugs aren't deployed yet): 90261 treeherder-pyclient/2.0.1 73589 HTTP-Monitor/1.1 60359 ouija 46617 treeherder/treeherder.mozilla.org 15304 treeherder-nodeclient/0.7.0 2926 autophone 2307 SaptarshiGuhaTalos/1.0 817 NewRelicPinger/1.0 (677903) 425 TreeBot/0.1 416 mozscreenshots/0.3.1 410 Pingdom.com_bot_version_1.4_(http://www.pingdom.com/) 133 curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.18 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 97 funsize 70 python-requests/2.9.1 26 Twitterbot/1.0 8 Mozilla/6.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/8.0 Mobile/10A5376e Safari/8536.25 4 mozplatformqa-jenkins 4 - 1 Safari/11601.4.4 CFNetwork/760.2.6 Darwin/15.3.0 (x86_64) 1 Python-urllib/1.17 1 Goldfire Server 1 Go 1.1 package http 1 Flamingo_SearchEngine (+http://www.flamingosearch.com/bot) The blank UA entries are: IP-REDACTED - - [18/Feb/2016:06:31:56 +0000] "POST /api/project/gaia/resultset/ HTTP/1.1" 200 37 "-" "-" IP-REDACTED - - [18/Feb/2016:08:23:54 +0000] "POST /api/project/gaia/resultset/ HTTP/1.1" 200 37 "-" "-" IP-REDACTED - - [18/Feb/2016:08:57:40 +0000] "POST /api/project/gaia-master/resultset/ HTTP/1.1" 200 37 "-" "-" IP-REDACTED - - [18/Feb/2016:11:40:47 +0000] "POST /api/project/gaia/resultset/ HTTP/1.1" 200 37 "-" "-" ...guessing gaia-taskcluster perhaps? (I can't check whether it's been deployed due to a Heroku bug not letting my access the app since it's locked, even though admins are supposed to be able to do so; have filed https://help.heroku.com/tickets/336512). The curl entries are all to /server-status?auto - and are due to the deploy script's drain/undrain feature. The Python-urllib entry is to /revision.txt?cachescramble=1455818831.65 and is due to whatsdeployed: https://github.com/peterbe/whatsdeployed/blob/21cdd8350ad074fd0c0573a6a61f611e52695325/app.py#L68
Assignee | ||
Comment 7•8 years ago
|
||
Think we're virtually ready to block non-specific (for non-browser only) UAs: [emorley@treeherder1.webapp.scl3 ~]$ awk -F\" '{print $6}' /var/log/httpd/treeherder.mozilla.org/access_log | grep -v 'Mozilla/' | sort | uniq -c | sort -nr 42058 treeherder-pyclient/2.0.1 33387 treeherder/treeherder.mozilla.org 32401 HTTP-Monitor/1.1 19662 ouija 7965 treeherder-nodeclient/0.7.0 6243 SaptarshiGuhaTalos/1.0 2178 autophone 364 NewRelicPinger/1.0 (677903) 172 Pingdom.com_bot_version_1.4_(http://www.pingdom.com/) 168 TreeBot/0.1 32 mozscreenshots/0.3.1 18 funsize 9 mozmill-ci 6 - 3 Twitterbot/1.0 1 IrssiUrlLog/0.2 1 Flamingo_SearchEngine (+http://www.flamingosearch.com/bot)
Assignee | ||
Comment 8•8 years ago
|
||
Latest: [emorley@treeherder1.webapp.scl3 ~]$ awk -F\" '{print $6}' /var/log/httpd/treeherder.mozilla.org/access_log | grep -v 'Mozilla/' | sort | uniq -c | sort -nr 46894 treeherder/treeherder.mozilla.org 41783 ouija 41686 HTTP-Monitor/1.1 36042 treeherder-pyclient/2.1.0 11579 treeherder-nodeclient/0.7.0 6517 SaptarshiGuhaTalos/1.0 2684 autophone 1975 treeherder-pyclient/2.0.1 1115 Go-http-client/1.1 473 NewRelicPinger/1.0 (677903) 228 Pingdom.com_bot_version_1.4_(http://www.pingdom.com/) 223 TreeBot/0.1 206 mozmill-ci 178 curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.18 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 167 funsize 117 mozscreenshots/0.3.1 26 Opera/9.80 (X11; Linux x86_64; Edition Linux Mint) Presto/2.12.388 Version/12.16 24 mozplatformqa-jenkins 20 python-requests/2.9.1 7 Twitterbot/1.0 6 wpt-fetchlogs 6 ltx71 - (http://ltx71.com/) 1 Scrapy/1.0.5 (+http://scrapy.org) 1 - And for stage: [emorley@treeherder1.stage.webapp.scl3 ~]$ awk -F\" '{print $6}' /var/log/httpd/treeherder.allizom.org/access _log | grep -v 'Mozilla/' | sort | uniq -c | sort -nr 45639 treeherder/treeherder.allizom.org 41563 HTTP-Monitor/1.1 29195 treeherder-pyclient/2.1.0 10699 treeherder-nodeclient/0.7.0 1003 treeherder-pyclient/2.0.1 439 NewRelicPinger/1.0 (677903) 187 mozplatformqa-jenkins 111 arewefastyet 100 mozmill-ci 89 treeherder-pyclient/1.8.0 5 autophone 1 ltx71 - (http://ltx71.com/) The Go UAs were of form: GET /api/project/try/artifact/100032679/ The libcurl ones for server-status and so not affected by DRF blacklisting: /server-status The python-requests ones: //api/project/mozilla-aurora/jobs/?job_guid=79d27713-76c6-4aaa-a86c-c143851b2745 //api/project/mozilla-aurora/resultset/?revision=ca6ab5be342e
Assignee | ||
Comment 9•8 years ago
|
||
On prod, the only remaining UA that matches the blacklist is: python-requests/2.9.1 ...which I believe to be leftover machines that didn't get the fix from bug 1248277 deployed. On stage there was just: [12/May/2016:12:14:56 +0000] "GET /revision.txt?cachescramble=1463055296.49 HTTP/1.0" 200 41 "-" "Python-urllib/1.17" -> what's deployed, have filed: https://github.com/peterbe/whatsdeployed/issues/13 [12/May/2016:06:07:36 +0000] "GET /a2billing/ HTTP/1.1" 400 26 "-" "python-requests/2.9.1" -> Some spam / someone scanning for exploitable frameworks or similar.
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•