Prod returned 85,000 301 redirects in the last 7 days

RESOLVED FIXED

Status

Tree Management
Treeherder: Infrastructure
P2
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: emorley, Assigned: emorley)

Tracking

Details

(Assignee)

Description

2 years ago
According to New Relic Insights, prod has 85K 301 redirects in the last 7 days:
https://insights.newrelic.com/accounts/677903/explorer?eventType=Transaction&timerange=week&filters=%255B%257B%2522key%2522%253A%2522response%252estatus%2522%252C%2522value%2522%253A%2522301%2522%257D%255D&facet=appName

Unfortunately unlike the APM section, Insights doesn't let us see the URL or user agent, so we have to resort to the logs instead. I've filed a ticket against New Relic asking for them to add support:
https://support.newrelic.com/tickets/172627/edit

Taking just the gunicorn logs from th-prod-web1, looks like there are several sources:

   4018 GET /api/project/mozilla-central/performance/data?interval=172800&signatures=REMOVED HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
   3865 GET /api/project/mozilla-central/performance/data?interval=518400&signatures=REMOVED HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
   2079 GET /api/project/mozilla-central/performance/data?interval=1036800&signatures=REMOVED HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
   2039 GET /api/project/mozilla-central/performance/data?interval=259200&signatures=REMOVED HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
   2022 GET /api/project/mozilla-central/performance/data?interval=1296000&signatures=REMOVED HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
   1905 GET /api/project/try/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
   1407 GET /api/project/bmo-master/resultset HTTP/1.1" 301 - "-" "TreeBot/0.1"
    375 GET /api/project/mozilla-inbound/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
    344 GET /api/project/mozilla-aurora/resultset?revision=REMOVED HTTP/1.1" 301 - "-" "python-requests/2.4.3 CPython/2.7.3 Linux/2.6.32-504.3.3.el6.x86_64"
    276 GET /api/project/fx-team/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
    221 GET /api/project/mozilla-central/resultset?revision=REMOVED HTTP/1.1" 301 - "-" "python-requests/2.4.3 CPython/2.7.3 Linux/2.6.32-504.3.3.el6.x86_64"
     46 GET /api/project/mozilla-central/resultset/NNNN HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
     45 GET /api/project/bmo-master/resultset/NNN/status HTTP/1.1" 301 - "-" "TreeBot/0.1"
     23 GET /api/project/b2g-inbound/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
     22 GET /api/project/mozilla-central/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
      8 GET /api/project/mozilla-aurora/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
      3 GET /api/project/mozilla-central/resultset/NNNN HTTP/1.1" 301 - "-" "libcurl/7.43.0 r-curl/0.9.4 httr/1.0.0"
      2 GET /api/project/mozilla-beta/jobs/NNNNNNN HTTP/1.1" 301 - "-" "Python-urllib/2.7"
      1 GET /api/project/mozilla-central/resultset/NNNN HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.9.3 httr/1.0.0"
      1 GET /api/project/mozilla-central/performance/signatures?interval=518400 HTTP/1.1" 301 - "-" "libcurl/7.19.7 r-curl/0.8 httr/1.0.0.9000"
      1 GET /api/project/ash/resultset?revision=REMOVED HTTP/1.1" 301 - "-" "python-requests/2.4.3 CPython/2.7.3 Linux/2.6.32-504.3.3.el6.x86_64"

Will, are the libcurl entries from a tool you have locally?
Flags: needinfo?(wlachance)
Not from me. Saptarshi, are you using curl to fetch the data? If so, you could make your script marginally more efficient (and reduce load for treeherder) by changing:

/api/project/mozilla-central/performance/data?params

to:

/api/project/mozilla-central/performance/data/?params

(same goes for any other endpoints you're using)
Flags: needinfo?(wlachance) → needinfo?(sguha)
Yes, i have a daily cron job which extracts the raw data and stores it. 
Sorry for the mistake
Flags: needinfo?(sguha)
Ok, if you could fix it as suggested that would be good. No big deal. :)
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME
Done.
(Assignee)

Comment 5

2 years ago
Leaving open for the other instances.

I've fixed one in:
https://github.com/globau/treebot/pull/1
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
(Assignee)

Comment 6

2 years ago
Some of the other instances may be:
https://github.com/mozilla/funsize/blob/66717d862f8b7e429c172f1719530a697a72a5b5/funsize/utils.py#L35
and
https://github.com/mozilla/autophone/blob/44f7029f481dc9f38eb8aa70c6019a35b902ad5b/autophonepulsemonitor.py#L432

I think after fixing these it may be worth coming up with a list of use-cases where people are not using TreeherderClient and adding support for them to it, and then persuading people to use the client to avoid issues like this (and people not setting a user agent etc). If we move to a model where people need credentials to avoid a GET rate limit (those credentials would likely be self-serve, no admin approval needed) it will be much easier for them if they are using the client.
(Assignee)

Updated

2 years ago
Depends on: 1230383
(Assignee)

Updated

2 years ago
Depends on: 1230402
(Assignee)

Comment 7

2 years ago
Think there's one more - don't suppose you could add the trailing slash to this too? 

127.0.0.1 - - [16/Dec/2015:13:05:15 -0800] "GET /api/project/mozilla-central/performance/signatures?interval=691200 HTTP/1.1" 301 - "-" "SaptarshiGuhaTalos/1.0"

Thanks :-)
Flags: needinfo?(sguha)
like this?
https://treeherder.mozilla.org/api/project/{{branch}}/performance/signatures?interval={{interval}}/
Flags: needinfo?(sguha)
(Assignee)

Comment 9

2 years ago
Before the query string (all paths end with a trailing slash), so like:
https://treeherder.mozilla.org/api/project/{{branch}}/performance/signatures/?interval={{interval}}
Fixed. Thanks
(Assignee)

Updated

2 years ago
Depends on: 1234233
(Assignee)

Comment 11

2 years ago
I'm now seeing:

127.0.0.1 - - [20/Dec/2015:19:12:54 -0800] "GET /api/project/mozilla-central/resultset/5279 HTTP/1.1" 301 - "-" "SaptarshiGuhaTalos/1.0"

I think the best bet is just to disable APPEND_SLASH so these 404 to make the issue more obvious; I've filed bug 1234233 for that :-)
Oh! This should be GET /api/project/mozilla-central/resultset/5279/ (trailing end slash) ?
Okay, fixed
(Assignee)

Comment 13

2 years ago
Yup, thank you :-)
(Assignee)

Updated

2 years ago
Depends on: 1236894

Comment 14

2 years ago
Commit pushed to master at https://github.com/mozilla/treeherder

https://github.com/mozilla/treeherder/commit/39d572d952a5ad435548fcc0dfadd28d16a15e97
Bug 1230179 - Docs: Fix the links to Swagger so they don't 301 redirect

Avoids this redirect seen in prod gunicorn logs:

[05/Jan/2016:05:55:42 -0800] "GET /docs HTTP/1.1" 301 -
"http://treeherder.readthedocs.org/retrieving_data.html"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:46.0) Gecko/20100101 Firefox/46.0"
(Assignee)

Comment 15

2 years ago
There are now only 10 events on prod in the last 7 days, rather than the 85,000 previously:
https://insights.newrelic.com/accounts/677903/explorer?eventType=Transaction&timerange=week&filters=%255B%257B%2522key%2522%253A%2522response%252estatus%2522%252C%2522value%2522%253A%2522301%2522%257D%252C%257B%2522key%2522%253A%2522appName%2522%252C%2522value%2522%253A%2522treeherder-prod%2522%257D%255D&facet=appName

And bug 1234233 will prevent us from regressing in the future.
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.