Frequent update requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://aus4-admin.mozilla.org/api/v2/releases/Firefox-mozilla-central-nightly-20201216214834
Categories
(Release Engineering :: Release Automation, defect, P5)
Tracking
(Not tracked)
People
(Reporter: intermittent-bug-filer, Assigned: oremj)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell disable-recommended])
Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=324762052&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/FfmTZluBSKKW9NGBJBKvzw/runs/2/artifacts/public/logs/live_backing.log
2020-12-16 23:36:08,563 - balrogclient.api - DEBUG - REQUEST STATS: {"timestamp": 1608161768.5636039, "method": "GET", "url": "https://aus4-admin.mozilla.org/api/v2/releases/Firefox-mozilla-central-nightly-20201216214834", "status_code": 403, "elapsed_secs": 0.146679}
2020-12-16 23:36:08,563 - redo - DEBUG - retry: Caught exception:
Traceback (most recent call last):
File "/app/lib/python3.8/site-packages/redo/__init__.py", line 170, in retry
return action(*args, **kwargs)
File "/app/lib/python3.8/site-packages/balrogscript/script.py", line 94, in <lambda>
retry(lambda: submitter.run(**release), jitter=5, sleeptime=10, max_sleeptime=30, attempts=10)
File "/app/lib/python3.8/site-packages/balrogscript/submitter/cli.py", line 404, in run
return NightlySubmitterBase.run(self, *args, schemaVersion=4, **kwargs)
File "/app/lib/python3.8/site-packages/balrogscript/submitter/cli.py", line 214, in run
return self.run_backend2(
File "/app/lib/python3.8/site-packages/balrogscript/submitter/cli.py", line 346, in run_backend2
existing_release = balrog_request(session, "get", url)
File "/app/lib/python3.8/site-packages/balrogclient/api.py", line 107, in balrog_request
resp.raise_for_status()
File "/app/lib/python3.8/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://aus4-admin.mozilla.org/api/v2/releases/Firefox-mozilla-central-nightly-20201216214834
2020-12-16 23:36:08,568 - redo - DEBUG - sleeping for 11.57s (attempt 1/10)
2020-12-16 23:36:20,139 - redo - DEBUG - attempt 2/10```
Updated•5 years ago
|
Comment 1•5 years ago
|
||
Looking at https://firefox-ci-tc.services.mozilla.com/provisioners/scriptworker-k8s/worker-types/gecko-3-balrog , it seems like there are good workers and bad workers (the failures seem to be isolated to a handful of workers).
Could we have bad credentials? Or could we be creating workers outside of an IP allowlist?
Comment 2•5 years ago
|
||
jmaher
aki: how do you tell the workers, it seems to always be unique ID on that link
aki
jmaher: if you click on a link under worker id, then you'll see the history for that worker. they're spot instances so they don't last forever, but they last 1+ tasks. i don't see any workers with both green and red tasks; i've only seen all green or all red
jmaher
oh, I see
aki
if i sort by task started, then i see there is a batch of green between batches of red, which tells me it might not be a server hiccup
could be a server hiccup, but i'm currently guessing bad workers
| Comment hidden (Intermittent Failures Robot) |
Comment 4•5 years ago
|
||
I wonder if https://bugzilla.mozilla.org/show_bug.cgi?id=1681129 is related - it was just fixed in the last 24h, and made changes to the whitelists.
Comment 5•5 years ago
•
|
||
:bhearsum was right, it was a side-effect of bug 1681129 being applied. We believe that the additional scriptworker IPs were added/applied in a temporary branch that wasn't merged to master, so when bug 1681129 was applied it overwrote those "temporary" changes and started causing 403's for approximately 4/5 of the requests from the scriptworker pool. I created a PR to add the other scriptworker IPs and applied it which has resolved this bug, but there may be some other missing IPs we need to add.
Updated•5 years ago
|
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 10•5 years ago
|
||
jbuck or oremj - do we need a similar change for balrog (and ship it?) stage? I just hit this with balrogscript stage scriptworkers trying to talk to balrog stage admin: https://firefoxci.taskcluster-artifacts.net/JCAUz6pRTpCGBhBgKRCZpQ/0/public/logs/live_backing.log
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 11•5 years ago
|
||
Fixed for balrog stage admin.
| Assignee | ||
Comment 12•5 years ago
|
||
Also updated shipit api dev.
| Assignee | ||
Updated•5 years ago
|
Comment 13•5 years ago
|
||
(In reply to Jeremy Orem [:oremj] from comment #11)
Fixed for balrog stage admin.
Working well again, thanks!
Updated•1 year ago
|
Description
•