Closed Bug 1846477 Opened 2 years ago Closed 2 years ago

shipit: Cannot kick off a release of Firefox for Android: JSONDecodeError('Expecting value: line 1 column 1 (char 0)')

Categories

(Release Engineering :: Applications: Shipit, defect, P1)

Tracking

(firefox116blocking fixed, firefox117blocking fixed, firefox118 fixed)

RESOLVED FIXED
Tracking Status
firefox116 blocking fixed
firefox117 blocking fixed
firefox118 --- fixed

People

(Reporter: jlorenzo, Assigned: jlorenzo)

Details

(Keywords: regression)

Attachments

(2 files)

12 hours ago, :RyanVM was not able to start a release of Firefox for Android on shipit. Firefox for Android is hosted on Github[1]. This means shipit uses a Github API endpoint[2] to fetch the list of available branches. Here's the error we've gotten since yesterday[3]:

Uncaught exception:
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/flask/app.py", line 2529, in wsgi_app
    response = self.full_dispatch_request()
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/flask_cors/extension.py", line 176, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/connexion/decorators/decorator.py", line 68, in wrapper
    response = function(request)
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/connexion/decorators/uri_parsing.py", line 149, in wrapper
    response = function(request)
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/connexion/decorators/validation.py", line 399, in wrapper
    return function(request)
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/connexion/decorators/response.py", line 112, in wrapper
    response = function(request)
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/connexion/decorators/parameter.py", line 120, in wrapper
    return function(**kwargs)
  File "/app/src/src/shipit_api/admin/github.py", line 135, in list_github_branches
    content = query_api(query)
  File "/app/src/src/shipit_api/admin/github.py", line 90, in query_api
    j = req.json()
  File "/builds/worker/poetry_cache/virtualenvs/shipit-api-GPaIsmRp-py3.9/lib/python3.9/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
<class 'requests.exceptions.JSONDecodeError'>
JSONDecodeError('Expecting value: line 1 column 1 (char 0)')

This API call hasn't worked in the production instance at least since yesterday. This problem is specific to the production instance. I was not able to repro neither on the staging instance nor locally with exactly the same code as the one currently in production. I don't think the token is a problem either. Otherwise, the HTTP error would have been reported. Here, we got a 200 response whose payload might have been truncated.

Was there any recent changes in the way HTTP requests are handled in the production environment? More precisely in the shipitapi-prod project?

Marking this bug as P1/S1 because this prevents new version of Firefox for Android to be shipped.

[1] https://github.com/mozilla-mobile/firefox-android
[2] https://github.com/mozilla-releng/shipit/blob/91fefdf09c84c157015c73d794a247582c0095a9/api/src/shipit_api/admin/github.py#L90
[3] https://mozilla.sentry.io/issues/4356896111/?alert_rule_id=14011861&alert_type=issue&project=6262522&referrer=slack

My first thought was a problem with the API token. Though I logged into moz-releng-automation yesterday and all the access tokens had no expiry.

We could trigger relpro actions manually as a work around

I thought about manually triggering relpro too 🙂 I'm not sure we want to go down that path yet. If we were to do it, then shipit and product-details would become outdated. That could cause issues to all consumers of product-details.

Another data point is it was working prior to the deploy for product details, but was broken after. There aren't any code changes that look suspicious but maybe the deploy picked something up on the cloud ops side?

Keywords: regression

We unstuck the Firefox Android release by calling the ShipIt API manually, but the root issue still remains.

Downgrading severity to S2 thanks to the workaround mentioned in comment 5.

Severity: S1 → S2
Priority: P1 → --
Priority: -- → P1

This turned out to be an error caused by an update on one of our dependencies. We were doing poetry install before copying the poetry-lock file into the container (so we were getting a different dependency tree on every poetry install)

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

Great job :bhearsum, :ahal, :gabriel, and :jbuck for finding the root cause and fixing it! Special thanks to :jbuck who helped RelEng out while it turned out to be an application issue.

Component: Operations: Releng → Applications: Shipit
Product: Cloud Services → Release Engineering
QA Contact: gabriel
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: