crash when matrix notification times out
Categories
(Release Engineering :: Applications: Shipit, defect, P3)
Tracking
(Not tracked)
People
(Reporter: jlorenzo, Unassigned)
Details
From github: https://github.com/mozilla-releng/shipit/issues/1307.
https://mozilla.sentry.io/issues/4045049635/ shows a case where the call to taskcluster's notify service takes a while and gunicorn ends up killing us with SIGABRT and a
[CRITICAL] WORKER TIMEOUT
error message.SystemExit: 1 File "flask/app.py", line 2548, in __call__ return self.wsgi_app(environ, start_response) File "flask/app.py", line 2525, in wsgi_app response = self.full_dispatch_request() File "flask/app.py", line 1820, in full_dispatch_request rv = self.dispatch_request() File "flask/app.py", line 1796, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) File "connexion/decorators/decorator.py", line 68, in wrapper response = function(request) File "connexion/decorators/uri_parsing.py", line 149, in wrapper response = function(request) File "connexion/decorators/validation.py", line 196, in wrapper response = function(request) File "connexion/decorators/validation.py", line 399, in wrapper return function(request) File "connexion/decorators/response.py", line 112, in wrapper response = function(request) File "connexion/decorators/parameter.py", line 120, in wrapper return function(**kwargs) File "/usr/local/lib/python3.9/site-packages/shipit_api/admin/xpi.py", line 184, in phase_signoff notify_via_matrix("xpi", f"Phase {phase} of {release.name} signed off by {users_email}") File "/usr/local/lib/python3.9/site-packages/shipit_api/admin/api.py", line 51, in notify_via_matrix notify.matrix({"roomId": room_id, "body": f"{owners}: {message}"}) File "taskcluster/generated/notify.py", line 101, in matrix return self._makeApiCall(self.funcinfo["matrix"], *args, **kwargs) File "taskcluster/client.py", line 269, in _makeApiCall response = self._makeHttpRequest(entry['method'], _route, payload) File "taskcluster/client.py", line 490, in _makeHttpRequest response = utils.makeSingleHttpRequest(method, url, payload, headers) File "taskcluster/utils.py", line 292, in makeSingleHttpRequest response = obj.request(method.upper(), url, data=payload, headers=headers, allow_redirects=False) File "requests/api.py", line 59, in request return session.request(method=method, url=url, **kwargs) File "requests/sessions.py", line 587, in request resp = self.send(prep, **send_kwargs) File "requests/sessions.py", line 701, in send r = adapter.send(request, **kwargs) File "requests/adapters.py", line 489, in send resp = conn.urlopen( File "urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "urllib3/connectionpool.py", line 449, in _make_request six.raise_from(e, None) File "<string>", line 3, in raise_from # Permission is hereby granted, free of charge, to any person obtaining a copy File "urllib3/connectionpool.py", line 444, in _make_request httplib_response = conn.getresponse() File "http/client.py", line 1377, in getresponse response.begin() File "http/client.py", line 320, in begin version, status, reason = self._read_status() File "http/client.py", line 281, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "socket.py", line 704, in readinto return self._sock.recv_into(b) File "ssl.py", line 1242, in recv_into return self.read(nbytes, buffer) File "ssl.py", line 1100, in read return self._sslobj.read(len, buffer) File "gunicorn/workers/base.py", line 203, in handle_abort sys.exit(1)
Change performed by the Move to Bugzilla add-on.
Reporter | ||
Comment 1•28 days ago
|
||
From github: https://github.com/mozilla-releng/shipit/issues/1307#issuecomment-1488504305
:jcristau said:
https://mozilla.sentry.io/issues/3943638203/ is a similar issue when calling
queue.listLatestArtifacts
, and https://mozilla.sentry.io/issues/3869583111/ inqueue.listTaskGroup
.
Reporter | ||
Comment 2•28 days ago
|
||
From github: https://github.com/mozilla-releng/shipit/issues/1307#issuecomment-1488814296
:jcristau said:
Possibly relevant:
https://docs.gunicorn.org/en/stable/settings.html#timeout
https://docs.gunicorn.org/en/stable/settings.html#worker-abort
Description
•