Closed
Bug 1185098
Opened 10 years ago
Closed 10 years ago
Convince celery to re-try failed connections
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: dustin)
References
Details
Attachments
(1 file)
|
266.81 KB,
image/png
|
Details |
Jordan's seeing
----
2015-07-17 12:09:40,499 [relengapi.blueprints.archiver] Creating new celery task and task tracker for: try-9ae2bc693a00.tar.gz_testing_mozharness
2015-07-17 12:09:40,576 [relengapi.blueprints.archiver] checking status of task id try-9ae2bc693a00.tar.gz_testing_mozharness: current state PENDING
...
...
2015-07-17 12:09:49,586 [relengapi.blueprints.archiver] generating GET URL to try-2a9d42bfc513.tar.gz/testing/mozharness, expires in 300s
2015-07-17 12:09:49,650 [relengapi.app] Exception on /archiver/status/try-9ae2bc693a00.tar.gz_testing_mozharness [GET]
Traceback (most recent call last):
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/newrelic-2.46.0.37/newrelic/hooks/framework_flask.py", line 40, in _nr_wrapper_handler_
return wrapped(*args, **kwargs)
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/relengapi/lib/api.py", line 103, in replacement
result = wrapped(*args, **kwargs)
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/relengapi/blueprints/archiver/__init__.py", line 80, in task_status
log.info("checking status of task id {}: current state {}".format(task_id, task.state))
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/celery/result.py", line 398, in state
return self._get_task_meta()['status']
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/celery/result.py", line 341, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id))
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/celery/backends/amqp.py", line 163, in get_task_meta
binding.declare()
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/kombu/entity.py", line 504, in declare
self.exchange.declare(nowait)
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/kombu/entity.py", line 166, in declare
nowait=nowait, passive=passive,
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/amqp/channel.py", line 613, in exchange_declare
self._send_method((40, 10), args)
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/amqp/abstract_channel.py", line 56, in _send_method
self.channel_id, method_sig, args, content,
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/amqp/method_framing.py", line 221, in write_method
write_frame(1, channel, payload)
File "/data/www/relengapi/virtualenv/lib/python2.7/site-packages/amqp/transport.py", line 182, in write_frame
frame_type, channel, size, payload, 0xce,
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 32] Broken pipe
----
which, ideally, celery would just automatically retry.
Comment 1•10 years ago
|
||
screen shot is from the last 3 hours
looks like we hit an error a few times. which results in the entire rev's list of builders failing.
I have updated the archiver client to fallback on getting the archive from hg.mozilla.org directly so this doesn't cause bustage in production
Comment 2•10 years ago
|
||
celery seems to recommend not using rabbit as a backend :)
http://celery.readthedocs.org/en/latest/configuration.html#amqp-backend-settings
maybe we should try redis or a database
Comment 3•10 years ago
|
||
as a bonus, making the backend a database may mean that I can remove my Tracker table..
No longer blocks: 1184722
| Assignee | ||
Comment 4•10 years ago
|
||
Hm, I thought I commented on this a week or so ago. I don't think there's anything to do here: Celery *does* retry its AMQP frontend connections, and we're no longer using it on the backend. So, fixed by virtue of using the MySQL backend.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•