Closed
Bug 1339288
Opened 7 years ago
Closed 6 years ago
Consider using Celery acks_late to prevent loss of tasks if worker crashes
Categories
(Tree Management :: Treeherder: Infrastructure, defect, P2)
Tree Management
Treeherder: Infrastructure
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: emorley, Unassigned)
Details
Turns out that if the worker crashes we can lose tasks. Whilst that should be rare, I wonder if we can also hit this case when Heroku terminates dynos due to deploys/the 24-hourly dyno restart. """ Should I use retry or acks_late? Answer: Depends. It’s not necessarily one or the other, you may want to use both. Task.retry is used to retry tasks, notably for expected errors that is catchable with the try: block. The AMQP transaction is not used for these errors: if the task raises an exception it is still acknowledged! The acks_late setting would be used when you need the task to be executed again if the worker (for some reason) crashes mid-execution. It’s important to note that the worker is not known to crash, and if it does it is usually an unrecoverable error that requires human intervention (bug in the worker, or task code). ... So use retry for Python errors, and if your task is idempotent combine that with acks_late if that level of reliability is required. """ See: http://docs.celeryproject.org/en/3.1/faq.html#faq-acks-late-vs-retry
Reporter | ||
Updated•7 years ago
|
Component: Treeherder → Treeherder: Infrastructure
Reporter | ||
Comment 1•6 years ago
|
||
We don't see worker crashes ever
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•