Closed Bug 1124269 Opened 10 years ago Closed 10 years ago

Shorten 600s retry timeouts

Categories

(Tree Management :: Treeherder: Data Ingestion, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

Since they drastically lengthen the time to ingest if there was a hiccup. eg log parsing, and probably others (pushlog, ...)
I can find three instances of retries: log parsing (600s): https://github.com/mozilla/treeherder-service/blob/6bf711fb4a50e1bd88751b8a1d87b2ae6f789c16/treeherder/log_parser/tasks.py#L95 bugzilla submission (immediate retry afaict): https://github.com/mozilla/treeherder-service/blob/b16eff0d2b2c862f4130d306a66d13a9344a9fe0/treeherder/etl/tasks/tbpl_tasks.py#L40 elasticsearch submission (immediate retry afaict): https://github.com/mozilla/treeherder-service/blob/b16eff0d2b2c862f4130d306a66d13a9344a9fe0/treeherder/etl/tasks/tbpl_tasks.py#L23 We don't retry for pushlog ingestion, since it's on a schedule anyway. We should probably add a countdown value for the two that don't have one, and shorten the 600s one for log parsing. I think we should use an exponential back-off to preserve much of the protection, whilst keeping the "just needs one retry" common case responsive.
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Priority: -- → P2
Attachment #8552589 - Flags: review?(mdoglio)
Comment on attachment 8552589 [details] [review] Add smarter celery retry times Build failing :(
Attachment #8552589 - Flags: review?(mdoglio) → review-
Comment on attachment 8552589 [details] [review] Add smarter celery retry times Travis is green now :-)
Attachment #8552589 - Flags: review- → review?(mdoglio)
No longer blocks: 1084493
Attachment #8552589 - Flags: review?(mdoglio) → review+
Commits pushed to master at https://github.com/mozilla/treeherder-service https://github.com/mozilla/treeherder-service/commit/a09c4053cbbeae03a94b0ed61354bb413ebdd5c7 Bug 1124269 - Retry parse-log tasks sooner Currently parse-log tasks retry after 10 minutes. With this change, the first retry is now after 1 minute, then the time for each subsequent retry lengthens by a further minute each time. https://github.com/mozilla/treeherder-service/commit/56fda4107536010465911a07ebc51cb907ced52c Bug 1124269 - Adjust delay before retrying failure classification tasks With this change, the first retry is now after 1 minute, then the time for each subsequent retry lengthens by a further minute each time.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: