Closed
Bug 1307782
Opened 8 years ago
Closed 8 years ago
Raise the Celery task time_limit for the buildbot ingestion tasks
Categories
(Tree Management :: Treeherder: Data Ingestion, defect, P1)
Tree Management
Treeherder: Data Ingestion
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: camd)
References
Details
During the prod Heroku migration, when ingestion was resumed on Heroku, the initial couple of builds-4hr task runs failed with: Oct 05 14:03:00 treeherder-prod app/worker_buildapi_4hr.1: TimeLimitExceeded: TimeLimitExceeded(180,) (https://papertrailapp.com/systems/treeherder-prod/events?centered_on_id=720310299303125003) This is because with an empty memcached, the builds-4hr ingestion can't skip previously seen jobs so takes more time. This combined with high load from simultaneous Pulse jobs ingestion catch-up made the timeout exceed the 180s currently set here: https://github.com/mozilla/treeherder/blob/f7e2c5cd423244d2963055fd2603e650ada845c3/treeherder/etl/tasks/buildapi_tasks.py#L31 It also appears that any Celery timeouts are not reported in New Relic. We should raise the timeout and see why they aren't caught by the NR agent.
Reporter | ||
Comment 1•8 years ago
|
||
The above issue was worked around during the Heroku migration, by making the worker_buildapi_4hr dyno type use a P-M dyno not a P2.
Reporter | ||
Comment 2•8 years ago
|
||
builds-4hr ingestion is still timing out occasionally: https://papertrailapp.com/systems/treeherder-prod/events?q=TimeLimitExceeded+program%3Aworker_buildapi
Priority: P2 → P1
Reporter | ||
Comment 3•8 years ago
|
||
Plus fetch-allthethings: https://papertrailapp.com/systems/treeherder-prod/events?centered_on_id=721067034422829071 Cameron, I don't suppose you could raises all the time limits in buildapi_tasks.py for me, land on master and then cherry pick just that commit to the production branch? I have to head out shortly.
Flags: needinfo?(cdawson)
Comment 4•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/c88a9fb3fa5fd4d60dc6d3438fee5c65c9e91dae Bug 1307782 - Raise the Celery task time_limit for the buildbot ingestion tasks
Assignee | ||
Comment 5•8 years ago
|
||
Yep, working on that now. Wasn't sure what numbers would be best, but tried 10 for each, and 15 for fetch-allthethings.
Reporter | ||
Comment 7•8 years ago
|
||
Many thanks! :-)
Assignee: nobody → cdawson
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•