Closed Bug 1340132 Opened 8 years ago Closed 6 years ago

Stop using --maxtasksperchild with celery now datasource isn't leaking

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

[treeherder] mozilla:celery-no-maxtasksperchild > mozilla:master 8 years ago GitHub Autolander Bot 47 bytes, text/x-github-pull-request	camd : review+	Details \| Review

Ed Morley [:emorley]

Assignee

Description

•

8 years ago

Bug 1312007 found datasource was leaking for web requests. This is also presumably why the background tasks used to leak too, and so why we had had to use the `--maxtasksperchild` option for celery workers until now. However since we're no longer using datasource, we should try removing the option since: 1) It will improve celery worker performance (some of our workers are using a --maxtasksperchild value of 20, which is really inefficient when some tasks only take a second or two each) 2) It reduces the number of background tasks that don't end up reported to New Relic (since celery forcibly kills the worker, see https://docs.newrelic.com/docs/agents/python-agent/back-end-services/python-agent-celery) If testing shows us to still have small leaks we could always still use the option but at a much higher threshold.

Ed Morley [:emorley]

Assignee

Updated

•

8 years ago

Blocks: 1059325

Ed Morley [:emorley]

Assignee

Comment 1

•

8 years ago

Testing for 30 minutes on stage showed: * pushlog: unchanged (previously maxtasksperchild=500) * buildapi_pending: unchanged (previously maxtasksperchild=20) * buildapi_running: unchanged (previously maxtasksperchild=20) * buildapi_4hr: unchanged (previously maxtasksperchild=20) * store_pulse_jobs: unchanged RAM usage, lower dyno load (previously maxtasksperchild=20) * store_pulse_resultsets: unchanged (previously maxtasksperchild=20) * default: unchanged (previously maxtasksperchild=50) * hp: unchanged (previously maxtasksperchild=50) * log_parser: increase in RAM usage, would need to investigate (previously maxtasksperchild=50) 30 minutes isn't long enough for a proper test, but at first glance it seems that most workers are unaffected by leaks any more, store_pulse_jobs performs much better when not having to restart every 20 jobs, and the log parser may need to be excluded initially, until further investigated.

GitHub Autolander Bot

Comment 2

•

8 years ago

Attached file [treeherder] mozilla:celery-no-maxtasksperchild > mozilla:master — Details

Ed Morley [:emorley]

Assignee

Comment 3

•

8 years ago

Comment on attachment 8864662 [details] [review] [treeherder] mozilla:celery-no-maxtasksperchild > mozilla:master This removes `--maxtasksperchild` usage from all tasks apart from the logparser, since everything but that seemed fine with it when previously tested (comment 1).

Attachment #8864662 - Flags: review?(cdawson)

Cameron Dawson [:camd]

Comment 4

•

8 years ago

Comment on attachment 8864662 [details] [review] [treeherder] mozilla:celery-no-maxtasksperchild > mozilla:master Awesome!

Attachment #8864662 - Flags: review?(cdawson) → review+

Treeherder GitHub Bugbot

Comment 5

•

8 years ago

Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/f6e320aa92e2af7acf788a665c9111d36bddf02c Bug 1340132 - Reduce usage of --maxtasksperchild with celery Now we're no longer using datasource, the memory leaks previously seen have gone. As such there is no need to make the celery worker processes restart so frequently (they will now only be restarted at the daily Heroku dyno restart), which will improve performance. In addition, this will reduce the number of transactions that don't get reported to New Relic (when the maxtasksperchild threshold is reached, the worker is forcibly killed before it can submit the New Relic payload). The log parser tasks have been left unchanged for now, since they appear to still be leaking.

Ed Morley [:emorley]

Assignee

Updated

•

7 years ago

Component: Treeherder → Treeherder: Infrastructure

Ed Morley [:emorley]

Assignee

Updated

•

6 years ago

Depends on: 1513506

Ed Morley [:emorley]

Assignee

Comment 6

•

6 years ago

I've split out the remaining log parser part to bug 1513506.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Stop using --maxtasksperchild with celery now datasource isn't leaking

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P2)

Tracking

(Not tracked)

People

(Reporter: emorley, Assigned: emorley)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Comment 6

Attachment

General

Description

File Name

Content Type