Closed Bug 1428031 Opened 7 years ago Closed 6 years ago

Intermittent deployment "Deadlock found when trying to get lock" during fixture loading

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

Details

Attachments

(2 files)

Twice in a row whilst trying to deploy to production (the third deploy succeeded), the deploy failed during the release phase step, in the `manage.py loaddata` management command, with: `model.FailureClassification(pk=1): (1213, 'Deadlock found when trying to get lock; try restarting transaction')` ...whilst trying to load `fixtures/failure_classification.json`. This is presumably due to contention with other queries running at the time. The annoying thing about this is that the fixture file hadn't actually changed, so nothing needed updating. Perhaps we should refactor our use of fixtures to only update things that have changed? Log excerpt: -----> PRE-DEPLOY: Loading initial data... ... Traceback (most recent call last): ... File "/app/treeherder/model/management/commands/load_initial_data.py", line 14, in handle 'performance_bug_templates') File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/__init__.py", line 131, in call_command return command.execute(*args, **defaults) File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/base.py", line 330, in execute output = self.handle(*args, **options) File "/app/.heroku/python/lib/python2.7/site-packages/newrelic/api/function_trace.py", line 107, in literal_wrapper return wrapped(*args, **kwargs) File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 69, in handle self.loaddata(fixture_labels) File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 109, in loaddata self.load_label(fixture_label) File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 175, in load_label obj.save(using=self.using) File "/app/.heroku/python/lib/python2.7/site-packages/django/core/serializers/base.py", line 205, in save models.Model.save_base(self.object, using=using, raw=True, **kwargs) File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/base.py", line 838, in save_base updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields) File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/base.py", line 905, in _save_table forced_update) File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/base.py", line 955, in _do_update return filtered._update(values) > 0 File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/query.py", line 664, in _update return query.get_compiler(self.db).execute_sql(CURSOR) File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 1204, in execute_sql cursor = super(SQLUpdateCompiler, self).execute_sql(result_type) File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 899, in execute_sql raise original_exception django.db.utils.OperationalError: Problem installing fixture '/app/treeherder/model/fixtures/failure_classification.json': Could not load model.FailureClassification(pk=1): (1213, 'Deadlock found when trying to get lock; try restarting transaction') Original logs: https://dashboard.heroku.com/apps/treeherder-prod/activity/releases/63347985-98cb-4191-a383-570702583a81 https://dashboard.heroku.com/apps/treeherder-prod/activity/releases/62e50b90-d23b-4c50-bd22-f8d760669bc2
Priority: P2 → P1
Assignee: nobody → emorley
So I had an initial look into this and we seem to just be using the native Django fixtures, so not sure what we can do to fix this? It seems like this is something that Django should handle better, and we should file upstream?
Assignee: emorley → nobody
Django describes[1] fixtures like so on their topic page: > It’s sometimes useful to pre-populate your database with hard-coded data when you’re first setting up an app. and your comment earlier Ed about our use of them: > the fixture file hadn't actually changed, so nothing needed updating makes a really good point that we're misusing them. Having looked at all the fixtures we're loading I think most are candidates for conversion to Field.choices [2]. There are a couple that need some more thought given to them (although there's one that's templating so should probably be in a template) but at least reducing the number we load will hopefull reduce the frequency of deadlocks as a start. [1]: https://docs.djangoproject.com/en/2.0/howto/initial-data/ [2]: https://docs.djangoproject.com/en/2.0/ref/models/fields/#choices
Yeah good ideas :-) Happened again today (the frequency has dropped though): https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/1849
This is still happening multiple times a week (and/or day), eg: https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/2080
Yeah, I just hit it again on stage yesterday, fwiw.
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/165ef157470eb877d3191bcedb1529a8e95d2d29 Bug 1428031 - Heroku: Retry load_initial_data up to 5 times (#4186) To work around the intermittent MySQL deadlocks encountered during a small percentage of deployments. This will ensure deployments don't cause email spam to app collaborators and Heroku admins, as well as saving us from having to manually retry the release each time it occurs. The `IGNORE_PREDEPLOY_ERRORS` option has also been removed, since it's mostly redundant with `SKIP_PREDEPLOY`.
Assignee: nobody → emorley
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/47419fde6c841a440f85cd341c612ce5495afc75 Bug 1428031 - Heroku: Increase max load_initial_data retries to 10 (#4203) Since there was still a deploy failure after five attempts here: https://dashboard.heroku.com/apps/treeherder-prototype/activity/releases/2847

Found this ticket by accident, while reading the pre_deploy script. Had the feeling we should revisit it, given the following comment from it:

# TODO: Look into this again when using newer MySQL and Django 2.x.
Flags: needinfo?(armenzg)
Flags: needinfo?(armenzg)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: