Closed
Bug 1428031
Opened 7 years ago
Closed 6 years ago
Intermittent deployment "Deadlock found when trying to get lock" during fixture loading
Categories
(Tree Management :: Treeherder: Infrastructure, defect, P1)
Tree Management
Treeherder: Infrastructure
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
Details
Attachments
(2 files)
Twice in a row whilst trying to deploy to production (the third deploy succeeded), the deploy failed during the release phase step, in the `manage.py loaddata` management command, with:
`model.FailureClassification(pk=1): (1213, 'Deadlock found when trying to get lock; try restarting transaction')`
...whilst trying to load `fixtures/failure_classification.json`.
This is presumably due to contention with other queries running at the time.
The annoying thing about this is that the fixture file hadn't actually changed, so nothing needed updating. Perhaps we should refactor our use of fixtures to only update things that have changed?
Log excerpt:
-----> PRE-DEPLOY: Loading initial data...
...
Traceback (most recent call last):
...
File "/app/treeherder/model/management/commands/load_initial_data.py", line 14, in handle
'performance_bug_templates')
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/__init__.py", line 131, in call_command
return command.execute(*args, **defaults)
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "/app/.heroku/python/lib/python2.7/site-packages/newrelic/api/function_trace.py", line 107, in literal_wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 69, in handle
self.loaddata(fixture_labels)
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 109, in loaddata
self.load_label(fixture_label)
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/management/commands/loaddata.py", line 175, in load_label
obj.save(using=self.using)
File "/app/.heroku/python/lib/python2.7/site-packages/django/core/serializers/base.py", line 205, in save
models.Model.save_base(self.object, using=using, raw=True, **kwargs)
File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/base.py", line 838, in save_base
updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/base.py", line 905, in _save_table
forced_update)
File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/base.py", line 955, in _do_update
return filtered._update(values) > 0
File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/query.py", line 664, in _update
return query.get_compiler(self.db).execute_sql(CURSOR)
File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 1204, in execute_sql
cursor = super(SQLUpdateCompiler, self).execute_sql(result_type)
File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 899, in execute_sql
raise original_exception
django.db.utils.OperationalError: Problem installing fixture '/app/treeherder/model/fixtures/failure_classification.json': Could not load model.FailureClassification(pk=1): (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Original logs:
https://dashboard.heroku.com/apps/treeherder-prod/activity/releases/63347985-98cb-4191-a383-570702583a81
https://dashboard.heroku.com/apps/treeherder-prod/activity/releases/62e50b90-d23b-4c50-bd22-f8d760669bc2
Assignee | ||
Comment 1•7 years ago
|
||
And again, but on stage:
https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/1534
Assignee | ||
Comment 2•7 years ago
|
||
And again on stage:
https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/1542
Assignee | ||
Comment 3•7 years ago
|
||
Assignee | ||
Comment 4•7 years ago
|
||
Assignee | ||
Comment 5•7 years ago
|
||
Assignee | ||
Comment 6•7 years ago
|
||
Assignee | ||
Comment 7•7 years ago
|
||
Assignee | ||
Comment 8•7 years ago
|
||
Assignee | ||
Comment 9•7 years ago
|
||
Assignee | ||
Comment 10•7 years ago
|
||
This issue is starting to get annoying.
https://dashboard.heroku.com/apps/treeherder-prod/activity/releases/281
Assignee | ||
Updated•7 years ago
|
Priority: P2 → P1
Assignee | ||
Comment 11•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → emorley
Assignee | ||
Comment 12•7 years ago
|
||
Assignee | ||
Comment 13•7 years ago
|
||
Assignee | ||
Comment 14•7 years ago
|
||
Assignee | ||
Comment 15•7 years ago
|
||
Assignee | ||
Comment 16•7 years ago
|
||
Assignee | ||
Comment 17•7 years ago
|
||
Assignee | ||
Comment 18•7 years ago
|
||
So I had an initial look into this and we seem to just be using the native Django fixtures, so not sure what we can do to fix this? It seems like this is something that Django should handle better, and we should file upstream?
Assignee | ||
Updated•7 years ago
|
Assignee: emorley → nobody
Comment 19•7 years ago
|
||
Django describes[1] fixtures like so on their topic page:
> It’s sometimes useful to pre-populate your database with hard-coded data when you’re first setting up an app.
and your comment earlier Ed about our use of them:
> the fixture file hadn't actually changed, so nothing needed updating
makes a really good point that we're misusing them.
Having looked at all the fixtures we're loading I think most are candidates for conversion to Field.choices [2]. There are a couple that need some more thought given to them (although there's one that's templating so should probably be in a template) but at least reducing the number we load will hopefull reduce the frequency of deadlocks as a start.
[1]: https://docs.djangoproject.com/en/2.0/howto/initial-data/
[2]: https://docs.djangoproject.com/en/2.0/ref/models/fields/#choices
Assignee | ||
Comment 20•6 years ago
|
||
Yeah good ideas :-)
Happened again today (the frequency has dropped though):
https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/1849
Assignee | ||
Comment 21•6 years ago
|
||
Assignee | ||
Comment 22•6 years ago
|
||
https://dashboard.heroku.com/apps/treeherder-prototype/activity/releases/2371
https://dashboard.heroku.com/apps/treeherder-prototype/activity/releases/2372
https://dashboard.heroku.com/apps/treeherder-prototype/activity/releases/2373
https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/1880
Comment 23•6 years ago
|
||
Assignee | ||
Comment 24•6 years ago
|
||
This is still happening multiple times a week (and/or day), eg:
https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/2080
Comment 25•6 years ago
|
||
Yeah, I just hit it again on stage yesterday, fwiw.
Comment 26•6 years ago
|
||
Comment 27•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/165ef157470eb877d3191bcedb1529a8e95d2d29
Bug 1428031 - Heroku: Retry load_initial_data up to 5 times (#4186)
To work around the intermittent MySQL deadlocks encountered during
a small percentage of deployments. This will ensure deployments don't
cause email spam to app collaborators and Heroku admins, as well as
saving us from having to manually retry the release each time it
occurs.
The `IGNORE_PREDEPLOY_ERRORS` option has also been removed, since
it's mostly redundant with `SKIP_PREDEPLOY`.
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → emorley
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 28•6 years ago
|
||
Seems to be working well - example retry:
https://dashboard.heroku.com/apps/treeherder-stage/activity/releases/c26b7df1-b400-4519-86cc-bbf5988fe29d
Comment 29•6 years ago
|
||
Comment 30•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/47419fde6c841a440f85cd341c612ce5495afc75
Bug 1428031 - Heroku: Increase max load_initial_data retries to 10 (#4203)
Since there was still a deploy failure after five attempts here:
https://dashboard.heroku.com/apps/treeherder-prototype/activity/releases/2847
Comment 31•5 years ago
•
|
||
Found this ticket by accident, while reading the pre_deploy
script. Had the feeling we should revisit it, given the following comment from it:
# TODO: Look into this again when using newer MySQL and Django 2.x.
Flags: needinfo?(armenzg)
Updated•5 years ago
|
Flags: needinfo?(armenzg)
You need to log in
before you can comment on or make changes to this bug.
Description
•