Closed Bug 1276867 Opened 8 years ago Closed 8 years ago

Errors deploying to stage: "django.db.utils.OperationalError: (1060, "Duplicate column name 'job_log_id'")"

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: jgraham)

References

Details

The latest stage deploy failed:
http://treeherderadm.private.scl3.mozilla.com/chief/treeherder.stage/logs/stage.1464683573

[2016-05-31 08:34:28] [localhost] failed: cd /data/treeherder-stage/src/treeherder.allizom.org/treeherder-service && source /data/treeherder-stage/src/treeherder.allizom.org/treeherder-env.sh && python2.7 manage.py migrate --noinput (1.473s)
[localhost] out: Operations to perform:
[localhost] out: Synchronize unmigrated apps: log_parser, django_browserid, corsheaders, autoclassify, staticfiles, messages, rest_framework_swagger, runserver_nostatic, webapp, rest_framework, embed, hawkrest, etl
[localhost] out: Apply all migrations: perf, sessions, admin, sites, auth, contenttypes, credentials, model
[localhost] out: Synchronizing apps without migrations:
[localhost] out: Creating tables...
[localhost] out: Running deferred SQL...
[localhost] out: Installing custom SQL...
[localhost] out: Running migrations:
[localhost] out: Rendering model states... DONE
[localhost] out: Applying model.0026_failure_line_job_log_id...
[localhost] err: Traceback (most recent call last):
...
[localhost] err: django.db.utils.OperationalError: (1060, "Duplicate column name 'job_log_id'")

This migration was added in:
https://github.com/mozilla/treeherder/commit/f4a5ccd403ae5937288095c1569cd3785e11c9d9

It looks like the previous deploy started running the migration but timed out:
http://treeherderadm.private.scl3.mozilla.com/chief/treeherder.stage/logs/stage.1464374130

So guessing the migration command finished running on the DB after that, but since the migration never succeeded on the Django side, the DB that tracks applied vs unapplied migrations was never updated.

We need to:
1) Ensure that both actions here have been run:
https://github.com/mozilla/treeherder/blob/master/treeherder/model/migrations/0026_failure_line_job_log_id.py
2) Run a migrate --fake 

We should also be aware that big schema changes may need to be performed manually rather than relying on migrations. (At least until we're on Heroku and I believe slightly more in control of our timeouts).

James, please could you take a look at this urgently? :-)
Flags: needinfo?(james)
Database is now in the right state, just needs the --fake migration to run.
Flags: needinfo?(james)
Done.

[emorley@treeherderadm.private.scl3 ~]$ cd /data/treeherder-stage/src/treeherder.allizom.org/treeherder-service
[emorley@treeherderadm.private.scl3 treeherder-service]$ sudo git fetch --quiet origin stage
[emorley@treeherderadm.private.scl3 treeherder-service]$ sudo git reset --hard FETCH_HEAD
HEAD is now at 9f96d18 Bug 1276864 - Disable the unused alder repository
[emorley@treeherderadm.private.scl3 treeherder-service]$ sudo find . -type f -name "*.pyc" -delete
[emorley@treeherderadm.private.scl3 treeherder-service]$ source ../treeherder-env.sh

[emorley@treeherderadm.private.scl3 treeherder-service]$ ../venv/bin/python2.7 ./manage.py migrate --fake
Operations to perform:
  Synchronize unmigrated apps: log_parser, django_browserid, corsheaders, autoclassify, staticfiles, messages, rest_framework_swagger, runserver_nostatic, webapp, rest_framework, embed, hawkrest, etl
  Apply all migrations: perf, sessions, admin, sites, auth, contenttypes, credentials, model
Synchronizing apps without migrations:
  Creating tables...
    Running deferred SQL...
  Installing custom SQL...
Running migrations:
  Rendering model states... DONE
  Applying model.0026_failure_line_job_log_id... FAKED
The following content types are stale and need to be deleted:

    model | device
    model | machinenote

Any objects related to these content types by a foreign key will also
be deleted. Are you sure you want to delete these content types?
If you're unsure, answer 'no'.

    Type 'yes' to continue, or 'no' to cancel: yes

[emorley@treeherderadm.private.scl3 treeherder-service]$ ../venv/bin/python2.7 ./manage.py migrate
Operations to perform:
  Synchronize unmigrated apps: log_parser, django_browserid, corsheaders, autoclassify, staticfiles, messages, rest_framework_swagger, runserver_nostatic, webapp, rest_framework, embed, hawkrest, etl
  Apply all migrations: perf, sessions, admin, sites, auth, contenttypes, credentials, model
Synchronizing apps without migrations:
  Creating tables...
    Running deferred SQL...
  Installing custom SQL...
Running migrations:
  No migrations to apply.
The stage deploy now succeeded. However we'll likely have the same problem again when deploying to prod.

Perhaps we should run it manually in advance?
Prod was handled in bug 1277889.
Blocks: 1233164
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.