Closed Bug 1293718 Opened 8 years ago Closed 7 years ago

Synchronize production schema with Django's view of schema.

Categories

(developer.mozilla.org Graveyard :: General, defect)

All
Other
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jwhitlock, Assigned: jwhitlock)

References

Details

(Keywords: in-triage, Whiteboard: [specification][type:bug])

Attachments

(7 files)

What did you do?
================
1. Create a fresh local database
2. Run mysqldump --no-data to get the database schema
3. Do the same against production
4. Compare the results

What happened?
==============
There are indexes, including unique indexes, in the fresh local database that are not in the production database.  There are small differences in the data columns between the two databases.

What should have happened?
==========================
The indexes may have different names, but they should be present in both databases.  Columns should be identical.

Is there anything else we should know?
======================================
The effect will be subtle difference in behavior between a freshly created database (like during tests) and the production database, leading to annoying, unreproducible bugs.

Differences are determined by visually examining a diff of a fresh development schema versus the production schema, both generated with "mysqldump --no-data".

Schema differences in indexes (not counting index names):

account_emailaddress.user_id
 - Prod has a KEY index on user_id, not in dev database
auth_group_permissions
 - Prod has a KEY index on group_id, not in dev database.
 - Prod does not have dev's CONSTRAINT on permission_id to auth_permission.id
 - Prod does not have dev's CONSTRAINT on group_id to auth_group.id
auth_permission
 - Prod has a KEY index on content_type_id not in dev
 - Prod does not have dev's CONSTRAINT on content_type_id to django_content_type.id
auth_user_groups
 - Prod has a KEY index on group_id not in dev
 - Prod does not have dev's CONSTRAINT on group_id to auth_group.id
 - Prod does not have dev's CONSTRAINT on user_id to auth_user.id
auth_user_user_permissions
 - Prod has a KEY index on permission_id not in dev
 - Prod does not have dev's CONSTRAINT on permission_id to auth_permission.id
 - Prod does not have dev's CONSTRAINT on user_id to auth_user.id
celery_taskmeta
 - Prod does not have KEY index on hidden like dev
celery_tasksetmeta
 - Prod does not have KEY index on hidden like dev
constance_config
 - Prod has an (extra) UNIQUE KEY on key(100)
django_admin_log
 - Prod lacks dev's CONSTRAINT of content_type_id to django_content_type.id
 - Prod lacks dev's CONSTRAINT of user_id to auth_user.id
feeder_bundle_feeds
 - Prod has a KEY index on bundle_id
 - Prod does not have dev's CONSTRAINT on bundle_id to feeder_bundle.id
 - Prod does not have dev's CONSTRAINT on feed_id to feeder_feed.id
feeder_entry
 - Prod does not have dev's CONSTRAINT on feeder_feed.id
socialaccount_socialapp_sites
 - Prod has KEY index on socialapp_id, not on dev
socialaccount_socialtoken
 - Prod has KEY index on app_id, not on dev
taggit_tag
 - Prod does not have UNIQUE KEY index on name like dev
taggit_taggeditem
 - Prod has KEY index on content_type_id, not on dev
tidings_watch
 - Prod does not have dev's KEY index on event_type
 - Prod does not have dev's KEY index on object_id
 - Prod does not have dev's KEY index on email
 - Prod does not have dev's KEY index on is_active
waffle_flag_groups
 - Prod has KEY index on flag_id
 - Prod does not have CONSTRAINT on flag_id to waffle_flag.id
 - Prod does not have CONSTRAINT on group_id to auth_group.id
waffle_flag_users
 - Prod has KEY index on flag_id
 - Prod does not have CONSTRAINT on flag_id to waffle_flag.id
 - Prod does not have CONSTRAINT on group_id to auth_group.id
wiki_document
 - Prod has KEY index on (locale, id). dev has KEY index on locale
 - Prod does not have CONSTRAINT on current_revision_id to wiki_revision.id
 - Prod does not have CONSTRAINT on parent_topic_id to wiki_document.id
 - Prod does not have CONSTRAINT on parent_id to wiki_document.id
wiki_documenttag
 - Prod does not have UNIQUE KEY on name
wiki_documentzone
 - Prod has KEY index on document_id (as well as UNIQUE KEY index)
 - Prod does not have KEY index on url_root
wiki_editortoolbar
 - Prod does not have CONSTRAINT on creator_id to auth_user.id
wiki_localizationtag
 - Prod does not have UNIQUE KEY on name
wiki_reviewtag
 - Prod does not have UNIQUE KEY on name
wiki_reviewtaggedrevision
 - Prod does not have CONSTRAINT on content_object_id to wiki_revision.id
 - Prod does not have CONSTRAINT on tag_id to wiki_reviewtag.id
wiki_revision
 - Prod does not have KEY index on document_id
 - Prod does not have CONSTRAINT on based_on_id to wiki_revision.id
 - Prod does not have CONSTRAINT on creator_id to auth_user.id
 - Prod does not have CONSTRAINT on document_id to wiki_document.id
wiki_taggeddocument
 - Prod does not have CONSTRAINT on content_object_id to wiki_document.id
 - Prod does not have CONSTRAINT on tag_id to wiki_documenttag.id


There are some schema differences in columns:

attachments_attachment.current_revision_id: Prod allows NULL, dev does not
djcelery_taskstate: Prod has DEFAULT NULL, dev has no default
search_filter.shortcut: Prod has no default, dev has DEFAULT NULL
search_filtergroup.slug: Prod has no default, dev has DEFAULT NULL
tidings_watch.email: varchar(75) in prod, varchar(254) in dev
wiki_document.render_max_age: dev has DEFAULT NULL, prod does not
wiki_document.render_expires: dev has DEFAULT NULL, prod does not
wiki_document.current_revision_id: prod has DEFAULT NULL, dev does not
wiki_document.parent_id: prod has DEFAULT NULL, dev does not
wiki_documentspamattempt.reviewed: prod has DEFAULT NULL, dev does not
wiki_documentspamattempt.reviewer_id: prod has DEFAULT NULL, dev does not
wiki_documentzone.url_root: dev has DEFAULT NULL, prod does not


Production has many NOT NULL columns that take defaults:

attachments_attachmentrevision.created: 2012-07-02
attachments_attachmentrevision.is_approved: '1'
attachments_attachmentrevision.is_mindtouch_migration: '0'
djcelery_crontabsschedule.minute: '*'
djcelery_crontabsschedule.hour: '*'
djcelery_crontabsschedule.day_of_week: '*'
djcelery_periodictask.enabled: '1'
djcelery_periodictask.total_run_count: '0'
djcelery_periodictasks.ident: '1'
djcelery_taskstate.hidden: '0'
userban.date: '2013-05-28'
userban.is_active: '1'
waffle_flag.superusers: '1'
waffle_flag.staff: '0'
waffle_flag.authenticated: '0'
waffle_flag.rollout: '0'
waffle_switch.active: '0'
wiki_document.is_template: '0'
wiki_document.is_localizable: '0'
wiki_document.locale: 'en-US'
wiki_editortoolbar.default: '0'
wiki_revision.created: '2011-12-21'
wiki_revision.is_approved: '0'

There are several columns where production has the type as
datetime, and development as datetime(6). The difference [1] is that
datetime(6) has a fractional second precision of 6 decimal places (microsecond
precision), and omitting it gives a precision of no fractional seconds:

account_emailconfirmation.created
account_emailconfirmation.sent
attachments_attachment.modified
authkeys_key.created
authkeys_keyaction.created
celery_taskmeta.date_done
celery_tasksetmeta.date_done
core_ipban.created
core_ipban.deleted
django_admin_log.action_time
django_migratons.applied
django_session.expire_date
djcelery_periodictask.expires
djcelery_periodictask.last_run_at
djcelery_periodictask.date_changed
djcelery_periodictasks.last_update
djcelery_taskstate.tstamp
djcelery_taskstate.eta
djcelery_taskstate.expires
djcelery_workerstate.last_heartbeat
feeder_entry.last_published
feeder_entry.created
feeder_entry.updated
feeder_feed.last_modified
feeder_feed.created
feeder_feed.updated
search_index.created_at
search_outdatedobject.created_at
socialaccount_socialaccount.last_login
socialaccount_socialaccount.date_joined
socialaccount_socialtoken.expires_at
waffle_flag.created
waffle_flag.modified
waffle_sample.created
waffle_sample.modified
waffle_switch.created
waffle_switch.modified
wiki_document.render_scheduled_at
wiki_document.render_started_at
wiki_document.last_rendered_at
wiki_document.render_expires
wiki_document.modified
wiki_documentdeletionlog.timestamp
wiki_documentspamattempt.created
wiki_documentspamattempt.reviewed
wiki_revision.created

So, why are there difference?

Some of this is probably left over from the mindtouch migration, which probably
involved some manual steps.

Some of this may be bugs in South migrations.

Some of this probably happened during the mega-upgrade for Django 1.7 (PR 3073
[1]), which included switching from South for migrations to Django 1.7 native
migrations.  Some of the indexes were added in 3rd party packages like
django-taggit, and appeared in the Django 1.7 initial migration, without
actually being applied to the production database.

Some of this may have been manual attempts to fix production issues that were
not reflected in code.

There's enough issues that it should probably be fixed with some hand-written
migrations targeted at production.

[1] https://dev.mysql.com/doc/refman/5.7/en/fractional-seconds.html
[2] https://github.com/mozilla/kuma/pull/3073
The production database schema
A fresh developer database schema. Change Django settings for a fresh database, run SQL "CREATE DATABASE <dbname> CHARACTER SET utf8;", and then run "./manage.py migrate" to create the schema.
Depends on: 1293749
Another source of differences: MyISAM doesn't support FOREIGN KEY and REFERENCES clauses (foreign key constraints), so these would have been omitted if the tables were initially MyISAM and later converted to InnoDB.

MySQL allows "upgrading" column types, such as datetime to datetime(6). Django does not do this automatically, it must be done manually:

https://docs.djangoproject.com/en/1.10/ref/databases/#fractional-seconds-support-for-time-and-datetime-fields

The production database is MySQL 5.6, which does not support renaming keys / indexes. On this engine, a ADD / DROP (or DROP / ADD) must be used to rename.

I'm working on a SQL file for a one-time fix of 90% of these issues. Others, like adding UNIQUE INDEX for taggit names, will require data cleanup as well as SQL.
Assignee: nobody → jwhitlock
Query the database to find records that violate foreign key constraints, and select * the records found in the production database.
Delete the records that violate foreign key constraints (or would, if they were present)
Update the schema for production, including:

1) Dropping defaults no longer used
2) Upgrading datetime to datetime(6) (date + time + microseconds)
3) Rearranging columns to match the order in a fresh database
4) Adding "NOT NULL" where missing
5) Renaming indexes to match a fresh database
6) Adding missing indexes and foreign key constraints
7) Other schema changes to match production to a fresh database constructed by Django.

This only include schema changes that took less than 10 seconds to execute locally. It does not include adding a unique index to the tagging name fields.
Same as fix03_*.sql, but works on the wiki_document and wiki_revision tables, which take several minutes to update.
Attached file test_schema.sh
A test script used to:

1) Load a production database dump into a test database
2) Run fix01_show_constraint_violations.sql to find constraint violations
3) Run fix02_delete_constraint_violations.sql to delete constraint violations
4) Run fix01_show_constraint_violations.sql again to show no remaining violations
5) Run fix03_schema_fast_tables.sql to execute the quick schema changes
6) Run fix04_schema_slow_tables.sql to execute the slow schema changes

This is not what should be run against the production database, but does show how to run the SQL scripts and capture the output for debugging.
I've created and uploaded some SQL scripts that deal with most of the schema differences.  They are also on GitHub:

https://gist.github.com/jwhitlock/ae48e7ca4ad418c5549e778c41cfe9bd

* fix01_show_constraint_violations.sql queries for records that would violate foreign key constraints, due to manually deleted data
* fix02_delete_constraint_violations.sql deletes the violators
* fix03_schema_fast_tables.sql updates the schema, and the queries run in less than 10 seconds each locally
* fix04_schema_slow_tables.sql finishes updating the schema, but the queries can take minutes (15 - 25 minutes total)

It is relatively safe to run fix02 and fix03 against the live database. They may lock the table during the update, but it should execute quickly enough for a user to only see a delay in their request.  The second query will cause many, if not all, requests to time out, and should be done during scheduled maintenance.

This does not address the missing unique indexes, which will require more intense data cleanup to remove duplicates.
Depends on: 1299227
After the schema updates in bug 1299227, the only significant schema differences are the unique indexes on the name columns of tag tables.  These four are tracked in bug 1293749.
Verified that production schema is now the same as the development schema, with the exception of tidings_watch.email, which has the (correct) type of varchar(254) in production, and the (incorrect) type of varchar(75) in development.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: developer.mozilla.org → developer.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: