Add missing UNIQUE KEY indexes in production

RESOLVED FIXED

Status

--
enhancement
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: jwhitlock, Assigned: jwhitlock)

Tracking

({in-triage})

Details

(Whiteboard: [specification][type:feature])

(Assignee)

Description

2 years ago
What problem would this feature solve?
======================================
There are several missing UNIQUE KEY indexes in production:

taggit_tag.name
wiki_documenttag.name
wiki_localizationtag.name
wiki_reviewtag.name

The django-taggit library has had a unique index on these fields since 0.10.0 (2013-17-08), and the correct detection of duplicate tags probably depends on it, leading to bugs like bug 1271552


Who has this problem?
=====================
All contributors to MDN

How do you know that the users identified above have this problem?
==================================================================
Users continue to have issues when adding tags to documents, when tags that differ only in capital letters are added to documents.

How are the users identified above solving this problem now?
============================================================
Admins manually remove the duplicate tags as they are brought to their attention.

Do you have any suggestions for solving the problem? Please explain in detail.
==============================================================================
1. Investigate the duplicate tag situation in the database
2. Remove the existing duplicate tags, manually or in a migration
3. Add the UNIQUE KEY index in production, manually or in a migration.

A migration is preferred, but care will be needed to ensure it works on a freshly created database as well.

Is there anything else we should know?
======================================
It is possible there were migrations that added these constraints, but that failed due to existing duplicates.  Data and schema issues like these may impact moving to PostgreSQL (bug 1159930).
(Assignee)

Updated

2 years ago
Blocks: 1293718, 1271552
(Assignee)

Updated

2 years ago
Assignee: nobody → jwhitlock
(Assignee)

Updated

2 years ago
Depends on: 1239756
(Assignee)

Comment 1

2 years ago
These are the SQL statements to add the unique indexes:

ALTER TABLE `taggit_tag` ADD UNIQUE KEY `name` (`name`);
ALTER TABLE `wiki_documenttag` ADD UNIQUE KEY `name` (`name`);
ALTER TABLE `wiki_localizationtag` ADD UNIQUE KEY `name` (`name`);
ALTER TABLE `wiki_reviewtag` ADD UNIQUE KEY `name` (`name`);

If there are existing duplicates, an error is returned, such as:

ERROR 1062 (23000): Duplicate entry 'profile:interest:CSS' for key 'name'

If there are no duplicates, it is successful:

Query OK, 0 rows affected (0.17 sec)
Records: 0  Duplicates: 0  Warnings: 0

If the index already exists, this is also an error:

ERROR 1061 (42000): Duplicate key name 'name'

On staging, I was able to successfully add the indexes for the tables wiki_localizationtag.name and wiki_reviewtag.name on staging (which are silly and to be removed in bug 1160178). The other two tables fail due to duplicate tags.

Many tags are unused, and unused tags should be deleted before attempting to merge duplicate tags.
(Assignee)

Comment 2

2 years ago
Removed unused tags, and merged duplicate tags, on stage and production, using this script:

https://gist.github.com/jwhitlock/f636a8d9424f3f344f0f1b6cdbb95ae7

Applied the UNIQUE KEY in staging and production.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
(Assignee)

Updated

2 years ago
See Also: → bug 679193
You need to log in before you can comment on or make changes to this bug.