Closed Bug 1302869 Opened 8 years ago Closed 8 years ago

Remove redundant indexes

Categories

(Tree Management :: Treeherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file)

eg: > SELECT * FROM sys.schema_redundant_indexes ... ******************** 5. row ********************* table_schema: treeherder table_name: bug_job_map redundant_index_name: bug_job_map_d697ea38 redundant_index_columns: job_id redundant_index_non_unique: 1 dominant_index_name: bug_job_map_job_id_700648fc2e943c6b_uniq dominant_index_columns: job_id,bug_id dominant_index_non_unique: 0 subpart_exists: 0 sql_drop_index: ALTER TABLE `treeherder`.`bug_job_map` DROP INDEX `bug_job_map_d697ea38` ******************** 6. row ********************* table_schema: treeherder table_name: build_platform redundant_index_name: build_platform_os_name_3e460886b60cf2ce_uniq redundant_index_columns: os_name,platform,architecture redundant_index_non_unique: 0 dominant_index_name: build_platform_os_name_2b1ee6b83c271ea_uniq dominant_index_columns: os_name,platform,architecture dominant_index_non_unique: 0 subpart_exists: 0 sql_drop_index: ALTER TABLE `treeherder`.`build_platform` DROP INDEX `build_platform_os_name_3e460886b60cf2ce_uniq`
Install the mysql helper scripts using: 1) vagrant destroy -f && vagrant up && vagrant ssh 2) cd ~/ 3) git clone https://github.com/mysql/mysql-sys 4) cd mysql-sys 5) mysql -u root -p < ./sys_56.sql Then against the Vagrant DB: > SELECT CONCAT(table_schema, '.', table_name) AS 'table', dominant_index_columns AS dominant_index, redundant_index_columns AS redundant_index FROM sys.schema_redundant_indexes ******************** 1. row ********************* table: treeherder.bug_job_map dominant_index: job_id,bug_id redundant_index: job_id ******************** 2. row ********************* table: treeherder.build_platform dominant_index: os_name,platform,architecture redundant_index: os_name ******************** 3. row ********************* table: treeherder.failure_line dominant_index: job_log_id,line redundant_index: job_log_id ******************** 4. row ********************* table: treeherder.failure_line dominant_index: signature,test,created redundant_index: signature ******************** 5. row ********************* table: treeherder.failure_match dominant_index: failure_line_id,classified_failure_id,matcher_id redundant_index: failure_line_id ******************** 6. row ********************* table: treeherder.job_group dominant_index: name,symbol redundant_index: name ******************** 7. row ********************* table: treeherder.job_log dominant_index: job_id,name,url redundant_index: job_id ******************** 8. row ********************* table: treeherder.job_type dominant_index: name,symbol redundant_index: name ******************** 9. row ********************* table: treeherder.machine dominant_index: name redundant_index: name ******************** 10. row ********************* table: treeherder.machine_platform dominant_index: os_name,platform,architecture redundant_index: os_name ******************** 11. row ********************* table: treeherder.option_collection dominant_index: option_collection_hash,option_id redundant_index: option_collection_hash ******************** 12. row ********************* table: treeherder.performance_alert dominant_index: summary_id,series_signature_id redundant_index: summary_id ******************** 13. row ********************* table: treeherder.performance_datum dominant_index: repository_id,job_id,result_set_id,signature_id,push_timestamp redundant_index: repository_id,job_id ******************** 14. row ********************* table: treeherder.performance_signature dominant_index: repository_id,framework_id,signature_hash redundant_index: repository_id ******************** 15. row ********************* table: treeherder.repository dominant_index: name redundant_index: name ******************** 16. row ********************* table: treeherder.text_log_error dominant_index: step_id,line_number redundant_index: step_id ******************** 17. row ********************* table: treeherder.text_log_step dominant_index: job_id,started_line_number,finished_line_number redundant_index: job_id
Some are not our fault, eg Django creates redundant indexes for a FlexibleForeignKey even if there is a unique_together: class TextLogError(models.Model): id = BigAutoField(primary_key=True) step = FlexibleForeignKey(TextLogStep, related_name='errors') line = models.TextField() line_number = models.PositiveIntegerField() class Meta: db_table = "text_log_error" unique_together = ('step', 'line_number') Others are definitely our fault. eg specifying both unique=True and db_index=True. For the Django at fault ones, we should still file upstream bugs.
(In reply to Ed Morley [:emorley] from comment #2) > Some are not our fault, eg Django creates redundant indexes See: https://groups.google.com/forum/#!topic/django-developers/3ywugkcaxqs https://code.djangoproject.com/ticket/24082
Assignee: nobody → emorley
Depends on: 1303763
(In reply to Ed Morley [:emorley] from comment #2) > Some are not our fault, eg Django creates redundant indexes for a FlexibleForeignKey More specifically: * Django 1.10 and earlier creates a redundant index for ForeignKeys that overlap with a unique_together (see example in https://emorley.pastebin.mozilla.org/8933048). * This appears to be fixed in Django master. * The workaround is to add an explicit db_index=False to the ForeignKey(). * However this workaround causes a suboptimal migration if applied to existing tables, since it not only drops the redundant index, but actually removes the foreign key entirely before adding an identical one. This is bad since dropping an index is an online DDL operation, but adding a constraint is not (causes the table to be locked). For this I've filed: https://code.djangoproject.com/ticket/27558 Discussion at: https://groups.google.com/forum/#!topic/django-developers/3ywugkcaxqs The takeaway lesson here is that when we're performing migrations on our larger tables we probably should run a `./manage.py model <migration_name>` to show the SQL being run to understand how much of a performance implication it will really have.
A `./manage.py sqlmigrate <model> <migration_name>` even.
(In reply to Ed Morley [:emorley] from comment #4) > For this I've filed: > https://code.djangoproject.com/ticket/27558 I've opened some Django PRs to fix that ticket: https://github.com/django/django/pull/7647 https://github.com/django/django/pull/7648
(In reply to Ed Morley [:emorley] from comment #6) > I've opened some Django PRs to fix that ticket: My PRs were actually merged in time to make the Django 1.10.4 release. Once we're on Django 1.10 we'll have the fix for this :-) https://docs.djangoproject.com/en/1.10/releases/1.10.4/
Comment on attachment 8825211 [details] [review] [treeherder] mozilla:rm-redundant-indexes > mozilla:master This fixes most of the cases. The others will ether be fixed as part of migrations file squashing (bug 1320136; since there is cruft in 0001_inital...) or else by later PRs in this bug.
Attachment #8825211 - Flags: review?(wlachance)
Comment on attachment 8825211 [details] [review] [treeherder] mozilla:rm-redundant-indexes > mozilla:master Awesome!
Attachment #8825211 - Flags: review?(wlachance) → review+
Commit pushed to master at https://github.com/mozilla/treeherder https://github.com/mozilla/treeherder/commit/97825a259ba6a8b9852b76f3da0e3c943b7c352d Bug 1302869 - Remove redundant MySQL indexes In cases where one index overlaps with the start of another (larger) composite index, the first index is redundant and can be removed. Found using the sys.schema_redundant_indexes helper: table: treeherder.build_platform dominant_index: os_name,platform,architecture redundant_index: os_name table: treeherder.job_group dominant_index: name,symbol redundant_index: name table: treeherder.job_type dominant_index: name,symbol redundant_index: name table: treeherder.machine_platform dominant_index: os_name,platform,architecture redundant_index: os_name table: treeherder.option_collection dominant_index: option_collection_hash,option_id redundant_index: option_collection_hash table: treeherder.performance_datum dominant_index: repository_id,job_id redundant_index: repository_id table: treeherder.performance_datum dominant_index: repository_id,job_id,push_id,signature_id redundant_index: repository_id,job_id
Depends on: 1320136
After bug 1320136, there's only one redundant index returned by the `sys.schema_redundant_indexes` helper. > SELECT * FROM sys.schema_redundant_indexes ******************** 1. row ********************* table_schema: treeherder table_name: failure_line redundant_index_name: failure_line_signature_idx redundant_index_columns: signature redundant_index_non_unique: 1 dominant_index_name: failure_line_signature_test_idx dominant_index_columns: signature,test,created dominant_index_non_unique: 1 subpart_exists: 1 sql_drop_index: ALTER TABLE `treeherder`.`failure_line` DROP INDEX `failure_line_signature_idx` 1 rows in set However the indexes in question are: signature(50) signature(25),test(50),created ...so I'm presuming the non-composite index still has use, since it's longer than the `signature` part of the composite index. I would file an issue against the mysql-sys helpers, but the GitHub repo has issues disabled and so bugs have to be filed against the Oracle mysql.com tracker, which is an utter waste of time given previous experience.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Decided to handle updating dev/stage/prod here rather than piecemeal in the table by table deps of bug 1303763. Statements generated from: SELECT table_name, redundant_index_columns, dominant_index_columns, sql_drop_index FROM sys.schema_redundant_indexes WHERE subpart_exists != 1 Run against dev/stage: ALTER TABLE `treeherder`.`bug_job_map` DROP INDEX `bug_job_map_d697ea38`; ALTER TABLE `treeherder`.`commit` DROP INDEX `commit_e13f0cea`; ALTER TABLE `treeherder`.`failure_line` DROP INDEX `failure_line_48d13415`; ALTER TABLE `treeherder`.`failure_match` DROP INDEX `failure_match_4b7d144b`; ALTER TABLE `treeherder`.`job_exclusion` DROP INDEX `idx_name`; ALTER TABLE `treeherder`.`job_log` DROP INDEX `job_log_d697ea38`; ALTER TABLE `treeherder`.`job` DROP INDEX `job_084f3fc9`; ALTER TABLE `treeherder`.`machine` DROP INDEX `idx_name`; ALTER TABLE `treeherder`.`performance_alert_summary` DROP INDEX `performance_alert_summary_81ee1a31`; ALTER TABLE `treeherder`.`performance_alert` DROP INDEX `performance_alert_548b2096`; ALTER TABLE `treeherder`.`performance_datum` DROP INDEX `performance_datum_81ee1a31`; ALTER TABLE `treeherder`.`performance_signature` DROP INDEX `performance_signature_81ee1a31`; ALTER TABLE `treeherder`.`text_log_error` DROP INDEX `text_log_error_bef491d2`; ALTER TABLE `treeherder`.`text_log_step` DROP INDEX `text_log_step_d697ea38`; Will wait a bit before doing the same on prod.
Blocks: 1303763
No longer depends on: 1303763
No change in stage DB perf; have applied against prod too.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: