Closed
Bug 1302869
Opened 8 years ago
Closed 8 years ago
Remove redundant indexes
Categories
(Tree Management :: Treeherder, defect)
Tree Management
Treeherder
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(1 file)
eg:
> SELECT * FROM sys.schema_redundant_indexes
...
******************** 5. row *********************
table_schema: treeherder
table_name: bug_job_map
redundant_index_name: bug_job_map_d697ea38
redundant_index_columns: job_id
redundant_index_non_unique: 1
dominant_index_name: bug_job_map_job_id_700648fc2e943c6b_uniq
dominant_index_columns: job_id,bug_id
dominant_index_non_unique: 0
subpart_exists: 0
sql_drop_index: ALTER TABLE `treeherder`.`bug_job_map` DROP INDEX `bug_job_map_d697ea38`
******************** 6. row *********************
table_schema: treeherder
table_name: build_platform
redundant_index_name: build_platform_os_name_3e460886b60cf2ce_uniq
redundant_index_columns: os_name,platform,architecture
redundant_index_non_unique: 0
dominant_index_name: build_platform_os_name_2b1ee6b83c271ea_uniq
dominant_index_columns: os_name,platform,architecture
dominant_index_non_unique: 0
subpart_exists: 0
sql_drop_index: ALTER TABLE `treeherder`.`build_platform` DROP INDEX `build_platform_os_name_3e460886b60cf2ce_uniq`
Assignee | ||
Comment 1•8 years ago
|
||
Install the mysql helper scripts using:
1) vagrant destroy -f && vagrant up && vagrant ssh
2) cd ~/
3) git clone https://github.com/mysql/mysql-sys
4) cd mysql-sys
5) mysql -u root -p < ./sys_56.sql
Then against the Vagrant DB:
> SELECT CONCAT(table_schema, '.', table_name) AS 'table', dominant_index_columns AS dominant_index, redundant_index_columns AS redundant_index FROM sys.schema_redundant_indexes
******************** 1. row *********************
table: treeherder.bug_job_map
dominant_index: job_id,bug_id
redundant_index: job_id
******************** 2. row *********************
table: treeherder.build_platform
dominant_index: os_name,platform,architecture
redundant_index: os_name
******************** 3. row *********************
table: treeherder.failure_line
dominant_index: job_log_id,line
redundant_index: job_log_id
******************** 4. row *********************
table: treeherder.failure_line
dominant_index: signature,test,created
redundant_index: signature
******************** 5. row *********************
table: treeherder.failure_match
dominant_index: failure_line_id,classified_failure_id,matcher_id
redundant_index: failure_line_id
******************** 6. row *********************
table: treeherder.job_group
dominant_index: name,symbol
redundant_index: name
******************** 7. row *********************
table: treeherder.job_log
dominant_index: job_id,name,url
redundant_index: job_id
******************** 8. row *********************
table: treeherder.job_type
dominant_index: name,symbol
redundant_index: name
******************** 9. row *********************
table: treeherder.machine
dominant_index: name
redundant_index: name
******************** 10. row *********************
table: treeherder.machine_platform
dominant_index: os_name,platform,architecture
redundant_index: os_name
******************** 11. row *********************
table: treeherder.option_collection
dominant_index: option_collection_hash,option_id
redundant_index: option_collection_hash
******************** 12. row *********************
table: treeherder.performance_alert
dominant_index: summary_id,series_signature_id
redundant_index: summary_id
******************** 13. row *********************
table: treeherder.performance_datum
dominant_index: repository_id,job_id,result_set_id,signature_id,push_timestamp
redundant_index: repository_id,job_id
******************** 14. row *********************
table: treeherder.performance_signature
dominant_index: repository_id,framework_id,signature_hash
redundant_index: repository_id
******************** 15. row *********************
table: treeherder.repository
dominant_index: name
redundant_index: name
******************** 16. row *********************
table: treeherder.text_log_error
dominant_index: step_id,line_number
redundant_index: step_id
******************** 17. row *********************
table: treeherder.text_log_step
dominant_index: job_id,started_line_number,finished_line_number
redundant_index: job_id
Assignee | ||
Comment 2•8 years ago
|
||
Some are not our fault, eg Django creates redundant indexes for a FlexibleForeignKey even if there is a unique_together:
class TextLogError(models.Model):
id = BigAutoField(primary_key=True)
step = FlexibleForeignKey(TextLogStep, related_name='errors')
line = models.TextField()
line_number = models.PositiveIntegerField()
class Meta:
db_table = "text_log_error"
unique_together = ('step', 'line_number')
Others are definitely our fault. eg specifying both unique=True and db_index=True.
For the Django at fault ones, we should still file upstream bugs.
Assignee | ||
Comment 3•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #2)
> Some are not our fault, eg Django creates redundant indexes
See:
https://groups.google.com/forum/#!topic/django-developers/3ywugkcaxqs
https://code.djangoproject.com/ticket/24082
Assignee | ||
Comment 4•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #2)
> Some are not our fault, eg Django creates redundant indexes for a FlexibleForeignKey
More specifically:
* Django 1.10 and earlier creates a redundant index for ForeignKeys that overlap with a unique_together (see example in https://emorley.pastebin.mozilla.org/8933048).
* This appears to be fixed in Django master.
* The workaround is to add an explicit db_index=False to the ForeignKey().
* However this workaround causes a suboptimal migration if applied to existing tables, since it not only drops the redundant index, but actually removes the foreign key entirely before adding an identical one. This is bad since dropping an index is an online DDL operation, but adding a constraint is not (causes the table to be locked).
For this I've filed:
https://code.djangoproject.com/ticket/27558
Discussion at:
https://groups.google.com/forum/#!topic/django-developers/3ywugkcaxqs
The takeaway lesson here is that when we're performing migrations on our larger tables we probably should run a `./manage.py model <migration_name>` to show the SQL being run to understand how much of a performance implication it will really have.
Assignee | ||
Comment 5•8 years ago
|
||
A `./manage.py sqlmigrate <model> <migration_name>` even.
Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #4)
> For this I've filed:
> https://code.djangoproject.com/ticket/27558
I've opened some Django PRs to fix that ticket:
https://github.com/django/django/pull/7647
https://github.com/django/django/pull/7648
Assignee | ||
Comment 7•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #6)
> I've opened some Django PRs to fix that ticket:
My PRs were actually merged in time to make the Django 1.10.4 release.
Once we're on Django 1.10 we'll have the fix for this :-)
https://docs.djangoproject.com/en/1.10/releases/1.10.4/
Comment 8•8 years ago
|
||
Assignee | ||
Comment 9•8 years ago
|
||
Comment on attachment 8825211 [details] [review]
[treeherder] mozilla:rm-redundant-indexes > mozilla:master
This fixes most of the cases. The others will ether be fixed as part of migrations file squashing (bug 1320136; since there is cruft in 0001_inital...) or else by later PRs in this bug.
Attachment #8825211 -
Flags: review?(wlachance)
Comment 10•8 years ago
|
||
Comment on attachment 8825211 [details] [review]
[treeherder] mozilla:rm-redundant-indexes > mozilla:master
Awesome!
Attachment #8825211 -
Flags: review?(wlachance) → review+
Comment 11•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/97825a259ba6a8b9852b76f3da0e3c943b7c352d
Bug 1302869 - Remove redundant MySQL indexes
In cases where one index overlaps with the start of another (larger)
composite index, the first index is redundant and can be removed.
Found using the sys.schema_redundant_indexes helper:
table: treeherder.build_platform
dominant_index: os_name,platform,architecture
redundant_index: os_name
table: treeherder.job_group
dominant_index: name,symbol
redundant_index: name
table: treeherder.job_type
dominant_index: name,symbol
redundant_index: name
table: treeherder.machine_platform
dominant_index: os_name,platform,architecture
redundant_index: os_name
table: treeherder.option_collection
dominant_index: option_collection_hash,option_id
redundant_index: option_collection_hash
table: treeherder.performance_datum
dominant_index: repository_id,job_id
redundant_index: repository_id
table: treeherder.performance_datum
dominant_index: repository_id,job_id,push_id,signature_id
redundant_index: repository_id,job_id
Assignee | ||
Comment 12•8 years ago
|
||
After bug 1320136, there's only one redundant index returned by the `sys.schema_redundant_indexes` helper.
> SELECT * FROM sys.schema_redundant_indexes
******************** 1. row *********************
table_schema: treeherder
table_name: failure_line
redundant_index_name: failure_line_signature_idx
redundant_index_columns: signature
redundant_index_non_unique: 1
dominant_index_name: failure_line_signature_test_idx
dominant_index_columns: signature,test,created
dominant_index_non_unique: 1
subpart_exists: 1
sql_drop_index: ALTER TABLE `treeherder`.`failure_line` DROP INDEX `failure_line_signature_idx`
1 rows in set
However the indexes in question are:
signature(50)
signature(25),test(50),created
...so I'm presuming the non-composite index still has use, since it's longer than the `signature` part of the composite index.
I would file an issue against the mysql-sys helpers, but the GitHub repo has issues disabled and so bugs have to be filed against the Oracle mysql.com tracker, which is an utter waste of time given previous experience.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 13•8 years ago
|
||
Decided to handle updating dev/stage/prod here rather than piecemeal in the table by table deps of bug 1303763.
Statements generated from:
SELECT table_name, redundant_index_columns, dominant_index_columns, sql_drop_index FROM sys.schema_redundant_indexes WHERE subpart_exists != 1
Run against dev/stage:
ALTER TABLE `treeherder`.`bug_job_map` DROP INDEX `bug_job_map_d697ea38`;
ALTER TABLE `treeherder`.`commit` DROP INDEX `commit_e13f0cea`;
ALTER TABLE `treeherder`.`failure_line` DROP INDEX `failure_line_48d13415`;
ALTER TABLE `treeherder`.`failure_match` DROP INDEX `failure_match_4b7d144b`;
ALTER TABLE `treeherder`.`job_exclusion` DROP INDEX `idx_name`;
ALTER TABLE `treeherder`.`job_log` DROP INDEX `job_log_d697ea38`;
ALTER TABLE `treeherder`.`job` DROP INDEX `job_084f3fc9`;
ALTER TABLE `treeherder`.`machine` DROP INDEX `idx_name`;
ALTER TABLE `treeherder`.`performance_alert_summary` DROP INDEX `performance_alert_summary_81ee1a31`;
ALTER TABLE `treeherder`.`performance_alert` DROP INDEX `performance_alert_548b2096`;
ALTER TABLE `treeherder`.`performance_datum` DROP INDEX `performance_datum_81ee1a31`;
ALTER TABLE `treeherder`.`performance_signature` DROP INDEX `performance_signature_81ee1a31`;
ALTER TABLE `treeherder`.`text_log_error` DROP INDEX `text_log_error_bef491d2`;
ALTER TABLE `treeherder`.`text_log_step` DROP INDEX `text_log_step_d697ea38`;
Will wait a bit before doing the same on prod.
Assignee | ||
Comment 14•8 years ago
|
||
No change in stage DB perf; have applied against prod too.
You need to log in
before you can comment on or make changes to this bug.
Description
•