Closed
Bug 992900
Opened 11 years ago
Closed 11 years ago
[shipping] slow queries in signoff view for more pushes
Categories
(Webtools Graveyard :: Elmo, defect)
Webtools Graveyard
Elmo
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: Pike, Unassigned)
References
Details
(Keywords: perf, regression)
We have introduced some new slow queries, it'd be nice if we could get rid of those, as they're causing multi-second load times.
From mysql-slow-log:
# Time: 140407 5:57:31
# User@Host: l10nuser[l10nuser] @ l10n-dashboard1.webapp.scl3.mozilla.com [10.22.81.129]
# Query_time: 2.702376 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1724956
SET timestamp=1396875451;
SELECT COUNT(DISTINCT `life_push`.`id`) FROM `life_push` INNER JOIN `life_push_changesets` ON (`life_push`.`id` = `life_push_changesets`.`push_id`) INNER JOIN `life_changeset` ON (`life_push_changesets`.`changeset_id` = `life_changeset`.`id`) WHERE (`life_changeset`.`branch_id` = 1 AND ((`life_push`.`push_date` >= '2014-03-17 21:50:00' AND `life_push`.`repository_id` = 907 ) OR (`life_push`.`push_date` <= '2014-03-17 21:50:00' AND `life_push`.`push_date` >= '2014-02-03 22:45:00' AND `life_push`.`repository_id` = 803 ) OR (`life_push`.`push_date` <= '2014-02-03 22:45:00' AND `life_push`.`push_date` >= '2013-12-09 19:30:00' AND `life_push`.`repository_id` = 58 )) AND `life_push`.`push_date` < '2014-04-06 05:27:26' );
# Time: 140407 5:57:34
# User@Host: l10nuser[l10nuser] @ l10n-dashboard1.webapp.scl3.mozilla.com [10.22.81.129]
# Query_time: 2.644111 Lock_time: 0.000000 Rows_sent: 10 Rows_examined: 1724977
SET timestamp=1396875454;
SELECT DISTINCT `life_push`.`id`, `life_push`.`push_date` FROM `life_push` INNER JOIN `life_push_changesets` ON (`life_push`.`id` = `life_push_changesets`.`push_id`) INNER JOIN `life_changeset` ON (`life_push_changesets`.`changeset_id` = `life_changeset`.`id`) WHERE (`life_changeset`.`branch_id` = 1 AND ((`life_push`.`push_date` >= '2014-03-17 21:50:00' AND `life_push`.`repository_id` = 907 ) OR (`life_push`.`push_date` <= '2014-03-17 21:50:00' AND `life_push`.`push_date` >= '2014-02-03 22:45:00' AND `life_push`.`repository_id` = 803 ) OR (`life_push`.`push_date` <= '2014-02-03 22:45:00' AND `life_push`.`push_date` >= '2013-12-09 19:30:00' AND `life_push`.`repository_id` = 58 )) AND `life_push`.`push_date` < '2014-04-06 05:27:26' ) ORDER BY `life_push`.`push_date` DESC LIMIT 10;
These got introduced in bug 658656, one of the two is in
https://github.com/mozilla/elmo/blob/master/apps/shipping/views/signoff.py#L180
Comment 1•11 years ago
|
||
Before we attack this in django, can we perhaps try to figure out if we're using indexes to maximum benefit. I tried running EXPLAIN on the queries but I don't really understand the output.
Perhaps we can needinfo on Sheeri.
Comment 2•11 years ago
|
||
Sheeri,
Can you help us checking that we have sufficient use of indexes on the query above (there's two of them but one is just the count of another).
We have some ideas on the django side but first we thought we'd check that the queries are as fast as can be. Sometimes an index can increase the speed by a massive difference.
The prod DB is: l10n_site and it's running on bm-l10n-db
Flags: needinfo?(scabral)
Comment 3•11 years ago
|
||
This is what I get on my local mysql 5.6.16
https://gist.github.com/peterbe/10023990
Comment 4•11 years ago
|
||
Table definitions:
mysql> show create table life_push\G
*************************** 1. row ***************************
Table: life_push
Create Table: CREATE TABLE `life_push` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`repository_id` int(11) NOT NULL,
`user` varchar(200) NOT NULL,
`push_date` datetime NOT NULL,
`push_id` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
KEY `life_push_repository_id` (`repository_id`),
KEY `life_push_user` (`user`),
KEY `life_push_push_date` (`push_date`),
CONSTRAINT `repository_id_refs_id_20591d59` FOREIGN KEY (`repository_id`) REFERENCES `life_repository` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=165767 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> show create table life_changeset\G
*************************** 1. row ***************************
Table: life_changeset
Create Table: CREATE TABLE `life_changeset` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`revision` varchar(40) NOT NULL,
`user` varchar(200) NOT NULL,
`description` longtext,
`branch_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `revision` (`revision`),
KEY `life_changeset_user` (`user`),
KEY `life_changeset_branch_id` (`branch_id`),
CONSTRAINT `branch_id_refs_id_2146ec18` FOREIGN KEY (`branch_id`) REFERENCES `life_branch` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=381470 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
Flags: needinfo?(scabral)
Comment 5•11 years ago
|
||
Both queries examine almost 200,000 rows in the life_changeset table, and they both use the life_changeset_branch_id key, on branch_id - both queries look for branch_id=1.
305,536 entries have branch_id=1. But there are over 800 branch ids in general, so the index isn't a bad one. I've tested with several other indexes (for example, branch_id plus id) and I don't see anything easy to optimize there. The real problem is that whenever we're searching for branch_id=1, there are tons of different entries, and the life_changeset table only uses branch_id and id, and I tried those indexes.
Comment 6•11 years ago
|
||
So basically the joining on changesets all with branch_id=1 is what's causing the excessive strain.
See:
https://gist.github.com/peterbe/10411869
So about 25% of the life_changeset records have branch_id that is NOT 1.
I wonder if we could instead select all from life_push and somehow "post process" in django and filter out what's from the wrong changesets. Ideas, Pike?
| Reporter | ||
Comment 7•11 years ago
|
||
Yeah, something like that. The non-default-branch landings are actually pretty common on the beta branch, so the logic should assume that it's gotta ignore a few.
I wonder if we could just get a bunch of push IDs, and then kick off a second query which of those have changesets on the default branch. That might make us transfer less data back and forth, not sure.
| Reporter | ||
Comment 8•11 years ago
|
||
Here's one weird thing I found:
Just the join makes the query slow, not actually selecting anything from it.
mysql> SELECT COUNT(DISTINCT `life_push`.`id`) FROM `life_push` INNER JOIN `life_push_changesets` ON (`life_push`.`id` = `life_push_changesets`.`push_id`) INNER JOIN `life_changeset` ON (`life_push_changesets`.`changeset_id` = `life_changeset`.`id`) WHERE (((`life_push`.`push_date` <= '2014-03-17 21:50:00' AND `life_push`.`push_date` >= '2014-02-03 22:45:00' AND `life_push`.`repository_id` = 829 ) OR (`life_push`.`push_date` <= '2014-02-03 22:45:00' AND `life_push`.`push_date` >= '2013-12-09 19:30:00' AND `life_push`.`repository_id` = 83 ) OR (`life_push`.`push_date` >= '2014-03-17 21:50:00' AND `life_push`.`repository_id` = 934 )));
+----------------------------------+
| COUNT(DISTINCT `life_push`.`id`) |
+----------------------------------+
| 13 |
+----------------------------------+
1 row in set (3.18 sec)
mysql> SELECT COUNT(DISTINCT `life_push`.`id`) FROM `life_push` WHERE (((`life_push`.`push_date` <= '2014-03-17 21:50:00' AND `life_push`.`push_date` >= '2014-02-03 22:45:00' AND `life_push`.`repository_id` = 829 ) OR (`life_push`.`push_date` <= '2014-02-03 22:45:00' AND `life_push`.`push_date` >= '2013-12-09 19:30:00' AND `life_push`.`repository_id` = 83 ) OR (`life_push`.`push_date` >= '2014-03-17 21:50:00' AND `life_push`.`repository_id` = 934 )) );
+----------------------------------+
| COUNT(DISTINCT `life_push`.`id`) |
+----------------------------------+
| 13 |
+----------------------------------+
1 row in set (0.23 sec)
The other thing of interest, I did dive in to actually spot how we regressed this, and found that we had the slow query before. The regression bit is that we're exercising it twice now, instead of once.
| Reporter | ||
Comment 9•11 years ago
|
||
I think that branches and pushes are really essential in elmo, which might satisfy to add some redundancy in the data.
Would it help to add a ManyToMany between Branch and Push?
Right now, it's Push <- MM -> Changeset -> Branch, and we could add a copy as Push <- MM -> Branch.
Comment 10•11 years ago
|
||
(In reply to Axel Hecht [:Pike] from comment #8)
> Here's one weird thing I found:
>
> Just the join makes the query slow, not actually selecting anything from it.
>
> mysql> SELECT COUNT(DISTINCT `life_push`.`id`) FROM `life_push` INNER JOIN
> `life_push_changesets` ON (`life_push`.`id` =
> `life_push_changesets`.`push_id`) INNER JOIN `life_changeset` ON
> (`life_push_changesets`.`changeset_id` = `life_changeset`.`id`) WHERE
> (((`life_push`.`push_date` <= '2014-03-17 21:50:00' AND
> `life_push`.`push_date` >= '2014-02-03 22:45:00' AND
> `life_push`.`repository_id` = 829 ) OR (`life_push`.`push_date` <=
> '2014-02-03 22:45:00' AND `life_push`.`push_date` >= '2013-12-09 19:30:00'
> AND `life_push`.`repository_id` = 83 ) OR (`life_push`.`push_date` >=
> '2014-03-17 21:50:00' AND `life_push`.`repository_id` = 934 )));
> +----------------------------------+
> | COUNT(DISTINCT `life_push`.`id`) |
> +----------------------------------+
> | 13 |
> +----------------------------------+
> 1 row in set (3.18 sec)
>
> mysql> SELECT COUNT(DISTINCT `life_push`.`id`) FROM `life_push` WHERE
> (((`life_push`.`push_date` <= '2014-03-17 21:50:00' AND
> `life_push`.`push_date` >= '2014-02-03 22:45:00' AND
> `life_push`.`repository_id` = 829 ) OR (`life_push`.`push_date` <=
> '2014-02-03 22:45:00' AND `life_push`.`push_date` >= '2013-12-09 19:30:00'
> AND `life_push`.`repository_id` = 83 ) OR (`life_push`.`push_date` >=
> '2014-03-17 21:50:00' AND `life_push`.`repository_id` = 934 )) );
> +----------------------------------+
> | COUNT(DISTINCT `life_push`.`id`) |
> +----------------------------------+
> | 13 |
> +----------------------------------+
> 1 row in set (0.23 sec)
>
Wow!
I remember digging deep into similar optimization "problems" in postgres. The preconditions were quite different I suspect. The tl;dr of that was that postgres is very smart in figuring out what of a join won't be needed unless you join on non-uniques.
>
> The other thing of interest, I did dive in to actually spot how we regressed
> this, and found that we had the slow query before. The regression bit is
> that we're exercising it twice now, instead of once.
That's funny! The Django ORM is so darn handy you're tempted to make very inefficient queries.
Comment 11•11 years ago
|
||
(In reply to Axel Hecht [:Pike] from comment #9)
> I think that branches and pushes are really essential in elmo, which might
> satisfy to add some redundancy in the data.
>
> Would it help to add a ManyToMany between Branch and Push?
>
> Right now, it's Push <- MM -> Changeset -> Branch, and we could add a copy
> as Push <- MM -> Branch.
That's what I thought too!
Redundant but would definitely solve all of this.
so something like:
class Push(models.Model):
"""stores context of who pushed what when
Fields:
repository -- repository changesets were pushed to
user -- person who did the push
push_date -- date and time of the push
push_id -- unique id of the push
"""
objects = PushManager()
repository = models.ForeignKey(Repository)
changesets = models.ManyToManyField(Changeset, related_name="pushes")
+ branches = models.ManyToManyField(Branch, related_name="pushes")
user = models.CharField(max_length=200, db_index=True)
push_date = models.DateTimeField('date of push', db_index=True)
push_id = models.PositiveIntegerField(default=0)
So a push can definitely span multiple branches? Somehow that tastes strange but I'm no hg expert.
Comment 12•11 years ago
|
||
Adding Brandon because he's pretty awesome at puzzles like this.
| Reporter | ||
Comment 13•11 years ago
|
||
Data design info: both hg push and git push affect multiple branches and/or bookmarks. Though branches are rather different things in hg and git.
What's not explicit in this bug, but the reason why we made it blocked by bug 915732, we don't see this query as being slow in our local testing with 5.6, so we'd like to see if an updated mysql would "just fix" this.
| Reporter | ||
Comment 14•11 years ago
|
||
Resolving WORKSFORME, now that we're on the new dbs.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Updated•5 years ago
|
Product: Webtools → Webtools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•