Closed
Bug 1025735
Opened 10 years ago
Closed 10 years ago
Timestamp-sorted queries are "using filesort" rather than the index
Categories
(Cloud Services Graveyard :: Server: Sync, defect)
Cloud Services Graveyard
Server: Sync
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rfkelly, Assigned: rfkelly)
References
Details
(Whiteboard: [qa?])
Attachments
(1 file, 1 obsolete file)
1.38 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
On sync-2.stage, when I ask mysql to explain a query from its slow-query log: MariaDB [weave0]> explain SELECT bso.userid, bso.collection, bso.id, bso.sortindex, bso.modified, bso.payload, bso.payload_size, bso.ttl FROM bso WHERE bso.userid = 960137 AND bso.collection = 10 AND bso.modified > 1400907104000 AND bso.ttl > 1401232339 ORDER BY bso.modified DESC, bso.id DESC; +------+-------------+-------+------+-----------------------------------------+---------+---------+-------------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+-----------------------------------------+---------+---------+-------------+------+-----------------------------+ | 1 | SIMPLE | bso | ref | PRIMARY,bso_ttl_idx,bso_usr_col_mod_idx | PRIMARY | 8 | const,const | 1 | Using where; Using filesort | +------+-------------+-------+------+-----------------------------------------+---------+---------+-------------+------+-----------------------------+ We really want this query to be using the bso_user_col_mod index, and we really *don't* want it to be "using filesort". Bleh. The cause is the secondary "bso.id DESC" in the ORDER BY clause, as shown by: MariaDB [weave0]> explain SELECT bso.userid, bso.collection, bso.id, bso.sortindex, bso.modified, bso.payload, bso.payload_size, bso.ttl FROM bso WHERE bso.userid = 960137 AND bso.collection = 10 AND bso.modified > 1400907104000 AND bso.ttl > 1401232339 ORDER BY bso.modified DESC; +------+-------------+-------+------+-----------------------------------------+---------------------+---------+-------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+-----------------------------------------+---------------------+---------+-------------+------+-------------+ | 1 | SIMPLE | bso | ref | PRIMARY,bso_ttl_idx,bso_usr_col_mod_idx | bso_usr_col_mod_idx | 8 | const,const | 1 | Using where | +------+-------------+-------+------+-----------------------------------------+---------------------+---------+-------------+------+-------------+ 1 row in set (0.00 sec) That's much nicer. I suspect this was a side-effect of a recent mysql engine upgrade or switch or something. AFAICT this *used* to work in our sync1.1 setup, per https://bugzilla.mozilla.org/show_bug.cgi?id=735102#c14 I *think* the right thing to do here is to explicitly add the "id" column on to the index, reproducing what MySQL apparently used to do implicitly: MariaDB [test]> drop index bso_usr_col_mod_idx on bso;Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 MariaDB [test]> create index bso_usr_col_mod_idx on bso (userid, collection, modified, id);Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 MariaDB [test]> explain SELECT bso.userid, bso.collection, bso.id, bso.sortindex, bso.modified, bso.payload, bso.payload_size, bso.ttl FROM bso WHERE bso.userid = 960137 AND bso.collection = 10 AND bso.modified > 1400907104000 AND bso.ttl > 1401232339 ORDER BY bso.modified DESC, id DESC; +------+-------------+-------+------+-----------------------------------------+---------------------+---------+-------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+------+-----------------------------------------+---------------------+---------+-------------+------+-------------+ | 1 | SIMPLE | bso | ref | PRIMARY,bso_ttl_idx,bso_usr_col_mod_idx | bso_usr_col_mod_idx | 8 | const,const | 1 | Using where | +------+-------------+-------+------+-----------------------------------------+---------------------+---------+-------------+------+-------------+ 1 row in set (0.00 sec) We could also just remove "id" from the ORDER BY clause. It's there to ensure that limit/offset always produces a consistent ordering, but clients are not using offset yet anyway. I discovered this because I need it for the new offset-driven pagination thing in 1007987 and we would need to keep it for that patch to work correctly.
Assignee | ||
Comment 1•10 years ago
|
||
Patch to include id in app-created table index. For prod the corresponding patch would be https://github.com/mozilla-services/puppet-config/pull/603
Assignee: nobody → rfkelly
Attachment #8440496 -
Flags: review?(telliott)
Comment 2•10 years ago
|
||
I have concerns here. You're gaining a nice speedup in the read, but it's at the cost of an additional index, which is going to hurt the write performance and overall disk usage. It may not be much of a win. I think step 1, at least is to simply remove the sort-on-id and see if that alone obviates the need for internal paging. I also *suspect* that the id being part of the pk means that it'll consistently return values correctly in those corner cases where the timestamp crosses the pagination barrier (or produce inconsistencies infrequently enough that we don't worry about it). (Also, minor nit: that last query at the end points to id rather than bso.id)
Comment 3•10 years ago
|
||
Sorry, s/additional/longer/
Updated•10 years ago
|
Whiteboard: [qa?]
Assignee | ||
Comment 4•10 years ago
|
||
OK. Here's a version that doesn't use the id for ordering, that we can experiment with.
Attachment #8440998 -
Flags: review?(telliott)
Comment 5•10 years ago
|
||
Merged https://github.com/mozilla-services/puppet-config/pull/603 Rolling change out to Stage.
Comment 6•10 years ago
|
||
Comment on attachment 8440998 [details] [diff] [review] sync15-no-order-by-id.diff Review of attachment 8440998 [details] [diff] [review]: ----------------------------------------------------------------- Worth a shot.
Attachment #8440998 -
Flags: review?(telliott) → review+
Comment 7•10 years ago
|
||
Specifically, this was rolled out to Sync-1 and Sync-2, but not Sync-3 or Sync-4. I ran 30min load tests on Sync-1 and Sync-3 and saw no appreciable difference other than disk I/O (as shown in StackDriver. :rfkelly will continue to look at these results...
Status: NEW → ASSIGNED
Assignee | ||
Comment 8•10 years ago
|
||
I've been running some more write-heavy tests, and see basically the same as :jbonacci reports above. Sync-1 (which has the extra column) showed around 1M/s more disk IO than Sync-3 (which does not) but it also had an average RPS of 52 versus 47 which might account for some of it.
Assignee | ||
Comment 9•10 years ago
|
||
And unsurprisingly, the overall db size is bigger on sync-1 than sync-3 - by several GB after all our loadtest runs today.
Comment 10•10 years ago
|
||
Sync Server 1.5.6 has been rolled out to stage. The extra column has been removed.
Comment 11•10 years ago
|
||
(In reply to Bob Micheletto [:bobm] from comment #10) > Sync Server 1.5.6 has been rolled out to stage. The extra column has been > removed. Correction: The id index has been removed.
Assignee | ||
Comment 12•10 years ago
|
||
Comment on attachment 8440496 [details] [diff] [review] sync15-include-id-in-timestamp-index.diff Obsoleting the previous patch that we decided not to run with
Attachment #8440496 -
Attachment is obsolete: true
Attachment #8440496 -
Flags: review?(telliott)
Assignee | ||
Comment 13•10 years ago
|
||
This went out in the 1.5.6 deploy (Bug 1025735) so we'll see what kind of effect it has. To verify, we should grab a SELECT query or two from the prod slow-query logs, run EXPLAIN on them, and check that they are indeed using the correct index. I'm going to close this out, we can re-visit the issue of ordering/offset-management when (if?) we get around to rolling out Bug 1007987 or similar.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 14•10 years ago
|
||
I see your Resolved this. Excellent. But since you said this, we have to do it before I can Verify the ticket: "This went out in the 1.5.6 deploy (Bug 1025735) so we'll see what kind of effect it has. To verify, we should grab a SELECT query or two from the prod slow-query logs, run EXPLAIN on them, and check that they are indeed using the correct index." :-)
Comment 16•10 years ago
|
||
MariaDB [weave0]> EXPLAIN SELECT bso.userid, bso.collection, bso.id, bso.sortindex, bso.modified, bso.payload, bso.payload_size, bso.ttl -> FROM bso -> WHERE bso.userid = XXXXXXX AND bso.collection = 7 AND bso.modified > 1403222235690 AND bso.ttl > 1403297438 ORDER BY bso.sortindex DESC -> LIMIT 5001; +------+-------------+-------+-------+-----------------------------------------+---------------------+---------+------+------+----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+-----------------------------------------+---------------------+---------+------+------+----------------------------------------------------+ | 1 | SIMPLE | bso | range | PRIMARY,bso_ttl_idx,bso_usr_col_mod_idx | bso_usr_col_mod_idx | 16 | NULL | 17 | Using index condition; Using where; Using filesort | +------+-------------+-------+-------+-----------------------------------------+---------------------+---------+------+------+----------------------------------------------------+ 1 row in set (0.01 sec)
Flags: needinfo?(bobm)
Assignee | ||
Comment 17•10 years ago
|
||
> ORDER BY bso.sortindex DESC
Yep, unfortunately the sortindex ones will still have to use a filesort. The good news is that it's using the bso_usr_col_mod_idx for selecting the actual rows, rather than doing a full scan of all the user's items.
Bob, are there any "ORDER BY bso.modified" queries in the slow log, and if so can you please also EXPLAIN one of them? It should *not* be using filesort.
Flags: needinfo?(bobm)
Comment 18•10 years ago
|
||
(In reply to Ryan Kelly [:rfkelly] from comment #17) > Bob, are there any "ORDER BY bso.modified" queries in the slow log, and if > so can you please also EXPLAIN one of them? It should *not* be using > filesort. MariaDB [weave0]> EXPLAIN SELECT bso.userid, bso.collection, bso.id, bso.sortindex, bso.modified, bso.payload, bso.payload_size, bso.ttl FROM bso WHERE bso.userid = XXXXXXX AND bso.collection = 4 AND bso.modified > 1403218499710 AND bso.ttl > 1403545706 ORDER BY bso.modified DESC LIMIT 5001; +------+-------------+-------+-------+-----------------------------------------+---------------------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+-----------------------------------------+---------------------+---------+------+------+-------------+ | 1 | SIMPLE | bso | range | PRIMARY,bso_ttl_idx,bso_usr_col_mod_idx | bso_usr_col_mod_idx | 16 | NULL | 179 | Using where | +------+-------------+-------+-------+-----------------------------------------+---------------------+---------+------+------+-------------+ 1 row in set (0.32 sec)
Flags: needinfo?(bobm)
Assignee | ||
Comment 19•10 years ago
|
||
:thumbsup:
Updated•1 year ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•