Closed
Bug 824987
Opened 12 years ago
Closed 12 years ago
query analysis for bugzilla query checksum 281345723352719199
Categories
(Data & BI Services Team :: DB: MySQL, task)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 825263
People
(Reporter: scabral, Assigned: mreid)
References
Details
(Whiteboard: change from subquery to join)
SELECT bug_id
, UNIX_TIMESTAMP(a.creation_ts)*1000 AS modified_ts
, login_name AS modified_by
, UNIX_TIMESTAMP(a.creation_ts)*1000 AS created_ts
, login_name AS created_by
, ispatch AS 'attachments.ispatch'
, isobsolete AS 'attachments.isobsolete'
, isprivate AS 'attachments.isprivate'
, attach_id
FROM
attachments a
JOIN profiles p ON a.submitter_id = p.userid
WHERE bug_id IN (
SELECT bug_id FROM bugs WHERE
UNIX_TIMESTAMP(delta_ts)*1000 > 1346050769000
AND true
)
ORDER BY
bug_id,
attach_id,
a.creation_ts
Reporter | ||
Comment 1•12 years ago
|
||
Here's the EXPLAIN on MySQL 5.1 (current running bugzilla). It's not great, because of the dependent subquery on the bugs table and the resulting full table scan on the attachments table.
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: a
type: ALL
possible_keys: attachments_submitter_id_idx
key: NULL
key_len: NULL
ref: NULL
rows: 679625
Extra: Using where; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: p
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: bugs.a.submitter_id
rows: 1
Extra:
*************************** 3. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: bugs
type: unique_subquery
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: func
rows: 1
Extra: Using where
3 rows in set (0.00 sec)
Reporter | ||
Comment 2•12 years ago
|
||
In MariaDB 5.5, however, the query optimization is much better - the same query's plan still has the full table scan on attachments, but only a primary key lookup on the bugs table, not a dependent subquery.
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: a
type: ALL
possible_keys: attachments_bug_id_idx,attachments_submitter_id_idx
key: NULL
key_len: NULL
ref: NULL
rows: 689913
Extra: Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: p
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: bugs.a.submitter_id
rows: 1
Extra:
*************************** 3. row ***************************
id: 1
select_type: PRIMARY
table: bugs
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: bugs.a.bug_id
rows: 1
Extra: Using where
Reporter | ||
Comment 3•12 years ago
|
||
Running it on MySQL 5.1 3 times in a row, I'm getting results in 3.58, 3.71 and 3.61 seconds.
Running it on MariaDB 5.5 3 times in a row, I'm getting results in 3.34, 3.32 and 3.31 seconds.
So it's a bit faster on MariaDB 5.5, but not tons faster.
Here are the FROM/WHERE clauses:
FROM attachments a JOIN profiles p ON a.submitter_id = p.userid
WHERE bug_id IN (SELECT bug_id FROM bugs WHERE UNIX_TIMESTAMP(delta_ts)*1000 > 1346050769000 AND true) ORDER BY bug_id, attach_id, a.creation_ts\G
And from the fingerprint:
from attachments a join profiles p on a.submitter_id = p.userid where bug_id in ( select bug_id from bugs where unix_timestamp(delta_ts)*? > ? and true ) order by bug_id, attach_id, a.creation_ts
From what I can see, this could easily be turned into a JOIN - there's no grouping or other filtering that would prevent this from being a JOIN like this:
FROM attachments a
JOIN profiles p ON a.submitter_id = p.userid
JOIN bugs ON a.bug_id = bugs.bug_id
WHERE UNIX_TIMESTAMP(delta_ts)*1000 > 1346050769000 ORDER BY a.bug_id, attach_id, a.creation_ts\G
SELECT a.bug_id, UNIX_TIMESTAMP(a.creation_ts)*1000 AS modified_ts, login_name AS modified_by, UNIX_TIMESTAMP(a.creation_ts)*1000 AS created_ts, login_name AS created_by, ispatch AS 'attachments.ispatch', isobsolete AS 'attachments.isobsolete', isprivate AS 'attachments.isprivate', attach_id
FROM attachments a
JOIN profiles p ON a.submitter_id = p.userid
JOIN bugs ON a.bug_id = bugs.bug_id
WHERE UNIX_TIMESTAMP(delta_ts)*1000 > 1346050769000 ORDER BY a.bug_id, attach_id, a.creation_ts\G
This results in:
3.14, 3.18, 3.19 seconds on MySQL 5.1
3.30, 3.45, 3.31 seconds on MariaDB 5.5
So not much of a difference on MariaDB 5.5, but definitely a difference on MySQL 5.1.
Is there any way you can change this from a subquery to a JOIN? There's almost a half-second difference on the current running MySQL.
Reporter | ||
Updated•12 years ago
|
![]() |
||
Comment 4•12 years ago
|
||
I don't know where this code comes from, but it doesn't exist upstream. Must come from a bmo extension.
Reporter | ||
Updated•12 years ago
|
Whiteboard: change from subquery to join
Reporter | ||
Updated•12 years ago
|
Assignee: server-ops-database → scabral
Comment 5•12 years ago
|
||
That Unix time (1346050769000) is Mon, 27 Aug 2012 06:59:29 GMT. What date are the logs being analysed?
I wonder if this is a notification extension of some sort.
Gerv
Reporter | ||
Comment 6•12 years ago
|
||
Nope, it's just that we were using the logs from then:
https://bugzilla.mozilla.org/show_bug.cgi?id=783545#c10
(In reply to Frédéric Buclin from comment #4)
> I don't know where this code comes from, but it doesn't exist upstream. Must
> come from a bmo extension.
this isn't generated by bugzilla/bmo code.
my bet would be this is metrics polling the db for changes.
(In reply to Gervase Markham [:gerv] from comment #5)
> That Unix time (1346050769000) is Mon, 27 Aug 2012 06:59:29 GMT. What date
> are the logs being analysed?
well spotted! hopefully these are old logs, because that time span would cover a lot of bugs.
(In reply to Byron Jones ‹:glob› (away until 2nd Jan) from comment #7)
> well spotted! hopefully these are old logs, because that time span would
> cover a lot of bugs.
d'oh somehow i missed sheeri's response to this in comment 6 :)
Comment 9•12 years ago
|
||
These are queries from Metrics. The subquery was originally written with the idea of avoiding a large read on the big activities table and was carried over to all the others. If a join is better across the board, we can definitely change the queries around.
Mark, can you please work with Sheeri on that?
Assignee: scabral → mreid
Assignee | ||
Comment 10•12 years ago
|
||
Sure. I've actually got a development version that uses JOINs instead of subqueries since it sped things up for local testing.
Sheeri, I got good results using something like this - can you confirm that it gives the desired speedup for you too?
SELECT a.bug_id, UNIX_TIMESTAMP(a.creation_ts)*1000 AS modified_ts, login_name AS modified_by, UNIX_TIMESTAMP(a.creation_ts)*1000 AS created_ts, login_name AS created_by, ispatch AS 'attachments.ispatch', isobsolete AS 'attachments.isobsolete', isprivate AS 'attachments.isprivate', attach_id FROM attachments a INNER JOIN (SELECT bug_id FROM bugs WHERE UNIX_TIMESTAMP(delta_ts)*1000 > 1346050769000) q ON q.bug_id = a.bug_id JOIN profiles p ON a.submitter_id = p.userid ORDER BY bug_id, attach_id, a.creation_ts
Reporter | ||
Comment 11•12 years ago
|
||
You do need to specify which "bug_id" in the ORDER BY. The EXPLAIN for this is:
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <derived2>
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 78827
Extra: Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: a
type: ref
possible_keys: attachments_bug_id_idx,attachments_submitter_id_idx
key: attachments_bug_id_idx
key_len: 3
ref: q.bug_id
rows: 1
Extra:
*************************** 3. row ***************************
id: 1
select_type: PRIMARY
table: p
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: bugs.a.submitter_id
rows: 1
Extra:
*************************** 4. row ***************************
id: 2
select_type: DERIVED
table: bugs
type: index
possible_keys: NULL
key: bugs_delta_ts_idx
key_len: 8
ref: NULL
rows: 840702
Extra: Using where; Using index
4 rows in set (0.81 sec)
There's still a subquery there. Testing it out, it's:
MySQL 5.1 - 4.86, 4.63, 4.68 seconds
MariaDB 5.5 - 3.88, 3.83, 3.84 seconds
So the query I offered in comment 3 is still much faster, in both MySQL 5.1 and MariaDB 5.5.
Assignee | ||
Comment 12•12 years ago
|
||
When I try to run the JOIN without a subquery on my dev machine, it doesn't seem to want to use an index. Here's the explain I get:
mysql> explain SELECT a.bug_id, UNIX_TIMESTAMP(a.creation_ts)*1000 AS modified_ts,
login_name AS modified_by, UNIX_TIMESTAMP(a.creation_ts)*1000 AS created_ts,
login_name AS created_by, ispatch AS 'attachments.ispatch',
isobsolete AS 'attachments.isobsolete', isprivate AS 'attachments.isprivate',
attach_id FROM attachments a JOIN profiles p ON a.submitter_id = p.userid
JOIN bugs ON a.bug_id = bugs.bug_id WHERE UNIX_TIMESTAMP(delta_ts)*1000 > 1330450269000 ORDER BY a.bug_id, attach_id, a.creation_ts\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: attachments_bug_id_idx,attachments_submitter_id_idx
key: NULL
key_len: NULL
ref: NULL
rows: 582654
Extra: Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: p
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: bugs.a.submitter_id
rows: 1
Extra:
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: bugs
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: bugs.a.bug_id
rows: 1
Extra: Using where
3 rows in set (0.01 sec)
The query ends up taking >10 seconds to run, while the join-on-subquery version runs in about 1 second (using the same plan as you saw in comment 11).
Do you know how I can get it to choose the attachments_bug_id_idx index?
Comment 13•12 years ago
|
||
This is the same general problem as https://bugzilla.mozilla.org/show_bug.cgi?id=825263, solution already implemented in new version
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Updated•11 years ago
|
Product: mozilla.org → Data & BI Services Team
You need to log in
before you can comment on or make changes to this bug.
Description
•