Turn on slow query logging for graph server

RESOLVED FIXED

Status

RESOLVED FIXED
10 years ago
4 years ago

People

(Reporter: rdoherty, Assigned: xb95)

Tracking

Details

(Reporter)

Description

10 years ago
Can we get aggregated slow query logs from MySQL on the graph server? We're in the process of improving performance and Schrep noticed this query taking forever:

SELECT id, machine, test, test_type, date, extra_data, branch FROM dataset_info WHERE type = 'discrete' AND test_type != 'baseline' and (date >= 0)

We'd like to get a list of other queries we that are slow too.
Assignee: server-ops → mark
Component: Server Operations: Web Content Push → Server Operations
(Assignee)

Comment 1

10 years ago
Well, that particular query sucks because there is no index that it can use.  And adding an index would be futile:

mysql> select type, count(*) from dataset_info group by 1;
+------------+----------+
| type       | count(*) |
+------------+----------+
| continuous |      615 |
| discrete   |  1428379 |
+------------+----------+
2 rows in set (22.10 sec)

mysql> select test_type, count(*) from dataset_info group by 1;
+-----------+----------+
| test_type | count(*) |
+-----------+----------+
| perf      |  1429011 |
+-----------+----------+
1 row in set (15.23 sec)

mysql> select min(date) from dataset_info;
+-----------+
| min(date) |
+-----------+
|         0 |
+-----------+
1 row in set (1.27 sec)

So in effect, this query is the same as 'SELECT * FROM dataset_info'.  Which is pretty incredibly lame, given there are 1.5 million rows.  Why are you doing this to begin with?  It seems broken.  :)

Anyway, I've got slow query logs for the slave (mrdb06) from February.  The master (mrdb05) is not currently logging queries.  Can you check the graph server and see if the queries you're interested in are sent to the slave, or if they're master only?

Turning on slow query log requires a database downtime, so the soonest I can do this is next Tuesday, as that's the next window.  But if you only need the slave, I can get that pretty quick.  Let me know.
Status: NEW → ASSIGNED
(Reporter)

Comment 2

10 years ago
Not sure why that query is written that way, will have to dig in.

I don't know what db(s) the frontend is connecting to. The connection info is in server/graphsdb.py.
(Reporter)

Comment 3

10 years ago
I've done some work and would like a query benchmarked in production:

SELECT id, machine, test, test_type, MAX(date), extra_data, branch FROM dataset_info WHERE type = "discrete" and test_type != "baseline" and date >= 1216675965 GROUP BY machine, branch, test


This query will give us the necessary data while avoiding selecting 1.5 million rows. Thanks!
(Reporter)

Updated

10 years ago
Blocks: 438716
(Reporter)

Comment 4

10 years ago
Closing, Mark ran the new query, which was a lot faster. Will reopen the bug if necessary.
Status: ASSIGNED → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.