Closed
Bug 1424412
Opened 8 years ago
Closed 8 years ago
Confirm that query times aren't ridiculous in client_count_daily
Categories
(Data Platform and Tools Graveyard :: Datasets: Client Count, enhancement, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: relud, Assigned: relud)
References
Details
The new incremental data set for client count uses submission_date, but still includes subsession_start_date for backward compatiblity. We need to confirm that query times aren't ridiculous now that we are no longer paritioning by subsession_start_date.
Assignee | ||
Updated•8 years ago
|
Points: 2 → 1
Assignee | ||
Comment 1•8 years ago
|
||
Some initial testing shows that query times are about double in client_count_daily (26-27 min) as compared to client_count (about 8 min) when using this query https://sql.telemetry.mozilla.org/queries/81/source#128
That said, I believe this is a result of having daily rollups, instead of a 6 month rollup, because using submission_date (27 min), was slower than using activity_date (26 min).
For client_count_daily I used this where clause to limit it to the same time span as the client_count table:
> WHERE submission_date > '20170619' AND submission_date < '20171219'
Assignee | ||
Comment 2•8 years ago
|
||
correction, 13 minutes for client_count, not 8:
times are about double in client_count_daily (26-27 min) as compared to client_count (13* min)
Assignee | ||
Comment 4•8 years ago
|
||
(times updated because I ran the queries again)
Simple count comparison:
(3 min) https://sql.telemetry.mozilla.org/queries/50006/source Client Count
(8 min) https://sql.telemetry.mozilla.org/queries/50007/source Client Count Daily
Complex count comparison:
(11 min) https://sql.telemetry.mozilla.org/queries/81/source#128 Firefox ER
(21 min) https://sql.telemetry.mozilla.org/queries/50005/source Firefox ER with client_count_daily
(23 min) https://sql.telemetry.mozilla.org/queries/50004/source Firefox ER with client_count_daily and submission_date
You may notice the graphs look a little odd in that last one. That's because activity_date is formatted like '%Y-%m-%d' and submission_date is formatted like '%Y%m%d', so they aren't actually interchangeable right now. I think this will need fixing, in another bug.
Flags: needinfo?(dthorn)
Assignee | ||
Comment 5•8 years ago
|
||
:frank do you consider those query time increases acceptable?
Status: NEW → ASSIGNED
Flags: needinfo?(fbertsch)
Comment 6•8 years ago
|
||
Double seems fine, and expected. If we get https://github.com/mozilla/redash/issues/35 fixed, then that will incrementally update these submission_date queries which would be much faster.
Flags: needinfo?(fbertsch)
Assignee | ||
Updated•8 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•5 years ago
|
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•