Closed Bug 1424412 Opened 8 years ago Closed 8 years ago

Confirm that query times aren't ridiculous in client_count_daily

Categories

(Data Platform and Tools Graveyard :: Datasets: Client Count, enhancement, P1)

x86
macOS
enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: relud, Assigned: relud)

References

Details

The new incremental data set for client count uses submission_date, but still includes subsession_start_date for backward compatiblity. We need to confirm that query times aren't ridiculous now that we are no longer paritioning by subsession_start_date.
Points: 2 → 1
Some initial testing shows that query times are about double in client_count_daily (26-27 min) as compared to client_count (about 8 min) when using this query https://sql.telemetry.mozilla.org/queries/81/source#128 That said, I believe this is a result of having daily rollups, instead of a 6 month rollup, because using submission_date (27 min), was slower than using activity_date (26 min). For client_count_daily I used this where clause to limit it to the same time span as the client_count table: > WHERE submission_date > '20170619' AND submission_date < '20171219'
correction, 13 minutes for client_count, not 8: times are about double in client_count_daily (26-27 min) as compared to client_count (13* min)
:relud, can you link to the queries you ran?
Flags: needinfo?(dthorn)
(times updated because I ran the queries again) Simple count comparison: (3 min) https://sql.telemetry.mozilla.org/queries/50006/source Client Count (8 min) https://sql.telemetry.mozilla.org/queries/50007/source Client Count Daily Complex count comparison: (11 min) https://sql.telemetry.mozilla.org/queries/81/source#128 Firefox ER (21 min) https://sql.telemetry.mozilla.org/queries/50005/source Firefox ER with client_count_daily (23 min) https://sql.telemetry.mozilla.org/queries/50004/source Firefox ER with client_count_daily and submission_date You may notice the graphs look a little odd in that last one. That's because activity_date is formatted like '%Y-%m-%d' and submission_date is formatted like '%Y%m%d', so they aren't actually interchangeable right now. I think this will need fixing, in another bug.
Flags: needinfo?(dthorn)
:frank do you consider those query time increases acceptable?
Status: NEW → ASSIGNED
Flags: needinfo?(fbertsch)
Double seems fine, and expected. If we get https://github.com/mozilla/redash/issues/35 fixed, then that will incrementally update these submission_date queries which would be much faster.
Flags: needinfo?(fbertsch)
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.