Closed
Bug 1354640
Opened 7 years ago
Closed 7 years ago
Unable to run query
Categories
(Data Platform and Tools :: General, defect)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: benjamin, Unassigned)
References
()
Details
Attachments
(1 file)
5.45 KB,
text/plain
|
Details |
I'm trying to run a query to graph the distribution of Flash versions over time: https://sql.telemetry.mozilla.org/queries/4012/source This query will not run. Here are some of the error messages I have received: "Error running query: Could not communicate with the remote task. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes." "Error running query: Should not have nextUri if failed" I'm not sure whether this is a transient operational issue, or a problem with the query itself. I don't *think* that this query ought to be unusually expensive, but I've been wrong before. Text of the query, for reference: WITH clients AS ( SELECT SUBSTR(s.subsession_start_date, 1, 10) AS activity_date, s.subsession_length, p.active_plugins.version AS flash_version FROM longitudinal l CROSS JOIN UNNEST( l.subsession_start_date, l.subsession_length, l.active_plugins ) AS s( subsession_start_date, subsession_length, active_plugins ) CROSS JOIN UNNEST(s.active_plugins) AS p(active_plugins) WHERE normalized_channel = 'release' AND os = 'Windows_NT' AND p.active_plugins.name = 'Shockwave Flash' AND s.subsession_start_date > '2016-09-01' ) SELECT activity_date, CASE WHEN COALESCE(TRY_CAST(SPLIT_PART(flash_version, '.', 1) AS TINYINT), 0) BETWEEN 20 AND 63 THEN flash_version ELSE '0:Older' END AS flash_version, SUM(subsession_length) AS usage FROM clients GROUP BY activity_date, flash_version ORDER BY activity_date DESC, flash_version DESC
Reporter | ||
Comment 1•7 years ago
|
||
An update here: robotblake said he kicked the cluster on Friday, but this query continues to time out/not run properly. Could I get somebody to either look at the query and tell me if I'm doing something really dumb from a query-performance perspective, or check server logs to see what's failing?
Flags: needinfo?(bimsland)
Reporter | ||
Comment 2•7 years ago
|
||
Attached is the query plan for this according to EXPLAIN. I'm not experienced at reading presto query plans, but that there are two RemoteExchange nodes is a bit troubling because I'd really expect you'd only need to do a slave aggregate and a single gather/collect.
Comment 3•7 years ago
|
||
I believe this query was causing the presto cluster to fail due to resource constraints. The same query runs OK against the longitudinal table accessible via Athena per: https://sql.telemetry.mozilla.org/queries/4416/source I've filed Bug 1361803 to ensure that the longitudinal table is kept up to date via the Athena data source. Benjamin, is this OK as a workaround?
Flags: needinfo?(benjamin)
Reporter | ||
Comment 4•7 years ago
|
||
Can you describe what Athena is? As long as the data is accessible I just need guidance about the right way to access it.
Flags: needinfo?(benjamin)
Comment 5•7 years ago
|
||
Athena is an Amazon hosted version of Presto + Hive, it's got some limitations but generally is faster and more stable than the Presto data source.
Flags: needinfo?(bimsland)
Comment 6•7 years ago
|
||
Athena uses the exact same underlying data as our Presto cluster. You can use it by switching the Data Source drop-down from "Presto" to "Athena". You also need to refer to the table as "telemetry.longitudinal", but otherwise it should be the same.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•1 year ago
|
Component: Redash (STMO) → General
You need to log in
before you can comment on or make changes to this bug.
Description
•