Closed
Bug 1354640
Opened 8 years ago
Closed 8 years ago
Unable to run query
Categories
(Data Platform and Tools :: General, defect)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: benjamin, Unassigned)
References
()
Details
Attachments
(1 file)
|
5.45 KB,
text/plain
|
Details |
I'm trying to run a query to graph the distribution of Flash versions over time:
https://sql.telemetry.mozilla.org/queries/4012/source
This query will not run. Here are some of the error messages I have received:
"Error running query: Could not communicate with the remote task. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes."
"Error running query: Should not have nextUri if failed"
I'm not sure whether this is a transient operational issue, or a problem with the query itself. I don't *think* that this query ought to be unusually expensive, but I've been wrong before.
Text of the query, for reference:
WITH clients AS (
SELECT
SUBSTR(s.subsession_start_date, 1, 10) AS activity_date,
s.subsession_length,
p.active_plugins.version AS flash_version
FROM
longitudinal l
CROSS JOIN
UNNEST(
l.subsession_start_date,
l.subsession_length,
l.active_plugins
) AS s(
subsession_start_date,
subsession_length,
active_plugins
)
CROSS JOIN
UNNEST(s.active_plugins) AS p(active_plugins)
WHERE
normalized_channel = 'release'
AND os = 'Windows_NT'
AND p.active_plugins.name = 'Shockwave Flash'
AND s.subsession_start_date > '2016-09-01'
)
SELECT
activity_date,
CASE
WHEN COALESCE(TRY_CAST(SPLIT_PART(flash_version, '.', 1) AS TINYINT), 0) BETWEEN 20 AND 63 THEN flash_version
ELSE '0:Older'
END AS flash_version,
SUM(subsession_length) AS usage
FROM clients
GROUP BY activity_date, flash_version
ORDER BY activity_date DESC, flash_version DESC
| Reporter | ||
Comment 1•8 years ago
|
||
An update here: robotblake said he kicked the cluster on Friday, but this query continues to time out/not run properly.
Could I get somebody to either look at the query and tell me if I'm doing something really dumb from a query-performance perspective, or check server logs to see what's failing?
Flags: needinfo?(bimsland)
| Reporter | ||
Comment 2•8 years ago
|
||
Attached is the query plan for this according to EXPLAIN.
I'm not experienced at reading presto query plans, but that there are two RemoteExchange nodes is a bit troubling because I'd really expect you'd only need to do a slave aggregate and a single gather/collect.
Comment 3•8 years ago
|
||
I believe this query was causing the presto cluster to fail due to resource constraints.
The same query runs OK against the longitudinal table accessible via Athena per:
https://sql.telemetry.mozilla.org/queries/4416/source
I've filed Bug 1361803 to ensure that the longitudinal table is kept up to date via the Athena data source.
Benjamin, is this OK as a workaround?
Flags: needinfo?(benjamin)
| Reporter | ||
Comment 4•8 years ago
|
||
Can you describe what Athena is? As long as the data is accessible I just need guidance about the right way to access it.
Flags: needinfo?(benjamin)
Comment 5•8 years ago
|
||
Athena is an Amazon hosted version of Presto + Hive, it's got some limitations but generally is faster and more stable than the Presto data source.
Flags: needinfo?(bimsland)
Comment 6•8 years ago
|
||
Athena uses the exact same underlying data as our Presto cluster. You can use it by switching the Data Source drop-down from "Presto" to "Athena".
You also need to refer to the table as "telemetry.longitudinal", but otherwise it should be the same.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
| Assignee | ||
Updated•2 years ago
|
Component: Redash (STMO) → General
You need to log in
before you can comment on or make changes to this bug.
Description
•