Closed
Bug 1298095
Opened 8 years ago
Closed 7 years ago
crash_aggregates' dimensions['channel'] has an awful lot of 'Other'
Categories
(Data Platform and Tools Graveyard :: Datasets: Crash Aggregates, defect, P2)
Data Platform and Tools Graveyard
Datasets: Crash Aggregates
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 1352091
People
(Reporter: chutten, Assigned: mdoglio)
Details
Queries (like [1]) looking at channels in crash_aggregates come up with the usual suspects (aurora, beta, nightly, release) and the usual "whatever" (no, default, government). There's an awful lot of 'Other', though. It's the fifth most popular (after the usual suspects) in terms of rows in the dataset. It's the third most popular (only beaten by release and beta) in terms of usage_khours. Does it mean ESR? If so, why isn't it part of the actual "esr" row? What does it mean? Is it just some clients messing with us? [1]: https://sql.telemetry.mozilla.org/queries/1093/source
Updated•8 years ago
|
Points: --- → 1
Priority: -- → P2
Comment 1•8 years ago
|
||
Something funky is going on here. I think maybe one of the source datasets (main pings?) is using normalizedChannel (as computed upstream at [1]), while another (crash pings?) is using the raw channel name. I'm not all that familiar with the CrashAggregateView code, but I did see a reference to normalizedChannel at [2]. Clearly something is using the pre-normalized value. [1] https://github.com/mozilla-services/data-pipeline/blob/master/hindsight/modules/fx.lua#L23 [2] https://github.com/mozilla/telemetry-batch-view/blob/master/src/main/scala/com/mozilla/telemetry/views/CrashAggregateView.scala#L80
Updated•8 years ago
|
Flags: needinfo?(mdoglio)
Assignee | ||
Comment 2•8 years ago
|
||
Digging into this now
Assignee: nobody → mdoglio
Flags: needinfo?(mdoglio)
Assignee | ||
Comment 3•8 years ago
|
||
It looks like we have tons of esr indeed, at least in the crash pings. See [1] for details. I don't know if all those `default` are expected, but we should probably assign them as well to a separate bucket. Please let me know if you need me to investigate further. [1]https://gist.github.com/maurodoglio/6d271cc5849b6655e47ef86d87a1517f
Reporter | ||
Comment 4•8 years ago
|
||
So... normalizedChannel has 5 values? {release, beta, aurora, nightly, Other} Could we add esr to it? esr seems to be a perfectly-valid channel we could reason about.
Assignee | ||
Comment 5•8 years ago
|
||
My only concern on changing the normalizedChannel configuration is that there may be jobs/re:dash queries not expecting the new value. :mreid do you have an opinion on that?
Flags: needinfo?(mreid)
Comment 6•8 years ago
|
||
It seems likely that there are things relying on the current behaviour of normalizedChannel, but I think the absence of ESR is a mistake that we should fix. So +1 on adding ESR, even if it impacts other queries. We should also use the same "channel" strategy for both crashes and main pings in the aggregates.
Flags: needinfo?(mreid)
Updated•8 years ago
|
Summary: crash_aggrregates' dimensions['channel'] has an awful lot of 'Other' → crash_aggregates' dimensions['channel'] has an awful lot of 'Other'
Updated•7 years ago
|
Component: Metrics: Pipeline → Datasets: Crash Aggregates
Product: Cloud Services → Data Platform and Tools
Reporter | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Updated•6 years ago
|
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•