If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

[e10s] e10s-addons experiment stability dashboard review

RESOLVED FIXED

Status

()

Firefox
Extension Compatibility
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: dzeber, Assigned: chutten)

Tracking

unspecified
Points:
---

Firefox Tracking Flags

(e10s+)

Details

(Reporter)

Description

a year ago
Please review the dashboard I created for tracking stability metrics related to the e10s addons experiment.

https://sql.telemetry.mozilla.org/dashboard/stability-comparison-for-e10s-add-ons-experiment
tracking-e10s: --- → ?
(Assignee)

Comment 1

a year ago
* What data/release/etc. questions are these plots supposed to inform?
* How often does crash_aggregates update? (If it's slower than 24hrs, then so too can your refresh cycle be)
* Why is "content_crashes / usage_khours AS content_crash_rate" commented out?
* The query looks solid and clear to read. I think it could benefit from more whitespace (especially towards the bottom) but I can still grok it.
* The plots are clear, but you could relabel the chart lines (under "SERIES" in the Visualization Editor you can write your own labels) to something like "e10s" and "not e10s" instead of those obscure cohort labels. The chart titles quite adequately explain that we're not looking at _all_ e10s/not e10s.

r+ with my questions answered

Updated

a year ago
tracking-e10s: ? → +
(Reporter)

Comment 2

a year ago
(In reply to Chris H-C :chutten from comment #1)
> * What data/release/etc. questions are these plots supposed to inform?

I think the general question is: "What is the effect on stability (crash rates) of enabling e10s for profiles on Beta 49 that have (well-behaved) add-ons".

> * How often does crash_aggregates update? (If it's slower than 24hrs, then
> so too can your refresh cycle be)

The crash_aggregates dataset is also updated daily.

> * Why is "content_crashes / usage_khours AS content_crash_rate" commented
> out?

The content_crashes count is the combination of content process crashes and crashes on shutdown. Rather than keep the combined rate, I split them out in the two columns after that one:

(content_crashes - content_shutdown_crashes) / usage_khours AS content_crash_rate_noshutdown,
content_shutdown_crashes / usage_khours AS content_shutdown_crash_rate,

I've removed the content_crash_rate column.

> * The plots are clear, but you could relabel the chart lines (under "SERIES"
> in the Visualization Editor you can write your own labels) to something like
> "e10s" and "not e10s" instead of those obscure cohort labels. The chart
> titles quite adequately explain that we're not looking at _all_ e10s/not
> e10s.

Done.

I've made some further updates to the dashboard:
- Added a description of the cohorts and the metrics
- Added series for the e10s/non-e10s cohorts that don't have add-ons
- Converted the "crash rates by build" graphs to line graphs (since we have enough builds now)
- Made all crash rate graphs full width
- Added cohort stats graphs for the no-add-ons cohorts
- Changed the cutoff for most recent date to "2 days before today". This was previously "1 day before today", but incomplete data was causing the most recent crash rates to shoot up.
(Assignee)

Comment 3

a year ago
> - Changed the cutoff for most recent date to "2 days before today". This was previously "1 day before today", but incomplete data was causing the most recent crash rates to shoot up.

No matter what you choose, there will almost always be users lagging in reporting their usage hours. ddurst's team (incl. me) will be looking into improving that in interesting ways in the near future.

r+
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.