Closed Bug 1504863 Opened 6 years ago Closed 6 years ago

Find a meaningful presentation for the CHECKERBOARD probe family

Tracking

(data-science-status Evaluation & interpretation)

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

data-science-status

---

Evaluation & interpretation

People

(Reporter: tdsmith, Assigned: tdsmith)

References

Details

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Description

•

6 years ago

Brief description of the request: The distribution of CHECKERBOARD_SEVERITY is strongly bimodal, structured as the sum of two widely-separated log-normal distributions. Because these populations are widely separated and the proportion of pings we receive from users in each group varies stochastically over time as builds roll out, summary metrics are very unstable for small n. Further, because the mild and severe peaks are so widely separated, an arithmetic mean of the per-user experience amounts to asking whether a user has ever experienced a severe checkerboarding event, which is not the intent. Because we believe checkerboarding events are undesirable and affect the user experience negatively, we would like to propose a visualization that accurately reflects the impact of WebRender on the user experience. Link to any assets: WebRender dashboard review: https://bugzilla.mozilla.org/show_bug.cgi?id=1501470

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Comment 1

•

6 years ago

Some early iteration in https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/44825/command/44830

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Updated

•

6 years ago

Points: --- → 2

Priority: -- → P3

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

data-science-status: --- → Modeling

Priority: P3 → P2

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Comment 2

•

6 years ago

Some more iteration in https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/86542/command/86586.

I think it makes sense to essentially treat these like crash rates, since they are individually rare-ish (most user-days have zero events, and we only have one or two days of activity per user for each nightly build) and depend on active use of the browser (so active_ticks models exposure to the risk).

Discarding the least severe events (< 500) and plotting the population rate/active-ticks ratio (after truncating either to the 99th percentile per user-day) looks stable over time, and comparable for WR vs control.

Plotting active_ticks-scaled "badness" (sum of log10(severity)) over the population shows identical-looking trends -- it would be nice to capture that some events are worse than others but then we lose the ability to treat it as a Poisson process.

Next step is to add this to the WebRender dashboard.

data-science-status: Modeling → Evaluation & interpretation

Priority: P2 → P1

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Depends on: 1539309

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Comment 3

•

6 years ago

Added this to the 67 release monitoring dashboard. This should go on the continuous monitoring dashboard as well.

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Updated

•

6 years ago

Blocks: 1522903

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Assignee

Comment 4

•

6 years ago

Done in https://github.com/tdsmith/webrender-dashboard/commit/24840237e9e0071c224940ea96e72c42415c761f.

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Find a meaningful presentation for the CHECKERBOARD probe family

Categories

(Data Science :: Investigation, task, P1)

Tracking

(data-science-status Evaluation & interpretation)

People

(Reporter: tdsmith, Assigned: tdsmith)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated

Updated

Updated

Comment 2

Updated

Comment 3

Updated

Comment 4