1474484 - [Shield] WebRender V1 Experiment

Reporter

Description

•

7 years ago

Basic description of experiment: Enable WebRender by default on qualified hardware by setting the gfx.webrender.all.qualified pref to true for the test population. What are the branches of the study? Half the experiment population would have WebRender enabled, and half would not. No third branch. What percentage of users do you want in each branch? 50/50 is fine. What Channels and locales do you intend to ship to? Nightly only. What is your intended go live date and how long will the study run? July 16, 2 weeks Are there specific criteria for participants? Windows 10, Nvidia GPU only. What is the main effect you are looking for and what data will you use to make these decisions? No more than a 5% increase in overall crash reports No more than a 5% increase in OOM crash reports No more than a 5% increase in shutdown crashes Telemetry probes: CANVAS_WEBGL_SUCCESS - no more than 5% regression in "True" value DEVICE_RESET_REASON - no more than 5% regression in number of submissions CHECKERBOARD_DURATION - no more than 5% regression in distribution CHECKERBOARD_PEAK - no more than 5% regression in distribution CHECKERBOARD_SEVERITY - no more than 5% regression in distribution CONTENT_LARGE_PAINT_PHASE_WEIGHT - no more than 5% regression in number of submissions CONTENT_PAINT_TIME - no more than 5% regression in distribution FX_PAGE_LOAD_MS - no more than 5% regression in distribution FX_TAB_CLICK_MS - no more than 5% regression in distribution COMPOSITE_TIME - no more than 10% regression in distribution CONTENT_FRAME_TIME - no more than 10% regression in distribution COMPOSITE_FRAME_ROUNDTRIP_TIME - expect to see an improvement here Who is the owner of the data analysis for this study? David Bolter, Tim Smith Who will have access to the data? David Bolter, Thomas Elin, Kartikaya Gupta, William (Chris) Beard, Tim Smith Do you plan on surveying users at the end of the study? No User facing title of the experiment: WebRender User facing description of the experiment: New generation graphics rendering engine Link to any relevant google docs / Drive files that describe the project. [PHD] https://docs.google.com/document/d/1mo76Ub0l5cNIII0oKqoVn25Dop4nLKKf0CVhvK4_Wps/edit [Release Criteria] https://docs.google.com/document/d/1zs5b-hAXnIxvl_acGUjibSSeT4ftI_3TIqAoePgCtBI

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

7 years ago

Blocks: webrender

Andreea Cupsa [:acupsa], Experiments QA

Updated

•

7 years ago

Depends on: 1474583

Andreea Cupsa [:acupsa], Experiments QA

Updated

•

7 years ago

Depends on: 1474294

Andreea Cupsa [:acupsa], Experiments QA

Updated

•

7 years ago

Depends on: 1474595

MarniePW [:marnie]

Updated

•

7 years ago

Summary: WebRender V1 Experiment → [Shield] WebRender V1 Experiment

Ilana

Comment 1

•

7 years ago

Science review: R+

MarniePW [:marnie]

Comment 2

•

7 years ago

Dave, can we get your R+ for the peer review? Thanks.

Flags: needinfo?(dtownsend)

Dave Townsend [:mossop]

Comment 3

•

7 years ago

Peer review: R+

Flags: needinfo?(dtownsend)

Carmen Fat [:cfat] - Ecosystem QA

Comment 4

•

7 years ago

Sign Off for WebRender - (YELLOW) WebRender Targeted: Firefox Nightly 63.0a1 We have finished testing the WebRender experiment. We have found the following issues: - Bug 1474583 - [WebRender Shield Study] Higher CPU usage with WebRender enabled on YouTube - Bug 1474595 - [WebRender shield study] FPS drop with WebRender enabled on webgl demo websites - Bug 1474294 - [WebRender Shield Study] Specific images entirely coded in HTML & CSS are not correctly rendered with WebRender enabled QA’s recommendation: YELLOW - SHIP IT, CONDITIONALLY Reasoning: - We tested the try build and there is a slight improvement but overall the results remained the same, the activation of WebRender increases the FPS for some of the websites tested, but also decreases the FPS for others (Bug 1474595 P1). - Even though CPU usage and battery life are not a priority for V1 (Bug 1474583 also P1), there still is the concern that this could have a negative impact on the users. Testing summary: - Full Functional test suite: TestRail (https://goo.gl/EpbvZb); - Verified that the Telemetry probes are correctly sent; - Tested loading time on Alexa’s topsites, CPU usage, FPS measurements and benchmark with Motion Mark: Testing results (https://goo.gl/2PwRxz). Tested Platforms: - Windows 10 x64 Tested Firefox versions: - Firefox Nightly 63.0a1

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 5

•

7 years ago

Thanks Carmen! From the developer side we're still wanting to ship the experiment. Enabling the experiment will give us more data as to whether the FPS drop is a widespread issue (i.e. affects many users/sites) or restricted to a subset of hardware of pages. Based on QA testing it seems to affect webgl "demo" sites - this class of websites is only going to be a small fraction of the websites visited by users, and so doesn't need to block the experiment.

Darkspirit

Comment 6

•

7 years ago

(In reply to Carmen Fat [:carmenf] - Experiments QA from comment #4) > Motion Mark: Testing results (https://goo.gl/2PwRxz). Your Motion Mark numbers look wrong. Look how awesome it is: https://docs.google.com/spreadsheets/d/e/2PACX-1vQolBzSivIh_pZlciaAmZECPjoo5O3T_O0esg2bMF0mhgbKDFFyO-h-ueeR3cl4PLYCpvRjKIXHGrUb/pubhtml

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 7

•

7 years ago

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #6) > (In reply to Carmen Fat [:carmenf] - Experiments QA from comment #4) > > Motion Mark: Testing results (https://goo.gl/2PwRxz). > > Your Motion Mark numbers look wrong. > Look how awesome it is: > https://docs.google.com/spreadsheets/d/e/2PACX- > 1vQolBzSivIh_pZlciaAmZECPjoo5O3T_O0esg2bMF0mhgbKDFFyO-h- > ueeR3cl4PLYCpvRjKIXHGrUb/pubhtml It depends on the hardware being used, as well as the prefs. The sheet you linked to has prefs set to disable the performance.now mitigations and to turn on ASAP mode, which produces much better results but is also a non-default configuration.

MarniePW [:marnie]

Comment 8

•

7 years ago

Andreas, can you R+ this experiment and that you understand the potential risks, from Product's perspective?

Flags: needinfo?(abovens)

Andreas Bovens [:abovens]

Comment 9

•

7 years ago

Product: R+

Flags: needinfo?(abovens)

Robert (rrayborn, he/him)

Comment 10

•

7 years ago

We're live after resolving some confusion around GPU targeting. The final targeting is: * Nightly 63+ * Windows 10 * Has an NVidia GPU (can't rely on isActive since that changes dynamically AFAIK; vendorID = '0x10de')

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

7 years ago

status-firefox63: --- → affected

Chris Peterson [:cpeterson]

Updated

•

7 years ago

Depends on: 1477156

Andreas Bovens [:abovens]

Comment 11

•

7 years ago

For those cases where there are two GPUs, is there any way to know (from the results) if the NVidia GPU is being used with Firefox or not?

Darkspirit

Updated

•

7 years ago

Depends on: 1477380

Darkspirit

Comment 12

•

7 years ago

(Thomas Elin [:relaas] from comment #0) > User facing title of the experiment: WebRender > User facing description of the experiment: New generation graphics rendering engine

Depends on: 1447499

Robert (rrayborn, he/him)

Comment 13

•

7 years ago

We've fixed our recipe issue for the latest Nightly and relaunched this. It will only target the most recent Nightly and newer, so our fulfillment will be lower. Per the analysis, I don't know the answer to that. Unfortunately dynamic GPU switching makes things very hard to analyze in an unambiguous way. I am not an expert here though. Thanks all

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

7 years ago

Depends on: 1480242

Matt Grimes [:Matt_G]

Comment 14

•

7 years ago

Per Thomas I've ended this study. We can close this bug after Tim has a chance to finish his analysis.

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Comment 15

•

7 years ago

Thanks, Matt. Here's a first look [1] at the distributions of per-user averages for the probes mentioned in the PHD; apologies for the lack of polish. Fewer users submitted qualifying telemetry to the treatment arm vs the control arm for reasons I think are unclear [2]. Sample size was not a concern for powering the comparisons we wanted to make, although if the factors that led to fewer users landing in the treatment arm were associated with some biasing factor (hardware age? etc), that could severely distort the results. Many of the probes showed improvements but there appears to have been a marked regression in COMPOSITE_TIME, and the raw fraction of users experiencing a crash looks higher in the treatment branch (unadjusted for activity). Some remaining work includes comparing activity metrics between the branches and packaging the report for presentation; this may drag since I'll be at onboarding next week. [1] https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/26331/command/26332 [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1480242

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Comment 16

•

7 years ago

Apologies; this is a nicer view that hides the code by default: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/26331/resultsOnly

Tim Smith (inactive) 👨‍🔬 [:tdsmith]

Comment 17

•

7 years ago

Here's a look at some activity metrics between the branches; usage hours (measured by the activeTicks simpleMeasurement) were ~20% lower and # URIs visited was 11% lower in the treatment branch after filtering down to the users who actually received WebRender: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/26915/resultsOnly Since usage was lower in the treatment branch, the study may have underestimated the fraction of users who would experience a crash and the number of crashes per user with WebRender enabled (vs the case where usage was the same between the branches). Both of those metrics were already higher in the WebRender branch vs the control branch. I'll close out the bug here; please let me know if I can help with anything else.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED