[Shield] Pref Flip Study: WebRender

NEW
Assigned to

Status

enhancement
11 months ago
20 days ago

People

(Reporter: jrmuizel, Assigned: jrmuizel)

Tracking

(Depends on 1 bug, Blocks 1 bug)

unspecified
Dependency tree / graph

Firefox Tracking Flags

(firefox64+ fixed, firefox65+ fixed, firefox66+ fixed)

Details

We would like to re-enable the study from https://bugzilla.mozilla.org/show_bug.cgi?id=1474484#c0

We would like an 85%/15% split with 
- 85% gfx.webrender.qualified.enabled = true
- 15% gfx.webrender.qualified.enabled = false

This will allow us track the evolution of telemetry metrics that we expect to move over time and help us ensure that we're improving things and not regressing.

There's some urgency in enabling this and given that it's replicating the previous study I'm hoping that the previous approvals to still apply.
Jeff, is everything the same as the previous study? Looks like you had a 50/50 split last time? And I'm assuming you'd still want Nightly, albeit 64?
Flags: needinfo?(jmuizelaar)
Yes. We had a 50/50 split last time. Other than the split everything should be the same. And yes, we only want to target Nightly
Flags: needinfo?(jmuizelaar)
[Tracking Requested - why for this release]:
Summary: [Shield] Pref Flip Study: WebRender → [Shield] Pref Flip Study: WebRender, Nightly 64
OK, if that's the case you'll need to file a Shield PI request and it will need to go through QA again, per Matt.
The pref has already been flipped on in Nightly, so the net affect of turning on this study is just to disable it for 15% of the users. Since we're only targeting Nvidia users the non-webrender path is still being tested so this has very low risk of introducing any additional problems that would be caught by QA. Further, QA is already doing existing WebRender on/off testing and no major problems have come up

If possible, I'd like to avoid delay of going through QA again.
Flags: needinfo?(mpasciutowood)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #0)
Two questions:
> - 85% gfx.webrender.qualified.enabled = true
> - 15% gfx.webrender.qualified.enabled = false

Will new prefs be introduced or did you mean the following?

85% gfx.webrender.all.qualified;true (already the default since bug 1490742)
15% gfx.webrender.all.qualified;false
- Restart needed

Last time the recipe was configured like this:
>  normandy.channel == 'nightly'
>  && normandy.version >= '63' 
>  && normandy.telemetry.main.environment.build.buildId >= '20180725103029'
>  && normandy.os.windowsVersion == 10.0 
>  && normandy.telemetry.main.environment.system.gfx.adapters[.vendorID == '0x10de'][0]

Would it be possible / make sense to have it like this?
>  normandy.channel == 'nightly'
>  && normandy.version >= '64' 
>  && normandy.os.windowsVersion == 10.0 
>  && normandy.telemetry.main.environment.system.gfx.features.wrQualified.status == 'available'
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #6)
> (In reply to Jeff Muizelaar [:jrmuizel] from comment #0)
> Two questions:
> > - 85% gfx.webrender.qualified.enabled = true
> > - 15% gfx.webrender.qualified.enabled = false
> 
> Will new prefs be introduced or did you mean the following?
> 
> 85% gfx.webrender.all.qualified;true (already the default since bug 1490742)
> 15% gfx.webrender.all.qualified;false
> - Restart needed

No new prefs. I forgot the ".all" part that you noticed. 

> 
> Last time the recipe was configured like this:
> >  normandy.channel == 'nightly'
> >  && normandy.version >= '63' 
> >  && normandy.telemetry.main.environment.build.buildId >= '20180725103029'
> >  && normandy.os.windowsVersion == 10.0 
> >  && normandy.telemetry.main.environment.system.gfx.adapters[.vendorID == '0x10de'][0]
> 
> Would it be possible / make sense to have it like this?
> >  normandy.channel == 'nightly'
> >  && normandy.version >= '64' 
> >  && normandy.os.windowsVersion == 10.0 
> >  && normandy.telemetry.main.environment.system.gfx.features.wrQualified.status == 'available'

That makes sense to me.
Matt, can you please weigh in on this? I know you wanted QA again, but please review the above.
Flags: needinfo?(mpasciutowood) → needinfo?(mgrimes)
I think this is fine. It's nightly and we've had this on by default already. As long as Krupa agrees we're good.
Flags: needinfo?(mgrimes) → needinfo?(kraj)
Since this is just for Nightly, I'm fine with the change in the split without going through additional QA.
Flags: needinfo?(kraj)
Jeff, sounds like you're good to go. Will you please still send the Intent to Ship email with the relevant info and dates you'd like to start/stop?
Flags: needinfo?(jmuizelaar)
Done.
Flags: needinfo?(jmuizelaar)
SGTM too for what it's worth.  Thanks Jeff.
Updated per Mythmon's recommendations in the Intent to Ship email:

> What percentage of users do you want in each branch? 85/15.

I read this as 85% of users serving as a tagged control group, with the feature left on, and 15% of users being put into an experimental group with the feature turned off. May I suggest changing this sampling to instead put 15% of users in the control group, and 15% of users in the experimental group? That would then leave 70% of users untagged.

I expect this will make it easier to look at the data later (so we can do direct 1:1 comparisons instead of having to divide by population size). It also reduces the number of users we tag as in an experiment, which is beneficial since the telemetry packets of every tagged user are duplicated on the servers.

If there is a reason to have a large control group, that is also fine. I think this suggestion more accurately reflects the actual goals of the study though.

-Michael Cooper

_______________________________________________
release-drivers mailing list
release-drivers@mozilla.org
https://mail.mozilla.org/listinfo/release-drivers


Jeff Muizelaar jmuizelaar@mozilla.com via mozilla.org 
	
9:18 AM (1 hour ago)
	
to Michael, release-drivers
Yes. This sounds like a good approach to me.
Sorry, we were going to launch just now, but I'm uncertain enough on details to delay the launch (I'll get it approved for a one click launch this weekend if the ambiguities are addressed).

My perception of the final design are:
* gfx.webrender.all.qualified as the test preference.  This is not explicitly mentioned in this bug, but is clearer in the Intent to Ship.  That said, I need to make sure we're on the same page before moving forward. <- THE BIG ONE
* 50/50 split true/false for 30% of population (since there's no reason to overpower the on/control group)
* I'm changing the branch names to enabled/disabled since variant can imply toggling and I don't want ambiguities there.
Flags: needinfo?(jmuizelaar)
Yes. gfx.webrender.all.qualified is the pref we want to flip.
Yes. a 50/50 split true/false for 30% of population
I'm not sure what you mean by branch names, but this sounds ok.
Flags: needinfo?(jmuizelaar)
Blocks: 1484365
This study is now live, per rrayborn.
We're seeing a problem where the pref is not being flipped even for about 25% of the people in the study. Is it possible for us to increase the percentage of users the study applies to to compensate for this?
Depends on: 1495152
For the time being Mythmon is investigating the enrollment issue.

I'll go ahead and bump the sample in the mean time (within an hour) since we need the sample regardless.
(In reply to Rob from comment #19)
> For the time being Mythmon is investigating the enrollment issue.
> 
> I'll go ahead and bump the sample in the mean time (within an hour) since we
> need the sample regardless.

Did this bump happen and what percentage is it at?
Flags: needinfo?(rrayborn)
As far as I can tell the recipe is applied to 45% of the population, with an even split between enabled/disabled.

https://normandy.cdn.mozilla.net/api/v1/recipe/587/
Flags: needinfo?(rrayborn)
Marnie, we'd like to have this shield study also apply to Beta 64. What's needed to make that happen?
Flags: needinfo?(mpasciutowood)
I'm hesitant to respond and give you the wrong answer, so let me pull in Matt.
Flags: needinfo?(mpasciutowood) → needinfo?(mgrimes)
FWIW as release manager I support taking this on beta, to keep WR disabled for a subset of the qualified users.
Email was sent to rel-drivers earlier today to request the following changes:

V1 - Nightly 64 with 45% enrollment of Windows 10 Nvidia GPU with
50/50 control to treatment ratio in all locales - Complete

V2 - Beta 64/Nightly 65 with 45% enrollment of Windows 10 Nvidia GPU
with 50/50 control to treatment ratio in all locales


Looks like we already have sign off from Julien so we can make this modification today.
Flags: needinfo?(mgrimes)
This change has been made and is now live.
Decision has been made to end the beta 64 experiment on Nov 30. Please let me know if anything else is needed to make this official.
Email from Jeff sent Nov. 27, 2018 requesting changes:

We are requesting approval to move the WebRender experiment on to Beta 65 and
and Nightly 66
V1 - Nightly 64 with 45% enrollment of Windows 10 Nvidia GPU with
50/50 control to treatment ratio in all locales - Complete
V2 - Beta 64/Nightly 65 with 45% enrollment of Windows 10 Nvidia GPU
with 50/50 control to treatment ratio in all locales
V3 - Beta 65/Nightly 66 with 45% enrollment of Windows 10 Nvidia GPU
with 50/50 control to treatment ratio in all locales - Awaiting
approval

The bug tracking this study is
https://bugzilla.mozilla.org/show_bug.cgi?id=1492568. The only
difference between v2 and v3 is the changing of the version numbers to
adapt to the new releases.

-Jeff
Approving V3 on behalf of RelMan.
Summary: [Shield] Pref Flip Study: WebRender, Nightly 64 → [Shield] Pref Flip Study: WebRender
Recipe 587 for v2 has been disabled; prefflip-webrender-v1-2-1492568 has ended.

Recipe 651 for v3 has been published; prefflip-webrender-v1-3-1492568 is now live. This recipe is otherwise identical to the previous with the exception that version filtration has been incremented from >=64 to >=65.
(In reply to Josh Gaunt [:jgaunt] from comment #30)
> Recipe 587 for v2 has been disabled; prefflip-webrender-v1-2-1492568 has
> ended.
> 
> Recipe 651 for v3 has been published; prefflip-webrender-v1-3-1492568 is now
> live. This recipe is otherwise identical to the previous with the exception
> that version filtration has been incremented from >=64 to >=65.

Does this still include Nightly 65? I realize there may have been some confusion because of how the study lines up with the branch dates. We don't want to drop Nightly 65 users until after Nightly 66 has actually started.
Flags: needinfo?(jgaunt)
The new filter expression is

> (
>   normandy.channel in ['nightly','beta']
>   && normandy.version >= '65' 
>   && normandy.os.windowsVersion == 10.0 
>   && normandy.telemetry.main.environment.system.gfx.features.wrQualified.status == 'available'
>   && [normandy.userId, normandy.recipe.id]|stableSample(0.45)
> )

so it includes Nightly 65, yes, but we've lost Beta 64 early.

Could we have done this without touching the deployed recipe? We're aggregating by build_id so it wasn't important to us to stop receiving data from clients that don't update.
> we've lost Beta 64 early

Jeff says this is fine.
Flags: needinfo?(jgaunt)
Calling this done for 64.
Per :jbonisteel, this study should end on March 25, 2019.

Is this running for beta 66 as well?

Flags: needinfo?(jmuizelaar)

That is the intention

Flags: needinfo?(jmuizelaar)

Hello Folks

WebRender V1 will be riding the trains with Fx67 to Release. Because of this I am requesting we make a few changes to the existing experiment expiration parameters.

1.) Extend Experiment for 3 weeks in Fx66 Release (April 9th 2019). WR will be DISABLED by default in Fx66 Release. The ask is that we ENABLE it for a portion? (@Tim) of the population to gather data as we prepare for Release in Fx67.

NOTE: WebRender in Fx66 is very stable (as indicated in previous experiment data). WR V1 is being held until Fx67 Release to provide additional bake time before going to a full (eligible) release population.

2.) Extend Experiment for 3 weeks in Fx67 Release (June 4th 2019). WR will be ENABLED by default in Fx67 Release. The ask is that we DISABLE it for a Portion? (@Tim) of the population to gather data.

3.) Extend current Nightly/Beta Experiments through 2020 (Jan 31 2020). WebRender V1 is only the first step. The Graphics team will immediately begin working on V2 Implementation to support (AMD, Intel, Mobile, etc.). We need this experiment data to inform our progress.

It's important to note that we are not changing the experiment parameters itself. Merely the expiry dates. Hopefully this will not require a full re-submission of the experiment itself.

@Tim - Please review and provide guidance around the population %'s that we should be targeting.

@Josh - Please review the recipe and let me know if you have any questions!

@Liz/Pascal - As the Owners of Release Fx66 and Fx67 (Respectively) please take a look and let us know if there are any concerns from the Relman team.

Flags: needinfo?(tdsmith)
Flags: needinfo?(pascalc)
Flags: needinfo?(lhenry)
Flags: needinfo?(jgaunt)

Thanks telin; acknowledging the request for a power analysis for a population size estimate for an experiment in release 66/67, and tracking in Bug 1526041.

Depends on: 1526041
Flags: needinfo?(tdsmith)

cancelling need infos from release and normandy - until the request is clear. needinfo for datascience still valid.

Flags: needinfo?(pascalc)
Flags: needinfo?(lhenry)
Flags: needinfo?(jgaunt)

Calling this done for 65 as well.

Telin is working on 2 in-progress experimenter requests. they took it out of the bug and into experimenter to follow the normal new study set-up patch.
https://experimenter.services.mozilla.com/experiments/webrender-performance-66/
https://experimenter.services.mozilla.com/experiments/webrender-performance-67/

The original study is extended until Jan 30th 2020 https://experimenter.services.mozilla.com/experiments/prefflip-webrender-v1-3-1492568/

Can this be closed out now? I think it is superceded by but 1526060.

Flags: needinfo?(jmuizelaar)

No, these are separate studies that should run in parallel.

Flags: needinfo?(jmuizelaar)

Most existing users in this study unenrolled because of bug 1553198. :mythmon advises that we'll need to deploy a new recipe with a fresh slug in order to get those users to re-enroll -- only the slug needs to change; the filter expression can be identical (though we can bump up the minimum version numbers for general hygiene if we feel like it).

We can end the current experiment (id 651, slug prefflip-webrender-v1-3-1492568) once we replace it with a new one.

:shell, can we do this in the next experiments change window?

ETA: :mythmon reminded me that we will actually have to change the filter expression in order to only enroll users on builds which have the fix, or else they'll just get unenrolled again, so we'll need to set minimum build_id's, probably?

Flags: needinfo?(sescalante)

The experiment was relaunched with the slug prefflip-webrender-v1-3-relaunch-1492568 yesterday.

Flags: needinfo?(sescalante)

What percentage of the FF population has WR enabled? Did it reach 4% yet?

Duplicate of this bug: 1569972
Duplicate of this bug: 1569973
You need to log in before you can comment on or make changes to this bug.