[e10s][telemetry] slow-script dialog appears twice as often in e10s vs non-e10s

RESOLVED WORKSFORME

Status

()

Firefox
General
RESOLVED WORKSFORME
2 years ago
a year ago

People

(Reporter: vladan, Unassigned)

Tracking

(Blocks: 1 bug)

41 Branch
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s+)

Details

(Whiteboard: [aurora experiment][mixed addons], URL)

(Reporter)

Description

2 years ago
According to Telemetry, the frequency of the slow-script dialog appearing roughly doubled with e10s:
SLOW_SCRIPT_NOTICE_COUNT in http://nbviewer.ipython.org/urls/gist.githubusercontent.com/vitillo/cb6f1304316c1c1a2cbc/raw/e10s%20analysis.ipynb

^ Based on Telemetry collected from Nightly 41 on June 15th, from buildIDs in the range [20150601, 20150616]

Full e10s performance write-up: https://groups.google.com/d/msg/mozilla.dev.platform/elb0Au7qwG0/Yarsx1xrrnwJ

Updated

2 years ago
tracking-e10s: --- → ?
Similar theory here as in bug 1182637, where addons over using CPOWs could be the cause. Any data you can provide around that would be useful
Flags: needinfo?(vdjeric)
(Reporter)

Comment 2

2 years ago
Redirecting needinfo to rvitillo again
Flags: needinfo?(vdjeric) → needinfo?(rvitillo)
Flags: needinfo?(rvitillo)
Roberto, we still need this information. Why did you cancel the NI?
Flags: needinfo?(rvitillo)
(Reporter)

Comment 4

2 years ago
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #3)
> Roberto, we still need this information. Why did you cancel the NI?

Roberto commented on the other bug (bug 1182637):

> We don't have enough users not using e10s without add-ons to make any strong claims.
> Soon enough we are probably not going to have enough users to compare e10s and non e10s builds as well.
> We should make sure that at least one quarter of Nightly users still runs non-e10s builds.

Essentially, we can't compare baseline e10s vs non-e10s (users without extensions) on Nightly simply because the non-e10s pool on Nightly is super-depleted and probably very much unlike the e10s population at this point.

We could do a comparison on Aurora since it's a steady 45% e10s, but Roberto is tied up in validating the data from the FHR/Telemetry unification (higher org priority?) so sadly he won't have have time to help with this in Q3.
(Reporter)

Comment 5

2 years ago
Brad: why is there a sudden drop-off of E10S users on Aurora according to E10S_AUTOSTART, starting with the July 29th Aurora build?
Flags: needinfo?(blassey.bugs)
(Reporter)

Comment 6

2 years ago
I asked Anthony Zhang to test your hypothesis on bug 1182637. I need to sync up with :jimm on interpreting slow-script-notice count, but either way, we're going to need another server-side analysis person for this e10s comparison in Q3
Flags: needinfo?(azhang)
Sorry about that Brad, I should have linked back to Bug 1182637. Ultimately we should aim to run an A/B test where we randomly pick a set of users on Aurora (Beta?) to have e10s enabled (or disabled). I expect some users to revert back their settings (and we can track that) but that's the only way to make sure the two populations are not biased. Current nightly data is not in an usable state and on aurora there was a sudden drop-off of E10S users which will probably make future analyses likely to be unreliable/biased.
Flags: needinfo?(rvitillo)

Comment 8

2 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #5)
> Brad: why is there a sudden drop-off of E10S users on Aurora according to
> E10S_AUTOSTART, starting with the July 29th Aurora build?

We landed a bad a11y patch on aurora that triggered the a11y restart prompt for a bunch of users despite not having an a11y client. Those users should come back with 42.
Flags: needinfo?(blassey.bugs)

Comment 9

2 years ago
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #7)
> Sorry about that Brad, I should have linked back to Bug 1182637. Ultimately
> we should aim to run an A/B test where we randomly pick a set of users on
> Aurora (Beta?) to have e10s enabled (or disabled). I expect some users to
> revert back their settings (and we can track that) but that's the only way
> to make sure the two populations are not biased. Current nightly data is not
> in an usable state and on aurora there was a sudden drop-off of E10S users
> which will probably make future analyses likely to be unreliable/biased.

If you want to a/b on aurora with random samples, now's is your chance with 42 merging! Just target 42 once it's on that channel.
(Reporter)

Comment 10

2 years ago
(In reply to Jim Mathies [:jimm] from comment #9)
> If you want to a/b on aurora with random samples, now's is your chance with
> 42 merging! Just target 42 once it's on that channel.

Indeed. So I want to do something like "if clientID % 2 == 0, force e10s pref on, otherwise force it off". Which e10s prefs do I need to flip?

Comment 11

2 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #10)
> (In reply to Jim Mathies [:jimm] from comment #9)
> > If you want to a/b on aurora with random samples, now's is your chance with
> > 42 merging! Just target 42 once it's on that channel.
> 
> Indeed. So I want to do something like "if clientID % 2 == 0, force e10s
> pref on, otherwise force it off". Which e10s prefs do I need to flip?

Lets file a fresh bug and cc/ni felipe, he's been handling all of our enabling patch work.
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #10)
> (In reply to Jim Mathies [:jimm] from comment #9)
> > If you want to a/b on aurora with random samples, now's is your chance with
> > 42 merging! Just target 42 once it's on that channel.
> 
> Indeed. So I want to do something like "if clientID % 2 == 0, force e10s
> pref on, otherwise force it off". Which e10s prefs do I need to flip?

Yeah a fresh bug for this would be nice. To do this you'll need to generate this clientID somehow (or is there already something that you can use?) and do this check in nsAppRunner.cpp mozilla::BrowserTabsRemoteAutostart(). There you can muck with the value of the trialPref boolean to do this A/B testing.

I'd keep the prefs as they are now (i.e., keep browser.tabs.remote.autostart.2 = true), and when it's true, perform the A/B filtering.

It'd be nice to also add a new status entry that says disabledByAB in order to properly see that on telemetry.
Flags: needinfo?(vdjeric)
Ref: https://bugzilla.mozilla.org/show_bug.cgi?id=1182637#c6

For Aurora 41 users with buildIDs on July 28th, 2015 who have no extensions installed:

> Median difference in histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.03, (0.56, 0.53).
> The probablity of this effect being purely by chance is 0.86.
Flags: needinfo?(vdjeric)
Flags: needinfo?(azhang)
I seem to have cleared the needinfo on :vladan somehow...
Flags: needinfo?(vdjeric)
(Reporter)

Updated

2 years ago
Depends on: 1193089
(Reporter)

Comment 15

2 years ago
(In reply to :Felipe Gomes from comment #12)
> Yeah a fresh bug for this would be nice. 

Filed meta bug 1193089

> To do this you'll need to generate this clientID somehow
> (or is there already something that you can use?) 

Yes, Telemetry already has a clientID tied to a profile

> and do this check in nsAppRunner.cpp mozilla::BrowserTabsRemoteAutostart().
> There you can muck with the value of the trialPref boolean to do this A/B
> testing.
> 
> I'd keep the prefs as they are now (i.e., keep
> browser.tabs.remote.autostart.2 = true), and when it's true, perform the A/B
> filtering.
> 
> It'd be nice to also add a new status entry that says disabledByAB in order
> to properly see that on telemetry.

Good suggestions, thanks
No longer depends on: 1193089
Flags: needinfo?(vdjeric)
(Reporter)

Updated

2 years ago
Depends on: 1193089

Updated

2 years ago
tracking-e10s: ? → +

Comment 16

2 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #0)
> According to Telemetry, the frequency of the slow-script dialog appearing
> roughly doubled with e10s:
> SLOW_SCRIPT_NOTICE_COUNT in
> http://nbviewer.ipython.org/urls/gist.githubusercontent.com/vitillo/
> cb6f1304316c1c1a2cbc/raw/e10s%20analysis.ipynb

Hey Vlad,

This data looks to be static, I see build dates specified here. Is there some way to run this report to get the latest data?
Flags: needinfo?(vladan.bugzilla)
(Reporter)

Comment 17

2 years ago
Roberto & Birunthan re-ran the e10s comparison report on the Aurora A/B experiment population, these are the slow-script findings:

1) For profiles with 0 extensions:

Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.09, (e10s=0.66, single-process=0.57). The probability of this effect being purely by chance is 0.47

2) For profiles with at least 1 extension:

Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.17, (e10s=0.48, single-process=0.31). The probability of this effect being purely by chance is 0.08

3) For profiles with *only* AdBlock Plus installed:

Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.22, (e10s=0.46, single-process=0.23). The probability of this effect being purely by chance is 0.23.

I'll post the analyses and add more conclusions after we're done reviewing the findings.
Flags: needinfo?(vladan.bugzilla)
(Reporter)

Comment 18

2 years ago
And yes, you can re-run the analysis on Aurora any time you like, via the telemetry-dash.mozilla.org Spark interface (let me know if you want links to docs). We'll also be re-running these analyses for every future experiment we run. The experiment populations are less biased than the current e10s/non-e10s split on Aurora
(Reporter)

Updated

2 years ago
Blocks: 1222849
(Reporter)

Updated

2 years ago
Blocks: 1222894

Updated

2 years ago
Summary: Telemetry: slow-script dialog appears twice as often in e10s vs non-e10s → [e10s][telemetry] slow-script dialog appears twice as often in e10s vs non-e10s

Comment 19

2 years ago
https://github.com/vitillo/e10s_analyses/blob/master/aurora/e10s_experiment.ipynb

35% regression: median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is -0.15, (0.42, 0.57).

Comment 20

2 years ago
Sorry, looks like the first number is e10s so I think this is actually an improvement?

Vladan, can you clarify?
Flags: needinfo?(vladan.bugzilla)
Jim, that comparison is not statistically significant.
Flags: needinfo?(vladan.bugzilla)

Comment 22

2 years ago
Hey Roberto, in the original report, we had a regression as such:

Median difference in histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is 0.15, (0.26, 0.11)

in the current report we have:

Median difference in payload/histograms/SLOW_SCRIPT_NOTICE_COUNT per hour is -0.15, (0.42, 0.57)

How is the first significant but the second not?
Flags: needinfo?(rvitillo)
In both cases the probability of the effect being purely by chance is pretty high so we can't say anything for sure.
Flags: needinfo?(rvitillo)

Comment 24

2 years ago
<jimm> rvitillo: fyi I ni'd you again on bug 1182638, had another question about those numbers.
<jimm> I don't understand why the bug was filed
<rvitillo> jimm: In both cases the probability of the effect being purely by chance is pretty high so we can't say anything for sure.
<jimm> ah ok
<rvitillo> we either need more data to be sure or the difference is so small that it doesn’t really matter
<rvitillo> Running an experiment on Beta should help with that
<jimm> ok, I'm going to close that out then. we can refile if we see solid proof of a regression down the road.
<jimm> thanks!
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME

Updated

2 years ago
Whiteboard: [aurora experiment][mixed addons]

Updated

a year ago
Blocks: 1249978
Blocks: 1251545
You need to log in before you can comment on or make changes to this bug.