Closed Bug 1223780 Opened 5 years ago Closed 4 years ago

[e10s][telemetry] regressions in EVENTLOOP_UI_LAG_EXP_MS (3 metrics: parent, children, parent + children)

Categories

(Firefox :: General, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
e10s + ---
firefox45 --- affected

People

(Reporter: jimm, Assigned: mconley)

References

(Blocks 1 open bug)

Details

https://github.com/vitillo/e10s_analyses/blob/master/aurora/e10s_experiment.ipynb

This is one of the metrics we should not regress.
This is pending new telemetry data broken down by:

1) no addons
2) approved addons list
3) all addons

Bug 1223777 is similar. 

Vladan, can we make a request to get the experiment data broken down like this for telemetry related regression bugs? Note there are other telemetry related regression bugs besides this one and bug 1223777, we're still working through e10s-perf.
Flags: needinfo?(vladan.bugzilla)
Sure, whenever we report an e10s regression from Telemetry, we'll provide a separate report for each of those 3 configurations.

Jim: Which approved addons list are you referring to?

Roberto: can you & Birunthan update the e10s Telemetry regression bugs with numbers from the no-addons & at-least-one-addon variants of the analyses?
Flags: needinfo?(vladan.bugzilla) → needinfo?(rvitillo)
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #2)
> Sure, whenever we report an e10s regression from Telemetry, we'll provide a
> separate report for each of those 3 configurations.
> 
> Jim: Which approved addons list are you referring to?

We could run the analysis on profiles that have only one (or a combination) of the top add-ons installed but single out the effect of individual add-ons for profiles with multiple add-ons installed might prove challenging; it might take some time to come up with a good model.

Analysis for profiles that have no extensions installed: [1]
Analysis for profiles with at least one extensions installed: [2]
Analysis for profiles that have only Adblock Plus installed: [3]

[1] https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_noext_experiment.ipynb
[2] https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_onlyext_experiment.ipynb
[3] https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_top_extension_experiment.ipynb
Flags: needinfo?(rvitillo)
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #2)
> Sure, whenever we report an e10s regression from Telemetry, we'll provide a
> separate report for each of those 3 configurations.
> 
> Jim: Which approved addons list are you referring to?

here ya go - 

https://wiki.mozilla.org/Electrolysis/Addons

I would hope that most users that have addons installed have a list that falls into this set since it's based on amo's most popular among other criteria.
Flags: needinfo?(vladan.bugzilla)
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #3)
> (In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #2)
> > Sure, whenever we report an e10s regression from Telemetry, we'll provide a
> > separate report for each of those 3 configurations.
> > 
> > Jim: Which approved addons list are you referring to?
> 
> We could run the analysis on profiles that have only one (or a combination)
> of the top add-ons installed but single out the effect of individual add-ons
> for profiles with multiple add-ons installed might prove challenging; it
> might take some time to come up with a good model.
> 
> Analysis for profiles that have no extensions installed: [1]
> Analysis for profiles with at least one extensions installed: [2]
> Analysis for profiles that have only Adblock Plus installed: [3]
> 
> [1]
> https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/
> e10s_noext_experiment.ipynb
> [2]
> https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/
> e10s_onlyext_experiment.ipynb
> [3]
> https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/
> e10s_top_extension_experiment.ipynb

The list I posted is the list we care about. Adblock plus is certainly in it. Is there any way can cross match this list with a 'most popular' via telemetry to see how in sync they are?
Summary: [e10s][telemetry] regressions in EVENTLOOP_UI_LAG_EXP_MS → [e10s][telemetry] regressions in EVENTLOOP_UI_LAG_EXP_MS (parent and children)
Summary: [e10s][telemetry] regressions in EVENTLOOP_UI_LAG_EXP_MS (parent and children) → [e10s][telemetry] regressions in EVENTLOOP_UI_LAG_EXP_MS (3 metrics: parent, children, parent + children)
Assignee: nobody → mconley
Priority: -- → P1
Flags: needinfo?(vladan.bugzilla)
Blocks: 1224276
(In reply to Jim Mathies [:jimm] from comment #5)
> The list I posted is the list we care about. Adblock plus is certainly in
> it. Is there any way can cross match this list with a 'most popular' via
> telemetry to see how in sync they are?

Good suggestion. We should be able to do that easily.
The analyses mentioned in comment 3 have been updated. In particular, the Adblock Plus analysis was replaced with a whitelisted addon experiment.

Profiles that have no extensions installed: http://nbviewer.ipython.org/github/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_no_extension_experiment.ipynb

Profiles that have at least one extension installed: http://nbviewer.ipython.org/github/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_with_extension_experiment.ipynb

Profiles that have at least one and only whitelisted extensions installed: http://nbviewer.ipython.org/github/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_whitelisted_extension_experiment.ipynb
This is tricky, since while I can see that there's greater likelihood of UI lag in the e10s case with no add-ons disabled, I have no data on where that lag is coming from.

My first strategy is to look at the evolution of the probe to see if there's an obvious place where the regression was introduced (that's not just when we flipped e10s on by default). Unfortunately, bug 1226337 prevents me from seeing that evolution.
Depends on: 1226337
Unfortunately, I can no longer seem to access http://nbviewer.ipython.org/github/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_no_extension_experiment.ipynb - I'm seeing:

Error 503 backend read error

backend read error
Guru Mediation:

Details: cache-lax1431-LAX 1449347668 2086734044

Varnish cache server

rvitillo, any idea what's up?
Flags: needinfo?(rvitillo)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #9)
> Unfortunately, I can no longer seem to access
> http://nbviewer.ipython.org/github/vitillo/e10s_analyses/blob/master/aurora/
> addons/e10s_no_extension_experiment.ipynb - I'm seeing:
> 
> Error 503 backend read error
> 
> backend read error
> Guru Mediation:
> 
> Details: cache-lax1431-LAX 1449347668 2086734044
> 
> Varnish cache server
> 
> rvitillo, any idea what's up?

There must be an issue with nbviewer but you can still access the notebook directly from github, e.g. https://github.com/vitillo/e10s_analyses/blob/master/aurora/addons/e10s_no_extension_experiment.ipynb.
Flags: needinfo?(rvitillo)
I talked with chutten and vladan during the work week, and I think we need more information here.

I _think_ what I want here is a measure of _how often_ (like, perhaps how many times per minute) we exceed a certain threshold on EVENTLOOP_UI_ACTIVITY_EXP_MS, and compare that between e10s and non-e10s.

chutten, am I remembering that correctly? If so, what is the threshold we should care about? Anything > 16ms? Or some other value that is considered our max time to service a system message?
Flags: needinfo?(chutten)
Generally the threshold of user visibility is around 100ms. If the user sends us a discrete interaction event (click, scroll, key, whatever), if we service it (or appear to service it) within 100ms, the user perceives no jank. If the user sends us a continuous interaction event sequence (like touch scrolling), the threshold is much lower (can't remember it off-hand).

This is probably why the BHR "hang" threshold is set around 100ms. (this is the hang duration after which we collect pseudo-stacks to try and pinpoint repeat offenders. Due to bucketing, it means we only really get the >255ms ones, which might be troublesome).

The problem becomes: what's useful?

For bug 1187069 I am developing an addon that reports when Gecko-main is hanging for > 127ms (that's that bucketing again) according to the threadHangStats measure. On e10s it pretty steadily ticks... but I perceive no jank. So I tested to see if the BHR hang monitor was borked by running a tight requestAnimationFrame loop and reporting when it wasn't being serviced for > 127ms. They line up almost perfectly across a couple-hundred samples (< 1% difference in count). (this is all parent-process measurement)

I installed the addon in an e10s-disabled profile, and it doesn't tick over nearly as quickly.

So the measure appears to be correct, yet as a user I don't really care. It doesn't _feel_ bad.

tl;dr - It depends what you're trying to measure. If you want a measurement of user-visible jank, I don't think we have one. If you want to know if e10s events are being serviced more slowly on the parent process: they are. If you want to know why... well, I'm afraid I don't know.

Hopefully this helps?
Flags: needinfo?(chutten)
No longer blocks: 1224276
Depends on: 1187069
I'm able to reproduce this behaviour on my Windows Surface laptop.

What's particularly frustrating is that the behaviour goes away as soon as I start profiling. Lovely.
We fixed the issue that chutten and I were seeing in Mozlando in bug 1234618.

Looking at new experiment data in  https://github.com/vitillo/e10s_analyses/blob/master/beta/addons/e10s_without_addons_experiment.ipynb, at first glance, it looks like the EVENTLOOP_UI_ACTIVITY_EXP_MS readings are doing better with e10s than not - but vladan wants me to check the >50ms buckets (since those are basically being stomped out by the massive buckets at 0ms and 1ms).

rvitillo - I tried to be awesome like you, and use a Spark instance to run an analysis to get a histogram of 50ms and higher for that probe, but I got nowhere (check the scrollback in #perf if you want the nitty gritty). I'm planning on taking the Telemetry onboarding session at some point, but in the meantime, would it be possible for you to put together the histogram for me, or at least guide me through doing it, since I can't seem to do it on my own?
Flags: needinfo?(rvitillo)
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #16)
> See http://nbviewer.ipython.org/gist/vitillo/30643f4de8c7fc33c372.

Thanks rvitillo!

vladan - is comment 16 enough to close this bug out for you?
Flags: needinfo?(vladan.bugzilla)
I would wait, EVENTLOOP_UI_ACTIVITY_EXP_MS doesn't report from child processes yet (bug 1235908)
Depends on: 1235908
Flags: needinfo?(vladan.bugzilla)
Just so we're clear, the responsiveness of the _child process_ as reported by Telemetry is part of the e10s release criteria?
Flags: needinfo?(vladan.bugzilla)
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #19)
> Just so we're clear, the responsiveness of the _child process_ as reported
> by Telemetry is part of the e10s release criteria?

IMHO, the right way to think about this is to compare overall browser responsiveness in e10s & non-e10s configurations. For the e10s configuration, overall e10s jankiness = jankiness_of_parent + jank_rate_of_child
Flags: needinfo?(vladan.bugzilla)
Priority: P1 → P3
Well, here's a run of jankiness as measured by INPUT_EVENT_RESPONSE_MS, which measures time from input event creation (by the OS) to the point when we're done handling the event in the reporting process (the plots are graphing only the values above the 100ms jank threshold): https://gist.github.com/chutten/9b9e29df10e0f7306f99

My casual interpretation is that the numbers are no different. Some are higher, some are lower, with no clear pattern.

This is operating on preliminary data, so things might change after we account for early adopters over-representing in the sample population.
I have now rerun the analysis on the entire beta45-experiment1 dataset and the graphs look little different.

To make triple sure I'll rerun on beta45-experiment2 (Bug 1244187), and I'll reopen this if necessary.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.