Gather Telemetry about Tab Warming

NEW
Unassigned

Status

()

enhancement
P3
normal
a year ago
a year ago

People

(Reporter: chutten, Unassigned)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox59 affected)

Details

(Reporter)

Description

a year ago
When tab warming is on, we notice[1] a nice drop in FX_TAB_SWITCH_TOTAL_E10S_MS. But that's not the whole story.

How many millis do we save people with tab warming?
How many millis do we waste, warming a tab that isn't switched to before we bin the resources?
How many bytes do we waste at that time?

I'm thinking the first is satisfyingly high, the second and third are acceptably low... but we won't know for certain unless we measure it.

So I propose a family of measures for looking at tab warming. I'm not settled on some of the particulars, though:

* Do we want to know the actual times saved? Does it matter that most of the warming is saving us Xms or less? If so, we want a histogram for the millis saved, probably exponential with a high of about 250ms.

* Do we just want the total amount saved? If we just want to know how much time we've saved our users, we could keep a running tally of millis we warmed tabs that were switched to before the resources were reclaimed. We could put that in a scalar.

* We only have one kind of warming at the moment (on mouse hover render layers and upload them to the compositor). We presumably will be extending this to different warming triggers (maybe warming "adjacent" tabs for people who Ctrl+Tab, Ctrl+Shift+Tab) and different warming strategies (maybe just rendering, no upload. Maybe render, upload, and offscreen composite. I dunno, just spitballing). So that means considering an expanding key space for these measures if we want to break down the effect each of these strategies have.

So, categorical histograms, or keyed histograms or scalars.

* <insert other things we might want to examine>

And is it enough to have this information from our (non-representative, but very helpful) pre-release populations? Or would having a big number of Hours Our Release Users Have Been Saved From Looking At Slow Tabs be useful for business decisions (and feeling good about ourselves)?

[1]: http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/1288/alerts/?from=2017-05-24&to=2017-05-24
(Reporter)

Comment 1

a year ago
Anything about tab warming strike you as particularly good fodder for instrumentation?
Flags: needinfo?(mconley)
Priority: -- → P3
(In reply to Chris H-C :chutten from comment #0)
> How many millis do we save people with tab warming?

Does the FX_TAB_SWITCH_TOTAL_E10S_MS probe not give us a sense of how much time we're saving people? Or are you thinking we have each client measure roughly how much time they're saving, and report _that_ value?

> How many millis do we waste, warming a tab that isn't switched to before we bin the resources?

Right, so, I think that's a little harder to measure. Rendering and then dropping layers itself has a cost, and then there might be secondary costs down the line as the result of those drops that we hit when we're idle (maybe some CC'ing stuff... I'm mostly speculating here). We should be really precise here about what exactly we mean about time wasted, and what window of time we care about.

So perhaps it's this:

How much time is spent in the TabChild rendering and uploading layers that are ultimately not used?

and

How much time is spent in the compositor receiving and then dropping layers that are ultimately not shown?

> How many bytes do we waste at that time?

I'm not 100% sure how to measure that to be honest. Perhaps our graphics people know how we can measure how much memory these sets of layers take... I would not be surprised, however, if we can only do best guesses based on the size of the layers.

> So I propose a family of measures for looking at tab warming. I'm not
> settled on some of the particulars, though:
> 
> * Do we want to know the actual times saved? Does it matter that most of the
> warming is saving us Xms or less? If so, we want a histogram for the millis
> saved, probably exponential with a high of about 250ms.

Yes, this sounds useful.

> * Do we just want the total amount saved? If we just want to know how much
> time we've saved our users, we could keep a running tally of millis we
> warmed tabs that were switched to before the resources were reclaimed. We
> could put that in a scalar.

I'm not certain how useful that is. That number is likely going to be wildly different depending on browser session length, so it'd need to be normalized during analysis, I guess.

I think it might be useful, ultimately - though we could also perhaps roughly approximate it by summing the buckets from the previously mentioned histogram.

> 
> * We only have one kind of warming at the moment (on mouse hover render
> layers and upload them to the compositor). We presumably will be extending
> this to different warming triggers (maybe warming "adjacent" tabs for people
> who Ctrl+Tab, Ctrl+Shift+Tab) and different warming strategies (maybe just
> rendering, no upload. Maybe render, upload, and offscreen composite. I
> dunno, just spitballing). So that means considering an expanding key space
> for these measures if we want to break down the effect each of these
> strategies have.

We can perhaps stash a reason we're warming in the async tab switcher when reporting. That might be useful, but yeah - it explodes our measures out a bit. Ultimately, I'm not 100% sure I care enough in comparing the various warming points - like, comparing that hovering a tab saves 50ms more tab switch time than hovering the tab close button.

> And is it enough to have this information from our (non-representative, but
> very helpful) pre-release populations? Or would having a big number of Hours
> Our Release Users Have Been Saved From Looking At Slow Tabs be useful for
> business decisions (and feeling good about ourselves)?

Sounds like a question for the Marketing or Product team. :) I guess I'm casually interested in the value - not to an extreme degree.

(In reply to Chris H-C :chutten from comment #1)
> Anything about tab warming strike you as particularly good fodder for
> instrumentation?

One thing that might be worth instrumenting is _how_ people are switching tabs, and whether or not those techniques have different tab switching behavioural characteristics.

For example, I'm reasonably sure based on my familiarity with the code, that a tab switch as the result of the previous tab closing is more likely to show a tab switch spinner than a normal tab switch.

So enumerating and classifying the ways in which people switch tabs, and breaking down our various tab switch performance metrics in that way would be super useful to me, but I'm not sure how practical that'd be (there are lots of ways to switch tabs!)
Flags: needinfo?(mconley)
> How much time is spent in the compositor receiving and then dropping layers that are ultimately not shown?

Maybe there's an easier way to get an approximation like this - how often we warm up a tab that is then not being shown within 2-3 seconds.
(Reporter)

Comment 4

a year ago
(In reply to Mike Conley (:mconley) (:⚙️) from comment #2)
> (In reply to Chris H-C :chutten from comment #0)
> > How many millis do we save people with tab warming?
> 
> Does the FX_TAB_SWITCH_TOTAL_E10S_MS probe not give us a sense of how much
> time we're saving people? Or are you thinking we have each client measure
> roughly how much time they're saving, and report _that_ value?

TAB_SWITCH_TOTAL gives us how much time people are spending on tab switch. Measuring a population at different points in time might give us an idea how a given tab warming strategy helps users... or it might have been a holiday or something :S

Having a "this is the amount of time we spent warming that tab _before_ you tried to switch to it. You're welcome." figure would be both a nice feather for our caps and a nice canary for if something goes wrong.
 
> > How many millis do we waste, warming a tab that isn't switched to before we bin the resources?
> 
> Right, so, I think that's a little harder to measure. Rendering and then
> dropping layers itself has a cost, and then there might be secondary costs
> down the line as the result of those drops that we hit when we're idle
> (maybe some CC'ing stuff... I'm mostly speculating here). We should be
> really precise here about what exactly we mean about time wasted, and what
> window of time we care about.
> 
> So perhaps it's this:
> 
> How much time is spent in the TabChild rendering and uploading layers that
> are ultimately not used?
> 
> and
> 
> How much time is spent in the compositor receiving and then dropping layers
> that are ultimately not shown?

Upon reflection, a time-based measurement might not be useful. What is a couple of millis of computation that was performed off-main on an idle core, anyway? Maybe a count of "tabs warmed that were ultimately not switched to" maybe "within interval X" as :gandalf suggests, or "within this session", or "before we throw out the layer data and have to re-render it anyway"

Just a way to give us a heads-up if our accuracy suddenly changes. (Helps with experimentation, too)

> > How many bytes do we waste at that time?
> 
> I'm not 100% sure how to measure that to be honest. Perhaps our graphics
> people know how we can measure how much memory these sets of layers take...
> I would not be surprised, however, if we can only do best guesses based on
> the size of the layers.

w x h x 4 bytes? :D

> > So I propose a family of measures for looking at tab warming. I'm not
> > settled on some of the particulars, though:
> > 
> > * Do we want to know the actual times saved? Does it matter that most of the
> > warming is saving us Xms or less? If so, we want a histogram for the millis
> > saved, probably exponential with a high of about 250ms.
> 
> Yes, this sounds useful.
> 
> > * Do we just want the total amount saved? If we just want to know how much
> > time we've saved our users, we could keep a running tally of millis we
> > warmed tabs that were switched to before the resources were reclaimed. We
> > could put that in a scalar.
> 
> I'm not certain how useful that is. That number is likely going to be wildly
> different depending on browser session length, so it'd need to be normalized
> during analysis, I guess.
> 
> I think it might be useful, ultimately - though we could also perhaps
> roughly approximate it by summing the buckets from the previously mentioned
> histogram.

We keep the full-resolution sum in the histogram when we send it. So if you accumulate 2, 3, 3, 4 and all it ends up as is a count of 4 in the 1-10 bucket, the histogram's sum is still 12. So fret not about approximation.

> > 
> > * We only have one kind of warming at the moment (on mouse hover render
> > layers and upload them to the compositor). We presumably will be extending
> > this to different warming triggers (maybe warming "adjacent" tabs for people
> > who Ctrl+Tab, Ctrl+Shift+Tab) and different warming strategies (maybe just
> > rendering, no upload. Maybe render, upload, and offscreen composite. I
> > dunno, just spitballing). So that means considering an expanding key space
> > for these measures if we want to break down the effect each of these
> > strategies have.
> 
> We can perhaps stash a reason we're warming in the async tab switcher when
> reporting. That might be useful, but yeah - it explodes our measures out a
> bit. Ultimately, I'm not 100% sure I care enough in comparing the various
> warming points - like, comparing that hovering a tab saves 50ms more tab
> switch time than hovering the tab close button.

It wouldn't balloon too badly. Linearly with the number of warming strategies.

> > And is it enough to have this information from our (non-representative, but
> > very helpful) pre-release populations? Or would having a big number of Hours
> > Our Release Users Have Been Saved From Looking At Slow Tabs be useful for
> > business decisions (and feeling good about ourselves)?
> 
> Sounds like a question for the Marketing or Product team. :) I guess I'm
> casually interested in the value - not to an extreme degree.

Fair :)

> (In reply to Chris H-C :chutten from comment #1)
> > Anything about tab warming strike you as particularly good fodder for
> > instrumentation?
> 
> One thing that might be worth instrumenting is _how_ people are switching
> tabs, and whether or not those techniques have different tab switching
> behavioural characteristics.
> 
> For example, I'm reasonably sure based on my familiarity with the code, that
> a tab switch as the result of the previous tab closing is more likely to
> show a tab switch spinner than a normal tab switch.
> 
> So enumerating and classifying the ways in which people switch tabs, and
> breaking down our various tab switch performance metrics in that way would
> be super useful to me, but I'm not sure how practical that'd be (there are
> lots of ways to switch tabs!)

Ah, so a keyed TAB_SWITCH_TOTAL_MS where the keys are things like "tabclosed" "keyboard" "tabpreview" "that-dropdown-arrow-menu-thing" "mouse" "touch" (adding keys as we find all the various ways we let users switch tabs). We could add that today, if we knew anyone who knew things about tab switching.

... :mconley, would you happen to know anyone who knows things about tab switching? I can consult on the Telemetry part of things (it is quite literally copy-pasta of FX_TAB_SWITCH_TOTAL_E10S_MS but with a slightly-different name and "keyed": "true" in there somewhere).

Other action items... Need some gfx person to tell us how hard it is to get memory estimates from layers, I guess. And we could use a TAB_WARMING_MS for all those same keys.
No longer blocks: 1423220
You need to log in before you can comment on or make changes to this bug.