Closed Bug 828097 Opened 11 years ago Closed 11 years ago

Add telemetry probes for tab animation smoothness

Categories

(Firefox :: Tabbed Browser, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 21

People

(Reporter: avih, Assigned: avih)

References

Details

Attachments

(1 file, 3 obsolete files)

Tab animation smoothness data should be collected.

Smoothness is ultimately measured in frames per second (FPS) of the animation, however, measuring also paint times can be useful for further analysis (especially on faster systems where frames times are not always affected by changes in paint times). Frame intervals and paint times can be measured using the system from bug 826383.

I think we should have 2 main probes (and per probe, record average FPS and average paint times for each animation):

1. A case with minimal variance and controlled variables:
- E.g. newtab page open when other tabs stay put (e.g. only the 2nd or 3rd tab in a window).

2. Any tab animation - to get an overall view of the experience.
- E.g. All tab open/close, regardless if other tabs are shrunk/expanded, regardless of the URI of the tab gaining focus, etc.


Concerns raised by vladan:

A. (only applies to scenario 1) With newtab page opening together with tab animation, first/some intervals/paints would be longer than when tab animation is unaffected by other intensive computations. He suggested to trim the first few frames out of the averages.

B. To get accurate results, frames recording should not include frames which are unrelated to the tab animation. I currently think this should never happen as animation should be rendered on each presented frame, but I'll look into it to make sure.
About the concerns from comment #0:

A. vladan and I agreed to keep all* frame intervals since it would let us observe changes over time due to newtab page changes (which is supposedly the only variable other than the tab animation itself).

B. The patch follows the assumption that such skew doesn't happen. I will verify this before the final r?.

* All intervals except the first, since it represents an interval to a frame which was usually presented relatively long before the recording start (e.g. browser was idle before opening/closing a tab, first interval may be several seconds).
Attachment #701661 - Flags: review?(vdjeric)
Would it make sense to encapsulate some of this logic in a new helper method on TelemetryTimestamps?

It would definitely be nice for the _toolkitTelemetryEnabled property to be on a more global service to avoid having to add preference observers in every client that wants to do non-trivial telemetry data recording.
We can live with telemetry pref changes only affecting new windows. Moved the main handling code into a method for readability.
Attachment #701661 - Attachment is obsolete: true
Attachment #701661 - Flags: review?(vdjeric)
Attachment #702607 - Flags: review?(vdjeric)
(In reply to :Gavin Sharp (away Jan 16-23) from comment #2)
> Would it make sense to encapsulate some of this logic in a new helper method
> on TelemetryTimestamps?
> 
> It would definitely be nice for the _toolkitTelemetryEnabled property to be
> on a more global service to avoid having to add preference observers in
> every client that wants to do non-trivial telemetry data recording.

The pref doesn't need to be read in every client; clients can just record data in Telemetry and if Telemetry is disabled, the calls will be no-ops. Also, we're not supporting Telemetry-enabled status changing without a restart, so there's no need for an observer.

I think Avi is explicitly checking Telemetry state in this patch because he's being extra careful about avoiding measurement overhead when Telemetry is disabled. We already expose Services.telemetry.canRecord for this, so there's no need to check the pref manually.
Comment on attachment 702607 [details] [diff] [review]
Removed telemetry pref observer, readability

>-            if (this.tabContainer._tabAnimationLoggingEnabled) {
>+            // Animation telemetry/logging
>+            if (this.tabContainer._toolkitTelemetryEnabled || this.tabContainer._tabAnimationLoggingEnabled) {
>               aTab._recordingHandle = window.QueryInterface(Ci.nsIInterfaceRequestor)
>                                             .getInterface(Ci.nsIDOMWindowUtils)
>                                             .startFrameTimeRecording();

Will the measurements be correct if I open multiple tabs simultaneously? e.g. session-restore or "open all in tabs" from a bookmark folder

>+                    if (aURI == BROWSER_NEW_TAB_URL && (t._tPos == 1 || t._tPos == 2)) {
>+                      // Indicate newtab page animation where other tabs are unaffected
>+                      //   (for which case, the 2nd or 3rd tabs are good representatives, even if not absolute)
>+                      t._recordingTabOpenPlain = true;

This is for our "controlled" measurement, so would it be possible to check if aURI is "about:newtab" for this measurement? The new tab page's URL is customizable from a pref and a quick Google search shows that addons replace it :(
And if so, do we also want to control for tab-previews being on or off?

>+          this._toolkitTelemetryEnabled = Services.prefs.getBoolPref("toolkit.telemetry.enabled");

You can use Services.telemetry.canRecord

>+              averagePaint /= intervals.length;

Nit: You might as well store intervals.length in a local "frameCount" variable. It's a bit easier to read imho

>+  "FX_TAB_ANIM_OPEN_INTERVALS_MS": {
>+    "kind": "exponential",
>+    "low" : 7,
>+    "high": "500",
>+    "n_buckets": 50,
>+    "description": "Average frame interval during tab open animation of browser.newtab.url, when other tabs are unaffected"
>+  },

Why not use the default "low" value of 0ms? We've had timing issues in the past. Also, I notice FX_TAB_ANIM_OPEN_MS has a high of 3000ms, would it make sense to use the same value in this histogram?

>+  "FX_TAB_ANIM_OPEN_PAINTS_MS": {
>+    "kind": "exponential",
>+    "high": "500",
>+    "n_buckets": 30,
>+    "description": "Average paint duration during tab open animation of browser.newtab.url, when other tabs are unaffected"
>+  },

Nit: PAINTS -> FRAME_PAINT
Attachment #702607 - Flags: review?(vdjeric)
(In reply to Vladan Djeric (:vladan) from comment #5)
> Will the measurements be correct if I open multiple tabs simultaneously?
> e.g. session-restore or "open all in tabs" from a bookmark folder
There is no tab animation on such cases, so no measurements -> OK.

> >+                    if (aURI == BROWSER_NEW_TAB_URL && (t._tPos == 1 || t._tPos 
> 
> This is for our "controlled" measurement, so would it be possible to check
> if aURI is "about:newtab" for this measurement? The new tab page's URL is
> customizable from a pref and a quick Google search shows that addons replace
> it :(
I did consider specifically testing "about:newtab" instead of the configurable pref, but eventually decided against it because:
1. This string can change in the future.
2. This will give us a wider view about our users, including those who use different default newtab page.
3. Hopefully we'll be able to bisect the collected data and differentiate those who use the default page from those who don't. Though I didn't check if we can actually do this with collected telemetry data.

> And if so, do we also want to control for tab-previews being on or off?
Probably no, since leaving it at whatever it is will allow us to measure improvements in overall newtab affect on animation, including when preview is changed to enabled on some future version.

> 
> >+  "FX_TAB_ANIM_OPEN_INTERVALS_MS": {
> >+    "kind": "exponential",
> >+    "low" : 7,
> >+    "high": "500",
> >+    "n_buckets": 50,
> >+    "description": "Average frame interval during tab open animation of browser.newtab.url, when other tabs are unaffected"
> >+  },
> 
> Why not use the default "low" value of 0ms? We've had timing issues in the
> past. Also, I notice FX_TAB_ANIM_OPEN_MS has a high of 3000ms, would it make
> sense to use the same value in this histogram?
Taras wanted a linear histogram to record exact (rounded) ms values around 16-17ms (60fps), which implies a small range (since we don't want hundreds of buckets). I noticed that frame intervals could be quite longer on slow systems (measured up to 100+ms), so I wanted a bigger range for this histogram. Starting in 7 will result in buckets every 1ms around 17, while also allowing bigger range. Starting in 0 will result in 2ms buckets around 17ms, so less accurate where it matters. The reason I chose 7 and not 15 is to account for possibly higher refresh rates (120hz is 8.3ms), when/if our refresh driver is not fixed at 60ms. Starting in 7 and using 30 buckets is still not full resolution around 17, so I chose 50.

So this choice of values provides practically linear histogram with highest resolution where it matters today and for reasonable future higher refresh rates, while still allowing bigger range (to see the big picture), while not using hundreds of buckets.

FX_TAB_ANIM_OPEN_MS accounts for the entire animation duration. Our buckets represent (average) single animation frame. 500ms for a single frame is enough to know we're doing extremely bad already.

Generally speaking, I tried to be mindful and not abuse the system by collecting too much data just for tab animations, while still making the data we do collect useful. We can collect more data for different specific cases, or limit our collection to more specific cases, or aggregate more cases into the same histogram, or tune our buckets for accuracy/range/collection-size, so I tried to balance the need for controllable data vs too much data vs accuracy vs usefulness.

I'm fully open for different segmentation though.
(In reply to Avi Halachmi (:avih) from comment #6)
> (In reply to Vladan Djeric (:vladan) from comment #5)
> > Will the measurements be correct if I open multiple tabs simultaneously?
> > e.g. session-restore or "open all in tabs" from a bookmark folder
> There is no tab animation on such cases, so no measurements -> OK.

Tab animations can and do happen simultaneously.
We decided to leave the buckets selection as is. All other comments were addressed.

(In reply to Dão Gottwald [:dao] from comment #7)
> Tab animations can and do happen simultaneously.
Indeed.

We measure addTab and removeTab which are also animated explicitly.
- Quick open/close/etc sequence which result in overlapping animations are measured properly (one measurement for each distinct animation, even if overlapped).
- On cases where other tabs are animated implicitly (shrunk/expanded - absolutely overlapping the trigger tab animation), only the trigger tab is measured - but the measurement is affected by the extra processing of the other tabs, hence smoothness might degrade, which is exactly what we want to include and measure for "any animation".

A negligible skew which we do collect is at cases where we measure "plain" newtab open without affecting other tabs, but in a quick open/close sequence which causes overlapping animation.
Attachment #702607 - Attachment is obsolete: true
Attachment #704736 - Flags: review?(dao)
Comment on attachment 704736 [details] [diff] [review]
v3 - address review at comment #5

>diff --git a/browser/base/content/tabbrowser.xml b/browser/base/content/tabbrowser.xml

>+      <method name="_handleFramesTelemetry">

>+          if (aTab.hasOwnProperty("_recordingHandle")) {

nit: this should just be:
if (aTab._recordingHandle)

Flagging felipe for review here as well, since he has some experience with telemetry stuff.
Attachment #704736 - Flags: review?(felipc)
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #10)
> nit: this should just be:
> if (aTab._recordingHandle)

Thanks, I also think it's clearer that way but was trying to "do the right thing". I'll change those instances.
Vladan: it looks like that .canRecord does not reflect if the user has opted out of telemetry as originally thought. The only thing I see that sets it is this line (http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Telemetry.cpp#944) plus a few tests.

The intention here is to avoid measuring the FPS if it's not gonna be reported by telemetry. What do you suggest us to do: change canRecord to reflect that? Or a new property? Or check the pref?
canRecord does reflect whether a user has opted into telemetry, see:

http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryPing.js#746

The above statement sets the value of mCanRecord based on the toolkit.telemetry.enabled pref
Oh ok, it doesn't change on the fly, it's setup during startup. I see
Comment on attachment 704736 [details] [diff] [review]
v3 - address review at comment #5

Review of attachment 704736 [details] [diff] [review]:
-----------------------------------------------------------------

r=felipe with the change we discussed over IRC to call merge the two instances that call startFrameTimeRecording in a function and name it _handleFramesTelemetryStart and _handleFramesTelemetryEnd
Attachment #704736 - Flags: review?(felipc) → review+
Important to choose platforms...
https://tbpl.mozilla.org/?tree=Try&rev=4cd8abd955e6
https://hg.mozilla.org/mozilla-central/rev/b39ac5994fd7
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 21
Depends on: 920123
Depends on: 1345315
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: