Closed Bug 790752 Opened 8 years ago Closed 7 years ago

Investigate more comprehensive measures of tab switching speed that includes event handler responsiveness

Categories

(Firefox :: Tabbed Browser, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 20

People

(Reporter: bellindira, Assigned: bellindira)

References

Details

Attachments

(1 file, 2 obsolete files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:15.0) Gecko/20100101 Firefox/15.0
Build ID: 20120824154833




Expected results:

Required telemetry coverage just for time spent under UpdateCurrentBrowser.
Component: Untriaged → Tabbed Browser
OS: Mac OS X → All
Hardware: x86 → All
Version: 15 Branch → unspecified
Comment 17 bug 753127 => Investigate more comprehensive measures of tab switching speed that include painting, event handler responsiveness, etc.

This is a summary of what I've found so far:

User switches tabs by either __selectNewTab (E.g Shortcut CMD + ALT + VK_LEFT) or _selectedTab (clicking on a different tab). After that onselect is triggered and it calls to the updateCurrentBrowser method. 

However, to measure when the page is completed loaded, it is required to add a telemetry to mTabProgressListener and check when nsIWebProgressListener.STATE_STOP in the case that "this.mTabBrowser.mIsBusy" was set to true in updateCurrentBrowser.
We don't want this telemetry to cover page loading time, just the actual switching time (time from event handling to paint of the new tab).
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Add telemetry for time spent under UpdateCurrentBrowser (tabbrowser.xml) → Investigate more comprehensive measures of tab switching speed that include painting, event handler responsiveness, etc.
I am not sure what you mean by time from event handling to paint. So, can you please clarify if what you would like to cover is the time spent between user event (that triggers switch tab) and just before the call to UpdateCurrentBrowser or what it would be?

Because, what I understand is that the tab painting happens in  UpdateCurrentBrowser and if the page has not been upload previously (e.g. open a link on a new tab) it relies on a progress listener to get when it finish to load the whole page (I mean UpdateCurrentBrowser could be executed and reach the end of the method and the page is not completed load until we get nsIWebProgressListener.STATE_STOP). 

And what happens between user event (that triggers switch tab) and just before the call to UpdateCurrentBrowser is just the creation/focus of the "tab template" (e.g. create doc element "tab", focus, set attributes label+value).

Thanks.
I believe Jeff has done this in bug 800031
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 800031
Bug 800031 fixed the end time to include painting, but didn't change the start time - updateCurrentBrowser is called in response to the select event, so it wouldn't have caught dumbness like e.g. bug 743877.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #5)
> Bug 800031 fixed the end time to include painting, but didn't change the
> start time - updateCurrentBrowser is called in response to the select event,
> so it wouldn't have caught dumbness like e.g. bug 743877.

That's bad. Lets fix this. Maybe even reuse the existing _TOTAL histogram
This is what I have found so far. 

Basically, the select event is triggered after:

1) selectedTab property is modified (this.mTabBox.selectedTab = val; tabbrowser.xml)
2) _selectNewTab method is executed (tabbox.xml)

selectedTab can be modified by the following methods:
1) previewTab (User selects a tab from Tab Groups)
2) loadOneTab 
3) loadTabs (The tab selected after a tab is closed)
4) _blurTab (user removes a tab)
5) selectTabAtIndex (user presses CMD + a number)
6) DOMWillOpenModalDialog (User opens a modal dialog)
7) drop (user drops a tab)

There is also a method called _handleKeyEvent however it is called any time a keypress event is received. 

_selectNewTab is called by:
1) mousedown (user clicks on a tab).
2) some keypress combinations (e.g. ALT + CMD + VK_LEFT)

Examples of tab switching flow:
1) user clicks on a tab (a different one of the current selected tab) -> _selectNewTab starts its execution -> updateCurrentBrowser begins its execution as well -> _handleTabSelect -> updateCurrentBrowser ends -> _selectNewTab ends/

2) user presses CMD + 2 (she selects tab #2) -> _selectTabAtIndex -> selectedTab property is set to 2 -> updateCurrentBrowser begins execution -> _handleTabSelect -> updateCurrentBrowser ends.

I think the simplest way to change the start time would be on the setter of selectedTab and at the beginning of _selectNewTab method. So, please let me know what you think about this.
Summary: Investigate more comprehensive measures of tab switching speed that include painting, event handler responsiveness, etc. → Investigate more comprehensive measures of tab switching speed that includes event handler responsiveness
(In reply to Bellindira Castillo [:bellindira] from comment #7)
> I think the simplest way to change the start time would be on the setter of
> selectedTab and at the beginning of _selectNewTab method. So, please let me
> know what you think about this.

I think we're only interested in event handler responsiveness here, rather than in the selectedTab setter.

However, it seems that we'll mostly measure some tiny XBL/DOM/JS overhead on top of the other two tab-switching probes at this point, which doesn't seem overly interesting. A more interesting thing to get data for might be the lag between the user input and event processing (which isn't specific to tab switching).
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #5)
> Bug 800031 fixed the end time to include painting, but didn't change the
> start time - updateCurrentBrowser is called in response to the select event,
> so it wouldn't have caught dumbness like e.g. bug 743877.

I think this is all that's required from this bug...to move up the start of measuring in _selectNewTab/_selectTabAtIndex not sure why responsiveness is in the title.
(In reply to Dão Gottwald [:dao] from comment #8)
> (In reply to Bellindira Castillo [:bellindira] from comment #7)
> > I think the simplest way to change the start time would be on the setter of
> > selectedTab and at the beginning of _selectNewTab method. So, please let me
> > know what you think about this.
> 
> I think we're only interested in event handler responsiveness here, rather
> than in the selectedTab setter.
> 
> However, it seems that we'll mostly measure some tiny XBL/DOM/JS overhead on
> top of the other two tab-switching probes at this point, which doesn't seem
> overly interesting. A more interesting thing to get data for might be the
> lag between the user input and event processing (which isn't specific to tab
> switching).

Yes, I agree, at the end, we'll measure some XBL/DOM/JS tiny overhead which I think it isn't going to add a meaningful time to the current metric. And it will also add more complexity to measure this correctly. I mean:

1)From Bug 753127 Comment 8: adding code to tabbox.xml to measure tab switching speed will be somewhat unreliable. And it is not possible to measure the call of _selectNewTab on tabbrowser.xml.

2)(In reply to Taras Glek (:taras) from comment #9)
> I think this is all that's required from this bug...to move up the start of
> measuring in _selectNewTab/_selectTabAtIndex not sure why responsiveness is
> in the title.

_selectNewTab and selectTabAtIndex doesn't cover all tab switching cases. 
_selectNewTab would cover events like click on a tab or move from tab to tab using CMD + Left/Right arrows.
selectTabAtIndex would cover some keyboard events as CMD + Number.
But they wouldn't cover events like: open or close new tabs (e.g CMD+T, CMD+W, open link in a new tab) which are currently measured on updateCurrentBrowser method (!aForceUpdate).

After analyzing this carefully, I would recommend leaving the start of this telemetry where it currently is, because it would guarantee to capture all tab switching cases in a reliable way.

And if you still want to measure the tiny time on the event handler, it would be better to add a separate telemetry for each case you are interested in.

Please let me know what you think about it.
We don't need to tweak any of the existing probes. It would be useful to add a complementary probe that measures time spent between e.g. a click on a tab and onUpdateCurrentBrowser being called. It doesn't need to cover every single case - ideally just the common ones (clicking on a tab is probably the most important).
Attachment #679722 - Flags: review?(gavin.sharp)
Comment on attachment 679722 [details] [diff] [review]
Added telemetry for tab switch event handlers

Hmm, I see a couple of issues here:
- the loadOneTab probe seems too specific - there are other ways for a tab to be added, and I'm not sure the workload its measuring is significant.
- FX_TAB_SWITCH_CLOSE_TAB_MS seems like a misleading name for measuring what happens under _blurTab - that's a small subset of "tab closing"

The tabbox changes look more like what we wanted, but having the stopwatch start in _selectNewTab is a bit far away from the event handler. And I don't like the ID-checking for browser-specific elements in toolkit code.

One way I see to fix that is to introduce a "stopwatchid" attribute on the "tabs" binding. That means you could set |stopwatchid="FX_TAB_CLICK_MS"| on the tabbrowser-tabs element in browser.xul, and then have the mousedown handler that calls this.parentNode._selectNewTab in tabbox.xml do something like:

let stopwatchid = this.parentNode.getAttribute("stopwatchid");
if (stopwatchid) {
  TelemetryStopwatch.start(stopwatchid);
}

before it calls _selectNewTab, and then have it call finish() at the end of that mousedown handler (since things now all happen synchronously under that handler, AFAICT).

This will end up covering a superset of the existing FX_TAB_SWITCH_UPDATE_MS measurement, but very specifically for "real" tab switches, and only those triggered from clicks. So I think we could probably eventually get rid of FX_TAB_SWITCH_UPDATE_MS in favor of this measurement, but we can keep them both in for a while to see how the numbers compare.

We could do other measurements for tab switches from the keyboard by doing essentially the same thing for advanceSelectedTab (or for all of its callers, if we want more granular data - but that's probably overkill).
Attachment #679722 - Flags: review?(gavin.sharp) → review-
Attached patch Patch v2 (obsolete) — Splinter Review
Here is the measurement of tab switching triggered by mouse click as suggested.
Attachment #679722 - Attachment is obsolete: true
Attachment #680948 - Flags: review?(gavin.sharp)
I'm still skeptical that you're measuring anything interesting here. What kind of numbers do you get for FX_TAB_CLICK_MS?

It's also unclear to me how this probe would ensure stuff doesn't happen async anymore. The most effective way to prevent bug 743877 from regressing is probably peer review.
(In reply to Dão Gottwald [:dao] from comment #15)
> I'm still skeptical that you're measuring anything interesting here. What
> kind of numbers do you get for FX_TAB_CLICK_MS?

As I mentioned in comment 13, we're measuring time spent between click and the tab being selected, which includes time measured by FX_TAB_SWITCH_UPDATE_MS. Why don't you think that is interesting?

> It's also unclear to me how this probe would ensure stuff doesn't happen
> async anymore.

The probe lives directly in the tab selection handling code - if someone was going to introduce a spin of the event loop or any other slowness there, they're more likely to notice the implications of that.

> The most effective way to prevent bug 743877 from regressing is probably peer
> review.

I agree, but I think this can help too. It certainly doesn't hurt, and gives us a more comprehensive measurement than what we currently have.
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #16)
> (In reply to Dão Gottwald [:dao] from comment #15)
> > I'm still skeptical that you're measuring anything interesting here. What
> > kind of numbers do you get for FX_TAB_CLICK_MS?
> 
> As I mentioned in comment 13, we're measuring time spent between click and
> the tab being selected, which includes time measured by
> FX_TAB_SWITCH_UPDATE_MS. Why don't you think that is interesting?

I meant the difference between FX_TAB_CLICK_MS and FX_TAB_SWITCH_UPDATE_MS.

> > It's also unclear to me how this probe would ensure stuff doesn't happen
> > async anymore.
> 
> The probe lives directly in the tab selection handling code - if someone was
> going to introduce a spin of the event loop or any other slowness there,
> they're more likely to notice the implications of that.

You mean because FX_TAB_CLICK_MS would (unexpectedly?) improve?
(In reply to Dão Gottwald [:dao] from comment #17)
> You mean because FX_TAB_CLICK_MS would (unexpectedly?) improve?

That's one possibility. But I meant that someone looking at the mousedown handler's code will see the performance measurement being made, and therefore be made aware of the potential performance implications of refactoring how that works.

But this is getting highly speculative - the main reason for doing this isn't to avoid future versions of bug 743877, it's to have a more comprehensive (and more specific) tab-switching measurement than what we currently have. If someone were to add another "select" event handler, for example, our current telemetry wouldn't capture its impact on tab switching. This one will.
Comment on attachment 680948 [details] [diff] [review]
Patch v2

>diff --git a/toolkit/content/widgets/tabbox.xml b/toolkit/content/widgets/tabbox.xml

>     <handlers>
>       <handler event="mousedown" button="0">

>         if (this != this.parentNode.selectedItem) { // Not selected yet
>+          let stopwatchid = this.parentNode.getAttribute("stopwatchid");
>+          let tmp = {};
>+          Cu.import("resource:///modules/TelemetryStopwatch.jsm", tmp);
>+          let TelemetryStopwatch = tmp.TelemetryStopwatch;
>+
>+          if (stopwatchid) {
>+            TelemetryStopwatch.start(stopwatchid);
>+          }

You should only import the module if the stopwatchid is non-null. But even better, you can just put this module in a field:

<field name="TelemetryStopwatch" readonly="true">
  let tmp = {};
  Cu.import("resource:///modules/TelemetryStopwatch.jsm", tmp);
  tmp.TelemetryStopwatch;
</field>

and then just use this.TelemetryStopwatch.

r=me with that change made.
Attachment #680948 - Flags: review?(gavin.sharp) → review+
Attached patch PatchSplinter Review
Fixed: Put TelemetryStopwatch in a field.
Attachment #680948 - Attachment is obsolete: true
Attachment #686424 - Flags: review?(gavin.sharp)
Attachment #686424 - Flags: review?(gavin.sharp) → review+
Looks like TelemetryStopwatch.jsm is referenced by the wrong URI in this patch (should be "resource:///gre/modules/TelemetryStopwatch.jsm" instead of "resource:///modules/TelemetryStopwatch.jsm"), should I file a followup bug?
Yes please (CC me!)
Depends on: 817948
https://hg.mozilla.org/mozilla-central/rev/82982ebe08cd
Status: REOPENED → RESOLVED
Closed: 8 years ago7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 20
You need to log in before you can comment on or make changes to this bug.