Closed Bug 1405585 Opened 7 years ago Closed 7 years ago

Add telemetry to track toolbox/tool refresh time on page reload

Categories

(DevTools :: General, enhancement, P2)

enhancement

Tracking

(firefox57 fix-optional, firefox58 fixed)

RESOLVED FIXED
Firefox 58
Tracking Status
firefox57 --- fix-optional
firefox58 --- fixed

People

(Reporter: ochameau, Assigned: ochameau)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files, 1 obsolete file)

When a web developer reload his page via F5 or any other browser UI, and DevTools were already opened, all panels update to the new page content.
This is know to be slow in some cases and we should track what users experience.


Now, knowing when a tool is done updating it quite challenging. We don't know upfront how many scripts the page have and how many we should wait in the debugger.
Same for console message, DOM elements, network requests...
I had to implement such wait in DAMP and the most secure way to listen for panel reload was to wait for the explicit number of content to be loaded in each panel:
http://searchfox.org/mozilla-central/rev/a4702203522745baff21e519035b6c946b7d710d/testing/talos/talos/tests/devtools/addon/content/damp.js#329-335,344-354,380-391

We may be able to have more generic algorithm to listen for panel reload,
but we should not introduce too many unecessary computation just for the telemetry probe...
Also, doing this correctly may require going deep down in each tool to better wait for all asynchronous events. bug 1402846 is a good example on how hard it is to correctly wait for things to be processed in the inspector.
Priority: -- → P2
Ok, so here is a patch just for the inspector as I think that's the easiest tool to support.
I've let some debug code to help review this on an artifact build.
It may not capture 100% of markup view updates, but it seems quite decent.
The debug screenshot shows an almost fully loaded inspector, I was only able to see some minor last minute update made to the breadcrumb.

I would like to then followup with one patch per tool.
Regarding telemetry I bumped the max value to 2 minutes. But given that the time will be relative to webpage size and network speed, I'm wondering if that is enough?
Also given that, we may also want to add some other probes to help filter out slow connections.
Comment on attachment 8919919 [details]
Bug 1405585 - Add telemetry to track inspector refresh time when reload a page.  datareview=francois

https://reviewboard.mozilla.org/r/190848/#review196264

Thanks for the patch! Clearing the review for now as I'd like to understand more what we are measuring here.

Some questions:

We are measuring the reload performance of the inspector even if the tool is in the background. 
Do we want to do that? If the tool is hidden, its performances might be a bit better, especially if we 
start implementing performance improvements for hidden tools.
  
Quickly testing locally I'm getting results between 50 ms and several seconds, with everything in between. 
The variance seems quite big. I can see that the probe is currently including the navigation time of the page.
Are we sure we want to start measuring in onBeforeNavigate rather than in onNewRoot? I feel like measuring
onNewRoot -> markupLoaded would be more relevant to DevTools, but maybe it's easier to say we start measuring
at will-navigate for all the tools?

Related, from an implementation standpoint:
If we stick to will-navigate for all tools, maybe the toolbox should be responsible for this.
On will-navigate, measure the startTime, then wait for the "reload" event from the current tool. Log to telemetry when 
receiving the reload event. This way tools are just responsible for implementing the event, and we are not measuring the time
for hidden tools.
Attachment #8919919 - Flags: review?(jdescottes)
(In reply to Julian Descottes [:jdescottes] from comment #3)
> We are measuring the reload performance of the inspector even if the tool is
> in the background.

Good point, I imagine we don't want that.
Measuring that may help understand why we are slow in some cases. But it should at least be segregated in its own probe.
So let's just not record background panels and come back to that if we have issues understanding the main probe.

> Quickly testing locally I'm getting results between 50 ms and several
> seconds, with everything in between. 
> The variance seems quite big. I can see that the probe is currently
> including the navigation time of the page.

I think we do want to measure from navigation start. As tools may have impact from the very start of this process.
(It will be different for each tool)
But we would obviously need more. Detailed steps during tool updates.
A good example applied to toolbox opening telemetry would be to record empty/blank panel timing.

> Are we sure we want to start measuring in onBeforeNavigate rather than in
> onNewRoot? I feel like measuring
> onNewRoot -> markupLoaded would be more relevant to DevTools, but maybe it's
> easier to say we start measuring
> at will-navigate for all the tools?

I believe such measurement is going to be very useful, but should be complementary.
And be specific to the inspector. I imagine we will have to add others as well.
Like number of DOM Elements/rules/... being displayed. Just to understand where are the slow patterns.
But all this should be complementary to the generic probe which is:
"How much time do webdevs stare at the panels after hitting F5".

> Related, from an implementation standpoint:
> If we stick to will-navigate for all tools, maybe the toolbox should be
> responsible for this.
> On will-navigate, measure the startTime, then wait for the "reload" event
> from the current tool. Log to telemetry when 
> receiving the reload event. This way tools are just responsible for
> implementing the event, and we are not measuring the time
> for hidden tools.

That's a good suggestion! Only toolbox will do the telemetry thing and it would help ensure this probe is really generic to all tools. It will also be easier for DAMP to also track that, and compare it to our hardcoded waiting functions!
We discussed this a bit on IRC. I agree that we'll also need other probes to get more data.
The one concern I have is the difficulty we'll have to interpret the results of this particular one since it also includes the server round-trip time which is outside our control. I'm just scared this component will skew the results and make them hard to interpret.
Not a reason for not shipping it though. The more data we have, the better.
Here is new patches.

This first is the same than before, except that now the telemetry is handled by toolbox and the inspector only emit a "reloaded" event.
The second tracks what Julian suggested. Time between new-root event and full inspector update.


(In reply to Patrick Brosset <:pbro> from comment #5)
> We discussed this a bit on IRC. I agree that we'll also need other probes to
> get more data.
> The one concern I have is the difficulty we'll have to interpret the results
> of this particular one since it also includes the server round-trip time
> which is outside our control. I'm just scared this component will skew the
> results and make them hard to interpret.
> Not a reason for not shipping it though. The more data we have, the better.

My point with this probe is that it will really reflect what developers see.
Even the refined probe, waiting at new-root start will also depend on page load time.
So unless, having only very small scoped measurements, we will always get the noise of loading time.
And also, I would like this probe to be common between tools and for other tools like netmonitor,
we impact the page load immediately, right when navigation starts.

My intuition on this is that if we make significant changes,
this should make page loading itself faster and be visible over telemetry trends.
For example, netmonitor, inspector and debugger are known to slow down website very significantly.
I hope this probe is going to help confirm fixes like bug 1366693, bug 1171482 and bug 1404913.
I'm starting to wonder if we should also log page load itself (between start navigation and DOMContentLoaded/load events).


My plan is to push various probes to Nightly and revisit them after we get some.
May be add more and remove the useless one.
Comment on attachment 8919919 [details]
Bug 1405585 - Add telemetry to track inspector refresh time when reload a page.  datareview=francois

https://reviewboard.mozilla.org/r/190848/#review197718

Looks good! Just one small question before I R+.

::: devtools/client/framework/toolbox.js:1999
(Diff revision 2)
> +      return;
> +    }
> +
> +    let start = this.win.performance.now();
> +    // The panel may still be loading
> +    let panel = await this.getPanelWhenReady(toolId);

Do we actually want to measure this if the panel is not loaded?

First, I'm scared about race conditions here. We are waiting for a "ready" event, 
but we have no guarantee it will be fired before the "reloaded" event.

Second, I don't think it would dramatically hurt the data to discard the measure 
if getPanel(toolId) is null.

What do you think?

::: devtools/client/inspector/inspector.js:896
(Diff revision 2)
>        this.highlighters.restoreShapeState()
>      ]);
>  
>      this.emit("new-root");
> +
> +    yield onExpand;

Let's just add a comment here to explain why we want to wait for this event to fire the reloaded event.
Attachment #8919919 - Flags: review?(jdescottes)
Comment on attachment 8921384 [details]
Bug 1405585 - Add telemetry to watch inspector delay between new-root event and full update.  datareview=francois

https://reviewboard.mozilla.org/r/192412/#review197728

Looks good, will be interesting to compare the two probes.

::: devtools/client/inspector/inspector.js:906
(Diff revision 1)
>      this.emit("reloaded");
> +
> +    // Record the time between new-root event and inspector fully loaded.
> +    if (this._newRootStart) {
> +      // Only log the timing when inspector is in foreground.
> +      if (this.toolbox.currentToolId  == "inspector") {

nit (eslint): 2 spaces

::: toolkit/components/telemetry/Histograms.json:8421
(Diff revision 1)
>      "high": 120000,
>      "n_buckets": 100,
>      "keyed": true,
>      "description": "Time taken (in ms) to update DevTools panel when reloading a page. This is keyed by tool ID being currently opened [inspector, webconsole, jsdebugger, styleeditor, shadereditor, canvasdebugger, performance, memory, netmonitor, storage, webaudioeditor, scratchpad, dom]."
>    },
> +  "DEVTOOLS_INSPECTOR_NEW_ROOT_TO_RELOAD_DELAY_MS": {

should be flagged for datareview?
Attachment #8921384 - Flags: review?(jdescottes) → review+
(In reply to Julian Descottes [:jdescottes] from comment #9)
> ::: devtools/client/framework/toolbox.js:1999
> (Diff revision 2)
> > +      return;
> > +    }
> > +
> > +    let start = this.win.performance.now();
> > +    // The panel may still be loading
> > +    let panel = await this.getPanelWhenReady(toolId);
> 
> Do we actually want to measure this if the panel is not loaded?
> 
> Second, I don't think it would dramatically hurt the data to discard the
> measure 
> if getPanel(toolId) is null.
> 
> What do you think?

Yes, that sounds fine. It may help ignore unexpected scenarios.
Comment on attachment 8919919 [details]
Bug 1405585 - Add telemetry to track inspector refresh time when reload a page.  datareview=francois

Francois, this keyed histogram telemetry records the time it takes to update each DevTools panel during page reload. Could you do the data review of this?
Attachment #8919919 - Flags: review?(francois)
Comment on attachment 8921384 [details]
Bug 1405585 - Add telemetry to watch inspector delay between new-root event and full update.  datareview=francois

This histogram is specific to the Inspector DevTools panel and records the time it takes to update itself during page reload. (Compared to the other patch from this bug, it records a more specific internal step within Inspector update process)
Attachment #8921384 - Flags: review?(francois)
Assignee: nobody → poirot.alex
Comment on attachment 8921893 [details]
Bug 1405585 - Add telemetry to track console refresh time when reload a page.

BTW, here is a test document to better test this patch:
data:text/html,<script>for(let i=0;i<1000;i++)console.log("foo", i)</script>
Comment on attachment 8919919 [details]
Bug 1405585 - Add telemetry to track inspector refresh time when reload a page.  datareview=francois

https://reviewboard.mozilla.org/r/190848/#review198180

Looks good, thanks for addressing the comments!
Attachment #8919919 - Flags: review?(jdescottes) → review+
> My point with this probe is that it will really reflect what developers see.

Adding my perspective for posterity. This probe will be very noisy as it depends on page load performance. There is also no good baseline of page load timings, as users are more likely to refresh on slow development environments; not comparable to loading behavior without tools.

But since we know that we have massive refresh issues, this probably will still work for now. So, LGTM.
(In reply to :Harald Kirschner :digitarald from comment #19)
> > My point with this probe is that it will really reflect what developers see.
> 
> Adding my perspective for posterity. This probe will be very noisy as it
> depends on page load performance. There is also no good baseline of page
> load timings, as users are more likely to refresh on slow development
> environments; not comparable to loading behavior without tools.
> 
> But since we know that we have massive refresh issues, this probably will
> still work for now. So, LGTM.

Do you know if we have any telemetry (or from any other source) data about page reload?
(outside of devtools context)

Depending on how the telemetry results look like, it may be worth looking at page reload time.
Comment on attachment 8919919 [details]
Bug 1405585 - Add telemetry to track inspector refresh time when reload a page.  datareview=francois

https://reviewboard.mozilla.org/r/190848/#review198242

This is Category 1 data.

datareview+
Attachment #8919919 - Flags: review?(francois) → review+
Comment on attachment 8921384 [details]
Bug 1405585 - Add telemetry to watch inspector delay between new-root event and full update.  datareview=francois

https://reviewboard.mozilla.org/r/192412/#review198244

Category 1 data.

datareview+
Attachment #8921384 - Flags: review?(francois) → review+
Comment on attachment 8921893 [details]
Bug 1405585 - Add telemetry to track console refresh time when reload a page.

https://reviewboard.mozilla.org/r/192932/#review198468

::: devtools/client/webconsole/new-console-output/new-console-output-wrapper.js:306
(Diff revision 1)
> +      if (this.throttleCallback) {
> +        this.throttleCallback();
> +        delete this.throttleCallback;
> +      }

since in the end `this.throttleCallback` existence is tied to `newConsoleOutput.throttledDispatchTimeout` , and given we only want to emit the "reloaded" event,  couldn't we check directly for `newConsoleOutput.throttledDispatchTimeout` instead of adding and removing a function ?


It might be a really small thing, but `delete` is known to be slow, so if we could avoid doing this, it would be nice

::: devtools/client/webconsole/new-webconsole.js:333
(Diff revision 1)
> +
> +    if (event == "navigate") {
> +      let timeout = this.newConsoleOutput.throttledDispatchTimeout;
> +      if (timeout) {
> +        this.newConsoleOutput.throttleCallback = () => {
> +          let delay = this.window.performance.now() - this._navigationStart;

`delay` is never used

::: devtools/client/webconsole/new-webconsole.js:335
(Diff revision 1)
> +          // TODO - REMOVE DEBUG CODE
> +          let window = this.window;
> +          let canvas = window.document.createElementNS("http://www.w3.org/1999/xhtml", "html:canvas");
> +          let context = canvas.getContext("2d");
> +          canvas.width = window.innerWidth;
> +          canvas.height = window.innerHeight;
> +          context.drawWindow(window, 0, 0, canvas.width, canvas.height, "rgb(255, 255, 255)");
> +          dump(">> "+canvas.toDataURL()+"\n");
> +          // TODO - REMOVE DEBUG CODE

looks like it should be removed. 
Also, I'm curious, what are you using this for ?

::: devtools/client/webconsole/panel.js:84
(Diff revision 1)
> +        // Pipe 'reloaded' event from NewWebConsoleFrame to WebConsolePanel.
> +        // These events are listened by the Toolbox.
> +        this.hud.ui.on("reloaded", () => {
> +          this.emit("reloaded");
> +        });

can't we directly emit from the panel object (if we can have the reference in the console code), or listen to `hud.ui` in the toolbox ?
Blocks: 1411887
Blocks: 1411888
Blocks: 1411889
I'm going to land the first set of patches against the inspector and keep the console one for bug 1411887.

Green try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e3c52172875eb90c36660014ce69667b205fec95&selectedJob=139567380
Attachment #8921893 - Attachment is obsolete: true
Attachment #8921893 - Flags: review?(nchevobbe)
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a049959544f2
Add telemetry to track inspector refresh time when reload a page. r=francois,jdescottes datareview=francois
https://hg.mozilla.org/integration/autoland/rev/d2f7f0e9554d
Add telemetry to watch inspector delay between new-root event and full update. r=francois,jdescottes datareview=francois
Backed out for failing browser-chrome's browser/components/extensions/test/browser/browser_ext_devtools_panels_elements.js:

https://hg.mozilla.org/integration/autoland/rev/f076a61f4c51b4a44b72b2dcb89fb769ec3b2878

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=d2f7f0e9554d248a7c23a082faefedef6f84b023&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=139806641&repo=autoland

[task 2017-10-26T09:48:16.981Z] 09:48:16     INFO - TEST-PASS | browser/components/extensions/test/browser/browser_ext_devtools_panels_elements.js | Got the expected onSelectionChanged once the tab is navigating - 
[task 2017-10-26T09:48:16.981Z] 09:48:16     INFO - TEST-PASS | browser/components/extensions/test/browser/browser_ext_devtools_panels_elements.js | Got the expected onSelectionChanged once the tab has been completely reloaded - 
[task 2017-10-26T09:48:16.982Z] 09:48:16     INFO - Buffered messages finished
[task 2017-10-26T09:48:16.983Z] 09:48:16     INFO - TEST-UNEXPECTED-FAIL | browser/components/extensions/test/browser/browser_ext_devtools_panels_elements.js | A promise chain failed to handle a rejection: TypeError: this.toolbox is null - stack: Inspector.prototype.onMarkupLoaded<@chrome://devtools/content/inspector/inspector.js:910:1
Flags: needinfo?(poirot.alex)
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1d05642ebd11
Add telemetry to track inspector refresh time when reload a page. r=francois,jdescottes datareview=francois
https://hg.mozilla.org/integration/autoland/rev/4b056eaa8bb3
Add telemetry to watch inspector delay between new-root event and full update. r=francois,jdescottes datareview=francois
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #26)
> Backed out for failing browser-chrome's
> browser/components/extensions/test/browser/
> browser_ext_devtools_panels_elements.js:

Fix that via https://reviewboard.mozilla.org/r/192412/diff/2-3/
(consider only inspector.js changes, everything else is a rebase)

It looks like browser_ext_devtools_panels_elements.js is imperfect. It close the toolbox while the inspector is still loading. I tried to change its markuploaded listener with an inspector-updated, but its assertion fails if I do so...
Flags: needinfo?(poirot.alex)
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: