Closed Bug 1365574 Opened 3 years ago Closed 2 years ago

When opening devtools there is some heavy Promise usage in the parent process causing jank

Categories

(DevTools :: General, defect, P2)

defect

Tracking

(firefox59 fixed)

RESOLVED FIXED
Firefox 59
Tracking Status
firefox59 --- fixed

People

(Reporter: smaug, Assigned: ochameau)

References

(Blocks 2 open bugs)

Details

(Keywords: perf, Whiteboard: [qf:p3])

Attachments

(2 files)

Promise handling is synchronous from browser point of view. Could devtools perhaps use requestIdleCallback or even timers to reduce jank.
And perhaps the work callbacks do could be split to smaller pieces?

https://perfht.ml/2rqcrHE
Whiteboard: [qf] → [qf:p3]
Keywords: perf
This bug isn't easily actionnable. It mostly highlights overall slowness/complexity of DevTools.

I've been looking at BHR reports and narrowed the longest hangs down to netmonitor codebase.
Your bug 1403884 was more actionnable and we made some progress on that.

Now, comment 0 highlights much smaller hangs, which I'm not sure I saw in BHR yet (but netmonitor may pollute the data).
Your profile is around toolbox opening. In it I see a lot of module loading, which is synchronous.
I have been reviewing toolbox opening and tried to lazy load all what we can delay loading. There may be new modules that landed in the meantime...

Let's look at one particular frame to see if there anything we can do:
  https://perfht.ml/2Ai7jgF
Here we synchronously call _mountReactComponent (which is slow as it loads many big modules, including react)
and then call setCanRender, which creates the React components and render them.
It all happens here:
  https://searchfox.org/mozilla-central/source/devtools/client/framework/toolbox.js#471,507

Do you suggest adding a setTimeout(0) before calling setCanRender?
I don't know requestIdleCallback, but it looks like it would delay rendering a lot?
We don't want to slow down toolbox opening/rendering either.

While we are discussing about optimizing the threading of toolbox loading, note that I tried doing some smart parallelization between our parent and process codes in bug 1397341, but unfortunately it is quite unreliable.
requestIdleCallback(callback, {timeout: 16}) with some small timeout might be better than setTimeout. That would let us still process user input and not block the next refreshdriver tick and such, but not postpone devtools loading too much.

But then, if it is _mountReactComponent which is slow, that all wouldn't really help much.
Any reason to use 16, rather than anything else?

_mountReactComponent takes 47ms while setCanRender takes 30ms.
So you could easily say that both are slow.
But your conclusion from comment 0 is that we synchronously pile up too many slow operations, which end up probably introducing some hangs. So instead of having a 77ms synchronous step we would have these two, seperated by the idle callback.
profile with current tip:
  https://perfht.ml/2irGLTx
  236ms event processing delay
  mountReact: 80ms
  open/setCanRender: 56ms

With toolbox patch:
  https://perfht.ml/2AqGnLM
  Two event processing delays: 188ms and 79ms
  mountReact: 123ms
  open/setCanRender: 45ms

So if we add a bunch of idle callback at critical places, we may shorten the event loop pauses.
With just this one, it is hard to see any difference. Will it have a real impact if we add many?
Surprisingly it seems to only impact netmonitor:
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=473add0ba842b8069a0f0a9a5143e8e337815b3d&newProject=try&newRevision=0d21cc3237044cd25bd3b45b93a26379ef6ea4e5&originalSignature=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c99daa61ac983f21a6ac&showOnlyImportant=1&framework=1
 + 34% on simple.netmonitor.reload (reloading the simple document is slower?!)
 + 12% on simple.netmonitor.requestsFinished (updating the netmonitor during page reload is slower)

I'll push again to see if this is consistent.
Ok, last try push was wonfusing as some significant changes landed on m-c recently.
With a custom base run, we should be more confident about Talos results.
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=2b667a68c6ef4a781b5f04675c96fd0a7e705074&newProject=try&newRevision=cee35a8541733c4f8744862bf698ad39b3be9fff&originalSignature=edaec66500db21d37602c99daa61ac983f21a6ac&newSignature=edaec66500db21d37602c99daa61ac983f21a6ac&showOnlyImportant=1&framework=1
It now reports a -2.8% win on simple.netmonitor.open (opening the netmonitor against an empty document is faster)
and possible an even smaller win on complicated.netmonitor.open.

It makes more sense as it impact toolbox load time. The only surprise is that it seems to only impact netmonitor.
And also, it makes it load faster, whereas I would imagine this patch could only make the toolbox load slightly slower because of an eventual delay due to requestIdleCallback.
Attached image idleCallback.png
Here is a screenshot to illustrate this change.
It clearly split the early toolbox loading synchronous code into two chunks.
So that latter should be quite a bit less janky, I assume. But since it is still janky, I don't know which one is better for UX - one long pause or two a bit shorter ones.
It would be nice to have a better FPS metric to optimize for. If something takes an equivalent amount of time but produces more frames, it probably feels better and more responsive for the end user.
My opinion is that it would be fine to take on a bit of complexity in order to split up our chunks of work into when we are idle, especially since DevTools is in the parent process. We don't really have any graphs that I'm aware of that optimize for FPS, but the user perception of jank is probably higher when things are unresponsive, compared to when things take longer.

In the future, as we move to React's new scheduler, this type of async requestIdleCallback work will be built into the renderer for React. I'd be curious to see how this affects this startup performance.
Thanks for the feedback Greg.

Any idea of any platform API to get a live FPS count? I think the old performance panel used to compute FPS.
Or any tool to make efficient screen record? Or any idea of pratice to compare patches?
The old perf panel uses a requestAnimationFrame loop to compute FPS. I don't know how reasonnable this is...
The requestAnimationFrame loop skews the results from what I understand, as it actually schedules frame updates when none are potentially needed. It might make sense to implement something like that for a single content page, but it will skew the results from a real world application. The RefreshDriverTick is marker we would care about, as well as the Composite event from the Compositor thread. but as far as I know we don't have any way to know when it's idle vs it's being affected by jank. This would be worth having some conversations with people on the graphics team to figure out if we can create some kind of metric, but I'm not aware of an easy numeric solution as of yet, other than to guess where things seemed blocked.
Dropped frames could be a metric but is much better suited for ensuring quality animations (vs an interactive interface). Long tasks (bug 1348405), which penalize tasks longer than 50ms, might work better here and can be aggregated as long-tasks-ms/minute or #-of-hangs-during-load. It can be polyfilled as well (while the linked bug is stuck in backlog).
Priority: -- → P2
Comment on attachment 8933351 [details]
Bug 1365574 - Release event loop between react module loading and toolbox react rendering.

I know how to track one specific stack trace on BHR, so I would like to land this during end of year. Then with one or two weeks of data, we can see if such patch had any impact on BHR reports.
Assignee: nobody → poirot.alex
For the record, the BHR reports for toolbox.js were quite stable over a couple of months:
  https://screenshots.firefox.com/SwBariLzmM1IOhkP/localhost
Until Dec 17th/18th, but sometimes there is big spikes in BHR and everything settles again the next day.
Comment on attachment 8933351 [details]
Bug 1365574 - Release event loop between react module loading and toolbox react rendering.

https://reviewboard.mozilla.org/r/204266/#review214870

::: devtools/client/framework/toolbox.js:507
(Diff revision 1)
>  
>        // Start rendering the toolbox toolbar before selecting the tool, as the tools
>        // can take a few hundred milliseconds seconds to start up.
>        // But wait for toolbar buttons to be set before updating this react component.
>        buttonsPromise.then(() => {
> +        this.win.requestIdleCallback(() => {

An extra comment might be needed here.
This function will slowly get to a point where there is more comment than code, but hey :) that's fine, it's a crucial part and we need to make sure future maintainers understand the importance of the order of things here.
Also, the comment could explain why 16ms below.
Attachment #8933351 - Flags: review+
Pushed by apoirot@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d0d8308550d8
Release event loop between react module loading and toolbox react rendering. r=pbro
https://hg.mozilla.org/mozilla-central/rev/d0d8308550d8
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 59
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.