[meta] Photon - Make closing tabs fast

NEW
Assigned to

Status

()

enhancement
P3
normal
2 years ago
3 months ago

People

(Reporter: mconley, Assigned: mconley)

Tracking

(Depends on 2 bugs, Blocks 3 bugs, {meta, perf})

unspecified
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [photon-performance] [gfx-noted] [qf:meta] [fxperf])

Attachments

(1 attachment)

Like, practically instantaneous. Specifics to follow.
Assignee

Updated

2 years ago
Blocks: 1336241
Assignee

Updated

2 years ago
Assignee: nobody → mconley

Comment 1

2 years ago
This is a profile of closing tabs, a whole bunch of them: https://perfht.ml/2m0gYxV

Sorting by JS only, bottoms up, permitUnload takes 4 seconds.  Next up, adjustTabStrip at 300ms.  After that, _handleTabTelemetryEnd at 248, presumably helping with performance?  Oh the irony. :(

We could be doing a lot better at not synchronously flushing layout all the time: https://perfht.ml/2m0orgp
A lot of this overhead seems to be coming from the fact that we scroll the tab bar in JS, which means we can't really do much better.  Markus, is there some way to get some help from the compositor to scroll the tab bar?

There is probably a lot more information to dig out from profiles here.  I stopped looking for more.
Flags: needinfo?(mstange)

Updated

2 years ago
Blocks: UIJank

Updated

2 years ago
Depends on: 1345315

Comment 2

2 years ago
(Note that if we could scroll the tab bar in the compositor, a lot of the JS cost would also presumably go away, such as adjustTabStrip for instance.)
The compositor cannot give us responsive interactive scrolling while the parent process main thread is busy. What it can give us is smooth scroll animations, if the whole animation can be predetermined by the main thread.

In the tab closing case, you're not actually scrolling the tab bar with the mouse wheel or anything. All that's happening there is a JS-initiated smooth scroll animation. Instead of running this animation frame by frame through JS, we could make the scrollbox JavaScript call a function that hands the smooth scroll off to the compositor.
For overflow:auto elements, element.scrollBy(0, 50, { behavior: "smooth" }) already runs the animation on the compositor, if I recall correctly - assuming the element has a display port.

The tab bar currently does not have a display port, because the scrollbox has overflow:hidden. So if we really want to give it a display port, we'll need to play a few tricks there.

One thing to note, though, is that tab bar painting can be expensive. CSS changes to it have frequently been noticed by Talos. So if we add a display port to it, we'll be painting more of it, and that might have a perf impact.
Flags: needinfo?(mstange)

Updated

2 years ago
Depends on: 816784
Whiteboard: [gfx-noted][qf:p1]

Updated

2 years ago
See Also: → 1260036
Whiteboard: [gfx-noted][qf:p1] → [photon] [gfx-noted] [qf:p1]
No longer blocks: UIJank
Depends on: 1354782
Depends on: 1354789
See Also: → 1355426
Depends on: 1355456
Whiteboard: [photon] [gfx-noted] [qf:p1] → [gfx-noted] [qf:p1]
Here is a profile of closing 2 or 3 empty tabs (about:newtab with the tiles disabled) on a slow machine: https://perfht.ml/2oznW1a
Flags: qe-verify-
Keywords: meta
Summary: Make closing tabs fast → [meta] Make closing tabs fast
Whiteboard: [gfx-noted] [qf:p1] → [photon-performance] [gfx-noted] [qf:p1]
Summary: [meta] Make closing tabs fast → [meta] Photon - Make closing tabs fast

Comment 5

2 years ago
In bug 1369140 I'm removing a reflow which was in the path of tab closing previously, but there is still a restyle left there.

I would appreciate if someone can profile tab closing on Nightly after that bug lands in detail a bit to see where the current issues are and what the current state looks like.  I'm specifically interested to know if bug 1369140 managed to make a significant dent in the performance of tab closing.
Depends on: 1369140
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #5)
> In bug 1369140 I'm removing a reflow which was in the path of tab closing
> previously, but there is still a restyle left there.

What you did is very similar to what Enn had started in bug 1356034. It should help for a lot more cases than just tab closing (see the list of bugs blocked by bug 1356034).

> I would appreciate if someone can profile tab closing on Nightly after that
> bug lands in detail a bit to see where the current issues are and what the
> current state looks like.  I'm specifically interested to know if bug
> 1369140 managed to make a significant dent in the performance of tab closing.

It's a more complicated case, but I've done some tab closing profiling in bug 1356153 (and we've already made a lot of progress there).
Assignee

Updated

2 years ago
Whiteboard: [photon-performance] [gfx-noted] [qf:p1] → [photon-performance] [gfx-noted] [qf:meta]

Comment 7

2 years ago
Is there more to do in this bug?  Can we close it now?
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #7)
> Is there more to do in this bug?  Can we close it now?

I have no opinion about the usefulness of this meta bug, but there's definitely more work to do related to tab closing speed. Especially we would like to make improvements related to beforeunload handling:
- when closing multiple tabs, close immediately the tabs that don't have a beforeunload listener
- when closing multiple tabs with beforeunload listeners that are in different content processes, send the messages to the content processes in parallel
- when closing multiple tabs with beforeunload listeners, consider closing tabs from left to right (so that the first tab we close triggers a user-visible change on screen)
- review the time we wait for a beforeunload reply and see if we can make some adjustments to it. Currently if a content process is unresponsible, we'll wait 2s per tab with a beforeunload listener. This can add up to a lot of time when closing hundreds of tabs.
Here is a quick test page to open quickly a mix of tabs without beforeunload listener, with fast (no op) beforeunload listeners, and with slow (3s busy loop) beforeunload listeners.
Keywords: perf
Priority: -- → P3

Comment 10

2 years ago
(In reply to Florian Quèze [:florian] [:flo] from comment #8)
> - review the time we wait for a beforeunload reply and see if we can make
> some adjustments to it. Currently if a content process is unresponsible,
> we'll wait 2s per tab with a beforeunload listener. This can add up to a lot
> of time when closing hundreds of tabs.

Should be 1s, which is parity with chrome, I think? (bug 1361650)
Where did the 2s come from? Are there other timeouts at play here?

We could collapse the request to close multiple tabs into 1 message to each content process, with inner window ids for the toplevel windows we're unloading, I guess, and wait 1s per process rather than per tab, which would be slightly better than the current state, but only when closing windows / multiple tabs...
Flags: needinfo?(florian)
(In reply to :Gijs (no reviews; PTO recovery mode) from comment #10)
> (In reply to Florian Quèze [:florian] [:flo] from comment #8)
> > - review the time we wait for a beforeunload reply and see if we can make
> > some adjustments to it. Currently if a content process is unresponsible,
> > we'll wait 2s per tab with a beforeunload listener. This can add up to a lot
> > of time when closing hundreds of tabs.
> 
> Should be 1s, which is parity with chrome, I think? (bug 1361650)
> Where did the 2s come from? Are there other timeouts at play here?

I can't find a 2s value anymore, so let's assume I was confused at the time I wrote this.

The timer I see currently is http://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/toolkit/content/widgets/remote-browser.xml#343 (indeed a 1s timer). But I'm not sure it'll actually fire, as I don't see anything keeping a reference to this timer object; isn't there a risk of it being cancelled by the GC without ever firing?

> We could collapse the request to close multiple tabs into 1 message to each
> content process, with inner window ids for the toplevel windows we're
> unloading, I guess, and wait 1s per process rather than per tab, which would
> be slightly better than the current state, but only when closing windows /
> multiple tabs...

This would be much better, it should put a 1s upper bound to the time it takes to close multiple tabs.
Flags: needinfo?(florian)

Comment 12

2 years ago
(In reply to Florian Quèze [:florian] [:flo] from comment #11)
> (In reply to :Gijs (no reviews; PTO recovery mode) from comment #10)
> > (In reply to Florian Quèze [:florian] [:flo] from comment #8)
> > > - review the time we wait for a beforeunload reply and see if we can make
> > > some adjustments to it. Currently if a content process is unresponsible,
> > > we'll wait 2s per tab with a beforeunload listener. This can add up to a lot
> > > of time when closing hundreds of tabs.
> > 
> > Should be 1s, which is parity with chrome, I think? (bug 1361650)
> > Where did the 2s come from? Are there other timeouts at play here?
> 
> I can't find a 2s value anymore, so let's assume I was confused at the time
> I wrote this.
> 
> The timer I see currently is
> http://searchfox.org/mozilla-central/rev/
> 423b2522c48e1d654e30ffc337164d677f934ec3/toolkit/content/widgets/remote-
> browser.xml#343 (indeed a 1s timer). But I'm not sure it'll actually fire,
> as I don't see anything keeping a reference to this timer object; isn't
> there a risk of it being cancelled by the GC without ever firing?

My understanding is that the event loop spinning a few lines down means that as far as the JS engine is concerned, we never leave that block before either we have an answer or the timer has fired. As a result, I wouldn't expect that to happen - the local var is in a function that's still on the stack, so shouldn't get GC'd.

> > We could collapse the request to close multiple tabs into 1 message to each
> > content process, with inner window ids for the toplevel windows we're
> > unloading, I guess, and wait 1s per process rather than per tab, which would
> > be slightly better than the current state, but only when closing windows /
> > multiple tabs...
> 
> This would be much better, it should put a 1s upper bound to the time it
> takes to close multiple tabs.

Well, thinking about this some more... I don't know how we'd deal with this on the child process side. That is, if the parent sent us a message saying "close windows 5, 6 and 7", and we try to run beforeunload in 5, and it hangs - do we not run beforeunload in tabs 6 or 7 as a consequence? That seems... not great. Of course, because the child process' DOM stuff is all on 1 thread, we can't run the beforeunloads for multiple tabs "simultaneously". :-\
(In reply to :Gijs (no reviews; PTO recovery mode) from comment #12)
> That is, if the parent sent us a message saying
> "close windows 5, 6 and 7", and we try to run beforeunload in 5, and it
> hangs - do we not run beforeunload in tabs 6 or 7 as a consequence? That
> seems... not great.

I think that's what we already do to some extent (and yes, it's not great).

Currently if a content page triggers an infinite loop in a beforeunload listener, we will never close it (unless the user clicks the 'X' again on the tab). This is because 'responded' is set to true at http://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/toolkit/content/widgets/remote-browser.xml#310 causing the timeout to have no effect.

On the other hand, if we send a 'PermitUnload' message to a busy content process that doesn't reply within a second that it is processing the message, we'll kill the tab after one second, without the page having ever received the 'beforeunload' event.

See also the code at http://searchfox.org/mozilla-central/rev/423b2522c48e1d654e30ffc337164d677f934ec3/toolkit/content/browser-child.js#617
Whiteboard: [photon-performance] [gfx-noted] [qf:meta] → [photon-performance] [gfx-noted] [qf:meta] [fxperf]
Depends on: 1438989

Updated

a year ago
Blocks: 1451985
You need to log in before you can comment on or make changes to this bug.