Open Bug 1675680 Opened 4 years ago Updated 6 months ago

[wayland] Move vsyncsource off the main thread

Categories

(Core :: Widget: Gtk, enhancement)

enhancement

Tracking

()

REOPENED

People

(Reporter: rmader, Unassigned)

References

(Blocks 1 open bug)

Details

So it's in line with other vsyncsources and does not get blocked if the main thread is busy. This should be relatively easy as the wl_surface is already used in a threadsafe manner.

I guess it might be relevant to give a quick explanation of why it's currently running on the main thread:

https://bugzilla.mozilla.org/show_bug.cgi?id=1641033

The main reason it was migrated away from the original vsync-thread based impl was that:

  1. The main thread is always going to be the one receiving the message, even if it moves it to a different threads' event queue.
  2. All vsync observers were just dispatching to other threads, including back the main thread (refresh drivers).
  3. Jumping between threads is not free, and adds jitter.

Before:

 Main     Vsync   Composite
----------------------------
   R                         Message received
   D   ->   R                Dispatching to vsync
   R   <-   D   ->    R      Vsync handler dispatches everything
   X                  X      Execution happens back on main/composite threads

D: Dispatch, R: Receive, X: Execute

The receive step is there to show the scheduling cost.

A lot of sending back and forth, and a lot of waiting on scheduling.

After:

 Main     Vsync   Composite
----------------------------
   R                         Message received
   DX  ->             R      Handler dispatches to composite, runs local
                      X      Composite runs

D: Dispatch, R: Receive, X: Execute

Less work, less scheduling, quicker execution.

Thanks Kenny, I was about to PM you and ask for exactly that. So if I understand correctly

  1. The main thread is always going to be the one receiving the message, even if it moves it to a different threads' event queue.

means that if it the main thread is busy or blocked we will not run anyway. Then this indeed doesn't make sense and would only increases the load.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INVALID

(In reply to Robert Mader [:rmader] from comment #2)

Thanks Kenny, I was about to PM you and ask for exactly that. So if I understand correctly

  1. The main thread is always going to be the one receiving the message, even if it moves it to a different threads' event queue.

means that if it the main thread is busy or blocked we will not run anyway.

Yeah, our Wayland display is owned by Gtk/Gdk, which itself is dispatched on the main thread. The Wayland dispatch going on there is responsible for filling all event queues (including those we use on other threads), and dispatching the main event queue. We then have to wake up threads servicing our other event queues so they can call wl_dispatch_pending_queue.

Then this indeed doesn't make sense and would only increases the load.

Indeed. The only benefits to moving the queue dispatch to another thread (directly with alternate event queues, or indirectly by posting runnables), are avoiding possible blocking of the main thread, and to run callbacks on a specific thread/stack (e.g. mesa's internal queues). Neither seemed relevant.

(In reply to Kenny Levinsen :kennylevinsen from comment #3)

The only benefits to moving the queue dispatch to another thread (directly with alternate event queues, or indirectly by posting runnables), are avoiding possible blocking of the main thread, and to run callbacks on a specific thread/stack (e.g. mesa's internal queues). Neither seemed relevant.

The former is somewhat relevant in Firefox, because animations can run without the involvement of the main thread of the parent process. This applies to both animations that run on the compositor (CSS opacity and transform animations, as well as smooth scroll animations), and to animations that run on the main thread of the content processes, e.g. using requestAnimationFrame. And unfortunately, we do have cases in Firefox where the parent process main thread is blocked for noticeable periods of time, for example while devtools are open, or when some types of periodic background tasks run. (We don't want any of those to block the main thread, but these kinds of issues creep in anyway.)
It's arguable that a browser that animation jank is acceptable during times when the browser doesn't respond to user events anyway... but it's not what happens on Windows and macOS in Firefox today.

(In reply to Markus Stange [:mstange] from comment #4)

The former is somewhat relevant in Firefox, because animations can run without the involvement of the main thread of the parent process. This applies to both animations that run on the compositor (CSS opacity and transform animations, as well as smooth scroll animations), and to animations that run on the main thread of the content processes, e.g. using requestAnimationFrame. And unfortunately, we do have cases in Firefox where the parent process main thread is blocked for noticeable periods of time, for example while devtools are open, or when some types of periodic background tasks run. (We don't want any of those to block the main thread, but these kinds of issues creep in anyway.)
It's arguable that a browser that animation jank is acceptable during times when the browser doesn't respond to user events anyway... but it's not what happens on Windows and macOS in Firefox today.

I believe you are describing the inverse scenario of what I mentioned: A situation where the main thread is blocked by something else, starving us from the ability to dispatch.

In this case, there is nothing we can do—we won't receive any frame callbacks/vsync, no matter what thread we are trying to process it in.

The reason for that is that Gtk/Gdk dispatch is done strictly on the main thread. Gdk, owning the Wayland display connection, is responsible for reading messages into all known event queues, and for dispatching the main event queue. So, if we don't service Gdk on the main thread, we won't have anything to process.

(It is possible to read the display from other threads, but doing so takes a lock that would block Gdk in its poll setup until the read completed, so that's not really a solution. Likewise, I imagine moving Gtk to a different thread is also out of scope.)

(In reply to Kenny Levinsen :kennylevinsen from comment #5)

I believe you are describing the inverse scenario of what I mentioned: A situation where the main thread is blocked by something else, starving us from the ability to dispatch.

Yes. Oh, I understand now what potential scenario you were referring to originally: That the act of processing the vsync notification itself creates blockage on the main thread. I agree that that's not something we need to worry about.

Thanks for your input! It does sound like it's not a valuable use of our time to avoid the problems I described on Wayland.

I'm reopening this one as in theory we'd like to get it at some point. Maybe there's some way to make the multithreaded polling smart enough to not block each other somehow :/

Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Blocks: wayland
No longer blocks: 1645528
See Also: → 1743144
You need to log in before you can comment on or make changes to this bug.