Closed Bug 1716049 Opened 3 years ago Closed 2 years ago

GLX/Nvidia: Firefox lags with video playing in background

Categories

(Core :: Graphics: WebRender, defect)

Firefox 89
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr78 --- disabled
firefox-esr91 --- wontfix
firefox91 --- wontfix
firefox92 --- wontfix
firefox93 --- wontfix
firefox96 --- wontfix
firefox97 --- wontfix
firefox98 --- verified disabled
firefox110 --- wontfix
firefox111 --- disabled

People

(Reporter: 0fwv42r5o, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0

Steps to reproduce:

I'm using Firefox 89.0 on Fedora Linux 33, with Nvidia proprietary video drivers version 465.31. This problem can be solved for now by disable WebRender on about:config with "gfx.webrender.force-disabled".

  1. Open a YouTube video on Firefox 89.0
  2. Open a new window with the YouTube playing in the background

Actual results:

Typing and scrolling on the new window will be very slow with long freezes

Expected results:

There should be no lag

The Bugbug bot thinks this bug should belong to the 'Core::Graphics: WebRender' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Graphics: WebRender
Product: Firefox → Core

A video of the bug

Can you attach your about:support?

Flags: needinfo?(0fwv42r5o)
Attached file about:support
(In reply to Jeff Muizelaar [:jrmuizel] from comment #3) > Can you attach your about:support?

Can you get profiles using https://profiler.firefox.com/ using the "Firefox Graphics" setting in both configurations? (WebRender enabled and WebRender disabled)?

Summary: Firefox 89 lags on Linux with video playing in background → Firefox 89 lags on Linux with video playing in background (sw-wr)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #5)

Can you get profiles using https://profiler.firefox.com/ using the "Firefox Graphics" setting in both configurations? (WebRender enabled and WebRender disabled)?

WebRender enabled https://share.firefox.dev/3cTENnV
WebRender disabled https://share.firefox.dev/35cnvxP

Flags: needinfo?(0fwv42r5o)

The WebRender enabled profile shows us spending large amounts of time in the nvidia driver with the following stack:

__GI___sched_yield
nv044glcore
nv044glcore
nv003glcore
nv015glcore
nv014glcore
XMaxRequestSize
gldbc3cfnX
XError
glXQueryContext
mozilla::gl::GLContextGLX::MakeCurrentImpl() const
mozilla::gl::GLContext::MakeCurrent(bool) const
mozilla::wr::RendererOGL::Update()
mozilla::wr::RenderThread::UpdateAndRender(mozilla::wr::WrWindowId, mozilla::layers::BaseTransactionId<mozilla::VsyncIdType> const&, mozilla::TimeStamp const&, bool, mozilla::Maybe<mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> > const&, mozilla::Maybe<mozilla::wr::ImageFormat> const&, mozilla::Maybe<mozilla::Range<unsigned char> > const&, bool*)
RenderThread::UpdateAndRender
mozilla::wr::RenderThread::HandleFrameOneDoc(mozilla::wr::WrWindowId, bool)
mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(mozilla::wr::WrWindowId, bool), true, (mozilla::RunnableKind)0, mozilla::wr::WrWindowId, bool>::Run()
MessageLoop::DoWork()
base::MessagePumpDefault::Run(base::MessagePump::Delegate*)
MessagePumpDefault::Run
MessageLoop::Run()
base::Thread::ThreadMain()
(root)

Maybe Arthur can make a guess as to why this would be?

Flags: needinfo?(ahuillet)

This stack doesn't fully make sense to me. The only guess I can suggest is that Mozilla might be calling glXQueryContext with an invalid context, which will do a round trip to the X server and return an error.
XMaxRequestSize appears to be called in our driver only as part of glXCreateContext, so I assume a context is being create each frame and this is what explains the disastrous performance (?).
Not a very productive guess I fear but maybe it can give you pointers.

Is there a way to obtain an Apitrace of the commands here, or failing that, debug this locally? I would need to attach gdb to the GL process.
Thanks

Here's our implementation of MakeCurrentImpl: https://searchfox.org/mozilla-central/rev/4c06787a227b9f46ae22b70611f1213891d72e03/gfx/gl/GLContextProviderGLX.cpp#631

It doesn't seem to do anything unusual.

Arthur, are you able to try to reproduce the problem on similar hardware with the same driver?

Summary: Firefox 89 lags on Linux with video playing in background (sw-wr) → Firefox 89 lags on Linux with video playing in background (nvidia binary)

When WebRender is enabled, RenderCompositorOGL and GLContextGLX. RenderCompositorOGL creates GLContextGLX for each window. I wonder if it might be related to the problem. If RenderCompositorEGL is used only one GLContextEGL is created for all windows.

But X11 EGL is blocked by Bug 1689464. It does not work well with some nvidia binary drivers.

:rmader, can you comment to the bug?

Flags: needinfo?(robert.mader)

Odd. We have similar issues in other setups, however I'm not aware of a reason why this should happen on X11 on a always composited WM (Gnome). Once bug 1646135 lands the EGL backend should work on NV, lets try again then and see if it fixes the issue.

Flags: needinfo?(robert.mader)
See Also: → 1667165

Some notes from Jeff and I looking into this:

  • A simple way to reproduce the issue is to open a page containing a css animation such as https://developer.mozilla.org/en-US/docs/Web/CSS/animation in two windows in Firefox. Setting the pref "gfx.webrender.all" to true in about:config should ensure that the GPU rendering backend is enabled even if we'd have disabled it by default for that current configuration.

  • Our rendering setup is rather simple. All GL commands are submitted from the same thread (the Renderer thread, which is not the thread the gtk event loop lives in case that's relevant). each time we render a window, we call glMakeCurrent, submit the GL commands and finally call glxSwapBuffers.

  • On all x11 with all nvidia hardware that I could test, as soon as multiple windows are presenting continuously we, the glMakeCurrent calls start taking a very long time (more than 16 milliseconds on the beefy xeon processor I am testing on) with the following stack:

  • Nouveau drivers aren't affected.

  • The stack trace is consitently the one from comment 7 on all of the configurations I tested.

  • As soon as a single window is presenting, glMakeCurrent goes back to be in the order of 0.005ms.

  • No error is reported to error handlers registered via XSetErrorHandler.

  • To be sure I disabled all interactions with glx on the vsync thread and switched to a purely timer-based one. It doesn not affect the problem at all.

  • We call glxSwapInterval(1) once every frame. In doubt I changed it to be called once during the initialization of each window, but the problem remains.

  • Interestingly, the problem almost goes away if glxSwapInterval is set to 0. By that I mean that glxMakeCurrent takes around 1ms which is a lot more than I would expect (~0.005ms) but at least leaves enough room for interactive frame rates.

  • In a sibbling bug, it is reported that ASAP mode fixes the issue. What's important here is that SwapInterval is set to 0 when ASAP mode is enabled. When enabling ASAP mode and forcing SwapInterval to be set to 1, the problem still happens.

  • Arthur Huillet suggested in comment 8 that the glx context becomes (partially?) invalid and (some of it is) recreated when we call gfxMakeCurrent. Note that textures and other persistent GPU resources aren't lost in the process.

To me this smells like:

  • we get into a state that the nvidia driver doesn't like and it causes it to reinitialize some stuff. This initialization is costly in particular because it issues some synchronous query to something (the x server?).
  • when SwapInterval is not zero, the sync query is not answered until the next vsync (which is unfortunate because we tend to call this early in the frame and end up waiting for a full frame.

The interaction of the two causing the slowdown to be spectacular

Things to consider:

We do:

For each window
    make_current()
    render()
    swap()

In various places online it is suggested that the following might be better when SwapInterval is not zero:

For each window:
    make_current()
    render()
    Flush()

For each window:
    make_current()
    swap()

That requires a bit of coordination between windows that doesn't exist right now but it should be doable-ish. In reality we need also to group per screen, especially if they have different refresh rates. It's worth trying but there is no indication that it will fix this bug. There's a chance that it might mitigate the sync issue by avoiding to call gfxMakeCurrent after swap, but the "real" issue is probably what's causing MakeCurrent to issue that sync query in the first place.

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.

Switching to EGL also appear to fix this. It's probably how we'll get this fixed in the medium/long term.

Arthur, are you able to reproduce the problem with the steps from comment 13?

Flags: needinfo?(ahuillet)
Flags: needinfo?(ahuillet)

Co-worker pointed me here. Are the contexts involved robust contexts?

glXMakeCurrent() becomes essentially free when it doesn't change any state (Make current to the same context and window). It is going to remain quite expensive when switching between contexts or windows though, even if the excessive time is removed. It's not intended to be a fast operation.

As mentioned before, with EGL we use a shared context and can avoid paying the price for MakeCurrent() - the main blocker on NV for that is a reliable way to get a transparent Xvisual though, see https://github.com/KhronosGroup/EGL-Registry/pull/124. James: hint hint - do you think you could 1. ack the extension and 2. implement it in the NV binary driver? It's also needed for GTK4.

Sounds to me like the other proposed solution would require quite a bit of work without a guarantee of success. Pushing EGL in turn would also unlock faster WebGL with DMABUF. IMO that's what we should do, and in the mean time disable HW acceleration on NV again. We've done that for ever until recently, so it's not a big regression.

Edit: see also bug 1702546

Flags: needinfo?(jajones)

Yes, I've acked that extension twice, we'll be implementing it at some point. Approved it again.

In the meantime, I'd like to understand what's happening here. The context switch seems pretty clearly blocked on waiting for a vsync event, and hence my question as to whether robust contexts are used, as that would affect how our driver behaves in this regard.

Flags: needinfo?(jajones)

(In reply to James Jones from comment #17)

Yes, I've acked that extension twice, we'll be implementing it at some point. Approved it again.

In the meantime, I'd like to understand what's happening here. The context switch seems pretty clearly blocked on waiting for a vsync event, and hence my question as to whether robust contexts are used, as that would affect how our driver behaves in this regard.

Thanks! AFAIK we do use robust contexts, but to be sure I'll delegate that question back to Nico.

Flags: needinfo?(nical.bugzilla)

Nical, is on PTO this week. And yes, we do use robust contexts. Another thing that may have something to do with this is that we're using multiple context (one for each window).

So drawing looks like:

for each frame:
   glxMakeCurrent(window1, ctx1);
   [drawing]
   swap()
   glxMakeCurrent(window2, ctx2);
   [drawing]
   swap()
Flags: needinfo?(nical.bugzilla)

Thanks for confirming, this fits in with my theory. I believe the problem is that when losing current from a robust context, our driver waits for all outstanding work on that context to complete so we can be sure any potential robustness error events caused by that work are are generated and attributed to the correct context. When that work includes a swap that is waiting for the next vertical sync signal, that wait will be blocked on said swap, meaning you can effectively only switch contexts once per vsync.

Not as a solution, but if you have a minute for an experiment, can you confirm disabling robust contexts also works around the issue? If so, that would validate the theory.

This also means the modified loop some proposed above of:

for each frame
for each window
glXMakeCurrent(window, ctx1)
[drawing]

for each window
    glXMakeCurrent(window, ctx1)
    swap()

Won't work either, as you'll just push the long wait to the second sub-loop. Notably, glXSwapBuffers() doesn't actually require the drawable to be current, unlike eglSwapBuffers(), so you could potentially take the glXMakeCurrent() out of that second loop and work-around the issue that way, but don't ask me how the robust context error attribution rules interact with such usage.

Sorry, failed at formatting the loop pseudo-code above. Should read:

for each frame
    for each window
        glXMakeCurrent(window, ctx1)
        [drawing]

    for each window
        glXMakeCurrent(window, ctx1)
        swap()

Thanks for chiming in.

Disabling the robustness extension, doesn't appear to affect the issue as far as I can tell.

James, are you able to reproduce the problem locally?

Flags: needinfo?(jajones)
See Also: → 1720634

We were not able to reproduce this internally, but I've filed an issue (You can refer to NVIDIA internal issue 3360950 in future inquiries) and turned it over to our QA team.

Thank you for performing the experiments without robust contexts.

Also for your reference, the effort to implement EGL_EXT_config_select_group in NV drivers is tracked in NVIDIA internal issue 3360948.

Flags: needinfo?(jajones)

It seems that I have a reproduction, following the steps laid out in comment #13. It's not visually very obvious to me, but I can see the CPU usage increase and the animation stutter a little bit.
Is there an easy tool I do not know of to profile Firefox and get the stack traces you've been getting?

With perf I confirm that it looks like MakeCurrent calls glFinish implicitly. Nicolas, in comment #22 you tried to disable the robustness extension - is there a way for me to do so in Firefox, so I can replicate the experiment?

As James explained, we expect !robustness not to trigger an implicit glFinish, so I'd like to find out why it still seems to be happening in order to validate our understanding of the problem.

Thanks

Flags: needinfo?(ahuillet) → needinfo?(nical.bugzilla)

Yep, you can use https://profiler.firefox.com/ to enable the profiler. Use the "Firefox Graphics" preset.

I am experiencing almost the same issue, but the slowdown in the second window is more of an FPS drop instead of large stutters as shown in the video attached earlier.

https://profiler.firefox.com/public/1k92z37j9m233jm5j18kq3k9r2amp137sev24kg

Here's a profile of the laggy second window, the green spikes on the graph under "Renderer" corresponds to when I scroll on the otherwise animation-less website.

I believe your issue is different.
At any rate, a potential workaround is to set __GL_HWSTATE_PER_CTX=2 before starting firefox. This will turn on a mechanism that lets us not do implicit glFinish calls on robust contexts. I am curious to hear back on whether it helps - for me it helped locally.
Similarly, setting __GL_yieldFunctionWaitForGpu=5 will use sleep() calls in the spin loop when waiting for the GPU to be done in glFinish, which drastically reduces the CPU usage locally without visibly hurting anything.

I tried playing with __GL_HWSTATE_PER_CTX (tried setting to 2 and 1) but it didn't affect the problem on my computer. __GL_yieldFunctionWaitForGpu=5 also didn't fix the issue but if I understand correctly it is expected since it would lower CPU time but not affect how long we wait in glFinish.

I kicked a build in which I commented out all mention of gl robustness I could find. It will appear here https://treeherder.mozilla.org/jobs?repo=try&revision=f90ae474671039b4d02822dab7c437612f37b963
Clicking the "B" should bring up a panel at the bottom of the interface, in that panel click the "Artifacts" tab and click "target.tar.bz2" to download the build.

I added two about:config prefs that only exist in this build:

  • "gfx.swapinterval" to change the value that is passed to glxSwapInterval (setting it to 0 has a big effect as I mentioned earlier)
  • "gfx.software-vsync" to force a dumb timer based vsync on the vsync thread instead of interacting with GLX there, to make sure that it doesn't interfere with what we do on the render thread (that pref didn't have any effect for me).

The build also dumps the time spent in glxMakeCurrent and glxSwapBuffer at the beginning and end of each frame to stdout.

You can see the stack traces by using the integrated gecko profiler. To do that visit profiler.firefox.com/ and click the button to enable the profiler. It should add an icon in the UI. Click the arrow next to that icon and select the "Firefox Graphics" preset (to avoid collecting screenshots which can mess up with the rendering performance).

Flags: needinfo?(nical.bugzilla)

(In reply to Nicolas Silva [:nical] from comment #31)

I tried playing with __GL_HWSTATE_PER_CTX (tried setting to 2 and 1) but it didn't affect the problem on my computer. __GL_yieldFunctionWaitForGpu=5 also didn't fix the issue but if I understand correctly it is expected since it would lower CPU time but not affect how long we wait in glFinish.

I kicked a build in which I commented out all mention of gl robustness I could find. It will appear here https://treeherder.mozilla.org/jobs?repo=try&revision=f90ae474671039b4d02822dab7c437612f37b963
Clicking the "B" should bring up a panel at the bottom of the interface, in that panel click the "Artifacts" tab and click "target.tar.bz2" to download the build.

I added two about:config prefs that only exist in this build:

  • "gfx.swapinterval" to change the value that is passed to glxSwapInterval (setting it to 0 has a big effect as I mentioned earlier)
  • "gfx.software-vsync" to force a dumb timer based vsync on the vsync thread instead of interacting with GLX there, to make sure that it doesn't interfere with what we do on the render thread (that pref didn't have any effect for me).

The build also dumps the time spent in glxMakeCurrent and glxSwapBuffer at the beginning and end of each frame to stdout.

You can see the stack traces by using the integrated gecko profiler. To do that visit profiler.firefox.com/ and click the button to enable the profiler. It should add an icon in the UI. Click the arrow next to that icon and select the "Firefox Graphics" preset (to avoid collecting screenshots which can mess up with the rendering performance).

Cool! I tried out the build and it seems to solve my problem perfectly. Changing "gfx.swapinterval" to 0 allows me to play a youtube video without making other browser windows choppy, while a value of 1 makes the windows choppy again.

(In reply to Nicolas Silva [:nical] from comment #31)

I tried playing with __GL_HWSTATE_PER_CTX (tried setting to 2 and 1) but it didn't affect the problem on my computer. __GL_yieldFunctionWaitForGpu=5 also didn't fix the issue but if I understand correctly it is expected since it would lower CPU time but not affect how long we wait in glFinish.

I kicked a build in which I commented out all mention of gl robustness I could find. It will appear here https://treeherder.mozilla.org/jobs?repo=try&revision=f90ae474671039b4d02822dab7c437612f37b963
Clicking the "B" should bring up a panel at the bottom of the interface, in that panel click the "Artifacts" tab and click "target.tar.bz2" to download the build.

I added two about:config prefs that only exist in this build:

  • "gfx.swapinterval" to change the value that is passed to glxSwapInterval (setting it to 0 has a big effect as I mentioned earlier)
  • "gfx.software-vsync" to force a dumb timer based vsync on the vsync thread instead of interacting with GLX there, to make sure that it doesn't interfere with what we do on the render thread (that pref didn't have any effect for me).

The build also dumps the time spent in glxMakeCurrent and glxSwapBuffer at the beginning and end of each frame to stdout.

You can see the stack traces by using the integrated gecko profiler. To do that visit profiler.firefox.com/ and click the button to enable the profiler. It should add an icon in the UI. Click the arrow next to that icon and select the "Firefox Graphics" preset (to avoid collecting screenshots which can mess up with the rendering performance).

Setting gfx.swapinterval to 0 in this build fixes the freezes for me too.

(Coming back from vacation) The fact that turning off robustness makes the problem go away seems to confirm our hypothesis as to the root cause of the problem (that robust contexts need a Finish on LoseCurrent); but the __GL_HWSTATE_PER_CTX setting should work around that nicely (and did on my system).
Since the fix we were considering is to turn that setting on by default... the findings that it's not helping mean there's more investigation on our end to be done. It definitely helps for me, but the reproduction was not very noticeable in the first place.
Does anyone have an obviously visible reproduction, not just a slower animation that becomes a bit more janky?
Thanks

The fact that turning off robustness makes the problem go away

Turning off robustness doesn't appear to make the problem go away as far as I could tell, it was setting swap interval to zero that did.

I don't have the nvidia hardware with me today, but I suspect that adding more windows doing not-demanding but continuous animations will accentuate the problem (without moving the cause to heavy GPU load).

If you run the build from https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Woj4B2OTT62gWZqHvZ-VZw/runs/0/artifacts/public/build/target.tar.bz2 do you see 16ms stalls logged in stdout when trying the simple test case?

Sorry I misread your message originally. Swap interval set to 0 means turn off vsync, so it makes sense that it would make the problem go away, but it's not very helpful.

I do see SwapBuffers taking 16ms when trying the simple test case, but that is fully expected since you are using vsync (swap interval = 1). This isn't indicative of a bug - by definition of swap interval = 1, SwapBuffers will block until the next vblank.
What could be considered a bug is MakeCurrent waiting until the next vblank because it is implicitly calling glFinish and there's vblank waits in the pipe, but that goes away with __GL_HWSTATE_PER_CTX=2 on my system (because that setting removes the need for the driver to do glFinish and a glFlush is enough).

Sadly, the feedback from multiple people here seems to suggest that that setting does not make the problem go away, which means I have been looking at a different problem !
Thanks

(Note that that setting will not allow more than 32 contexts with their dedicated HW state, so if you're creating more than 32 contexts in your tests the issue will re-appear.)

For the record, Swap interval set to 0 would AFAIK not take away Vsync entirely as our "VsyncSource" does not rely on it. So we'd still issue frames at the correct rate and usually in time, though we might get tearing - which again should not be an issue on composited window managers.

It's by the way what we do when using the EGL backend[1].

1: https://searchfox.org/mozilla-central/source/gfx/webrender_bindings/RenderCompositorEGL.cpp#195 (the code was originally written for Wayland but is also used for X11/EGL now).

I can't disable WebRender anymore with Firefox 92. The browser is now unusable with multiple windows because of the constant freezing. Unfortunately that means I'm forced to use Chromium for the time being.

(In reply to 0fwv42r5o from comment #41)

I can't disable WebRender anymore with Firefox 92. The browser is now unusable with multiple windows because of the constant freezing. Unfortunately that means I'm forced to use Chromium for the time being.

You can still disable hardware acceleration by setting gfx.webrender.software - that should work around the issue as well. Alternatively, enabling gfx.x11-egl.force-enabled should work on the prop. nvidia driver as well now.

(In reply to Robert Mader [:rmader] from comment #42)

(In reply to 0fwv42r5o from comment #41)

I can't disable WebRender anymore with Firefox 92. The browser is now unusable with multiple windows because of the constant freezing. Unfortunately that means I'm forced to use Chromium for the time being.

You can still disable hardware acceleration by setting gfx.webrender.software - that should work around the issue as well. Alternatively, enabling gfx.x11-egl.force-enabled should work on the prop. nvidia driver as well now.

Thank you, this workaround solved it.

(Nicolas Silva [:nical] from comment #13)

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.

Could such a fix be backported to release?
https://www.reddit.com/r/firefox/comments/plpmii/firefox_scroll_lags_with_multiple_windows_linux_ii/

Switching to EGL also appear to fix this. It's probably how we'll get this fixed in the medium/long term.

Ok, so I did some testing between the test build (swap interval set to 0) and the default firefox 92. On both, there is subtle screen tearing with compositing disabled while the screen tearing disappears when I enable a compositor with Vsync on both (as expected). So it seems like there's not any additional tearing when setting the swap interval to 0. So doesn't this mean setting the swap interval to 0 perfectly fixes this bug?

(In reply to Robert Mader [:rmader] from comment #40)

For the record, Swap interval set to 0 would AFAIK not take away Vsync entirely as our "VsyncSource" does not rely on it. So we'd still issue frames at the correct rate and usually in time, though we might get tearing - which again should not be an issue on composited window managers.

What does your VsyncSource do exactly, and can we take it out of the picture as an experiment?
I am observing SwapBuffers taking next to no time with a single window with swap interval=1, and this isn't consistent with what is normally observed (16ms waits in SwapBuffers). Unless you're timing rendering in a certain way, and I suspect this is what VsyncSource is about?

Flags: needinfo?(robert.mader)

(In reply to Arthur Huillet from comment #47)

What does your VsyncSource do exactly, and can we take it out of the picture as an experiment?
I am observing SwapBuffers taking next to no time with a single window with swap interval=1, and this isn't consistent with what is normally observed (16ms waits in SwapBuffers). Unless you're timing rendering in a certain way, and I suspect this is what VsyncSource is about?

So GtkVsyncSource essentially calls WaitVideoSyncSGI[1] in a thread and then notifies the render thread via NotifyVsync(). It thus makes sure we render at the maximal refresh rate but does not guarantee to never tear IIUC. There used to be an option to disable it, but apparently that got removed. I can reintroduce it if that would help with debugging.

For the record, we're also moving on with the EGL effort and try to enable it by default on recent drivers, see bug 1695933

1: https://searchfox.org/mozilla-central/source/gfx/thebes/gfxPlatformGtk.cpp#752-778

Flags: needinfo?(robert.mader)

There is also a dumb timer based vsync source that is just a thread that sends an event every 16ms when we are listening for vsync. I've tried both and the issue is happening the same way with the GLX vsync source described by Robert and the timer based one. In the build I linked earlier the timer based vsync can be enabled with the pref "gfx.software-vsync".

I do see SwapBuffers taking 16ms when trying the simple test case, but that is fully expected since you are using vsync (swap interval = 1). This isn't indicative of a bug - by definition of swap interval = 1, SwapBuffers will block until the next vblank.

This is interesting, it means that within a given vsync interval we can't present more than one window on the same thread, if I understand correctly. Does that sound correct?

Well, that is only the case if you're rendering as fast as you can, which is the typical use case in games -- but not so here since you're trying to render only once per frame thanks to the Vsync timer logic. I had not paid attention to that, sorry. I think it is key to what may be going on here.

I need to investigate more on my end.
It seems that you're playing timing games with the implementation based on some assumptions that may not be true, but whether guaranteed by the spec or not, there should be a way to make these games work. I'll get back to you when I better understand what we're doing exactly -- I do agree, now, that the 16ms SwapBuffer delays with > 1 window seem fishy, or at least I can't readily explain them.

Thanks

So since bug 1695933 landed, this bug is technically fixed - at least on the >=470 driver series, which we took as baseline for EGL. AFAIK we could also consider lowering that to 465 (which has been reported to work) if things work out well. Thus I wonder if it makes sense to further try to fix the GLX backend. Instead we could just make sure to only use EGL on Nvidia and carry on.

(In reply to Robert Mader [:rmader] from comment #53)

So since bug 1695933 landed, this bug is technically fixed - at least on the >=470 driver series, which we took as baseline for EGL. AFAIK we could also consider lowering that to 465 (which has been reported to work) if things work out well. Thus I wonder if it makes sense to further try to fix the GLX backend. Instead we could just make sure to only use EGL on Nvidia and carry on.

I tested the latest Firefox Nightly and the issue disappeared. WebRender works with Nvidia and I don't have any freezes with multiple windows.

Depends on: linux-egl
Blocks: 1720902

I see switching to EGL has been considered to be a fix for bug 1720634 (which was marked a duplicate of this one, even though it's not exactly), but this isn't a viable fix for that, at least on my machine.

I've updated to Firefox Nightly 2021-09-22 today to test it. The result of my testing shows that it makes my Firefox run at the wrong refresh rate. My display runs at 59.95 Hz and Firefox with this change runs at exactly 60 Hz, causing repeated dropped frames periodically, even with only 1 window animating. I can see this happening in normal scrolling, and see it clearly on vsynctester.com.

It's not properly synchronizing to my refresh rate, unlike whatever Firefox was doing before this change with WebRender. Therefore, I would like to go back to the previous method for now and keep using the layout.frame_rate 0 workaround I described in the description of bug 1720634.

Attached after this comment is my about:support info from Firefox Nightly.

(In reply to lexlexlex from comment #57)

The result of my testing shows that it makes my Firefox run at the wrong refresh rate.

Yes, that's a known issue and may get fixed by bug 1728473, bug 1640779 or by switching to Wayland.

As you said, your configuration needs a workaround either way. You will still be able to stick to your current configuration by setting gfx.x11-egl.force-disabled:true additionally to layout.frame_rate:0. However, as there's currently no perfect solution, having non native refresh rate is IMO better than having windows getting stuck - as default configuration. Especially as many users don't have > 60Hz displays.

Robert, is there a trick that makes it possible to change the value of the "software vsync" timer you provided in a previous experimental build?
What I am seeing right now is that the frame timing from Firefox is about 16ms apart on my 60Hz monitor, so SwapBuffers should be waiting on the previous frame to be presented for about 0 ms, yet ends up waiting for 16ms, and I do not yet know why.
I'd like to play a bit with the frame timing on Firefox side to see how it affects what I'm observing.

Or.. maybe the software vsync (and/or the other one) just get desynced and end up pacing at 16ms at exactly the wrong time. It's not very easy to verify but if I can play with the software vsync period I can probably find out. Or better yet play with the phase so I can try to align the timer to the real vblank timer.

Would you be able to help with that?
Thanks

Flags: needinfo?(robert.mader)

Thanks, Robert. That makes sense to me. I've added myself as a follower on bug 1640779 after reading the comments explaining the status there. I will continue using my workarounds with gfx.x11-egl.force-disabled true until an EGL vsync implementation is ready.

(In reply to Arthur Huillet from comment #60)

Robert, is there a trick that makes it possible to change the value of the "software vsync" timer you provided in a previous experimental build?

That was nical, but IIUC[1][2] you should be able to set it via layout.frame_rate in about:config.

What I am seeing right now is that the frame timing from Firefox is about 16ms apart on my 60Hz monitor, so SwapBuffers should be waiting on the previous frame to be presented for about 0 ms, yet ends up waiting for 16ms, and I do not yet know why.
I'd like to play a bit with the frame timing on Firefox side to see how it affects what I'm observing.

Or.. maybe the software vsync (and/or the other one) just get desynced and end up pacing at 16ms at exactly the wrong time. It's not very easy to verify but if I can play with the software vsync period I can probably find out. Or better yet play with the phase so I can try to align the timer to the real vblank timer.

The software Vsync is AFAIK really just a dump timer, started at a random time :/ So yes, I'd expect it to be desynced - and I'm not sure if there's any way to influence the phase.

Would you be able to help with that?
Thanks

Thanks for looking into it! By the way, making sure that you also have this on your radar: we'd love to switch to EGL by default, however it's currently blocked by the fact that EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV is not implemented by the driver, see bug 1731172 comment 13. Would love to see that in an upcoming release, if possible :)

1: https://searchfox.org/mozilla-central/source/gfx/thebes/SoftwareVsyncSource.cpp#27
2: https://searchfox.org/mozilla-central/source/gfx/thebes/gfxPlatform.cpp#2811

Flags: needinfo?(robert.mader)

I also kind of distrust the phase of the "hardware" vsync thread. Have you guys ever ensured that it was reliable? What I understand of the design makes me think that it isn't, and that could explain part of the problem here too. I'd personally expect that a software timer with the right phase is likely to be more reliable than a thread trying to get HW vsync timing every frame, because you get into overhead and scheduler woes.

(Arthur Huillet from comment #63)

I'd personally expect that a software timer with the right phase is likely to be more reliable than a thread trying to get HW vsync timing every frame, because you get into overhead and scheduler woes.

What would be the best way on X11 to detect correct display refresh rate for this purpose? That also sounds like a better solution for EGL/X11/Nvidia. On EGL/X11/Mesa, Firefox currently mixes GLX_SGI_video_sync with EGL, but on Nvidia it crashes thus it gets a fixed 60 Hz software timer.

See Also: → 1732365

The sequence described in comment 21, is, I believe, required for the frame pacing to be what Firefox requires.
This is on top of the robustness issue that has been discussed before.

https://bugzilla.mozilla.org/show_bug.cgi?id=1716049#c21

Gentlemen in needinfo, how easy would it be for you to provide a build that uses that modified sequence instead?
Thanks

Flags: needinfo?(robert.mader)
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(jmuizelaar)

Clearing my ni here as unfortunately I can't give a good estimate.

I would, however, like to point out again that we once drivers with support for EGL_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV ship, we'll hopefully be able to just move over to EGL - and it has been confirmed that this bug does not happen there.

Flags: needinfo?(robert.mader)

I understand that you see EGL as the future path, and it's not necessarily a bad idea. I'd still like to get to the bottom of what's wrong on the GLX side. I do think it could be a driver bug but am still working on proving it.

(In reply to Arthur Huillet from comment #67)

I'd still like to get to the bottom of what's wrong on the GLX side. I do think it could be a driver bug but am still working on proving it.

For the record: I do also suspect it to be a driver bug and my personal take so far is:

  • it blocks because of SwapInterval(1) (and does not block with SwapInterval(0), which we use on EGL)
  • in a composited environment on X11 this should AFAIK never block longer than until the next frame (even if a window is obscured)
  • Mesa does not block in the same situation (more than until the next frame)

Gentlemen in needinfo, how easy would it be for you to provide a build that uses that modified sequence instead?

Sorry for the late answer. I had a look and unfortunately, adding plumbing for that looks a fair bit complicated in our current frame scheduling code.

Flags: needinfo?(nical.bugzilla)
See Also: → 1736245

(In reply to Arthur Huillet from comment #67)

I understand that you see EGL as the future path, and it's not necessarily a bad idea. I'd still like to get to the bottom of what's wrong on the GLX side. I do think it could be a driver bug but am still working on proving it.

I can confirm that it's a driver bug, or at least suboptimal behavior on our end. Working on a fix.

Ubuntu 22.04, Gnome X11, Nvidia driver 495.46

Fixed by EGL (bug 1751252, bug 1742994) as long as Force Composition Pipeline is enabled in NVIDIA X Server Settings (bug 1736245):

If Force Composition Pipeline is disabled, EGL is barely better than GLX (bug 1736245):

Status: UNCONFIRMED → NEW
Depends on: 1751252
No longer depends on: linux-egl
Ever confirmed: true
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Summary: Firefox 89 lags on Linux with video playing in background (nvidia binary) → GLX/Nvidia: Firefox lags with video playing in background
Depends on: 1812102
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(jmuizelaar)
Resolution: --- → FIXED
Duplicate of this bug: 1647166
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: