[Xwayland] Webrender causes firefox to lag when video is played behind another window
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox78 | --- | disabled |
People
(Reporter: sajdl.vojtech, Assigned: rmader)
References
(Depends on 1 open bug)
Details
Attachments
(3 files, 1 obsolete file)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0
Steps to reproduce:
Tried finding a regression range for this bug, but it pointed me to Webrender being enabled on AMD GPUs. Disabling Webrender helps. The windows are open each on different monitor. I'm running ArchLinux, my GPU is AMD RX 5700XT.
- Open any video and have it playing in one window, have Google open in another
- Put the video window behind another window (i.e. Slack)
- Try typing inside the Google search bar
Actual results:
Browser is unresponsive and typing is really slow
Expected results:
Browser should be responsive
Comment 1•4 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Reporter | ||
Comment 2•4 years ago
|
||
Still present after update to Nightly 78. From the looks of it is not limited to playing videos only, but every time a window, which is not directly visible needs to draw something Firefox lags - noticed it with developer tools too when refreshing page and leaving the devtool window in background.
When reporting I forgot to mention that I'm running Gnome 3.36 on Wayland if that helps.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 3•4 years ago
|
||
From further testing this doesn't seem to affect Gnome Xorg session, so it's very probably something Wayland related.
Comment 4•4 years ago
|
||
Because this bug's Severity is normal
and has not been changed, and this bug's priority is --
(none,) indicating it has has not been previously triaged, the bug's Severity is being updated to --
(default, untriaged.)
Comment 5•4 years ago
|
||
Can you share the contents of about:support as a text file?
Reporter | ||
Comment 6•4 years ago
|
||
Assignee | ||
Comment 7•4 years ago
|
||
Can't reproduce here but there are a couple of possible reasons ... lets start with this: could you try if it still happens if you enable widget.wayland_vsync.enabled
?
Reporter | ||
Comment 8•4 years ago
|
||
Tried right now, doesn't seem to do much, might have delayed the lagging for a few seconds (might be placebo though). I tested with one of my two monitors disconnected and that actually resolved the issue. It is worth noting that the two monitors have each different refresh rate. Setting them to the same refresh rate didn't help at all last time i tested, so I omitted that info.
Comment 9•4 years ago
•
|
||
Window Protocol: x11
Desktop Environment: gnome
(Robert Mader [:rmader] from comment #7)
Can't reproduce here but there are a couple of possible reasons ... lets start with this: could you try if it still happens if you enable
widget.wayland_vsync.enabled
?
Can you try this with env var $ MOZ_ENABLE_WAYLAND=1 path/to/firefox
on Gnome Wayland?
Assignee | ||
Comment 10•4 years ago
|
||
Oh, it's on X11, not on Wayland - so that option doesn't have any effect. Sorry for the noise then!
Reporter | ||
Comment 11•4 years ago
|
||
Window Protocol: x11
Good catch, I thought I had MOZ_ENABLE_WAYLAND=1 in the launch command (and kinda thought it was on by default by now). Turns out I don't have it set on my main PC and that this actually fixed the issue for me.
Using that option causes Firefox to crash when using autoscroll - is that a known issue? I couldn't find anything much on bugzilla, but I don't exactly know how to search on here properly
[GFX1-]: window is null
[GFX1-]: Failed to create EGLSurface
[GFX1-]: We don't have EGLSurface to draw into. Called too early?
[GFX1-]: Compositors might be mixed (5,2)
ExceptionHandler::WaitForContinueSignal waiting for continue signal...
ExceptionHandler::GenerateDump cloned child 27008
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
Exiting due to channel error.
Assignee | ||
Comment 12•4 years ago
|
||
Autoscroll works fine here - don't know about an existing issue about it though.
Concerning the issue: we should probably change the title to reflect that it happens on Xwayland (which is a special case as Firefox has a weird vsync implementation that does work better on plain X11 but not on Xwayland) and only in a multi monitor setup.
Finally, we don't know yet if other compositors apart from Gnome are affected - I wouldn't be surprised, given bug 3 and 1162 (both hopefully resolved in 3.38 as there's very active work on both of them).
https://gitlab.gnome.org/GNOME/mutter/-/issues/3
https://gitlab.gnome.org/GNOME/mutter/-/issues/1162
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment hidden (offtopic) |
Comment hidden (offtopic) |
Comment hidden (offtopic) |
Comment hidden (offtopic) |
Comment hidden (offtopic) |
Comment hidden (offtopic) |
Comment 19•4 years ago
|
||
Filed bug 1638084 for the crashing autoscroll icon.
Comment 20•4 years ago
|
||
I can confirm this issue occurs with xwayland only when:
- There are (at least) two windows open
- There is any animation in the first window (video, gif... for example https://en.wikipedia.org/wiki/GIF#/media/File:Rotating_earth_(large).gif)
- The animation happens in the selected tab. If another tab with no animation is selected, there is no lag
- The window where the lag occurs is maximized (the second window). If not, there is no lag.
- If both windows play animation, only the focused window is laggy. You can see that with gnome, when you put both windows in different workspace (see video attachment). I'm not entirely sure about this one, because if your window lag, focusing to another software don't stop the lag...
- The animation can happen in the sidebar, in that case it's still the other window that will lag (not the window with the sidebar)
The lag caused by this issue make the entire UI of Firefox irresponsive, even when you use keyboard shortcuts.
The UI is rendered at around 1 fps, and I don't see any kind of spike in CPU or GPU usage.
Comment 21•4 years ago
|
||
Assignee | ||
Comment 22•4 years ago
•
|
||
(In reply to filman230 from comment #20)
I can confirm this issue occurs with xwayland only when:
- There are (at least) two windows open
- There is any animation in the first window (video, gif... for example https://en.wikipedia.org/wiki/GIF#/media/File:Rotating_earth_(large).gif)
- The animation happens in the selected tab. If another tab with no animation is selected, there is no lag
- The window where the lag occurs is maximized (the second window). If not, there is no lag.
- If both windows play animation, only the focused window is laggy. You can see that with gnome, when you put both windows in different workspace (see video attachment). I'm not entirely sure about this one, because if your window lag, focusing to another software don't stop the lag...
- The animation can happen in the sidebar, in that case it's still the other window that will lag (not the window with the sidebar)
The lag caused by this issue make the entire UI of Firefox irresponsive, even when you use keyboard shortcuts.
The UI is rendered at around 1 fps, and I don't see any kind of spike in CPU or GPU usage.
Could you try reproducing this with
MOZ_X11_EGL=1
env var set- without the above env var but
layers.acceleration.force-enabled
enabled (and webrender force-disabled)
on nightly? The reason I'm asking is that there are several different things that could cause the issue and the above options force two of the code paths I would think about first. Sorry for not trying to reproduce myself atm, little time :(
Also, could everyone who sees the issue post their Xorg version (I assume some 1.20 point release)? Thanks!
Comment 23•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #22)
Could you try reproducing this with
MOZ_X11_EGL=1
env var set- without the above env var but
layers.acceleration.force-enabled
enabled (and webrender force-disabled)on nightly? The reason I'm asking is that there are several different things that could cause the issue and the above options force two of the code paths I would think about first. Sorry for not trying to reproduce myself atm, little time :(
Also, could everyone who sees the issue post their Xorg version (I assume some 1.20 point release)? Thanks!
xorg-server version is 1.20.9-2, and yes I was (and is) on nightly.
I'm sorry, I don't know why, but now when there are animations in both windows, there is no lag... (I updated nightly though, it was one day old...)
Is having gfx.webrender.all set to false enough to force-disable webrender?
So, here is what I got, this is a bit complicated so I'm not entirely sure if I reported it correctly. There is always an animation playing in the background in another window:
In any case those options improved the situation, if there is an animation in another window, you can browse without problem, the scroll is fluid, the page is responsive...
BUT if you touch any UI element, AND IF there is no animation inside your tab, then your window will lag. If there is an animation (like a gif), it won't lag. If there are animations in both windows, there is not lag caused by the UI.
By "touch any UI element", I mean using the megabar (typing text), or simply hovering the buttons in the megabar (favorite button, pocket, read mode...)
This is where the subtle difference comes to play: (not entirely sure about this)
In any case, clicking on the megabar and typing text will cause your window to lag (not sure if it would stop after some time), but with MOZ_X11_EGL=1
, hovering the buttons will cause a lag for a few seconds, while with layers.acceleration.force-enabled
, the lag is much longer (never stops?).
I don't know if it's useful, but here is the output in the terminal with MOZ_X11_EGL=1
:
Can't find symbol 'eglGetNativeClientBufferANDROID'.
Can't find symbol 'eglQuerySurfacePointerANGLE'.
Can't find symbol 'eglCreateStreamKHR'.
Can't find symbol 'eglDestroyStreamKHR'.
Can't find symbol 'eglQueryStreamKHR'.
Can't find symbol 'eglStreamConsumerGLTextureExternalKHR'.
Can't find symbol 'eglStreamConsumerAcquireKHR'.
Can't find symbol 'eglStreamConsumerReleaseKHR'.
Can't find symbol 'eglStreamConsumerGLTextureExternalAttribsNV'.
Can't find symbol 'eglCreateStreamProducerD3DTextureANGLE'.
Can't find symbol 'eglStreamPostD3DTextureANGLE'
I'm sorry if this is confusing.
Assignee | ||
Comment 24•4 years ago
|
||
I think I now know what's happening here at IIUC it's either a bug in Xwayland or of the GLX vsync implementation.
Xwayland has a mechanism to adapt its reported refresh rate to Wayland frame callbacks. If no callbacks arrive, it will throttle to 1Hz (can can simply test by running glxgears
, hiding it behind some other window - it will fall back to 1fps in a Wayland session).
Although the GLX vsync source on Xwayland does not properly detect the higher refresh rates, the throttling mechanism still appears to apply to it. You can reproduce by:
- open FF X11 in Gnome Shell Wayland session
- open https://www.vsynctester.com/
- cover FF with some other window (I used gnome-terminal)
- wait a few seconds
- switch to FF again
The website will report much lower fps for the last seconds, while quickly catching up again.
Now in the case described above, one window is always on a hidden workspace, thus not getting frame callbacks and thus being throttled (as soon as the overview is opened, frame callbacks will get send to all windows). So apparently the throttling to 1Hz somehow happens despite the fact that there's a window completely visible. Why the effect only gets visible when both windows render I can only speculate about - but my guess is that if one window is idle, its vsync source gets stopped, thus it will not call into the glXWaitVideoSync
.
To summarize, I think this could be a bug in the glXWaitVideoSync
implementation, either in mesa or Xwayland, or in the way we handle the vsync source.
Michel, does this sound reasonable / possible?
Comment 25•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #24)
Michel, does this sound reasonable / possible?
I'm afraid not.
I'm able to reproduce the problem, but glXWaitVideoSyncSGI
seems to be consistently returning after just a fraction of a second.
(Since glXWaitVideoSyncSGI
doesn't take a drawable parameter, the Mesa implementation uses the drawable of the current GLX context, which is the root window AFAICT on the GLXVsyncThread
. Since there's no Wayland object corresponding to the X11 root window, Xwayland
uses a fake ~60 Hz timer for its MSC)
AFAICT the problem is that the Renderer
thread of the main Firefox process repeatedly calls glXSwapBuffers
for the background window, which in the long term only returns after ~1s (when one of the previous buffer swaps has actually completed).
Assignee | ||
Comment 26•4 years ago
|
||
(In reply to Michel Dänzer from comment #25)
AFAICT the problem is that the
Renderer
thread of the main Firefox process repeatedly callsglXSwapBuffers
for the background window, which in the long term only returns after ~1s (when one of the previous buffer swaps has actually completed).
Ah that makes sense, thanks!
Assignee | ||
Comment 27•3 years ago
|
||
So I think the solution here is to disable the GLX vsyncsource and use the software 60Hz timer on Xwayland. There's no regression for us here, as Xwayland also serves us with a software 60Hz timer. Also the native Wayland backend now ships a the frame callback based source, i.e. users needing 144Hz etc. can just switch to that.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 28•3 years ago
|
||
wayland will give us a 60Hz timer for glXWaitVideoSyncSGI
anyway,
but an optimization in Xwayland to reduce that to 1Hz if a window is
occluded can cause issues for us in multi-window cases.
In unaffected (i.e. single window) cases this will make us consume
more resources, as rendering will not get throttled to 1Hz anymore
when hidden. The native Wayland backend supports this, however.
Comment 29•3 years ago
|
||
Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/integration/autoland/rev/5e30c026c632 Do not use GLX vsync source on Xwayland, r=stransky
Comment 30•3 years ago
|
||
Backed out as requested by dev.
Backout link: https://hg.mozilla.org/integration/autoland/rev/40bd5bbe396ce4cb1a5bfd86d852307d7e71dc97
Assignee | ||
Comment 31•3 years ago
|
||
Thanks. This possibly needs a more comprehensive approach.
Assignee | ||
Comment 32•3 years ago
|
||
I increasingly think we'll have to implement bug 1640779 for this. The good news is that after bug 1645528 a lot of per-window infrastructure is now in place and used by default on Wayland.
Comment 33•3 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #32)
I increasingly think we'll have to implement bug 1640779 for this.
Seems doubtful that (by itself) would make any difference either, since the issue here isn't directly related to VSync functionality; it's that SwapBuffers
blocks, which will be an issue as long as Firefox calls SwapBuffers
for multiple windows from the same thread.
Assignee | ||
Comment 34•3 years ago
|
||
(In reply to Michel Dänzer from comment #33)
(In reply to Robert Mader [:rmader] from comment #32)
I increasingly think we'll have to implement bug 1640779 for this.
Seems doubtful that (by itself) would make any difference either, since the issue here isn't directly related to VSync functionality; it's that
SwapBuffers
blocks, which will be an issue as long as Firefox callsSwapBuffers
for multiple windows from the same thread.
The idea is the EGL vsync source would throttle down SwapBuffers
for those Windows to 1Hz under exactly the same conditions where we hit this issue IIUC.
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 36•3 years ago
•
|
||
I ran into this today (using Nightly on Ubuntu 21.04), due to having a Google Doc as the foreground tab in a background window. (Specifically, it was my 1:1 notes doc with my manager) The doc is only 7 US Letter pages long, so this isn't a huge-amounts-of-content-being-painted sort of issue. It may have been due to the blinking cursor (either my own local blinking-cursor or the second blinking-cursor from me-having-the-doc-open-in-another-tab).
I was getting ~1 second paint latency for UI and interacting with other windows. Things went back to normal when I closed the Google Doc. (And I was able to make the issue happen again by opening a new background-window with the same Google Doc again.)
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 37•3 years ago
|
||
Just for the record: strictly speaking this is not a Firefox issue but a Xwayland one - some other applications have been reported to be affected as well, for example steam. The big question for is whether it's worth to try to implement the proposed solution from comment 32, or rather just push the native Wayland backend to get production ready (bug 1543600). Given that the solution via EGL would likely not cover e.g. nvidia prop. drivers for a while makes me think that going native Wayland is the better investment.
So for everyone affected I'd recommend to enable the native Wayland backend via MOZ_ENABLE_WAYLAND=1
(see also https://mastransky.wordpress.com/2020/03/16/wayland-x11-how-to-run-firefox-in-mixed-environment/)
Comment 38•3 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #37)
Just for the record: strictly speaking this is not a Firefox issue but a Xwayland one
It's the Wayland compositor which stops sending frame callbacks. Xwayland doesn't hold anything back.
- some other applications have been reported to be affected as well, for example steam.
Right, same issue there, calling SwapBuffers
for multiple windows on the same thread.
The big question for is whether it's worth to try to implement the proposed solution from comment 32, or rather just push the native Wayland backend to get production ready (bug 1543600). Given that the solution via EGL would likely not cover e.g. nvidia prop. drivers for a while makes me think that going native Wayland is the better investment.
Agreed.
Assignee | ||
Comment 40•3 years ago
|
||
For some odd reasons I'm unable to reproduce this for a while now on recent Gnome + recent Firefox. Can anyone confirm to still see this when running nightly in a Wayland session (but of course Firefox X11 backend)?
Comment 41•3 years ago
•
|
||
Behavior of this EGL/Xwayland bug seemed to be similar to GLX/Nvidia bug 1716049.
(Sotaro Ikeda [:sotaro] from bug 1716049 comment #10)
When WebRender is enabled, RenderCompositorOGL and GLContextGLX. RenderCompositorOGL creates GLContextGLX for each window. I wonder if it might be related to the problem. If RenderCompositorEGL is used only one GLContextEGL is created for all windows.
Then I would assume bug 1684194 has fixed this EGL/Xwayland bug.
Comment 42•3 years ago
|
||
But comment 36 was after bug 1684194.
Assignee | ||
Comment 43•3 years ago
|
||
(In reply to Darkspirit from comment #42)
But comment 36 was after bug 1684194.
It's not obvious to me if comment 36 was on EGL. Daniel, could you retest recent nightly (which enables EGL by default on recent Mesa) and check if you still see the issue?
An Jan, could you also confirm that you can't reproduce it?
It would be such an relieve if this was finally fixed!
Assignee | ||
Comment 47•3 years ago
|
||
Jan, can you shortly confirm that you also can't reproduce the issue?
The odd thing is that I also can't reproduce the issue with GLX (gfx.x11-egl.force-disabled
). To me this indicates that something must have changed in Xwayland or Mesa, but I'm not aware of anything that could have fixed it. Then again, other apps like Steam were also hit by this bug, so there's a chance that somebody got fixed somewhere.
In any case, I'm very inclined to reenable HW-WR on Xwayland on recent Mesa - maybe only if EGL is available, increasing the chance that this does not happen (because of bug 1684194).
Comment 48•3 years ago
|
||
As discussed on IRC, apparently Firefox sets swap interval 0 now, which avoids the problem.
Comment 49•3 years ago
•
|
||
Debian Testing (Frankendebian with Mesa 21.2.1 from unstable), Gnome (X)Wayland, Intel Iris Graphics 6100 (BDW GT3)
(Michel Dänzer from comment #48)
As discussed on IRC, apparently Firefox sets swap interval 0 now, which avoids the problem.
bug 1515448 set fSwapInterval(0) on EGL/Wayland (visible window froze when the other window was on another workspace & invisible).
bug 1684194 + bug 1713468 + bug 1695933 brought the fix to X11+Xwayland.
Yes, it was bug 1684194 which fixed EGL/Xwayland:
https://hg.mozilla.org/integration/autoland/shortlog/52299c7cbec4
last bad: MOZ_X11_EGL=1 mozregression --repo autoland --launch de1a1b350e9e0fb606cc7f5b709df544af8dd313 --pref gfx.webrender.all:true -a https://www.vsynctester.com/ -a https://www.vsynctester.com/
first good: MOZ_X11_EGL=1 mozregression --repo autoland --launch 52299c7cbec44f2fe75273acdf2aed8e2496931c --pref gfx.webrender.all:true -a https://www.vsynctester.com/ -a https://www.vsynctester.com/
This bug is still reproducible with GLX/Xwayland:
Attached screencast: mozregression --launch 20210913213224 --pref gfx.x11-egl.force-disabled:true gfx.webrender.all:true -a https://www.vsynctester.com/ -a https://www.vsynctester.com/
Nvidia bug 1716049 seems to be similar to this GLX/Xwayland bug.
After submitting this comment I will switch to a regular X11 session and test there.
Yes, GLX/Xwayland can be fixed by setting swap interval to 0 (layout.frame_rate=0):
mozregression --launch 20210913213224 --pref gfx.x11-egl.force-disabled:true gfx.webrender.all:true layout.frame_rate:0 -a https://www.vsynctester.com/ -a https://www.vsynctester.com/
// Many GLX implementations default to blocking until the next
// VBlank when calling glXSwapBuffers. We want to run unthrottled
// in ASAP mode. See bug 1280744.
const bool isASAP = (StaticPrefs::layout_frame_rate() == 0);
mGLX->fSwapInterval(*mDisplay, mDrawable, isASAP ? 0 : 1);
There seem to be made some assumptions based on the layout.frame_rate pref:
https://searchfox.org/mozilla-central/rev/fb7c66cb59ccc282aecfe157b05dc12b1e38753f/gfx/layers/ipc/CompositorBridgeParent.cpp#248
static int32_t CalculateCompositionFrameRate() {
// Used when layout.frame_rate is -1. Needs to be kept in sync with
// DEFAULT_FRAME_RATE in nsRefreshDriver.cpp.
// TODO: This should actually return the vsync rate.
Maybe the layout.frame_rate pref should be left untouched (= -1)
and the code directly be changed to mGLX->fSwapInterval(*mDisplay, mDrawable, 0);
?
I don't know. I am not a programmer.
Comment 50•3 years ago
•
|
||
(In reply to Darkspirit from comment #49)
Nvidia bug 1716049 seems to be similar to this GLX/Xwayland bug.
After submitting this comment I will switch to a regular X11 session and test there.
Have moved the second Firefox window
- behind the terminal and
- onto another workspace
and could not reproduce any instant problem on:
- GLX/Gnome X11/Intel
- GLX/KDE X11 without compositing/Intel
In case they are affected, then it would take time to reproduce or different STR.
Assignee | ||
Comment 51•3 years ago
|
||
This bug is still reproducible with GLX/Xwayland
Thanks Jan! Good to know that you can still reproduce the issue on GLX but not on EGL after bug 1684194. I'll take that as a "go" for bug 1730671, but will still wait till shortly before the next beta (as it doesn't affect nightly). Fingers crossed that all of this (enabling EGL by default etc.) sticks!
Comment 52•3 years ago
•
|
||
codepath-is-for-android irrelevant |
(Darkspirit from bug 1730822 comment 7)
WebRender (Software OpenGL)
[...]
apparently does not use
RenderCompositorEGL
yet.
Yes, SW WR OpenGL is affected by bug 1635186 on EGL/Xwayland while regular WR is not anymore.
MOZ_X11_EGL=1 mozregression --launch 2021-09-15 --pref devtools.chrome.enabled:true gfx.webrender.all:true gfx.webrender.software:true gfx.webrender.software.opengl:true -a https://www.vsynctester.com/ -a https://www.vsynctester.com/
Assignee | ||
Comment 53•3 years ago
|
||
This bug essentially depends on bug 788319 - as we unblock more hardware/drivers for EGL, it will automatically solve the issue here as well.
Assignee | ||
Comment 54•2 years ago
|
||
Closing as there's nothing to track here any more. We'll only enable HW-WR on Xwayland for setups that are also qualified for EGL, which solves the issue here.
Description
•