Top half of the browser has a solid color when using WebRender/GLX/Gnome X11 with GTK_CSD=1/proprietary Nvidia
Categories
(Core :: Widget: Gtk, defect)
Tracking
()
People
(Reporter: jan, Unassigned)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: correctness, nightly-community, regression)
Attachments
(8 files)
+++ This bug was initially created as a clone of Bug #1663273 +++
See bug 1663273 comment 80 + comment 81.
Comment 1•5 years ago
|
||
I don't know if this is the right place, but I don't want to reopen the other closed tickets.
I was using the beta, and it auto-updated to 87b4 (?). I don't remember what was the previous version (i didn't update for some time).
This white block always happened to me whenever I tried to enable webrender, and when I updated, it happened again. I actually had to fall back to stable, since the "disable Title Bar" feature stopped working as well (as in: the tabs are not rendered in place of the title bar)
I tried the Nightly 88.0a1 (2021-03-01) (64-bit) on Elementary OS, RTX 3070, NVidia driver 460.39 and the same bugs happened: White top block, and title bar stopped working.
I'm attaching some files to show it.
Comment 2•5 years ago
|
||
Comment 3•5 years ago
|
||
Comment 4•5 years ago
|
||
Is the GTK_CSD=1 env variable significant here? Do you see it when it's not set?
Comment 5•5 years ago
|
||
Okay, I tested in multiple situations:
TL;DR: GTK_CSD=0 solves both bugs: the white box, and the disable title bar feature.
The printenv command shows that GTK_CSD is already set to 1 by default, and MOZ_X11_EGL is also 1.
When I just click on the firefox executable on the file manager, the bug happens, compositor is WebRender.
If I just run ./firefox on the command line, it falls back to Webrender (software).
In the stdout, I see errors like:
[GFX1-]: Failed to create EGLSurface!: 0x3009
[GFX1-]: Failed to create EGLSurface!: 0x3009
[GFX1-]: Failed GL context creation for WebRender: 0
[GFX1-]: FEATURE_FAILURE_WEBRENDER_INITIALIZE_UNSPECIFIED
[GFX1-]: Failed to connect WebRenderBridgeChild.
[GFX1-]: Fallback WR to SW-WR
If I run GTK_CSD=1 ./firefox, nothing changes, of course. Still runs with Webrender (software).
If I run GTK_CSD=0 ./firefox, then it runs with hardware Webrender, and the white box disappears.
I also tried to remove the MOZ_X11_EGL environment variable from my bashrc, these are the results:
If I just run ./firefox on the command line, the bug happens, compositor is Webrender. (It's just like clicking on the executable)
Just to sanity check, I run MOZ_X11_EGL=1 ./firefox and then it uses software webrender.
If I run GTK_CSD=0 ./firefox, no white box, harware webrender... works fine.
If I run MOZ_X11_EGL=1 GTK_CSD=0 ./firefox it also works fine.
Comment 6•5 years ago
|
||
So perhaps I didn't read the title of this bug and should have tested it before? I'm sorry.. though it seems like the bug still happens tho
Comment 7•5 years ago
|
||
I wonder why you have set GTK_CSD and MOZ_X11_EGL. Did you set it by yourself or does that come from distro?
GTK_CSD is obsoleted and we should not use that in Firefox directly - I'll a check from
https://searchfox.org/mozilla-central/rev/f83c67b24fed1d677c5deafe7b31f5656c2656ec/widget/gtk/nsWindow.cpp#8175
(Bug 1209659 may be related)
As for MOZ_X11_EGL, this is experimental feature and it's not finished yet (Bug 1677203).
Thanks.
Comment 8•5 years ago
|
||
I do remember messing around with MOZ_X11_EGL, I don't remember exactly why. I entered a rabbit hole of confusion when trying to enable hardware video acceleration, I messed around with a ton of stuff. Eventually I reached MOZ_X11_EGL and tried to do something with it.
As for the GTK_CSD, it's being set by /etc/profile.d/gtk_csd.sh, which I don't ever remember messing with. I think it is a distro setting?
I don't know what GTK_CSD and MOZ_X11_EGL really mean and what are the consequences of enabling or disabling them, but I'll try disabling CSD and removing the EGL variable in bashrc.
Comment 9•5 years ago
|
||
Though 'll leave a quick comment here: assuming that GTK_CSD is in fact being set by the distro, I think the upgrade to 87 beta4 did break things for those who have the same distro, other than webrender. I'm not sure that the EGL thing broke anything.
I did not understand what you mean in your comment about that line of code, if it will be removed, modified, or if the CSD just triggers that check.
Comment 10•5 years ago
|
||
The titlebar and borders are fixed at Bug 1693460.
Can you check if the WebRender/GLX bug is a regression or not? Please use mozregression tool for it:
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems?rd=Bug_info_Firefox#Use_Mozregression_tool
Thanks.
Comment 11•5 years ago
|
||
Hi Martin,
I tried several configurations:
This is my environment:
echo $MOZ_X11_EGL: Variable is unset
echo $GTK_CSD: value is 0
Regressions with mozregression --good 86 --bad 87:
None
Regressions with MOZ_X11_EGL=0 mozregression --good 86 --bad 87
None
Regressions with MOZ_X11_EGL=1 mozregression --good 86 --bad 87
None
Regression with GTK_CSD=1 mozregression --good 86 --bad 87
This test just changes the GTK_CSD variable, leaving MOZ_X11_EGL unset. Title bar and White block observed.
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=8471b70b4df960d3599dcd951f0b05fb4f7bd420&tochange=12744d62ec8944fe64bb028a68bcab2c4665cf7b
Regression with GTK_CSD=1 MOZ_X11_EGL=0 mozregression --good 86 --bad 87
This one only produces the title bar bug
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=3d4360e021fa62a3dfc40df2295038622f7cfa96&tochange=d6388772d4c63331ad4dfebdbaa945364dada2e1
Regression with GTK_CSD=1 MOZ_X11_EGL=1 mozregression --good 86 --bad 87
Exact same results as the above. Seems like the code only checks if MOZ_X11_EGL is set, whether it's 1 or 0 doesn't matter?
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=3d4360e021fa62a3dfc40df2295038622f7cfa96&tochange=d6388772d4c63331ad4dfebdbaa945364dada2e1
With GTK_CSD=0 there are no regressions. Is GLX the thing being replaced by EGL?
Comment 12•5 years ago
|
||
(In reply to ricardopieper from comment #11)
Thanks. Let's wait until Bug 1693460 hits the nightly builds.
With GTK_CSD=0 there are no regressions. Is GLX the thing being replaced by EGL?
Not yet, there's a plan to use it for Mesa drivers only.
Comment 13•5 years ago
|
||
So can you please re-test with GTK_CSD=1?
Thanks.
Comment 14•5 years ago
|
||
Hi, sorry for the delay.
GTK_CSD=1 still results in the top half of the browser having a solid color. I downloaded the latest nightly.
Comment 15•5 years ago
|
||
Okay. And is the titlebar bug fixed at least?
Comment 16•5 years ago
|
||
The hide title bar feature seems to be working fine though.
Comment 17•5 years ago
|
||
Maybe this is irrelevant information, but the MOZ_X11_EGL=1 flag now crashes the browser.
Should I report this elsewhere?
GFX1-: Failed to connect WebRenderBridgeChild.
GFX1-: Failed to create EGLSurface!: 0x3009
GFX1-: Failed to create EGLSurface!: 0x3009
GFX1-: Failed GL context creation for WebRender: 0
Comment 18•5 years ago
|
||
(In reply to ricardopieper from comment #17)
Maybe this is irrelevant information, but the MOZ_X11_EGL=1 flag now crashes the browser.
Should I report this elsewhere?GFX1-: Failed to connect WebRenderBridgeChild.
GFX1-: Failed to create EGLSurface!: 0x3009
GFX1-: Failed to create EGLSurface!: 0x3009
GFX1-: Failed GL context creation for WebRender: 0
EGL is not supposed to run with NVIDIA drivers, we don't enable EGL there so don't use MOZ_X11_EGL.
Comment 19•5 years ago
|
||
Let me summarize it please.
So IIRC you get a white square when running latest nightly with GTK_CSD=1 and WebRender enabled, right? (I don't mention MOZ_X11_EGL because MOZ_X11_EGL is not working with NVIDIA cards).
Comment 20•5 years ago
|
||
Also please attach content of about:support page, Thanks.
Comment 21•5 years ago
|
||
Comment 22•5 years ago
|
||
(^ I accidentally posted the json above before posting this text)
Yes, that was correct. GTK_CSD=1 and WebRender enabled. I checked the about:support page before posting those results. It was just showing WebRender, not WebRender (software something)
I would like to post the about:support here just to prove it... but now I'm having a new issue where it doesn't even try to run with WebRender enabled. The MOZ_x11_EGL is unset, there is no value there, but I still get this error:
[GFX1-]: glxtest: libEGL initialize failed
[GFX1-]: glxtest: X error, error_code=2, request_code=151, minor_code=3
[GFX1-]: glxtest: process failed (exited with status 1)
[GFX1-]: Failed GL context creation for WebRender: 0
[GFX1-]: FEATURE_FAILURE_WEBRENDER_INITIALIZE_UNSPECIFIED
[GFX1-]: Failed to connect WebRenderBridgeChild.
[GFX1-]: Fallback (SW-)WR to Basic
I'm a bit lost. I attached the raw about:support json.
Comment 23•5 years ago
|
||
This ran in the latest nightly, brand new profile, and I just set the webrender.enabled flag to True on the about:config page.
Comment 24•5 years ago
|
||
Also, GTK_CSD=1 or 0 doesn't change the result.
Comment 25•5 years ago
|
||
AAllso, it doesn't seem to be a white square per se... the color seems to be determined by the background color of the page. Just to make sure there is no misunderstanding here.
Reporter | ||
Comment 26•5 years ago
|
||
Summary:
- Affected users: Nightly+Early Beta on Gnome X11 with GTK_CSD=1 env var on proprietary Nvidia (bug 1673752 comment 5 enabled WR on proprietary Nvidia 460.32.03 or newer)
- Comment 1 to 25 are the same as comment 0:
(Darkspirit from bug 1663273 comment 80)
Proprietary Nvidia, GTX1060, Debian Testing
Basic GLX WR GLX SWWR GLX Basic EGL WR EGL SWWR EGL Gnome X11 with GTK_CSD=1 fine this bug is still present bug 1674473 fine fallback, see below bug 1674473
WR GLX/Gnome X11 with GTK_CSD=1/proprietary Nvidia: top half with solid color
GTK_CSD=1 mozregression --repo try --launch 230f8c44f18b85947446ab2c9ba98dd17380b716 --pref gfx.webrender.all:true -a about:support
WR EGL/Gnome X11 with GTK_CSD=1/proprietary Nvidia: GL context failure:
Gnome X11 with GTK_CSD=1 and MOZ_X11_EGL=1 on proprietary Nvidia:
Almost the same as in comment 48, but WebRender now falls back to Basic instead of OpenGL (bug 1677825):
GTK_CSD=1 MOZ_X11_EGL=1 mozregression --repo try --launch 230f8c44f18b85947446ab2c9ba98dd17380b716 --pref gfx.webrender.all:true -a about:supportCompositing Basic
(#0) Error Failed to create EGLSurface!: 0x3009
(#1) Error Failed to create EGLSurface!: 0x3009
(#2) Error Failed GL context creation for WebRender: 0
(#3) Error FEATURE_FAILTURE_WEBRENDER_INITIALIZE_UNSPECIFIED
(#4) Error Failed to connect WebRenderBridgeChild.
Comment 27•4 years ago
|
||
Just wanted to report that I also experience this issue
- for the last few months when WebRender was not yet enabled by default, I tried enabling it after FF updates to check if it is working (it did not).
- since yesterday when FF got updated to 89.0 on elementary OS
setting GTK_CSD=0 as a workaround also works in my case.
specs:
FF 89.0 (and versions before) - troubleshooting info: https://zerobin.net/?d1c2ef50d5602369#CzK+voiAKNcpvzztScxSZq8ijvRrH+XsYdh3ZwcLuYM=
elementary OS 5.1.7 (Built on Ubuntu 18.04.4 LTS)
Linux 5.4.0-74-generic
GTK 3.22.30
Nvidia 2070 Super @ proprietary 460.80
Comment 28•4 years ago
|
||
Reporter | ||
Comment 29•4 years ago
|
||
[Tracking Requested - why for this release]:
It seems this was shipped to release.
Elementary OS seems to have GTK_CSD=1 environment variable by default. Proprietary Nvidia users are affected.
Updated•4 years ago
|
Comment 32•4 years ago
|
||
Hm, this somehow fell through the cracks and went into release a bit too soon I guess (https://searchfox.org/mozilla-central/source/widget/gtk/GfxInfo.cpp#743-749).
Andrew, I think we need to limit WR NV prop. driver rollout to DEs where this doesn't happen.
Martin, do we really need this different CSD types? Can we do anything to make the DEs support the CSD types that are not affected by this bug?
Updated•4 years ago
|
Comment 34•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #32)
Martin, do we really need this different CSD types? Can we do anything to make the DEs support the CSD types that are not affected by this bug?
I don't think it's related to CSD, because we use CSD on Elementary OS by default, no matter if GTK_CSD is set or not, see:
https://searchfox.org/mozilla-central/rev/e8904db16ac45bff0ffe65e7289f8d2f00c48c48/widget/gtk/nsWindow.cpp#8700
https://searchfox.org/mozilla-central/rev/e8904db16ac45bff0ffe65e7289f8d2f00c48c48/widget/gtk/nsWindow.cpp#8738
It may be related to disabled titlebar so maybe window configuration / GL window may be wrong or so.
Andy, can you try to enable system titlebar (go to Hamburger menu -> Customize Toolbar -> 'Title bar' check box at left bottom corner) and try again?
Comment 35•4 years ago
|
||
Hey Martin,
when not setting GTK_CSD=0 manually and enabling the title bar as you suggested, the issue still exists.
Comment 36•4 years ago
|
||
Comment 38•4 years ago
|
||
Hi there, I originally reported 1714355 which was marked as duplicate. I upgraded my drivers again (nVidia prop. 465.27) to check some of the suggestions and it turned out, that the problem does not persist in Firefox 90 DE. I thought that might be a helpful information and made a screenshot comparing 90 DE with 89 (as a Flatpak).
Reporter | ||
Comment 39•4 years ago
|
||
(In reply to nr from comment #38)
problem does not persist in Firefox 90 DE.
Please attach your Firefox 90 DE about:support information.
Comment 40•4 years ago
|
||
Comment 41•4 years ago
|
||
This is very interesting - maybe there's a bug in the NV prop. driver concerning depth buffers - and in FF90 we stopped using them, see bug 1711490
Comment 42•4 years ago
|
||
Andy, Ricardo, can you confirm that the issue is resolved in beta/nightly? That would be great news!
Comment 43•4 years ago
|
||
I have the same issue (#1714355) and I can confirm it works in v90b (flatpak: flathub-beta)
Right now for v89, as we all know, the workaround is to disable hardware acceleration
Reporter | ||
Comment 44•4 years ago
|
||
Could someone try to find out which commit fixed it? At the end, you get a pushlog URL:
$ pip3 install --user mozregression
$ mozregression --find-fix --bad 2021-03-22 --good 2021-05-31 --pref gfx.webrender.all:true
Comment 45•4 years ago
|
||
Hi Robert,
I can confirm that the issue does not exist on 90.0b5 (64-Bit)
Updated•4 years ago
|
Comment 46•4 years ago
|
||
6:51.57 INFO: No more integration revisions, bisection finished.
6:51.57 INFO: First good revision: de6dfc676a6877428343a1c0fdbb099fc6b3ebfd
6:51.57 INFO: Last bad revision: 6bce4e61777b33736a0bde8ec3bb88a26a8f430d
6:51.57 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=6bce4e61777b33736a0bde8ec3bb88a26a8f430d&tochange=de6dfc676a6877428343a1c0fdbb099fc6b3ebfd
Comment 47•4 years ago
|
||
Thanks for confirming, so this was indeed an issue with the depth buffer (bug 1711490 as mentioned in comment 41 just removed it, but we stopped using it in bug 1696905).
Comment 48•4 years ago
|
||
Arthur, this is apparently another bug in the nvidia driver. It is now fixed in FF90 by using cpu side culling instead of using a depth buffer, but you may want to have a look never the less, in case other applications also run into it.
Comment 49•4 years ago
|
||
I have hit this issue myself and was curious to understand the root cause.
What is the bug thought to be exactly? I haven't seen a description of what was wrong with the depth buffer.
Do you maybe have a standalone reproducer or an Apitrace to inspect the GL command stream?
Thanks
Updated•4 years ago
|
Comment 50•4 years ago
|
||
I'm very sorry, I currently don't have a device to reproduce or capture a trace right now. All I can tell you is that the depth buffer was used to order overlapping tiles during compositing the final image (IIUC), and that it worked correctly on Mesa drivers and other OSs like Android. As the bug apparently only appears in combination with a title bar, chances are that there's odd happening in the GTK backend that wouldn't happen on other platforms.
Reporter | ||
Comment 51•4 years ago
•
|
||
(I'm not a developer.)
This bug also occured with "WebRender/GLX/KDE with disabled compositing/proprietary Nvidia" when non-alpha visual and XShape were used (bug 1663273 comment 45). bug 1663273 comment 82 fixed it for KDE back then.
Then I filed this bug for the unfixed "WebRender/GLX/Gnome X11 with GTK_CSD=1 environment variable/proprietary Nvidia" case.
- comment 27: Elementary OS is affected because it sets the GTK_CSD=1 env var by default: https://github.com/elementary/gala/issues/244#issuecomment-438259702
Manually setting GTK_CSD=0 environment variable fixed the problem. - bug 1696905 comment 5 seems to have fixed this bug according to comment 46.
Comment 52•4 years ago
|
||
I tested proprietary Nvidia with WebRender/GLX/Gnome X11 with GTK_CSD on/off on Fedora 34 but I can't reproduce it.
It may be related to Elementary OS.
Reporter | ||
Comment 54•4 years ago
•
|
||
Should Firefox really be left broken for Nvidia/Elementary OS users until the next release? This bug has been closed as duplicate and only 89 is affected. Fixed Firefox 90 will be released in two weeks.
Reporter | ||
Updated•4 years ago
|
Updated•4 years ago
|
Description
•