glxtest fails in Firefox 86 if GLES >= 3.0 is not supported
Categories
(Core :: Graphics, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox86 | --- | wontfix |
firefox87 | --- | fixed |
People
(Reporter: vd.Kraats, Assigned: rmader)
References
(Regression)
Details
(Keywords: regression, regressionwindow-wanted)
Attachments
(4 files, 2 obsolete files)
User Agent: Mozilla/5.0 (X11; Linux i686; rv:86.0) Gecko/20100101 Firefox/86.0
Steps to reproduce:
Normally I run firefox with MOZ_ENABLE_WAYLAND=1 at config/environment.d/60_firefox.conf at Linux debian 5.10.0-1-686-pae.
When I start Firefox it looks normal.
Actual results:
But a web-page at the first tab cannot be scrolled down by mouse at the right scrollbar. Also the page for the second tab does not appear. If you close firefox it then asks permission for closing 2 tabs.
Windows do not appear or do not disappear.
In fact it is completely unusable.
Expected results:
Firefox ESR with wayland and 86.0b2 without wayland are working correctly.
Comment 1•4 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Comment 2•4 years ago
|
||
Can you use mozregression tool to find the wrong commit? How-to is here:
https://fedoraproject.org/wiki/How_to_debug_Firefox_problems?rd=Bug_info_Firefox#Use_Mozregression_tool
Thanks.
Updated•4 years ago
|
I add a regression-file.
The wrong firefox-sessions already stop at a popup-screen from Google, where it asks for a advertisement-policy.
Comment 4•4 years ago
|
||
I'm afraid the push log is too bug so I can't identify the issue from it.
I wonder if it can be a duplicate of Bug 1687212.
btw. Do you see it with latest nightly too?
Thanks.
Comment 5•4 years ago
|
||
Please attach content of about:support page.
Thanks.
I attach about:support.
Does not look good.
When starting firefox with wayland I see the warning:
[GFX1-]: glxtest: libGLESv2 glGetString returned null
Nightly has the same problem.
If I only use the laptop-screen I get the errors:
(#0) Error: No GPUs detected via PCI
(#1) Error: glxtest: process failed (received signal 11)
Updated•4 years ago
|
Updated•4 years ago
|
Extra info.
At previous versions, Firefox with Wayland accepted OpenGL ES 2.0:
WebGL 1 Driver Version:
OpenGL ES 2.0 Mesa 20.3.3
Intel Open Source Technology Center -- Mesa DRI Intel(R) 945GM x86/MMX/SSE2
Looking at the source I have the impression that Beta and Nightly now require a minimal version ES 3.0. otherwise the error:
glxtest: libGLESv2 glGetString returned null
occurs.
This is strange because Wayland itself supports ES 2.0 and firefox previously did not give problems with ES 2.0.
This error disappears if LIBGL_ALWAYS_SOFTWARE=1 is used.
After clearing the cache WEBGL1 and WEBGL2 are supported again according to "about:support", but firefox with wayland still is not usable.
In that case at Nightly with wayland and without wayland still remain the same errormessages:
[GFX1-]: More than 1 GPU detected via PCI, cannot deduce vendor
[GFX1-]: PCI candidate 0x8086/0x27a6
[GFX1-]: PCI candidate 0x8086/0x27a2
This can be caused by duplicate bus devices for same VGA controller, which seems to be a known property of 945GM:
lspci -nn
.....
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller [8086:27a2] (rev 03)
00:02.1 Display controller [0380]: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller [8086:27a6] (rev 03)
.....
Jan 29 16:51:29 debian kernel: [ 3.848451] [Firmware Bug]: Duplicate ACPI video bus devices for the same VGA controller, please try module parameter "video.allow_duplicates=1"if the current driver doesn't work.
Comment 9•4 years ago
|
||
Robert, may that come from the multi-gpu patches?
Thanks.
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to vd.Kraats from comment #8)
...
Looking at the source I have the impression that Beta and Nightly now require a minimal version ES 3.0. otherwise the error:
glxtest: libGLESv2 glGetString returned null
occurs.
...
I pocked at this a bit and got some weird result.
First of all I using MESA_GLES_VERSION_OVERRIDE=2.0
with the EGL version of glGetString
makes it break - it always returns NULL, making us fail with the above message. The GLX version, however, works just fine. This part looks like a mesa bug.
However, that should not make the whole browser break - it should simply fall back to software rendering. Your about:support
indicates that this works fine.
For the record: this probably have started after bug 1640053 landed, which makes us use EGL in glxtest instead of GLX. Now while this definitely should get fixed in mesa so setups like this have a chance of getting WebGL 1 support, what needs further investigation is why things fall apart - I assume some process crashes or so. Will dick deeper.
Assignee | ||
Comment 11•4 years ago
|
||
Just set up an old laptop with similar hardware, showing the same behaviour in about:support
. However it works well apart from WebGL 1 being broken. As this is a Mesa bug it should get fixed there IMO (will give it a try soon).
vd.Kraats: is latest nightly still broken for you, apart from WebGL?
Assignee | ||
Comment 12•4 years ago
|
||
Filed https://gitlab.freedesktop.org/mesa/mesa/-/issues/4283 now. I suppose this should be easy to fix.
Reporter | ||
Comment 13•4 years ago
|
||
Both latest nightly and also latest beta 86.0b9 with wayland are not broken anymore, but still have the WEBGL-issue.
Assignee | ||
Comment 14•4 years ago
|
||
As it would make us fail on GLES 2.0 hardware. We could do much
better here by properly checking GL and GLES context etc. but apparently
we are only really interested in whether we are on GL/GLES 1 hardware.
Therefore keep the test as simply as it is for now.
Also error out if glGetString
returns empty values, in order to
fall back to GLX where applicable.
Updated•4 years ago
|
Assignee | ||
Comment 15•4 years ago
|
||
(In reply to vd.Kraats from comment #13)
Both latest nightly and also latest beta 86.0b9 with wayland are not broken anymore, but still have the WEBGL-issue.
Great - the WebGL issue should be fixed by the trivial patch above.
Assignee | ||
Comment 16•4 years ago
|
||
- Do not require GLES >= 3.0 any more to succeed, fixing this bug.
- Use GL instead of GLES, matching GLX behavior. This way we can avoid
most regressions related to the EGL switch. One scenario here is older
intel hardware supporting GL 1.4 and GLES 2.0 - we want to continue
blocking this hardware altogether, as e.g. WebGL does not support a
fallback to GLES in this case, resulting in crashes. - Add more error messages and early returs, making future debugging
easier. - Remove
eglCreatePbufferSurface
- it always failed anyway, unnoticed!
After this patch we should always match GLX behavior when setting
different combinations of MESA_GL_VERSION_OVERRIDE
and MESA_GLES_VERSION_OVERRIDE
Updated•4 years ago
|
Assignee | ||
Comment 17•4 years ago
|
||
Repurposing this bug for the glxtest issue.
Updated•4 years ago
|
Comment 18•4 years ago
|
||
Comment 19•4 years ago
|
||
bugherder |
Assignee | ||
Comment 20•4 years ago
|
||
Comment on attachment 9203044 [details]
Bug 1689207 - Some fixes for EGL in glxtest, r=aosmond,stransky
Beta/Release Uplift Approval Request
- User impact if declined: All Linux users, X11 and Wayland, without GLES >3.0 support will have no WebGL. This includes many older devices and apparently Debian with prop. Nvidia drivers.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Only applies to Linux, changes are only in glxtest - in the very worst case, i.e. a crash, certain users wouldn't have WebGL/WR. But this is exactly what the patch is aimed at to avoid and what was tested on several devices.
- String changes made/needed:
Assignee | ||
Comment 22•4 years ago
|
||
- Do not require GLES >= 3.0 any more to succeed, fixing this bug.
- Use GL instead of GLES, matching GLX behavior. This way we can avoid
most regressions related to the EGL switch. One scenario here is older
intel hardware supporting GL 1.4 and GLES 2.0 - we want to continue
blocking this hardware altogether, as e.g. WebGL does not support a
fallback to GLES in this case, resulting in crashes. - Add more error messages and early returs, making future debugging
easier. - Downgrade
record_error
torecord_warning
in some more places where
we don't want to fail hard (bug 1689707). - Remove
eglCreatePbufferSurface
- it always failed anyway, unnoticed!
Simplified backport of D105107
Updated•4 years ago
|
Comment 23•4 years ago
|
||
I am keeping the uplift request open in case we have a dot release this cycle and we have some bake time in beta 87 with this patch.
Reporter | ||
Comment 24•4 years ago
|
||
I tested nightly at Debian Bullseye (32 bit).
The error "glxtest: libGLESv2 glGetString returned null" indeed has
disappeared.
The warning "More than 1 GPU detected via PCI, cannot deduce vendor"
still is present, but does not harm, and possibly to typical properties
of the 945GM Graphics Controller. showing 2 bus devices for one
controller.
I do not quite understand how things are supposed to work now, but
WEBGL1 is not automatically activated (also not by clearing cache), but the
error:
"WebglAllowWindowsNativeGl:false restricts context creation on this system"
is shown at about:support.
Activating is possible by setting "webgl.force-enabled" true at about:config.
ES2.0 is used then.
At Firefox ESR 78.7 this is not needed and WEBGL1 is automatically
activated.
Tested nightly also at Ubuntu 18.04.5 gave same results.
WEBGL1 (and (WEBGL2) are both activated if LIBGL_ALWAYS_SOFTWARE=1.
Remarkable but not caused by Firefox is the poor performance of the WEBGL
aquarium-demo at Debian, compared to Ubuntu at the same machine.
Ubuntu uses LLVM 10, Debian LLVM11, both using 3.1 Mesa with version
20.0.8 (Ubuntu) resp. 20.3.4 (Debian).
At Ubuntu LIBGL_ALWAYS_SOFTWARE is 2 to 3 times faster than ES 2.0, at Debian it has the same "speed", but uses double cpu.
~
Assignee | ||
Comment 25•4 years ago
|
||
(In reply to vd.Kraats from comment #24)
...
I do not quite understand how things are supposed to work now, but
WEBGL1 is not automatically activated (also not by clearing cache), but the
error:
"WebglAllowWindowsNativeGl:false restricts context creation on this system"
is shown at about:support.
Activating is possible by setting "webgl.force-enabled" true at about:config.
ES2.0 is used then.
...
IIUC the device falls into the GL 1.4/GLES 2.0 category, right? While it would be technically possible to allow automatic WebGL activation for these devices, I personally will not work in that direction. The GLX behaviour was always to block GL < 2 devices - allowing GLES 2 devices would need careful checking if fallback paths work across a wide range of old devices and the general rule of thumb has been that GL < 2 drivers tend to be very buggy. What could work is to force-enable GL 2.0 support - AFAIK that was somehow possible.
I'm confused that it should have worked it ESR 78.7. Could you share an about:support
from there?
Reporter | ||
Comment 26•4 years ago
|
||
I attached the requested about:support from Firefox ESR at Debian Bullseye
Assignee | ||
Comment 27•4 years ago
|
||
(In reply to vd.Kraats from comment #26)
Created attachment 9204169 [details]
firefox_support_esrI attached the requested about:support from Firefox ESR at Debian Bullseye
Thanks! So that shows that for some reason mesa fell back to llvmpipe in glxtest, tricking it to pass, and then successfully creating a WebGL context via ES 2.0 on hardware. So that was more of an accident, not intended and not what you should have got on GLX.
I'd love to have this case working, but Jeff Gilbert previously told me that he'd not accept any patches allowing GL 1.x era hardware - too many issues in the past for too little gain. Sorry :/
Reporter | ||
Comment 28•4 years ago
|
||
Indeed GL 1.x should not be used anymore. I regret a working solution is not working anymore automatically, but I can understand your decision, because graphics and drivers are a terrible mess. I will Force WEBGL1 as long as it works and software rendering is slow.
Thanks.
Comment 29•4 years ago
|
||
This issue still occurs for me when running Nightly with "MOZ_ENABLE_WAYLAND=1 MOZ_X11_EGL=1 ./firefox" on Xorg. It still works with 86b9. I've got both envs exported globally for convenience reasons and it'd be nice if this continued to work.
Comment 30•4 years ago
|
||
Edit: Forgot to mention: Primary GPU is RX 5700 XT (RadeonSI), whereas there's also the IGP of the 6700k available in the system (Iris OGL 4.6, recent mesa git-master).
Assignee | ||
Comment 31•4 years ago
|
||
(In reply to walmartguy from comment #29)
This issue still occurs for me when running Nightly with "MOZ_ENABLE_WAYLAND=1 MOZ_X11_EGL=1 ./firefox" on Xorg. It still works with 86b9. I've got both envs exported globally for convenience reasons and it'd be nice if this continued to work.
What exact issue do you mean? I suppose you're not running into the GLES < 3.0 scenario?
Comment 32•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #31)
What exact issue do you mean? I suppose you're not running into the GLES < 3.0 scenario?
I probably should have mentioned that (oops): The issue is that Webrender doesn't work at all for me in Nightly with the aforementioned env vars specified. Since my GPUs are well supported by Mesa, I wouldn't think that it's anything about missing features levels.
Webrender with EGL on Xorg still works normally as long as I don't specify MOZ_ENABLE_WAYLAND=1 at the same time (which worked with 86).
Comment 33•4 years ago
|
||
Yet it seems to be related to GPU selection, as I get the "[GFX1-]: More than 1 GPU detected via PCI, cannot deduce vendor" error verbosity.
Assignee | ||
Comment 34•4 years ago
|
||
Ah I see. So on nightly we do not fall back to GLX any more if IsWaylandDisabled()
is false
(1), which is unconditionally set MOZ_ENABLE_WAYLAND=1
, regardless of the backend actually used. So the assumption is IsWaylandDisabled() == false
-> we are using the Wayland backend.
We could fix that for glxtest, however there are several other places where we rely on the same assumption (2), making me wonder why this worked for you in the first place. It should have had a bunch of odd side effects - maybe their effects were small enough.
Looking at the Fedora /usr/bin/firefox
script we have the following:
if ! [ $MOZ_DISABLE_WAYLAND ]; then
if [ "$XDG_CURRENT_DESKTOP" == "GNOME" ]; then
export MOZ_ENABLE_WAYLAND=1
fi
if false && [ "$XDG_SESSION_TYPE" = "wayland" ]; then
export MOZ_ENABLE_WAYLAND=1
fi
fi
I.e. MOZ_ENABLE_WAYLAND=1
will also get set unconditionally, without e.g. checking for $XDG_SESSION_TYPE
. That will probably also break hard soon then. I suppose it would make sense to check in firefox IsWaylandDisabled()
for $XDG_SESSION_TYPE
and, if set, only enable it if its value is wayland
.
By the way: we should IMO rename IsWaylandDisabled()
to IsWaylandEnabled()
- inverted logic is usually not a good idea.
1: https://searchfox.org/mozilla-central/source/toolkit/xre/glxtest.cpp#1202-1206
2: https://searchfox.org/mozilla-central/search?q=IsWaylandDisabled&path=
Assignee | ||
Comment 35•4 years ago
|
||
Martin, this will probably be important for Fedora soon. What do you think about the points above?
Comment 36•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #34)
We could fix that for glxtest, however there are several other places where we rely on the same assumption (2), making me wonder why this worked for you in the first place. It should have had a bunch of odd side effects - maybe their effects were small enough.
Now you got me an idea: With 86 beta, I've noticed that there was intermittent stutter (can take a minute or two to occur) on vsynctester.com, opposed to 85 stable.
Well, it very much looks like this was caused by MOZ_ENABLE_WAYLAND=1. I've repeated the test several times and without the variable set, there never was stutter (not counting the first few seconds), while with it set, there always was after 10 - 180 seconds. Very odd indeed. :)
Comment 37•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #34)
Looking at the Fedora
/usr/bin/firefox
script we have the following:if ! [ $MOZ_DISABLE_WAYLAND ]; then if [ "$XDG_CURRENT_DESKTOP" == "GNOME" ]; then export MOZ_ENABLE_WAYLAND=1 fi if false && [ "$XDG_SESSION_TYPE" = "wayland" ]; then export MOZ_ENABLE_WAYLAND=1 fi fi
That's a typo, it should be "$XDG_SESSION_TYPE" == "wayland". It was intended for https://bugzilla.redhat.com/show_bug.cgi?id=1922608 but given Kwin/Wayland state I'm going to remove it anyway so we'll use Wayland for Gnome only for now.
I.e.
MOZ_ENABLE_WAYLAND=1
will also get set unconditionally, without e.g. checking for$XDG_SESSION_TYPE
. That will probably also break hard soon then. I suppose it would make sense to check in firefoxIsWaylandDisabled()
for$XDG_SESSION_TYPE
and, if set, only enable it if its value iswayland
.By the way: we should IMO rename
IsWaylandDisabled()
toIsWaylandEnabled()
- inverted logic is usually not a good idea.
Yes, we can rename IsWaylandDisabled()
to IsWaylandEnabled()
and also check Wayand availability at IsWaylandEnabled() do make sure when IsWaylandEnabled() returns true we're really using Wayland.
Assignee | ||
Comment 38•4 years ago
|
||
Martin, following up on this: until we have a check in FF I think the fedora launch script should get adopted from
if [ "$XDG_CURRENT_DESKTOP" == "GNOME" ]; then
export MOZ_ENABLE_WAYLAND=1
fi
to something like
if [ "$XDG_CURRENT_DESKTOP" == "GNOME" ] && [ "$XDG_SESSION_TYPE" == "wayland" ]; then
export MOZ_ENABLE_WAYLAND=1
fi
Otherwise glxtest will fail in X11 sessions once this patch lands. And even without this patch there are likely some subtle differences that could have negative effects such as in https://searchfox.org/mozilla-central/source/dom/ipc/BrowserChild.cpp#2174
Comment 40•4 years ago
|
||
I'll update the launch script anyway, Thanks.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Description
•