Closed Bug 1725148 Opened 4 years ago Closed 4 years ago

WebGL no longer available on Linux

Categories

(Core :: Security: Process Sandboxing, defect)

Firefox 91
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
95 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 --- fixed
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 --- fixed
firefox95 --- fixed

People

(Reporter: mark-mozilla, Assigned: jld)

References

(Regression)

Details

(Keywords: regression)

Attachments

(7 files)

Attached file ldd-libraries.txt

Regression from Firefox 90.0.2 -> 91.0
WebGL no longer working
Linux, Slackware (-current)

Using Mozilla build (not distribution package):
https://download.mozilla.org/?product=firefox-latest-ssl&os=linux64&lang=en-GB

Run Firefox and creating a new profile:
$ /opt/mozilla/firefox-91.0/firefox -ProfileManager

Visit a WebGL page; eg.
https://webglsamples.org/blob/blob.html

Error returned:
Status: WebGL creation failed: * tryNativeGL () * Exhausted GL driver options. (FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS)

Console prints this text at the time of page load:
No protocol specified
No protocol specified

Also affects Google JamBoard, which reports no WebGL.

All system libraries seem to be present (see ldd-libraries.txt)

Both JamBoard and the test case above continue to work on this system on Firefox 90.0.2.

Attached file xdpyinfo.txt
Attached file glxinfo.txt

The Bugbug bot thinks this bug should belong to the 'Core::Canvas: WebGL' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: Untriaged → Canvas: WebGL
Product: Firefox → Core

I also have this, running on Debian on ChromeOS. I noticed that in about:support, the desktop environment is unknown. Is it something to do with that?

Same problem here with Firefox 91 (and latest Firefox 92):

Operating System: openSUSE Tumbleweed 20210910
KDE Plasma Version: 5.22.5
KDE Frameworks Version: 5.85.0
Qt Version: 5.15.2
Kernel Version: 5.14.1-1-default (64-bit)
Graphics Platform: X11
Processors: 4 × Intel® Core™ i5-6200U CPU @ 2.30GHz
Memory: 7.6 GiB of RAM
Graphics Processor: Mesa Intel® HD Graphics 520

glxinfo -B
No protocol specified
name of display: :0
display: :0 screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
Vendor: Intel (0x8086)
Device: Mesa Intel(R) HD Graphics 520 (SKL GT2) (0x1916)
Version: 21.2.1
Accelerated: yes
Video memory: 3072MB
Unified memory: yes
Preferred profile: core (0x1)
Max core profile version: 4.6
Max compat profile version: 4.6
Max GLES1 profile version: 1.1
Max GLES[23] profile version: 3.2

I've found I can set MOZ_ENABLE_WAYLAND=1 and launch firefox with that. It then uses wayland instead of xwayland, and webgl is now working. So it seems like this sidesteps the issue on my system.

(In reply to ra_hardy from comment #6)

I've found I can set MOZ_ENABLE_WAYLAND=1 and launch firefox with that. It then uses wayland instead of xwayland, and webgl is now working. So it seems like this sidesteps the issue on my system.

Since I am not using Wayland on my desktop this did predictably not have any effect for me. Just to confirm that this issue was introduced in version 91 I downloaded the vanilla Firefox 90 and 92 from the Mozilla website and tried them independently from my installation. The result is the same, all works with 90 but breaks with 92 (as it did with 91). Starting firefox with the debug argument (-d gdb) didn't give any additional info. The only error I can see anywhere is in the web console itself and is the one stated in the original report above:

Status: WebGL creation failed: * tryNativeGL () * Exhausted GL driver options. (FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS)

So apparently Firefox 91 decided it doesn't like my GL driver anymore for some reason (Mesa? or Intel HD?, unclear ...)

Component: Canvas: WebGL → Widget: Gtk

Can you run pip install --user mozregression, then mozregression --good 90 --bad 91 and see what broke it? Thanks!

Flags: needinfo?(mark-mozilla)

Err, meant to ni? about comment 7

Flags: needinfo?(manfred.kitzbichler)

Everyone who is still affected: Please open about:support, click on "Copy text to clipboard" and paste it here. Thanks!

As I have already mentioned at https://support.mozilla.org/en-US/questions/1347475#answer-1448448 I found that setting webgl.out-of-process to true makes WebGL work but has the rather annoying side effect that the widevine plugin crashes every time, so video streaming has become impossible.

(In reply to manfred.kitzbichler from comment #12)

As I have already mentioned at https://support.mozilla.org/en-US/questions/1347475#answer-1448448 I found that setting webgl.out-of-process to true makes WebGL work but has the rather annoying side effect that the widevine plugin crashes every time, so video streaming has become impossible.

Actually I have disabled webgl.out-of-process before to make the about:support information meaningful and I noticed that widevine is still crashing, so presumably the two are not connected after all. I think sandboxing might be the problem with widevine, but that's for another bug report.

(In reply to manfred.kitzbichler from comment 11)

This landed in 91:

c49de061c1fab19f934574e959938c5500d1d080 Jed Davis — Bug 1635451 - Minimize content processes' connections to the X server. r=jgilbert,stransky,nika
c1bd0996764c49174c9169fe5550905ce5dcef88 Jed Davis — Bug 1635451 - Allow GLX to work in headless content processes. r=jgilbert
2974b1ba2beb67be93502376968ff462921d0d5e Jed Davis — Bug 1635451 - Attempt to start WebGL even in headless mode. r=jgilbert

  1. Please open about:config, set gfx.x11-egl.force-enabled to true and restart Firefox.
    Does https://webglsamples.org/aquarium/aquarium.html work with this setting?

  2. Please set gfx.x11-egl.force-enabled back to false, set dom.ipc.avoid-gtk to false, restart Firefox and test again. Does it work now?

Can't try this just now but I did try forcing EGL on before and this only changed the error from tryNativeGL() to FEATURE_FAILURE_NO_DISPLAY instead of FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS if that makes sense.
Will try dom.ipc.avoid-gtk later.

(In reply to manfred.kitzbichler from comment #15)

Can't try this just now but I did try forcing EGL on before and this only changed the error from tryNativeGL() to FEATURE_FAILURE_NO_DISPLAY instead of FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS if that makes sense.

Firefox Snap gets that EGL error as well because the sandbox blocks EGL in Snap, while GLX works fine there (bug 1732580).
In your case, EGL and GLX seem to be blocked. OpenSUSE might apply their own modifications to Firefox.

Will try dom.ipc.avoid-gtk later.

Thanks!

Well, I'll be damned. Setting dom.ipc.avoid-gtk=false did the trick!

WebGL works again without enabling webgl.out-of-process. What has GTK got to do with WebGL though?

(In reply to manfred.kitzbichler from comment #18)

Well, I'll be damned. Setting dom.ipc.avoid-gtk=false did the trick!

WebGL works again without enabling webgl.out-of-process. What has GTK got to do with WebGL though?

It's about avoiding too many X server connections because you get a crash if you have too many.

If you can, please also test if the problem still occurs with latest Nightly.
There are two ways:
a) Download https://nightly.mozilla.org, extract it in your Downloads folder and run the firefox binary. It will create its own Firefox profile and doesn't touch your existing profile.
b) Run this in your terminal to launch a temporary throwaway Nightly:

$ pip3 install --user mozregression
$ mozregression --launch 2021-10-07 -a https://webglsamples.org/aquarium/aquarium.html
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regression
Regressed by: semi-headless

Sorry to disappoint, but I tried the pip install and mozregression commands to run the latest nightly and it gave me the familiar:

It does not appear your computer supports WebGL.
Click here for more information.

Status: WebGL creation failed: * tryNativeGL (FEATURE_FAILURE_NO_DISPLAY) * Exhausted GL driver options. (FEATURE_FAILURE_WEBGL_EXHAUSTED_DRIVERS)

Thanks!

Has Regression Range: --- → yes
Flags: needinfo?(mark-mozilla)
Flags: needinfo?(jld)

(In reply to Darkspirit from comment #30)

(In reply to manfred.kitzbichler from comment #28)
Ok, yes, the Widevine crash seems only to occur with downstream builds made by alpha distributions:

(Jed Davis [:jld] from comment 12)

If anyone needs a workaround, Mozilla's builds of Firefox should still work.

Ok, so I guess for the time being I'll run my default (openSuse) firefox with MOZ_DISABLE_GMP_SANDBOX and dom.ipc.avoid-gtk=false which solves the two problems I am having. Once V95 is in the repository I can drop the MOZ_DISABLE_GMP_SANDBOX and be happy.
In the meantime, is there actually a setting in about:config that I could use instead of having to set the environment variable MOZ_DISABLE_GMP_SANDBOX? I suppose the sandbox level could do the trick, but it seems to affect security more than necessary.

I think I know what's going on here. To those who've experienced this, is the XAUTHORITY environment variable set to anything? And does env XAUTHORITY=$HOME/.Xauthority firefox work?

I can reproduce this by unsetting XAUTHORITY on my regular desktop. I also have a ChromeOS device but it's arm64, so I've confirmed that it doesn't set XAUTHORITY in Crostini but I don't currently have a new enough Firefox build to test there.

If that env var isn't set, Xorg will fall back to $HOME/.Xauthority, but the code that sets up the content process sandbox policy doesn't, because I didn't realize that libXau had that feature when I wrote that code. This should be relatively simple to fix.

Assignee: nobody → jld
Flags: needinfo?(jld)

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #32)

I think I know what's going on here. To those who've experienced this, is the XAUTHORITY environment variable set to anything? And does env XAUTHORITY=$HOME/.Xauthority firefox work?

I can reproduce this by unsetting XAUTHORITY on my regular desktop. I also have a ChromeOS device but it's arm64, so I've confirmed that it doesn't set XAUTHORITY in Crostini but I don't currently have a new enough Firefox build to test there.

If that env var isn't set, Xorg will fall back to $HOME/.Xauthority, but the code that sets up the content process sandbox policy doesn't, because I didn't realize that libXau had that feature when I wrote that code. This should be relatively simple to fix.

I am afraid starting with "XAUTHORITY=$HOME/.Xauthority firefox" didn't help. Widevine crashes and I am getting the same error messages in the terminal as above, ie lots of these:

Sandbox: attempt to open unexpected file /usr/lib64/x86_64/libdl.so.2
Sandbox: seccomp sandbox violation: pid 21703, tid 21703, syscall 262, args 4294967196 140732782308944 140732782309120 0 4294967295 140732782308944.

However, I did assume that this bug has been fixed already in the latest nightly by adding some more tests for library paths. I have to test this with the version that comes with my distribution though, once it has come through to the Tumbleweed repository. The Mozilla compiled version somehow always works (libraries compiled in I suppose).

(In reply to manfred.kitzbichler from comment #33)
comment 32 was not a reply to comment 31. This bug is about WebGL: Does "XAUTHORITY=$HOME/.Xauthority firefox" fix WebGL for you?

(If you want to have Widevine fixed earlier, you could ask OpenSuse to backport bug 1725828, otherwise you could use https://beta.mozilla.org or https://nightly.mozilla.org. If you should use your Firefox profile once with a newer version, you usually can't go back to an older, but you could switch back from Beta 94 to Stable 94 if you really wanted.)

Yes, that you cannot go back to an older version of Firefox and keep your profile is quite annoying. Frankly, that's why I am here, because otherwise I'd just have gone back to v90 and waited until all these issues sort themselves out.

Anyway, I enabled dom.ipc.avoid-gtk again, restarted, had the usual WebGL failure, closed, restarted with "XAUTHORITY=$HOME/.Xauthority firefox" and lo and behold WebGL works again!

So, do we want to "avoid GTK", and is this the best solution?

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #32)

I think I know what's going on here. To those who've experienced this, is the XAUTHORITY environment variable set to anything? And does env XAUTHORITY=$HOME/.Xauthority firefox work?

On my (Intel) Chromebook, XAUTHORITY is not set. And yes, having reverted the avoid-gtk setting to its original value, then set XAUTHORITY and launching Firefox fixes the webgl issue.

Has STR: --- → yes
Component: Widget: Gtk → Security: Process Sandboxing

If the XAUTHORITY env var is unset, libXau will fall back to
$HOME/.Xauthority, but our content sandbox policy didn't handle that
case when it needs to allow access to that file; this patch corrects
that oversight.

This broke WebGL as of bug 1635451, because we no longer eagerly connect
to the X server before sandbox startup, only as needed for WebGL.

Usually the XAUTHORITY env var is set even if the file is in its
default location, but some environments (including but not limited to
the Linux VMs on Chrome OS) do not set it.

Pushed by jedavis@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c797311411f7 Fix how we find the Xauthority file for sandbox policies. r=gcp
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch

About the pref name: it was originally dom.ipc.semi-headless, which was confusing even to Gecko developers, then avoid-x11, then avoid-gtk when I realized we might want it for Wayland someday (no hard limit like Xorg, but the compositor can run out of fds). Also, GTK has some other unwanted side effects like consuming a noticeable amount of memory and CPU time to initialize.

I'm assuming we'll want to backport this to Beta for Fx94 and to ESR91, but it would be great if we could get someone to verify on a Nightly build that this is working for them now.

If it's of any help, I tested this with mozregression --launch 2021-10-15 -a https://webglsamples.org/aquarium/aquarium.html and it seemed to be working.

The patch landed in nightly and beta is affected.
:jld, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jld)

Comment on attachment 9245032 [details]
Bug 1725148 - Fix how we find the Xauthority file for sandbox policies.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Regression (90 to 91) in an important feature in some cases, and low risk.
  • User impact if declined: WebGL is broken in some configurations, including the Linux environment on Chrome OS; this is a regression in 91.
  • Fix Landed on Version: 95
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch doesn't change anything except in the case that's currently broken, and it just adds a rule to the sandbox policy to allow reading one specific file, so it shouldn't have side-effects outside of WebGL.
  • String or UUID changes made by this patch: none

Beta/Release Uplift Approval Request

  • User impact if declined: WebGL is broken in some configurations, including the Linux environment on Chrome OS; this is a regression in 91.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: On most Linux distributions: env -u XAUTHORITY ./firefox https://get.webgl.org (assuming you're in the directory with the unpacked Firefox; adjust the path as needed, substitute ./mach run if it's a local build, etc.)

Expected results: without the patch the page loads but it shows an error message about WebGL being unavailable; with the patch there is a spinning cube.

  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This patch doesn't change anything except in the case that's currently broken, and it just adds a rule to the sandbox policy to allow reading one specific file, so it shouldn't have side-effects outside of WebGL.
  • String changes made/needed: none
Flags: needinfo?(jld)
Attachment #9245032 - Flags: approval-mozilla-esr91?
Attachment #9245032 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9245032 [details]
Bug 1725148 - Fix how we find the Xauthority file for sandbox policies.

Approved for 94.0b9 and 91.3esr.

Attachment #9245032 - Flags: approval-mozilla-esr91?
Attachment #9245032 - Flags: approval-mozilla-esr91+
Attachment #9245032 - Flags: approval-mozilla-beta?
Attachment #9245032 - Flags: approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

I can still reproduce the issue using the steps from comment 0 and Firefox 94.0b9, latest Nightly 95.0a1 and Firefox 91.3 esr on Ubuntu 18.04 x64.

Should I file another bug or can you reopen this one?

Flags: needinfo?(jld)

(In reply to Oana Botisan, Desktop Release QA from comment #48)

I can still reproduce the issue using the steps from comment 0 and Firefox 94.0b9, latest Nightly 95.0a1 and Firefox 91.3 esr on Ubuntu 18.04 x64.

I looked into this and I believe Ubuntu 18.04 is unaffected: its default display manager doesn't create the .Xauthority file at all, and uses si:localuser to grant access by uid instead. Specifically, I can't reproduce this bug (using my STR from comment #44) on that OS even using 93.0, which is supposed to be affected/wontfix. Can you give some more detail about how you reproduced it?

Flags: needinfo?(jld) → needinfo?(oana.botisan)

The steps I used:

  1. Open Firefox with a new profile using the command "-ProfileManager".
  2. Visit a WebGL page; eg. https://webglsamples.org/blob/blob.html

However. I retested everything, but I can't seem to be able to reproduce the issue anymore, either.

Flags: needinfo?(oana.botisan)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: