Closed Bug 1455498 Opened 6 years ago Closed 4 years ago

WebGL doesn't work on Linux if drivers are loaded through LD_LIBRARY_PATH

Categories

(Core :: Security: Process Sandboxing, defect, P3)

59 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox-esr68 --- wontfix
firefox75 --- wontfix
firefox76 --- wontfix
firefox77 --- wontfix
firefox78 --- fixed

People

(Reporter: ian, Assigned: gcp, NeedInfo)

References

()

Details

(Keywords: regression, Whiteboard: sb+)

Attachments

(3 files)

Attached file WebGL Report.pdf
User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0
Build ID: 20180327091415

Steps to reproduce:

Visit http://webglreport.com


Actual results:

Reports This browser supports WebGL 2, but it is disabled or unavailable.
or This browser supports WebGL 1, but it is disabled or unavailable.


Expected results:

Visiting http://webglreport.com with chrome reports This browser supports WebGL 2.
(Screenshot Attached)

Graphics section of about:config attached.
WebGL has worked in firefox on this machine in the past, but I am uncertain exactly when it stopped working. For certain, v58 also didn't work on the otherwise identical system.
Can you reproduce the issue in a brand new profile?
https://support.mozilla.org/kb/profile-manager-create-and-remove-firefox-profiles

If yes, could you try to find the regression range? Bug 1292697 (Intel GPU) says the release version still works, which at the time was 48. So you could try
mozregression --good 48 --bad 58
https://mozilla.github.io/mozregression/documentation/usage.html
Has Regression Range: --- → irrelevant
Has STR: --- → yes
Component: Untriaged → Canvas: WebGL
Flags: needinfo?(ian)
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
Attachment #8969525 - Attachment description: Graphics section of about:config → Graphics section of about:support
Yes, the problem exists with a brand new profile.

I am having trouble doing the regression test. I tried to install mozregression and there is a mismatch between libssl and libcrypt versions (have 1.1.1, require 1.0.0). I tried to work around but no go. I tried building mozregression from source and ran into a different set of problems.

Fixing this has to be harder than doing a binary search using manually downloaded firefox versions.

I'll let you know how I go.
Flags: needinfo?(ian)
Version 56.0.2 is the latest version which works. Version 57.0 fails.
I'm sorry to hear you're having trouble with mozregression. I'm afraid 56.0.2 to 57.0 is quite a large regression range. If build IDs aren't platform-specific, then this should be the push log, which contains 46 instances of "WebGL". So that's not terribly specific.
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=51ffb9283f0c7c00e08eb8c39b33fbee218c370d&tochange=47f7b6c64265bc7bdd22eef7ab71abc97cf3f8bf

Looking for resolved WebGL Linux bugs between 2017-07-29 and 2017-09-23 yields only 3 results. The last affected Nvidia GPUs, so it seems like a good suspect. But on the other hand, the search might be overly restrictive and it might be none of these. 
Bug 1330433
Bug 1384718
Bug 1385715

If you could do a manual bisection, that would help. Once you've found the last good build and the first bad build, enter about:buildconfig into the location bar, paste the build IDs into a pushlog link like the one above, then post it in a comment here. Starting points:

Last 56.0a1 nightly build
https://ftp.mozilla.org/pub/firefox/nightly/2017/08/2017-08-01-10-03-11-mozilla-central/

Last 57.0a1 nightly build
https://ftp.mozilla.org/pub/firefox/nightly/2017/09/2017-09-21-10-01-41-mozilla-central/
Has Regression Range: irrelevant → no
OK, narrowed it down to From Build ID 20170724100304 to Build ID 20170725144053

This would be a pushlog of https://hg.mozilla.org/mozilla-central/pushloghtml?startdate=20170724100304&enddate=20170725144053
but it doesn't make sense. That link has "Changes pushed after 2018-04-22 00:10:03, before 2018-04-22 10:25:53", which doesn't seem to be the correct date range which should be 24 July 2017 to 25 July 2017.
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5928d905c0bc0b28f5488b236444c7d7991cf8d4&tochange=07484bfdb96bc7297c404e377eea93f1d8ca4442

No matches for WebGL in the summary of any bugs in that range. Bug 1382185 has 2 patches having to do with WebGL. :nical, do you think it could be the cause? If not, any hints on where to look?

(In reply to Ian Dall from comment #7)
> OK, narrowed it down to From Build ID 20170724100304 to Build ID 20170725144053

Thank you for the update.

> This would be a pushlog of

The date-based pushlog link shows no results. It works if you use the changeset-based one. Like I said in comment 6, you can get the changeset ID (which is what I meant rather than build ID) from about:buildconfig rather than about:support. They're also listed in TXT files in the same folder where you download the Nightly build in question.
Has Regression Range: no → yes
Flags: needinfo?(nical.bugzilla)
I noticed I'm in CC here, so I guess it may be worth trying going to about:config, set security.sandbox.content.level to 0 and restart the browser. (If this doesn't help, don't forget to change it back)

(That said, the regression range had no sandboxing related changes)
Brilliant! setting security.sandbox.content.level to 0 works. What actual things are prevented by sscl 3?
In fact lowering security.sandbox.content.level to 2 works OK. So what is the difference between level 2 and level 3?
Got it. I have my nvidia libraries in a non-standard place (so I can use nvidia proprietary drivers and non nvidia graphics drivers with the same system image) with LD_LIBRARY_PATH set appropriately. Setting security.sandbox.content.read_path_whitelist to '/opt/nvidia/' and security.sandbox.content.level to 3 works.

Thanks for your help. I guess this is not a bug though would it be possible to auto allow reads from paths in LD_LIBRARY_PATH?
Summary: webgl doesn't work on linux → WebGL doesn't work on Linux if drivers are loaded through LD_LIBRARY_PATH
Whiteboard: sb?
We try to find all relevant libraries by parsing /etc/ld.so.conf and friends, but didn't consider LD_LIBRARY_PATH overrides. I'm perhaps slightly surprised that works, but if it does, it think we can support it.
Flags: needinfo?(nical.bugzilla)
Component: Canvas: WebGL → Security: Process Sandboxing
Assignee: nobody → gpascutto
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Priority: -- → P1
Whiteboard: sb? → sb+
Moving to p3 because no activity for at least 24 weeks.
Priority: P1 → P3

Um, I just ran into this again (because my config was reset). I had assumed this was fixed based on comment #13 but it is still there in 66.02

See Also: → 1608558

I had assumed this was fixed based on comment #13 but it is still there in 66.02

That comment mean to imply that it's possible to do the work to make this work automatically, not that it was already done (this bug would've been RESOLVED FIXED if so). I didn't see any other report aside from yours so more urgent bugs took priority.

It looks like fixing this would clean up bug 1608558 though.

So it took a long way to find that this was the problem. Several websites just crashed the tab for me and I kept sending crash reports through about:crashes (I think this is the relevant one: https://crash-stats.mozilla.org/signature/?product=Firefox&signature=swrast_dri.so%400x9a645a).

The fix in Comment #11 fixed it for me too. 1 thing to note is that when that setting is on 3 or 4 the webglreport site reports vmware as the vendor/renderer. When on 2 it returns the correct values of X.Org and AMD Turks. Since I didn't had this problem before and I installed VMWare some time after I think it might have something to do with it.

Pushed by gpascutto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4311edb105ef
Whitelist directories passed in LD_LIBRARY_PATH. r=jld
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78

The patch landed in nightly and beta is affected.
:gcp, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(gpascutto)

While this is reasonably low risk, we've lived with this bug as long as we had sandboxing so let's just let this ride the trains. (And there's good workarounds by setting the whitelist pref, too...)

Flags: needinfo?(gpascutto)
Flags: qe-verify+

I tried to reproduce this issue using an old build on a Ubuntu 18.04 machine I have here but without success so I can't verify this is fixed. chriseilander would you mind helping with a quick verification on Firefox 78.0b8 beta build? You can download it from here: https://www.mozilla.org/en-US/firefox/channel/desktop/

Flags: qe-verify+ → needinfo?(chriseilander)

Unfortunately I updated to Kubuntu 20.04, and now even when I set security.sandbox.content.level back to 4 and use an older version of Firefox, I am unable to recreate the bug. If I can find the time this weekend, I will try out some other things to recreate it. And try to verify if 78.0b8 beta has it fixed.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: