Closed Bug 1434711 Opened 6 years ago Closed 6 years ago

WebGL causes a crash with the AMDGPU-PRO video driver

Categories

(Core :: Security: Process Sandboxing, defect, P1)

58 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox59 + wontfix
firefox60 --- fixed
firefox61 --- fixed

People

(Reporter: emmeci, Assigned: gcp)

References

Details

(Whiteboard: gfx-noted)

Crash Data

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20180131010234

Steps to reproduce:

On some website (Facebook with the chat window popup opened, Google Maps and some minor website) tab crash during load, and this happened right after the update to Firefox 58. 

the driver installed for the GPU is AMDGPU-PRO 17.50-511655


Actual results:


This issue have generated a crash report you can find here (https://crash-stats.mozilla.com/report/index/33dda5a8-8ea5-468c-a15a-052b80180127) and i tested with the browser in safe mode and with a new profile and even with Noscript enabled to be sure isn't a script on the webpage , and the crash still happen. 

Opening the browser trough Bash and trying to reproduce the crash give this stdout:

ivan@yang:~$ firefox
MESA-LOADER: could not get parent device
/opt/amdgpu/share/libdrm/amdgpu.ids: Permesso negato
[Parent 6363, Gecko_IOThread] WARNING: pipe error (174): Connessione interrotta dal corrispondente: file /build/firefox-ox8TnP/firefox-58.0.1+build1/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353
[Parent 6363, Gecko_IOThread] WARNING: pipe error (47): Connessione interrotta dal corrispondente: file /build/firefox-ox8TnP/firefox-58.0.1+build1/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353
[Parent 6363, Gecko_IOThread] WARNING: pipe error (46): Connessione interrotta dal corrispondente: file /build/firefox-ox8TnP/firefox-58.0.1+build1/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353

###!!! [Parent][MessageChannel] Error: (msgtype=0x150084,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv


###!!! [Parent][MessageChannel] Error: (msgtype=0x150084,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
Component: Untriaged → Graphics
Product: Firefox → Core
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Just an update to say that even http://www.poste.it, the website of Poste Italiane (postal service associated with Universal Postal Union for Italy) that have also bank service have the same problem. 

So i think is good to promote this bug with a more high priority
Severity: normal → major
Crash Signature: https://crash-stats.mozilla.com/report/index/33dda5a8-8ea5-468c-a15a-052b80180127
Crash Signature: https://crash-stats.mozilla.com/report/index/33dda5a8-8ea5-468c-a15a-052b80180127 → https://crash-stats.mozilla.com/signature/?product=Firefox&signature=amdgpu_dri.so%400x192904d
Crash Signature: https://crash-stats.mozilla.com/signature/?product=Firefox&signature=amdgpu_dri.so%400x192904d → amdgpu_dri.so%400x192904d
I am having the same problem going to various sites, specifically https://www.udemy.com/
Crash signature: https://crash-stats.mozilla.com/report/index/58265950-7016-4d43-994b-6dd960180208
Seeing this a lot too (e.g. https://crash-stats.mozilla.com/report/index/e3da8434-47e7-45b1-b5a3-0d2570180210).  Trivially reproducible when accessing https://www.reddit.com/ in private mode (but not in normal mode).  Additionally, if I have reddit.com open in non-private mode and then open it in private mode, both tabs crash, as did a tab I had open to crash-stats.mozilla.org.  Many other tabs were unaffected (so yay for process isolation?)
Also, for the record, still happening in 58.0.2 despite the temptingly labelled fix of "Blocklisted graphics drivers related to off main thread painting crashes"
I can confirm this issue. crashing sites: www.bonprix.cz www.irozhlas.cz or www.reddit.com (in a private window).

crash report: https://crash-stats.mozilla.com/report/index/72fe20c8-4fb8-4564-9c71-8375c0180211
about:support https://hastebin.com/ojuwikidab.json

$ uname -roms
Linux 4.13.0-32-generic x86_64 GNU/Linux
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 17.10
Release:	17.10
Codename:	artful
$ lspci -v
...
01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RS690M [Radeon Xpress 1200/1250/1270] (prog-if 00 [VGA controller])
	Subsystem: Hewlett-Packard Company RS690M [Radeon Xpress 1200/1250/1270]
	Flags: bus master, fast devsel, latency 64, IRQ 16, NUMA node 0
	Memory at c0000000 (64-bit, prefetchable) [size=128M]
	Memory at d0100000 (64-bit, non-prefetchable) [size=64K]
	I/O ports at 4000 [size=256]
	Memory at d0200000 (32-bit, non-prefetchable) [size=1M]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: radeon
	Kernel modules: radeon
...

Is there at least some temporary workaround? It's honestly pretty major issue, since the problem happened not on my pc.
I confirm issue still persist on 58.0.2, probably the "Blocklisted graphics drivers related to off main thread painting crashes" and "Tab crash during printing" are not fix related to this bug
Severity: major → critical
Happens too on Ubuntu 16.04 x86_64 and firefox developer edition 59.0b9 (64-bits), page I'm getting this error in is:

https://html5test.com

Latest crash report I have is:

https://crash-stats.mozilla.com/report/index/9134a977-a201-45d7-b1d4-9f8440180215
The followings are crash stats in this bug. When the crash stats had the valid gecko stack, the stack has gl::GLXLibrary::EnsureInitialized() and WebGLContext::CreateAndInitGLWith(). The crash seems to be related to WebGL initialization.

--------------------------------

https://crash-stats.mozilla.com/report/index/33dda5a8-8ea5-468c-a15a-052b80180127
https://crash-stats.mozilla.com/report/index/58265950-7016-4d43-994b-6dd960180208
https://crash-stats.mozilla.com/report/index/e3da8434-47e7-45b1-b5a3-0d2570180210
https://crash-stats.mozilla.com/report/index/72fe20c8-4fb8-4564-9c71-8375c0180211
https://crash-stats.mozilla.com/report/index/9134a977-a201-45d7-b1d4-9f8440180215
:jgilbert, can you comment to this bug?
Flags: needinfo?(jgilbert)
(In reply to code from comment #7)
> Happens too on Ubuntu 16.04 x86_64 and firefox developer edition 59.0b9
> (64-bits), page I'm getting this error in is:
> 
> https://html5test.com
> 
> Latest crash report I have is:
> 
> https://crash-stats.mozilla.com/report/index/9134a977-a201-45d7-b1d4-
> 9f8440180215

Additional info I've gathered:

1) On "safe mode" the bug does not happen, but on normal mode even with all the add-ons disabled it still happens.

2) Was able to workaround it by following the advice on this thread: https://support.mozilla.org/eu/questions/1204598

Basically I had to disable webGL in the advanced config and it worked again, BUT no webgl support tho.
Yes, disabling WebGL resolve the issue, but I don't think the problem is into the AMDGPU-PRO itself, but the implementation Firefox have with the driver and WebGL. On Chromium i don't experience any crash.
Hi,

    This is happening to me too, no matter if I install latest amdgpu-pro driver, still the problem persist.

This does not happen with Google Chrome at all.

OS: Ubuntu 16.04
Firefox: 58.0.1

Thanks.
I could also reproduce this issue using Fx 59.0b11, on Ubuntu 16.04 LTS x64 having an AMD graphics card.( Radeon (TM) RX 480 Graphics). 
I can confirm it is a webGL related issue. Marking this bug accordingly!
Status: UNCONFIRMED → NEW
Ever confirmed: true
Hi,

    Any news about this bug? I've switched from Chrome to Firefox when this new Firefox arrived, but now is almost impossible to use it (as for example, firefox crash even with Facebook Chat and so many websites).

Thanks.
Disable WebGL on about:config as workaround
Yeah, I know... but no WebGL support then... Thanks anyways :)
Assignee: nobody → jgilbert
Flags: needinfo?(jgilbert)
Priority: -- → P1
Whiteboard: gfx-noted
GLXLibrary.h:119 comes up at least twice, where we do have crash stacks, which is glxQueryVersion:
https://dxr.mozilla.org/mozilla-central/rev/6ff60a083701d08c52702daf50f28e8f46ae3a1c/gfx/gl/GLXLibrary.h#119
Bug glXQueryVersion has been around since v1.0, and GLLibraryLoader::LoadSymbols returns false if any symbol failed to load. This means we may be getting a non-null, but invalid glXQueryVersion pointer.
(In reply to Cristian Comorasu [:CristiComo] from comment #13)
> I could also reproduce this issue using Fx 59.0b11, on Ubuntu 16.04 LTS x64
> having an AMD graphics card.( Radeon (TM) RX 480 Graphics). 
> I can confirm it is a webGL related issue. Marking this bug accordingly!

Can you give me an about:support for that machine? I have a R9 390 on Arch that's working fine.
Flags: needinfo?(cristian.comorasu)
I have also a RX480. about:support :

NOTE: of course I've disabled "uBlock origin" (or any other extension) before reproducing the bug:


Application Basics
------------------

Name: Firefox
Version: 58.0.1
Build ID: 20180205104257
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
OS: Linux 4.13.0-36-generic
Multiprocess Windows: 1/1 (Enabled by default)
Web Content Processes: 2/4
Stylo: content = true (enabled by default), chrome = false (disabled by default)
Google Key: Found
Mozilla Location Service Key: Missing
Safe Mode: false

Crash Reports for the Last 3 Days
---------------------------------

All Crash Reports

Firefox Features
----------------

Name: Activity Stream
Version: 2018.01.04.0062-4997c81d
ID: activity-stream@mozilla.org

Name: Application Update Service Helper
Version: 2.0
ID: aushelper@mozilla.org

Name: Firefox Screenshots
Version: 25.0.0
ID: screenshots@mozilla.org

Name: Follow-on Search Telemetry
Version: 0.9.6
ID: followonsearch@mozilla.com

Name: Form Autofill
Version: 1.0
ID: formautofill@mozilla.org

Name: Photon onboarding
Version: 1.0
ID: onboarding@mozilla.org

Name: Pocket
Version: 1.0.5
ID: firefox@getpocket.com

Name: Shield Recipe Client
Version: 76.1
ID: shield-recipe-client@mozilla.org

Name: Web Compat
Version: 1.1
ID: webcompat@mozilla.org

Extensions
----------

Name: uBlock Origin
Version: 1.15.10
Enabled: true
ID: uBlock0@raymondhill.net

Graphics
--------

Features
Compositing: Basic
Asynchronous Pan/Zoom: wheel input enabled; scrollbar drag enabled; keyboard enabled; autoscroll enabled
WebGL 1 Driver WSI Info: -
WebGL 1 Driver Renderer: WebGL is currently disabled.
WebGL 1 Driver Version: -
WebGL 1 Driver Extensions: -
WebGL 1 Extensions: -
WebGL 2 Driver WSI Info: -
WebGL 2 Driver Renderer: WebGL is currently disabled.
WebGL 2 Driver Version: -
WebGL 2 Driver Extensions: -
WebGL 2 Extensions: -
GPU #1
Active: Yes
Description: ATI Technologies Inc. -- AMD Radeon (TM) RX 480 Graphics
Vendor ID: ATI Technologies Inc.
Device ID: AMD Radeon (TM) RX 480 Graphics
Driver Version: 4.5.13505 Compatibility Profile Context 17.50.2.13

Diagnostics
AzureCanvasAccelerated: 0
AzureCanvasBackend: skia
AzureContentBackend: skia
AzureFallbackCanvasBackend: none
CairoUseXRender: 0
Decision Log
HW_COMPOSITING:
blocked by default: Acceleration blocked by platform
OPENGL_COMPOSITING:
unavailable by default: Hardware compositing is disabled
WEBRENDER:
opt-in by default: WebRender is an opt-in feature
unavailable by runtime: Build doesn't include WebRender
OMTP:
disabled by default: Disabled by default




Media
-----

Audio Backend: pulse
Max Channels: 2
Preferred Channel Layout: stereo
Preferred Sample Rate: 44100
Output Devices
Name: Group
HD-Audio Generic Estéreo analógico: /devices/pci0000:00/0000:00:08.1/0000:12:00.3/sound/card1
HDA ATI HDMI Digital Stereo (HDMI 4): /devices/pci0000:00/0000:00:03.1/0000:09:00.1/sound/card0
Input Devices
Name: Group
Monitor of HD-Audio Generic Estéreo analógico: /devices/pci0000:00/0000:00:08.1/0000:12:00.3/sound/card1
HD-Audio Generic Estéreo analógico: /devices/pci0000:00/0000:00:08.1/0000:12:00.3/sound/card1
Monitor of HDA ATI HDMI Digital Stereo (HDMI 4): /devices/pci0000:00/0000:00:03.1/0000:09:00.1/sound/card0

Important Modified Preferences
------------------------------

accessibility.typeaheadfind.flashBar: 0
browser.cache.disk.capacity: 358400
browser.cache.disk.filesystem_reported: 1
browser.cache.disk.smart_size.first_run: false
browser.cache.frecency_experiment: 2
browser.download.useDownloadDir: false
browser.places.smartBookmarksVersion: 8
browser.sessionstore.upgradeBackup.latestBuildID: 20180205104257
browser.startup.homepage: www.google.com.ar
browser.startup.homepage_override.buildID: 20180205104257
browser.startup.homepage_override.mstone: 58.0.1
browser.urlbar.timesBeforeHidingSuggestionsHint: 0
dom.push.userAgentID: 68e0ba0d75d644a3bc8ed7caae1918a1
extensions.lastAppVersion: 58.0.1
font.internaluseonly.changed: true
media.gmp-gmpopenh264.abi: x86_64-gcc3
media.gmp-gmpopenh264.lastUpdate: 1518532845
media.gmp-gmpopenh264.version: 1.7.1
media.gmp-manager.buildID: 20180205104257
media.gmp-manager.lastCheck: 1520857613
media.gmp.storage.version.observed: 1
media.webrtc.debug.log_file: /tmp/WebRTC.log
network.cookie.prefsMigrated: true
network.dns.disablePrefetch: true
network.http.speculative-parallel-limit: 0
network.predictor.cleaned-up: true
network.predictor.enabled: false
network.prefetch-next: false
places.database.lastMaintenance: 1520447837
places.history.expiration.transient_current_max_pages: 123155
plugin.disable_full_page_plugin_for_types: application/pdf
print.print_bgcolor: false
print.print_bgimages: false
print.print_duplex: 0
print.print_evenpages: true
print.print_in_color: true
print.print_margin_bottom: 0.5
print.print_margin_left: 0.5
print.print_margin_right: 0.5
print.print_margin_top: 0.5
print.print_oddpages: true
print.print_orientation: 0
print.print_page_delay: 50
print.print_paper_data: 0
print.print_paper_height: 11,69
print.print_paper_name: iso_a4
print.print_paper_size_unit: 0
print.print_paper_width: 8,27
print.print_scaling: 1,00
print.print_shrink_to_fit: true
print.print_to_file: false
print.print_unwriteable_margin_bottom: 56
print.print_unwriteable_margin_left: 25
print.print_unwriteable_margin_right: 25
print.print_unwriteable_margin_top: 25
services.sync.declinedEngines:
services.sync.engine.prefs.modified: false
services.sync.lastPing: 1520944956
services.sync.lastSync: Tue Mar 13 2018 20:45:11 GMT-0300 (-03)
storage.vacuum.last.index: 1
storage.vacuum.last.places.sqlite: 1518533567
webgl.disabled: true

Important Locked Preferences
----------------------------

Places Database
---------------

JavaScript
----------

Incremental GC: true

Accessibility
-------------

Activated: false
Prevent Accessibility: 0

Library Versions
----------------

NSPR
Expected minimum version: 4.17
Version in use: 4.17

NSS
Expected minimum version: 3.34.1
Version in use: 3.34.1

NSSSMIME
Expected minimum version: 3.34.1
Version in use: 3.34.1

NSSSSL
Expected minimum version: 3.34.1
Version in use: 3.34.1

NSSUTIL
Expected minimum version: 3.34.1
Version in use: 3.34.1

Experimental Features
---------------------

Sandbox
-------

Seccomp-BPF (System Call Filtering): true
Seccomp Thread Synchronization: true
User Namespaces: true
Content Process Sandboxing: true
Media Plugin Sandboxing: true
Content Process Sandbox Level: 3
Effective Content Process Sandbox Level: 3

Rejected System Calls
---------------------

Internationalization & Localization
-----------------------------------

Application Settings
Requested Locales: ["en-US"]
Available Locales: ["es-MX","es-ES","es-CL","es-AR","en-ZA","en-GB","en-US"]
App Locales: ["en-US","en-ZA","en-GB"]
Regional Preferences: ["en-US"]
Default Locale: "en-US"
Operating System
System Locales: ["en-US"]
Regional Preferences: ["en-US"]
Does WebGL also crash Nightly? Just try using about:support without disabling WebGL on Nightly.
Flags: needinfo?(gustavo.diaz)
This might indicate a sandboxing issue?:
> /opt/amdgpu/share/libdrm/amdgpu.ids: Permesso negato

Try security.sandbox.content.level:0 and restart the browser.
Also it would be good to try about:support with WebGL not disabled, since that's not content process, and so shouldn't be sandboxed.
Hello!
My about:support can be found accessing the following link: https://pastebin.com/ATRqR9Dh .
Flags: needinfo?(cristian.comorasu)
Ok, since about:support works fine, I definitely think this is sandboxing related.
Please try security.sandbox.content.level:0, restart the browser, and try WebGL again. (You can use http://jdashg.github.io/misc/webgl/ccw-point.html)
Flags: needinfo?(cristian.comorasu)
Hi,

   I can confirm that after changed security.sandbox.content.level:0 (and enabled again WebGL), I don't have any more crashes so far (tested as well the suggested URL).
Flags: needinfo?(gustavo.diaz)
I tested with Fx 61.0a1 (build ID: 20180316100132), on Ubuntu 16.04 LTS with the security.sandbox.content.level:0 and I can confirm the issue is not reproducible.
Flags: needinfo?(cristian.comorasu)
See Also: → 1438215
This is indeed bug 1438215.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
(In reply to Jeff Gilbert [:jgilbert] from comment #27)
> This is indeed bug 1438215.

I'm not so sure about that — as far as I know, the only crashes we found as part of bug 1438215 required editing the sandbox policy to allow opening files in /dev/ati; without such a change, WebGL initialization failed but didn't crash.

Also, the crash reports here are… not what I expected.  These have amdgpu_dri.so in the stack, but they have the same "ATI Technologies Inc." vendor string as fglrx and the same version string formatting, which suggests they may be equivalent to fglrx in other ways as well and raises some other questions about bug 1438215:

https://crash-stats.mozilla.com/report/index/33dda5a8-8ea5-468c-a15a-052b80180127
https://crash-stats.mozilla.com/report/index/58265950-7016-4d43-994b-6dd960180208
https://crash-stats.mozilla.com/report/index/e3da8434-47e7-45b1-b5a3-0d2570180210
https://crash-stats.mozilla.com/report/index/9134a977-a201-45d7-b1d4-9f8440180215

This one, by contrast, is r300_dri.so and has a vendor of "X.Org R300 Project" so it may not be related to any fglrx/amdgpu-specific problems:

https://crash-stats.mozilla.com/report/index/72fe20c8-4fb8-4564-9c71-8375c0180211

See also bug 1438394 — without that patch, fglrx/amdgpu might crash on Nightly (depending on distribution; Ubuntu wasn't affected).
Some of the MESA-GL warnings are also related to Bug 1416016.

As far as I understand, there's AMDGPU (open source) and AMDGPU-PRO (proprietary). It would not surprise me if AMDGPU-PRO is the fglrx stuff rebranded, and suffering from the same problems.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
I'll take this and investigate, seems sandbox related at this point.
Assignee: jgilbert → gpascutto
Component: Graphics → Security: Process Sandboxing
Summary: Cerrtain webpage crash at loading probably related to AMDGPU VIDEO driver with a SIGSEGV → Certain webpage crash at loading probably related to AMDGPU VIDEO driver with a SIGSEGV
Unless I'm mistaken, most of the reporters in this bug indeed were using the AMDGPU-PRO driver, not the open source one (which is Mesa based?).
Summary: Certain webpage crash at loading probably related to AMDGPU VIDEO driver with a SIGSEGV → WebGL causes a crash with the AMDGPU-PRO video driver
I acquired a modern AMD card that is supported by AMDGPU and did some testing:

The open source AMDGPU driver appears to fully work, including WebGL at the highest sandboxing level.

I didn't manage to get the AMDGPU-PRO driver working even in Ubuntu 16.04.3, the only release it claims to support. Installing the driver made me unable to log into the desktop, either by crashing it or making all input device inoperative.

I managed to get it working in 17.10, at least by running the ./amdgpu-install package (there is also a "pro" version of that installer with "legacy OpenGL" and OpenCL drivers, which I'll try next).

The normal driver at least does not identify itself a being ATI Technologies:
Description	X.Org -- Radeon RX 560 Series (AMD POLARIS11 / DRM 3.20.0 / 4.13.0-37-generic, LLVM 5.0.1)
Vendor ID	X.Org
Device ID	Radeon RX 560 Series (AMD POLARIS11 / DRM 3.20.0 / 4.13.0-37-generic, LLVM 5.0.1)
Driver Version	3.0 Mesa 17.2.4-AMD_17.50
This driver appears to work correctly with sandboxing, which is perhaps not so surprising as it's Mesa based.

I'll try the "legacy OpenGL" (--pro) driver next.
I managed to get the "Pro / legacy OpenGL" driver installed via xubuntu 16.04 (which had a working desktop after installing it, probably not relying on working OpenGL like GNOME does?).

GPU #1
Active	Yes
Description	ATI Technologies Inc. -- Radeon RX 560 Series
Vendor ID	ATI Technologies Inc.
Device ID	Radeon RX 560 Series
Driver Version	4.5.13505 Compatibility Profile Context 17.50.2.13

Reproduces:

https://crash-stats.mozilla.com/report/index/984c6431-fed1-497d-b6e9-235380180321
The first problem is getting the device via /sys:

Sandbox: Recording mapping /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 -> /sys/dev/char/226:0
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys for pid=17766
Sandbox: Failed errno -13 op 2 flags 0400000 path /sys
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 for pid=17766
Sandbox: Failed errno -13 op 2 flags 0400000 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices/pci0000:00/0000:00:01.0 for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices/pci0000:00 for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../../devices/pci0000:00
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../../devices
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/../..
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/dev for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char/..
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/dev/char for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev/char
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/dev for pid=17766
Sandbox: Failed errno -13 op 10 flags 00 path /sys/dev
MESA-LOADER: could not get parent device

Whitelisting /sys bypasses this, so it's another case of making the Mesa loader being able to find it's device through ever changing locations between kernel/Mesa versions.

Fixing that leads to this crash:

#01: ???[/home/morbo/hg/firefox/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x65f5e0d]
#02: ???[/home/morbo/hg/firefox/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x65675ce]
#03: ???[/lib/x86_64-linux-gnu/libpthread.so.0 +0x11390]
#04: amdgpu_get_marketing_name[/opt/amdgpu/lib/x86_64-linux-gnu/libdrm_amdgpu.so.1 +0x7994]
#05: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x25b181d]
#06: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x25b1f61]
#07: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x25b62a0]
#08: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x192d698]
#09: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x18d8af2]
++DOCSHELL 0x7f260babe000 == 7 [pid = 18396] [id = {e41a681c-aa7f-4d3e-93fe-9c5f4bb4037a}]
++DOMWINDOW == 17 (0x7f260badd000) [pid = 18396] [serial = 17] [outer = (nil)]
#10: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x540a71]
#11: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x1463a67]
#12: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x1463d37]
#13: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x1478a2c]
#14: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x1478fa6]
#15: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x2574340]
#16: ???[/usr/lib/x86_64-linux-gnu/dri/amdgpu_dri.so +0x25744e4]
#17: ???[/opt/amdgpu-pro/lib/x86_64-linux-gnu/libGL.so.1 +0x7975b]
#18: ???[/opt/amdgpu-pro/lib/x86_64-linux-gnu/libGL.so.1 +0x4af35]
#19: glXQueryVersion[/opt/amdgpu-pro/lib/x86_64-linux-gnu/libGL.so.1 +0x475fd]

Always nice to crash because "get_marketing_name" can't open its file, AMD. I suspect whitelisting "/opt/amdgpu/share/libdrm/amdgpu.ids" will fix this. As the driver installs to a fixed location, we might be able to hardcode that path if we detect an AMD GPU. (We're SOL if it's elsewhere, but then how does the driver find it?)

So as a temporary workaround, you can whitelist /sys/ and /opt/amdgpu/ i.e.:
set security.sandbox.content.read_path_whitelist to "/sys/,/opt/amdgpu/" in about:config.

Now the good news: with these fixes WebGL appears to work, even with full sandboxing. So that's a major improvement over fglrx at least.
Sandbox: Recording mapping /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 -> /sys/dev/char/226:0
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 for pid=17766

Given that we see the mapping, the broker should be able to reverse the lookup to /sys/dev/char/226:0 and that should pass because "/sys/dev/char/226:" is a whitelisted prefix. So it's a bit strange this fails.
The actual path that is requested is "/sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0"

This isn't resolved before doing symlink inversion.
/sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0

We can realpath this to

/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0

Invert the symlink to

/sys/dev/char/226:0

And now try SymlinkPermissions on it.

This works and we accumulate the read permission on "/sys/dev/char/226:0". But at this point we resolve that link, and end up in /sys/dev/char/ and then notice we are trying to back out out of symlink target path, which causes SymlinkPermissions to give a permission denial.
The Mesa <= 12 codepath in the sandbox policy already tries to handle this. Replacing the two directory entries it whitelists by the entire dir (which looks like it should b safe, given the perms on the 226: device for newer cards) does not work though.
I'm having problems getting this to work when I whitelist anything more restrictive than all of /sys. If it's whitelisted, the loading sequence looks like this:

Sandbox: Lookup before convrel=/sys/dev/char/226:0
Sandbox: First lookup after convrel=/sys/dev/char/226:0 (perms 3)
Sandbox: Recording mapping /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 -> /sys/dev/char/226:0
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 3)
Sandbox: Lookup before convrel=/sys/dev
Sandbox: First lookup after convrel=/sys/dev (perms 3)
Sandbox: Lookup before convrel=/sys/dev/char
Sandbox: First lookup after convrel=/sys/dev/char (perms 3)
Sandbox: Lookup before convrel=/sys/devices
Sandbox: First lookup after convrel=/sys/devices (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00
Sandbox: First lookup after convrel=/sys/devices/pci0000:00 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0/uevent
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0/uevent (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm (perms 3)
Sandbox: Failed errno -22 op readlink flags 00 path /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 3)
Sandbox: Lookup before convrel=/sys/devices
Sandbox: First lookup after convrel=/sys/devices (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00
Sandbox: First lookup after convrel=/sys/devices/pci0000:00 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/uevent
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/uevent (perms 3)
Sandbox: Failed errno -2 op access flags 00 path /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/uevent
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 (perms 3)
Sandbox: Failed errno -22 op readlink flags 00 path /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 3)
Sandbox: Lookup before convrel=/sys/devices
Sandbox: First lookup after convrel=/sys/devices (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00
Sandbox: First lookup after convrel=/sys/devices/pci0000:00 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/uevent
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/uevent (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/uevent
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/uevent (perms 3)
Sandbox: Lookup before convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/subsystem
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/subsystem (perms 3)
Sandbox: Recording mapping /sys/bus/pci -> /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/subsystem
Giving permission to almost (but not entirely) everything looks like this:

Sandbox: Lookup before convrel=/sys/dev/char/226:0
Sandbox: First lookup after convrel=/sys/dev/char/226:0 (perms 3)
Sandbox: Recording mapping /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 -> /sys/dev/char/226:0
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 0)
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys for pid=4991
Sandbox: Failed errno -13 op stat flags 0400000 path /sys
Sandbox: Lookup before convrel=/sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card0 (perms 3)
Sandbox: Lookup before convrel=/sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm (perms 3)
Sandbox: Failed errno -22 op readlink flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 0)
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys for pid=4991
Sandbox: Failed errno -13 op stat flags 0400000 path /sys
Sandbox: Lookup before convrel=/sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 (perms 3)
Sandbox: Failed errno -22 op readlink flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0/0000:01:00.0
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 0)
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys for pid=4991
Sandbox: Failed errno -13 op stat flags 0400000 path /sys
Sandbox: Lookup before convrel=/sys/dev/char/../../devices/pci0000:00/0000:00:01.0
Sandbox: First lookup after convrel=/sys/devices/pci0000:00/0000:00:01.0 (perms 3)
Sandbox: Failed errno -22 op readlink flags 00 path /sys/dev/char/../../devices/pci0000:00/0000:00:01.0
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 0)
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys for pid=4991
Sandbox: Failed errno -13 op stat flags 0400000 path /sys
Sandbox: Lookup before convrel=/sys/dev/char/../../devices/pci0000:00
Sandbox: First lookup after convrel=/sys/devices/pci0000:00 (perms 3)
Sandbox: Failed errno -22 op readlink flags 00 path /sys/dev/char/../../devices/pci0000:00
Sandbox: Lookup before convrel=/sys
Sandbox: First lookup after convrel=/sys (perms 0)
Sandbox: SandboxBroker: denied op=stat rflags=400000 perms=0 path=/sys for pid=4991
Sandbox: Failed errno -13 op stat flags 0400000 path /sys
Sandbox: Lookup before convrel=/sys/dev/char/../../devices
Sandbox: First lookup after convrel=/sys/devices (perms 0)
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/devices for pid=4991
Sandbox: Failed errno -13 op readlink flags 00 path /sys/dev/char/../../devices
Sandbox: Lookup before convrel=/sys/dev/char/../..
Sandbox: First lookup after convrel=/sys (perms 0)
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys for pid=4991
Sandbox: Failed errno -13 op readlink flags 00 path /sys/dev/char/../..
Sandbox: Lookup before convrel=/sys/dev/char/..
Sandbox: First lookup after convrel=/sys/dev (perms 0)
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/dev for pid=4991
Sandbox: Failed errno -13 op readlink flags 00 path /sys/dev/char/..
Sandbox: Lookup before convrel=/sys/dev/char
Sandbox: First lookup after convrel=/sys/dev/char (perms 0)
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/dev/char for pid=4991
Sandbox: Failed errno -13 op readlink flags 00 path /sys/dev/char
Sandbox: Lookup before convrel=/sys/dev
Sandbox: First lookup after convrel=/sys/dev (perms 0)
Sandbox: SandboxBroker: denied op=readlink rflags=0 perms=0 path=/sys/dev for pid=4991
Sandbox: Failed errno -13 op readlink flags 00 path /sys/dev
MESA-LOADER: could not get parent device

Note how:
1) It keeps trying to stat /sys
2) It keeps trying to readlink things that aren't links

Maybe it would work if it can stat things only to figure out what the links are? Let's see if we can add permission only for that...
Comment on attachment 8964009 [details]
Bug 1434711 - WebGL causes a crash with the AMDGPU-PRO video driver.

https://reviewboard.mozilla.org/r/232830/#review238992
Attachment #8964009 - Flags: review?(jld) → review+
Pushed by gpascutto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6b6efca52e56
WebGL causes a crash with the AMDGPU-PRO video driver. r=jld
https://hg.mozilla.org/mozilla-central/rev/6b6efca52e56
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Is that something that we want to uplift to 60?
Flags: needinfo?(gpascutto)
Comment on attachment 8964009 [details]
Bug 1434711 - WebGL causes a crash with the AMDGPU-PRO video driver.

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1289718 or thereabouts
[User impact if declined]: WebGL will crash tabs when using AMDGPU-PRO driver.
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Landed a few days ago
[Needs manual test from QE? If yes, steps to reproduce]: Not needed but possible: install AMDGPU-PRO, try any site with WebGL.
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: Low risk. 
[Why is the change risky/not risky?]: Extra permissions if a certain driver is detected.
[String changes made/needed]: N/A
Flags: needinfo?(gpascutto)
Attachment #8964009 - Flags: approval-mozilla-beta?
Comment on attachment 8964009 [details]
Bug 1434711 - WebGL causes a crash with the AMDGPU-PRO video driver.

Fix WebGL crashes with the AMDGPU-PRO driver. Approved for 60.0b11.
Attachment #8964009 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
We can keep this on the radar for 59 but since there is a workaround, I think it is best to wait for this to be fixed in 60.
Fx60 will ship next week with this fix included.
Currently, I don't think this issue has been fixed, even though FF version reached above 60+ (60.3.0esr)
Instead, fixing security level (security.sandbox.content.level) actually enables WebGL acceleration.

Target: multiple Korean websites, Explicit WebGL acceleration test page (http://webglsamples.org/aquarium/aquarium.html)

Here's my current environment for using FF/WebGL. (AMD Radeon Pro WX 5100/AMDGPU Pro Driver)

"name": "Firefox",
"osVersion": "Linux 3.10.0-862.el7.x86_64",
"version": "60.3.0esr",
"buildID": "20181025072128",
"userAgent": "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0",
"safeMode": false,
"supportURL": "https://support.mozilla.org/1/firefox/60.3.0/Linux/en-US/",
"numTotalWindows": 2,
"numRemoteWindows": 2,
"remoteAutoStart": true,
"currentContentProcesses": 3,
"maxContentProcesses": 4,
"autoStartStatus": 1,
"styloBuild": true,
"styloDefault": true,
"styloResult": true,
"styloChromeDefault": true,
"styloChromeResult": true,
"policiesStatus": 0,
"keyGoogleFound": true,
"keyMozillaFound": true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: