Allow more syscalls for nvidia-vaapi-driver, possibly behind a pref
Categories
(Core :: Security: Process Sandboxing, enhancement, P2)
Tracking
()
People
(Reporter: rmader, Unassigned)
References
(Blocks 2 open bugs)
Details
Attachments
(1 file)
|
412.16 KB,
text/x-log
|
Details |
There's a new VAAPI wrapper imlementation for Nvidia: https://github.com/elFarto/nvidia-vaapi-driver#firefox
It says it need the syscall 41,49,50,332 (socket, bind, listen, statx).
Jed, does that look sensible to you? It also says: "This is not recommended for general use as it reduces security" and indeed sounds somewhat dangerous.
Comment 1•4 years ago
|
||
I'll need to find out more about what's going on here. I really don't want to allow sockets if there's any way to avoid it; I see that connect isn't in the list, so it's possible that this isn't trivially broken, but with datagram sockets sendmsg can send to any destination even with an unconnected socket and that's also something I've tried to avoid allowing where possible. If it's a case of Nvidia blobs that try to use sockets but degrade gracefully if socket returns an error, then that's fine; we already have issues like that in content processes. We definitely won't allow statx with arbitrary arguments; currently we'll reject it with ENOSYS and ideally the caller would fall back to an older syscall which will be replaced with a file broker request, so I need to find out why that's not working. (Supporting statx directly in the broker would be possible, but it would be a nontrivial amount of complexity.)
Comment 2•4 years ago
•
|
||
Background:
At the moment, the project's readme suggests setting security.sandbox.content.level to 0.
jrmuizel recommended rather setting MOZ_DISABLE_RDD_SANDBOX=1 than disabling the content process sandbox:
even running in the rdd with the sandbox completely disabled wouldn't be the worst option in the world
definitely better than disabling the sandbox in the content processes
So we recommended media.rdd-ffmpeg.enabled=true + MOZ_DISABLE_RDD_SANDBOX=1 in this issue:
https://github.com/elFarto/nvidia-vaapi-driver/issues/6#issuecomment-1005630454
If the required syscalls could be behind a security.sandbox.rdd.nvidia-highly-experimental-vaapi pref, it would allow users of that project to not disable any sandbox.
Updated•4 years ago
|
Updated•3 years ago
|
Updated•1 year ago
|
MOZ_DISABLE_RDD_SANDBOX=1 (and well setting media.hardware-video-decoding.force-enabled, because bug 1752494 blocked every single nvidia across the board) is everything that is required today.
This is the driver log when you adjust those variables
376322.287508944 [1100815-1100878] ../src/vabackend.c:2187 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 31
376322.287518143 [1100815-1100878] ../src/vabackend.c:2196 __vaDriverInit_1_0 Now have 0 (0 max) instances
376322.287523143 [1100815-1100878] ../src/vabackend.c:2222 __vaDriverInit_1_0 Selecting Direct backend
376322.295299379 [1100815-1100878] ../src/direct/nv-driver.c: 267 init_nvdriver Initing nvdriver...
376322.295330388 [1100815-1100878] ../src/direct/nv-driver.c: 285 init_nvdriver NVIDIA kernel driver version: 550.90.07, major version: 550, minor version: 90
376322.295338142 [1100815-1100878] ../src/direct/nv-driver.c: 292 init_nvdriver Got dev info: 100 1 2 6
376322.449570794 [1100815-1100878] ../src/vabackend.c:1445 nvQueryImageFormats In nvQueryImageFormats
376322.594721510 [1100815-1100878] ../src/vabackend.c: 674 nvCreateConfig got profile: 6 with 0 attributes
376322.594778409 [1100815-1100878] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) (nil) 0
And this is it out of the box
377533.729002022 [1107094-1107164] ../src/vabackend.c: 155 init CUDA ERROR 'OS call failed or operation not supported on this OS' (304)
377533.729025017 [1107094-1107164] ../src/vabackend.c:2174 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 31
377533.729027517 [1107094-1107164] ../src/vabackend.c:2183 __vaDriverInit_1_0 Now have 0 (0 max) instances
377533.729029675 [1107094-1107164] ../src/vabackend.c:2209 __vaDriverInit_1_0 Selecting Direct backend
377533.742221673 [1107094-1107164] ../src/direct/nv-driver.c: 267 init_nvdriver Initing nvdriver...
377533.742372975 [1107094-1107164] ../src/direct/nv-driver.c: 189 nv_get_versions nv_check_version failed: -1 25
[GFX1-]: VideoBridgeParent receives IPC close with reason=AbnormalShutdown
[Child 1106989, MediaDecoderStateMachine #1] WARNING: Decoder=74378b6f5c00 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - auto mozilla::MediaChangeMonitor::CreateDecoderAndInit(MediaRawData *)::(anonymous class)::operator()(const MediaResult &) const: Unable to create decoder: file /usr/src/debug/firefox/firefox-128.0/dom/media/MediaDecoderStateMachineBase.cpp:167
https://github.com/elFarto/nvidia-vaapi-driver/blob/v0.0.12/src/vabackend.c#L168
The long story short is that cuda cannot be initialized (not sure if the "move ffmpeg to gpu process" idea advanced in bug 1683808 couldn't help?)
Comment 5•1 year ago
|
||
(In reply to mirh from comment #4)
MOZ_DISABLE_RDD_SANDBOX=1(and well settingmedia.hardware-video-decoding.force-enabled, because bug 1752494 blocked every single nvidia across the board) is everything that is required today.
This is the driver log when you adjust those variables376322.287508944 [1100815-1100878] ../src/vabackend.c:2187 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 31 376322.287518143 [1100815-1100878] ../src/vabackend.c:2196 __vaDriverInit_1_0 Now have 0 (0 max) instances 376322.287523143 [1100815-1100878] ../src/vabackend.c:2222 __vaDriverInit_1_0 Selecting Direct backend 376322.295299379 [1100815-1100878] ../src/direct/nv-driver.c: 267 init_nvdriver Initing nvdriver... 376322.295330388 [1100815-1100878] ../src/direct/nv-driver.c: 285 init_nvdriver NVIDIA kernel driver version: 550.90.07, major version: 550, minor version: 90 376322.295338142 [1100815-1100878] ../src/direct/nv-driver.c: 292 init_nvdriver Got dev info: 100 1 2 6 376322.449570794 [1100815-1100878] ../src/vabackend.c:1445 nvQueryImageFormats In nvQueryImageFormats 376322.594721510 [1100815-1100878] ../src/vabackend.c: 674 nvCreateConfig got profile: 6 with 0 attributes 376322.594778409 [1100815-1100878] ../src/vabackend.c:1801 nvQuerySurfaceAttributes with 4 (8) (nil) 0And this is it out of the box
377533.729002022 [1107094-1107164] ../src/vabackend.c: 155 init CUDA ERROR 'OS call failed or operation not supported on this OS' (304) 377533.729025017 [1107094-1107164] ../src/vabackend.c:2174 __vaDriverInit_1_0 Initialising NVIDIA VA-API Driver: 31 377533.729027517 [1107094-1107164] ../src/vabackend.c:2183 __vaDriverInit_1_0 Now have 0 (0 max) instances 377533.729029675 [1107094-1107164] ../src/vabackend.c:2209 __vaDriverInit_1_0 Selecting Direct backend 377533.742221673 [1107094-1107164] ../src/direct/nv-driver.c: 267 init_nvdriver Initing nvdriver... 377533.742372975 [1107094-1107164] ../src/direct/nv-driver.c: 189 nv_get_versions nv_check_version failed: -1 25 [GFX1-]: VideoBridgeParent receives IPC close with reason=AbnormalShutdown [Child 1106989, MediaDecoderStateMachine #1] WARNING: Decoder=74378b6f5c00 Decode error: NS_ERROR_DOM_MEDIA_FATAL_ERR (0x806e0005) - auto mozilla::MediaChangeMonitor::CreateDecoderAndInit(MediaRawData *)::(anonymous class)::operator()(const MediaResult &) const: Unable to create decoder: file /usr/src/debug/firefox/firefox-128.0/dom/media/MediaDecoderStateMachineBase.cpp:167https://github.com/elFarto/nvidia-vaapi-driver/blob/v0.0.12/src/vabackend.c#L168
The long story short is that cuda cannot be initialized (not sure if the "move ffmpeg to gpu process" idea advanced in bug 1683808 couldn't help?)
(In reply to mirh from comment #104)
See also bug 1748460 probably
yes, i pinged the people in charge of that over matrix but got no answer
If you have hardware, you can now run with the profiler to collect sandbox infos, this landed a few weeks ago in nightly: https://firefox-source-docs.mozilla.org/tools/profiler/sandbox.html#recoding-sandbox-violations
With uptodate infos from the profiler we can likely work on the sandbox holes more easily
After countless struggle.. (documentation doesn't tell you that the settings that are pre-filled in the button/toolbar doesn't actually work with MOZ_PROFILER_STARTUP which requires MOZ_PROFILER_STARTUP_FEATURES set with the *perftools-presets-debug details)
The great majority of threads only has this to report in the marker table (tens of thousands of times in just half a minute):
SandboxBrokerClient — SandboxBrokerClient id 20409 op openrflags 0path /proc/13360/statmpath2 (empty)pid 13360
SandboxBrokerClient — SandboxBrokerClient id 20410 op openrflags 591872path /proc/13360/taskpath2 (empty)pid 13360
Then, in one of the youtube sometimes you'll have this:
SandboxBrokerClient — SandboxBrokerClient id 20413 op readlinkrflags 0path /proc/self/exepath2 (empty)pid 13360
Which is eventually followed by an incredible number of libraries read attempts (same parameters of the statm one): linux-vdso.so.1, ./libmozsandbox.so, /usr/lib/libdl.so.2 and /usr/lib/libstdc++.so.6 are just the few firsts but then the list goes on and on.
Comment 7•1 year ago
•
|
||
You don't need to do it using MOZ_PROFILER_STARTUP, and please share the generated profile
Comment 8•1 year ago
|
||
So can you generate a profile and share it? no need for profiler startup, as long as you start the profiler on a fresh profile instance, once you load e.g. a youtube page and start a few seconds of video it should be enough to kick the RDD process and have the required information. Once you have it, share here, and we can iterate like we did on bug 1903688
Comment 10•1 year ago
|
||
Unfortunately, I dont see any RDD process here. Was it present in about:processes ? Its name should be "Remote Data Decoder" (it would be localized if you use a non english build)
Comment 11•1 year ago
|
||
No it's not there.
Though maybe this has something to do with its absence
https://crash-stats.mozilla.org/report/index/ee582eb5-392d-43a0-8c69-250a50240715
Comment 12•1 year ago
|
||
(In reply to mirh from comment #11)
No it's not there.
Though maybe this has something to do with its absence
https://crash-stats.mozilla.org/report/index/ee582eb5-392d-43a0-8c69-250a50240715
Unfortunately, hard to actionate. Could sandboxing be triggering a bug in nvidia's code? And we would be tricked because we loose profiler info from the crashing process.
We need to fallback to MOZ_SANDBOX_LOGGING=1 MOZ_SANDBOX_RDD_LOGGING=1 firefox 2>&1 | tee sandbox.log to investigate, unfortunately.
Comment 13•1 year ago
|
||
And this is from coredumpctl
>>> bt
#0 0x00007ff9a4b75e0b in __GI_____strtol_l_internal (nptr=nptr@entry=0x0, endptr=endptr@entry=0x0, base=base@entry=10, group=group@entry=0,
bin_cst=bin_cst@entry=true, loc=0x7ff9a4d0c3c0 <_nl_global_locale>) at ../stdlib/strtol_l.c:304
#1 0x00007ff9a4b75dbc in __GI___isoc23_strtol (nptr=nptr@entry=0x0, endptr=endptr@entry=0x0, base=base@entry=10) at ../stdlib/strtol.c:126
#2 0x00007ff98f76dda2 in atoi (__nptr=0x0) at /usr/include/stdlib.h:483
#3 init_nvdriver (context=context@entry=0x7ff99409fc00, drmFd=27) at ../src/direct/nv-driver.c:283
#4 0x00007ff98f76d551 in direct_initExporter (drv=0x7ff99409fb30) at ../src/direct/direct-export-buf.c:100
#5 0x00007ff98f77423c in __vaDriverInit_1_0 (ctx=0x7ff9a49303e0) at ../src/vabackend.c:2246
#6 0x00007ff99d3521a3 in vaInitialize () from /usr/lib/libva.so.2
#7 0x00007ff999d4ecb2 in mozilla::FFmpegVideoDecoder<46465650>::CreateVAAPIDeviceContext() () from /nightly-root/firefox/libxul.so
#8 0x00007ff999d4e4af in mozilla::FFmpegVideoDecoder<46465650>::InitVAAPIDecoder() () from /nightly-root/firefox/libxul.so
#9 0x00007ff999d4868c in mozilla::FFmpegVideoDecoder<46465650>::Init() () from /nightly-root/firefox/libxul.so
#10 0x00007ff999d20360 in mozilla::detail::ProxyFunctionRunnable<mozilla::MediaDataDecoderProxy::Init()::$_0, mozilla::MozPromise<mozilla::TrackInfo::TrackType, mozilla::MediaResult, true> >::Run() () from /nightly-root/firefox/libxul.so
#11 0x00007ff996a8c015 in mozilla::TaskQueue::Runner::Run() () from /nightly-root/firefox/libxul.so
Comment 14•1 year ago
|
||
(one of the reason the profiler is more comfortable) can you confirm PID 119874 was the RDD process?
Comment 15•1 year ago
|
||
So far I dont see any https://searchfox.org/mozilla-central/rev/8c6edfe25c094e032a27722ef30f69555f556bf8/security/sandbox/linux/Sandbox.cpp#156-161 but there are several files denied close to tentatives to loading cuda on e.g., 119874 and a few others.
Comment 16•1 year ago
|
||
That's the one that returned the above gdb trace, yes.
Description
•