Closed Bug 1619585 Opened 4 years ago Closed 2 years ago

VAAPI video playback leads to sandbox violations (Temporary solution: Uninstall iHD driver (intel-media-driver) and install i965 driver (i965-va-driver == intel-vaapi-driver))

Categories

(Core :: Audio/Video: Playback, defect, P5)

x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1698778
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- disabled
firefox-esr91 --- disabled
firefox73 --- unaffected
firefox74 --- unaffected
firefox75 --- disabled
firefox92 --- disabled
firefox93 --- disabled
firefox94 --- disabled

People

(Reporter: kubrick, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, nightly-community)

Crash Data

Attachments

(3 files)

Trying to play https://www.youtube.com/watch?v=Bey4XXJAqS8 using the avc1 format leads to the process violating the sandbox.

bp-060a3520-3b75-4416-b053-9b2530200303
Mar 03 13:52:23 xps16 firefox-nightly.desktop[177020]: Sandbox: seccomp sandbox violation: pid 177020, tid 177065, syscall 64, args 1140872792 1 1974 139969028948256 0 139968664480984. Killing process.
Mar 03 13:52:23 xps16 systemd[1]: Started Process Core Dump (PID 177088/UID 0).
Mar 03 13:52:24 xps16 systemd-coredump[177089]: Removed old coredump core.Web\x20Content.1000.ec83f4ebbaea4df5b57e3a10e23bf298.176823.1583239907000000000000.lz4.
Mar 03 13:52:24 xps16 systemd-coredump[177089]: Process 177020 (Web Content) of user 1000 dumped core.

                                            Stack trace of thread 177065:
                                            #0  0x00007f4d14696f8d syscall (libc.so.6 + 0xf9f8d)
                                            #1  0x00007f4d0a88381d n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x8f381d)
                                            #2  0x00007f4d14b39667 n/a (/home/francois/bin/firefox-nightly/libmozsandbox.so + 0xc667)
                                            #3  0x00007f4d14ad2800 __restore_rt (libpthread.so.0 + 0x14800)
                                            #4  0x00007f4d1469e20b semget (libc.so.6 + 0x10120b)
                                            #5  0x00007f4ce746e09b n/a (iHD_drv_video.so + 0x46809b)
                                            #6  0x00007f4ce746f42b n/a (iHD_drv_video.so + 0x46942b)
                                            #7  0x00007f4ce74566b8 n/a (iHD_drv_video.so + 0x4506b8)
                                            #8  0x00007f4cfcf6e288 n/a (libva.so.2 + 0x12288)
                                            #9  0x00007f4cfcf71cbf vaInitialize (libva.so.2 + 0x15cbf)
                                            #10 0x00007f4cfb8e537e n/a (libavutil.so.56 + 0x3037e)
                                            #11 0x00007f4cfb8e5515 n/a (libavutil.so.56 + 0x30515)
                                            #12 0x00007f4cfb8dda43 av_hwdevice_ctx_create (libavutil.so.56 + 0x28a43)
                                            #13 0x00007f4d0c242fb5 n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x22b2fb5)
                                            #14 0x00007f4d0c24317d n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x22b317d)
                                            #15 0x00007f4d0c2436ca n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x22b36ca)
                                            #16 0x00007f4d0c2140b9 n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x22840b9)
                                            #17 0x00007f4d0e4c8496 n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x4538496)
                                            #18 0x00007f4d0e4d081e n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x454081e)
                                            #19 0x00007f4d0e4d0e8d n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x4540e8d)
                                            #20 0x00007f4d0de0ff25 n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x3e7ff25)
                                            #21 0x00007f4d0de11866 n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x3e81866)
                                            #22 0x00007f4d0de4368d n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x3eb368d)
                                            #23 0x00007f4d0e69fdef n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x470fdef)
                                            #24 0x00007f4d0e4cbb3e n/a (/home/francois/bin/firefox-nightly/libxul.so + 0x453bb3e)
                                            #25 0x00007f4d145804de n/a (/home/francois/bin/firefox-nightly/libnspr4.so + 0x254de)
                                            #26 0x00007f4d14ac746f start_thread (libpthread.so.0 + 0x946f)
                                            #27 0x00007f4d1469c3d3 __clone (libc.so.6 + 0xff3d3)

libva-intel-driver-hybrid 2.4.0-1
Intel Corporation Iris Graphics 540 (rev 0a)
gnome 3.34.4
linux 5.5.7

Setting security.sandbox.content.level to 0 instead of 4 works around the issue and "it works" (although playback is horrible but that's another issue)

avc1 is not accelerated by va-api.

well, when I got it to play (ie not crash) by disabling sandboxing, that is the format that was playing.

I had media.peerconnection.video.vp9_enabled set to false by the way.

Can you run it with MOZ_LOG="PlatformDecoderModule:5" and attach the log here?
Thanks.

Flags: needinfo?(kubrick)

(In reply to Martin Stránský [:stransky] from comment #2)

h.264 is only supported now.

avc1 is not accelerated by va-api.

avc1 is h.264 though.

Confirmed with Nightly 20200303095030, KDE, Wayland, Debian Testing, Macbook Pro.
https://addons.mozilla.org/firefox/addon/enhanced-h264ify/
bp-bf189985-b0b2-468e-b54c-92d400200303
bp-98df9a4b-a984-420a-931f-9fe7d0200303
bp-5594921a-6a78-4cc8-baa9-b79420200303
It works with security.sandbox.content.level;0.

GDK_SCALE=2 GDK_DPI_SCALE=2 GDK_BACKEND=wayland MOZ_LOG="PlatformDecoderModule:5" ./firefox

[Child 9839: MediaPDecoder #1]: D/PlatformDecoderModule Initialising VA-API FFmpeg decoder
[Child 9839: MediaPDecoder #3]: D/PlatformDecoderModule AudioTrimmer[0x7fa8fbed1c00] ::operator(): sample[-114693,-68254] no trimming information
[Child 9839: MediaPDecoder #3]: D/PlatformDecoderModule AudioTrimmer[0x7fa8fbed1c00] ::HandleDecodedResult: sample[-114693,-68254] (decoded[-114693,-68254] no trimming needed
[AVHWDeviceContext @ 0x7fa8f57ce6c0] Trying to use DRM render node for device 0.
[AVHWDeviceContext @ 0x7fa8f57ce6c0] libva: VA-API version 1.6.0
[AVHWDeviceContext @ 0x7fa8f57ce6c0] libva: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[AVHWDeviceContext @ 0x7fa8f57ce6c0] libva: Found init function __vaDriverInit_1_6
Sandbox: seccomp sandbox violation: pid 9839, tid 9880, syscall 64, args 1140872792 1 1974 140363982438688 140363982438432 0.  Killing process.
Status: UNCONFIRMED → NEW
Crash Signature: [@ semget ]
Ever confirmed: true
Attached file vaapi.log.zst

This is the log with

GDK_BACKEND=wayland MOZ_ENABLE_WAYLAND=1 MOZ_LOG="PlatformDecoderModule:5"

Flags: needinfo?(kubrick)
Component: Audio/Video: Playback → Security: Process Sandboxing

Isn't VAAPI for Wayland supposed to be disabled by default?

with so many regressions being lodged, it seems to me that it's not.

Flags: needinfo?(stransky)

(In reply to Jean-Yves Avenard [:jya] from comment #9)

Isn't VAAPI for Wayland supposed to be disabled by default?

with so many regressions being lodged, it seems to me that it's not.

It's probably just because a lot of people are interested in testing, it seems to be behind the widget.wayland-dmabuf-vaapi.enabled config.

These are willingly encountered problems (opt-in by pref), please don't backout. WebRender is only enabled on Nightly, Wayland backend is enabled by env var and VAAPI by pref.

The driver does in fact appear to be using SysV IPC; here, for example. This is… not great; SysV IPC's security model is based only on uid/gid and as far as I know there's no way to use a broker to restrict access as we do with regular file accesses (by passing fds with SCM_RIGHTS).

We might be able to get away without completely allowing SysV IPC if the driver only ever uses fixed hard-coded key_t values, because those are just integers and can be filtered by seccomp-bpf. (For example, 1140872792 seen in comment #7 is the value defined in the driver source as #define DUAL_VDBOX_KEY ('D'<<24|'V'<<8|'X'<<0).) This can even work on 32-bit x86 before kernel 4.3, because ipc(2)'s arguments are shifted by one, not passed in memory as for socketcall(2). However, it means we can't use CLONE_NEWIPC. And of course it will break if the driver ever changes that implementation detail.

Something else I'd like to find out is what options we have for moving the use of this driver into another process, like the RDD or GPU process, so that the sandbox exception isn't exposed to the process that's executing untrusted JS.

There's also the usual question of what Chromium does; they seem to have code for VA-API as a compile-time option, but not enabled in Chrome.

(In reply to Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧ from comment #12)

The driver does in fact appear to be using SysV IPC; [here, for example][a-semget]. This is… not great; SysV IPC's security model is based only on uid/gid and as far as I know there's no way to use a broker to restrict access as we do with regular file accesses (by passing fds with SCM_RIGHTS).

As mentioned in the bug where this feature was added.
This code must run in the GPU or RDD process. It shouldn't run in the content process as it does now.

So I don't believe the priority to fix this bug as-is to be high, nor a sandboxing issue.

Can this be a distro specific? Because I don't see the "seccomp sandbox violation" crash on Fedora.

(In reply to Jean-Yves Avenard [:jya] from comment #9)

Isn't VAAPI for Wayland supposed to be disabled by default?
with so many regressions being lodged, it seems to me that it's not.

Yes, it's definitely disabled by default.

Flags: needinfo?(stransky)

(In reply to Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧ from comment #12)

There's also the usual question of what Chromium does; they seem to have code for VA-API as a compile-time option, but not enabled in Chrome.

Yes, Chrome/Chromium does not build it by default, only some distros enable it in custom builds.

Depends on: 1595994

This code must run in the GPU or RDD process. It shouldn't run in the content process as it does now.
So I don't believe the priority to fix this bug as-is to be high, nor a sandboxing issue.

I don't want to pass the hot potato/play ping-pong here, but as we both agree there's nothing actionable here on the sandboxing side, I'm moving this back to the AV component.

If this goes into the RDD process, that sandbox will need to have SysV restrictions lifted, but as per conversation with jld we can probably live with that.

Component: Security: Process Sandboxing → Audio/Video: Playback

If this goes into the RDD process, that sandbox will need to have SysV restrictions lifted, but as per conversation with jld we can probably live with that.

This bug is for that change.

For reference: Chromium has a compile-time option for VA-API, off by default, enabled in some downstream builds. From the layout of their codebase it looks like they run it in their GPU process, which seems to allow unrestricted use of SysV IPC; Chromium's renderer processes, in contrast, do not allow it. (Interestingly, they seem to use only seccomp-bpf filtering of the syscalls and don't unshare the IPC namespace.)

Allowing SysV IPC in the RDD process isn't ideal, but if the alternative is running ffmpeg in the GPU process: the GPU process currently doesn't have a sandbox on Linux, and its eventual sandbox would be weaker than for RDD, and the data immediately accessible to the RDD process would only be related to media decoding, not all of the browser's compositing. So, the RDD process seems to be the better place for this.

I'm just a user, and quite desperate to get "chrome equivalent" video playback on Firefox but, my 2 cents on this are:

  • Chrome currently achieves much smoother and lower cpu usage video playback than FF without VA-API (I can watch 4k YT video on Chrome, I can't of FF)
  • while VA-API saves CPU usage, it doesn't save much power as load increases on the GPU
  • I would rather not compromise security for performance

So maybe VA-API is not the solution here and software decoding with dmabuf will be enough eventually.

(In reply to Francois Guerraz from comment #20)

I'm just a user, and quite desperate to get "chrome equivalent" video playback on Firefox but, my 2 cents on this are:

  • Chrome currently achieves much smoother and lower cpu usage video playback than FF without VA-API (I can watch 4k YT video on Chrome, I can't of FF)
  • while VA-API saves CPU usage, it doesn't save much power as load increases on the GPU
  • I would rather not compromise security for performance

Please file a new bug about it, this is about sandboxing. You may need to use gecko profiler and other techniques to get the problem.

(In reply to Martin Stránský [:stransky] from comment #21)

Please file a new bug about it, this is about sandboxing. You may need to use gecko profiler and other techniques to get the problem.
No, my comment is very much about sandboxing. You don't need to change the sandbox if you ditch VA-API support, which I am arguing is probably the right thing to do.
I know you spent a lot of time on this so this is probably not nice to hear but I'm questioning the usefulness of this feature especially in the light of it breaking the security model, which is the topic of this bug report.

Again, I'm just a user, feel free to ignore :-)

(In reply to Francois Guerraz from comment #22)

(In reply to Martin Stránský [:stransky] from comment #21)

Please file a new bug about it, this is about sandboxing. You may need to use gecko profiler and other techniques to get the problem.

No, my comment is very much about sandboxing. You don't need to change the sandbox if you ditch VA-API support, which I am arguing is probably the right thing to do.

I know you spent a lot of time on this so this is probably not nice to hear but I'm questioning the usefulness of this feature especially in the light of it breaking the security model, which is the topic of this bug report.

VA-API support will not be officially released without sandboxing, of course.

Again, I'm just a user, feel free to ignore :-)

Martin is suggesting to open a new bug about improving video playback when VA-API is not available, as this bug is just about fixing the sandbox problem with VA-API.

(In reply to Francois Guerraz from comment #22)

(In reply to Martin Stránský [:stransky] from comment #21)

Please file a new bug about it, this is about sandboxing. You may need to use gecko profiler and other techniques to get the problem.
No, my comment is very much about sandboxing. You don't need to change the sandbox if you ditch VA-API support, which I am arguing is probably the right thing to do.
I know you spent a lot of time on this so this is probably not nice to hear but I'm questioning the usefulness of this feature especially in the light of it breaking the security model, which is the topic of this bug report.

Again, I'm just a user, feel free to ignore :-)

I think your problems depends on your actual HW/SW configuration. I did tests with chrome/chromium/firefox some time ago and all of them use the same %CPU, at least on my PC which is 4-years old lenovo laptop. Also I tested 4K playback (https://openalt.cz/2019/slides/martin-stransky-firefox-stav-a-budoucnost-nejen-na-linuxu.pdf - sorry the slides are on Czech as it was a talk on a local conference).

When both Firefox and Chrome use ffmpeg there isn't any reason why Chrome should be faster and Firefox slower as video playback is a quite simple task, you just get frames from ffmpeg and paint them. Also I saw wrong playback performance on my HW when for instance sound driver was broken/struck and so on...so I think it's something to fix for your particular setup.

(In reply to Marco Castelluccio [:marco] from comment #23)

Martin is suggesting to open a new bug about improving video playback when VA-API is not available, as this bug is just about fixing the sandbox problem with VA-API.

It may be related to an actual vaapi driver version. I was told that there are two drivers available, intel-vaapi-driver and intel-media-driver.

I use intel-media-driver and there isn't any sandboxing issue, it's also a reason why I didn't hit that during development. Also intel-media-driver has some patent issees (I was told) so it's not available by default but on third-party repos only.

intel-vaapi-driver is an older one, I didn't test it as it's not available in Fedora but I expect the reports here comes from it.

Confirmed! The crash only occurs with the new iHD driver and not with the old i965 driver:

GDK_BACKEND=wayland LIBVA_DRIVER_NAME=i965 ./firefox works and does not crash.
https://wiki.archlinux.org/index.php/Hardware_video_acceleration#Configuring_VA-API

My drivers:

$ ls /usr/lib/x86_64-linux-gnu/dri/*_drv_video.so
/usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so  /usr/lib/x86_64-linux-gnu/dri/nouveau_drv_video.so  /usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so
/usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so   /usr/lib/x86_64-linux-gnu/dri/r600_drv_video.so

I have both installed:

iHD is used by default for my "Mesa DRI Intel(R) Iris 6100 (Broadwell GT3)":

$ vainfo
libva info: VA-API version 1.6.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_6
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.6 (libva 2.6.0)
vainfo: Driver version: Intel iHD driver - 19.4.0
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD

$ LIBVA_DRIVER_NAME=i965 vainfo
libva info: VA-API version 1.6.0
libva info: User environment variable requested driver 'i965'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_5
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.6 (libva 2.6.0)
vainfo: Driver version: Intel i965 driver for Intel(R) Broadwell - 2.4.0
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Simple            : VAEntrypointEncSlice
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264MultiviewHigh      : VAEntrypointVLD
      VAProfileH264StereoHigh         : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointVLD

I observe the same behaviour as [:darkspirit] on Arch.

I'm not getting the crash with iHD_drv_video.so on Fedora 31/32.

$vainfo
libva info: VA-API version 1.6.0
libva info: Trying to open /usr/lib64/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_6
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.6 (libva 2.6.1)
vainfo: Driver version: Intel iHD driver - 1.0.0

(In reply to Martin Stránský [:stransky] from comment #28)

1.0.0

That's the non-free variant of the iHD driver: https://packages.debian.org/testing/intel-media-va-driver-non-free
Installing that package removes intel-media-va-driver, it has the same sandbox issue for me, but supports encoding even with my old Broadwell.
I assume these patent issues you mentioned could be about encoding with older devices.

$ vainfo
libva info: VA-API version 1.6.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_6
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.6 (libva 2.6.0)
vainfo: Driver version: Intel iHD driver - 1.0.0
vainfo: Supported profile and entrypoints
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileNone                   : VAEntrypointStats
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Simple            : VAEntrypointEncSlice
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointFEI
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointFEI
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointFEI
      VAProfileVP8Version0_3          : VAEntrypointVLD

https://github.com/intel/media-driver

  • Full Feature Build is default driver build, which supports all feature by hardware accelerator and close source shaders(media kernel binaries). Ubuntu intel-media-va-driver-non-free package is generated from this build type.
  • Free Kernel Build, enables fully open source shaders(media kernels) and hardware features but the features would be limited. Ubuntu intel-media-va-driver package is generated from this build type.

while VA-API saves CPU usage, it doesn't save much power as load increases on the GPU

Do you have any proof of that? I didn't measure with Firefox as it's using too much CPU anyway, but a while ago I ran a test using the mpv player. On a Skylake CPU/iGPU software decoding used 2.5x more power (31 W) than hardware decoding (25 W) as a delta over idle (21 W), measured at the power plug. That was with a 1080p H.264 video. So it seems to me that hardware acceleration can be a huge win.

I would rather not compromise security for performance

You will likely be able to disable VA-API, but I don't think allowing some extra SysV IPC calls is that much of a security issue.

I observe the same behaviour as [:darkspirit] on Arch.

I haven't ran into any crashes with intel-media-driver 19.4.0.r-1 on Arch Linux.

I think I have the same issue for many days, but I am using an AMD GPU RX 570. The tab crash when the browser try to play a video.

[Parent 19816: Main Thread]: D/WidgetWayland moz_gtk_widget_get_wl_surface [0x7fb470d1da60] wl_surface 0x7fb4703631f0 ID 43
Sandbox: seccomp sandbox violation: pid 20199, tid 20305, syscall 312, args 20199 20199 0 40 58 140523005886528. Killing process.

[Parent 19816: Main Thread]: D/WidgetWayland moz_gtk_widget_get_wl_surface [0x7fb470d1da60] wl_surface 0x7fb4703631f0 ID 43

Some crash reports:
https://crash-stats.mozilla.org/report/index/4013caef-3c7a-47a4-b84f-6ca410200303
https://crash-stats.mozilla.org/report/index/a4269751-5465-4ce7-b187-3d9910200319

I tried with a clean profile and the same happens. I am using latest nightly in Arch.

vainfo:

vainfo: VA-API version: 1.6 (libva 2.6.0)
vainfo: Driver version: Mesa Gallium driver 20.1.0-devel for Radeon RX 570 Series (POLARIS10, DRM 3.36.0, 5.5.10-zen1-1-zen, LLVM 11.0.0)
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc

Two logs:
GDK_BACKEND=wayland MOZ_LOG="WidgetWayland:5" MOZ_LOG="PlatformDecoderModule:5" firefox-nightly

link to log:
https://pastebin.com/jd0NtXXH

GDK_BACKEND=wayland MOZ_LOG="WidgetWayland:5" MOZ_LOG="PlatformDecoderModule:5" LD_DEBUG=libs firefox-nightly

link to log:
https://pastebin.com/d7JWw6pY

Archlinux, mesa-git (20.1.0). RX 570.

This is going to be fixed by Bug 1595994

(In reply to Leonardo from comment #31)

I think I have the same issue for many days, but I am using an AMD GPU RX 570. The tab crash when the browser try to play a video.

[Parent 19816: Main Thread]: D/WidgetWayland moz_gtk_widget_get_wl_surface [0x7fb470d1da60] wl_surface 0x7fb4703631f0 ID 43
Sandbox: seccomp sandbox violation: pid 20199, tid 20305, syscall 312, args 20199 20199 0 40 58 140523005886528. Killing process.

Translation: kcmp(getpid(), getpid(), KCMP_FILE, 40, 58) (the last two arguments are fds).

So we'll also have to modify the sandbox policy to allow kcmp, restricted to the process's own pid.

(In reply to Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧ from comment #34)

(In reply to Leonardo from comment #31)

[Parent 19816: Main Thread]: D/WidgetWayland moz_gtk_widget_get_wl_surface [0x7fb470d1da60] wl_surface 0x7fb4703631f0 ID 43
Sandbox: seccomp sandbox violation: pid 20199, tid 20305, syscall 312, args 20199 20199 0 40 58 140523005886528. Killing process.

Translation: kcmp(getpid(), getpid(), KCMP_FILE, 40, 58) (the last two arguments are fds).

So we'll also have to modify the sandbox policy to allow kcmp, restricted to the process's own pid.

FWIW, Mesa really only needs KCMP_FILE, in case that helps.

(In reply to Leonardo from comment #31)

I tried with a clean profile and the same happens.

But VA-API support is opt-in by pref (comment #11), so a clean profile wouldn't be trying to use it, I don't think?

In any case, the reason Mesa uses kcmp doesn't seem to be inherently specific to VA-API, so let's move that problem to bug 1624743 and I'll fix that for content processes.

Depends on: 1624743

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #36)

(In reply to Leonardo from comment #31)

I tried with a clean profile and the same happens.

But VA-API support is opt-in by pref (comment #11), so a clean profile wouldn't be trying to use it, I don't think?

I used a clean profile to avoid any conflicts with my settings, extensions etc... The only settings manually enabled was the va-api related switches.

Crash Signature: [@ semget ] → [@ semget ] [@ arena_dalloc | replace_free | mozilla::FFmpegVideoDecoder<T>::InitVAAPIDecoder]
See Also: → 1647283

I am using the iHD driver, and have media.ffvpx.enabled set to False. I do NOT experience any crash upon playing videos. I am not sure if the GPU is actually being used to decode.

Below is the initial output from the log for the PlatformDecoderModule:

[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule Initialising VA-API FFmpeg decoder
[RDD 13175: MediaPDecoder #1]: D/PlatformDecoderModule OpusDataDecoder[0x7fef721670b0] ::ProcessDecode: Opus decoder skipping 312 of 960 frames
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x41524742 -> bgra.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x42475241 -> argb.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x41424752 -> rgba.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x52474241 -> abgr.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x58524742 -> bgr0.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x42475258 -> 0rgb.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x58424752 -> rgb0.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x52474258 -> 0bgr.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30335241 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30334241 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30335258 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30334258 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x36314752 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x50424752 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x50524742 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x56555941 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30303859 -> gray.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x3231564e -> nv12.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x3132564e -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x32595559 -> yuyv422.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x59565955 -> uyvy422.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x32315659 -> yuv420p.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30323449 -> yuv420p.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x50313134 -> yuv411p.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x48323234 -> yuv422p.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x56323234 -> yuv440p.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x50343434 -> yuv444p.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x33434d49 -> unknown.
[AVHWDeviceContext @ 0x7f40e1c82600] Format 0x30313050 -> p010le.
[AVHWDeviceContext @ 0x7f40e1c82600] VAAPI driver: Intel iHD driver for Intel(R) Gen Graphics - 20.1.1 ().
[AVHWDeviceContext @ 0x7f40e1c82600] Driver not found in known nonstandard list, using standard behaviour.
[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule VA-API FFmpeg init successful
[Child 12785: MediaPDecoder #2]: D/PlatformDecoderModule AudioTrimmer[0x7f40e1ce9660] ::HandleDecodedResult: sample[0,21000] (decoded[0,13500] no trimming needed
[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule Choosing FFmpeg pixel format for VA-API video decoding.
[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule Requesting pixel format VAAPI_VLD
[vp9 @ 0x7f40ea34b000] Format vaapi_vld chosen by get_format().
[vp9 @ 0x7f40ea34b000] Format vaapi_vld requires hwaccel initialisation.
[vp9 @ 0x7f40ea34b000] Considering format 0x3231564e -> nv12.
[vp9 @ 0x7f40ea34b000] Picked nv12 (0x3231564e) as best match for yuv420p.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0.
[AVHWFramesContext @ 0x7f40d8566f80] Direct mapping possible.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x1.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x2.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x3.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x4.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x5.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x6.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x7.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x8.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x9.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0xa.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0xb.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0xc.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0xd.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0xe.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0xf.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x10.
[AVHWFramesContext @ 0x7f40d8566f80] Created surface 0x11.
[vp9 @ 0x7f40ea34b000] Considering format 0x3231564e -> nv12.
[vp9 @ 0x7f40ea34b000] Picked nv12 (0x3231564e) as best match for yuv420p.
[vp9 @ 0x7f40ea34b000] Decode context initialised: 0x1b/0x10000000.
[vp9 @ 0x7f40ea34b000] Param buffer (type 0, 92 bytes) is 0.
[vp9 @ 0x7f40ea34b000] Slice 0 param buffer (316 bytes) is 0x1.
[vp9 @ 0x7f40ea34b000] Slice 0 data buffer (303136 bytes) is 0x2.
[vp9 @ 0x7f40ea34b000] Decode to surface 0x11.
[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule Got one VAAPI frame output with pts=0 dts=0 duration=33000 opaque=-9223372036854775808
[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule Created dmabuf UID = 1 HW surface 11
[Child 12785: MediaPDecoder #1]: D/PlatformDecoderModule VAAPIFrameHolder is adding dmabuf surface UID = 1
[Child 12785: MediaPDecoder #2]: D/PlatformDecoderModule VAAPIFrameHolder is releasing dmabuf surface UID = 1
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPlayback #1]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPlayback #1]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPDecoder #1]: D/PlatformDecoderModule Initialising FFmpeg decoder.
[Child 12807: MediaPDecoder #1]: D/PlatformDecoderModule FFmpeg init successful.
[Child 12807: MediaPDecoder #1]: D/PlatformDecoderModule AudioTrimmer[0x7f9ad04476a0] ::operator(): sample[0,24000] no trimming information
[Child 12807: MediaPDecoder #1]: D/PlatformDecoderModule AudioTrimmer[0x7f9ad04476a0] ::HandleDecodedResult: sample[0,24000] (decoded[0,24000] no trimming needed
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPlayback #1]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPlayback #2]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPDecoder #3]: D/PlatformDecoderModule Initialising FFmpeg decoder.
[Child 12807: MediaPDecoder #3]: D/PlatformDecoderModule FFmpeg init successful.
[Child 12807: MediaPDecoder #1]: D/PlatformDecoderModule AudioTrimmer[0x7f9ad0447ba0] ::operator(): sample[0,24000] no trimming information
[Child 12807: MediaPDecoder #1]: D/PlatformDecoderModule AudioTrimmer[0x7f9ad0447ba0] ::HandleDecodedResult: sample[0,24000] (decoded[0,24000] no trimming needed
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: Main Thread]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPlayback #1]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPlayback #2]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 12807: MediaPDecoder #2]: D/PlatformDecoderModule Initialising FFmpeg decoder.
[Child 12807: MediaPDecoder #2]: D/PlatformDecoderModule FFmpeg init successful.
[Child 12807: MediaPDecoder #3]: D/PlatformDecoderModule AudioTrimmer[0x7f9ad0447ba0] ::operator(): sample[0,24000] no trimming information
[Child 12807: MediaPDecoder #3]: D/PlatformDecoderModule AudioTrimmer[0x7f9ad0447ba0] ::HandleDecodedResult: sample[0,24000] (decoded[0,24000] no trimming needed
[Parent 12689: MediaTelemetry]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Parent 12689: MediaTelemetry]: D/PlatformDecoderModule Sandbox decoder rejects requested type

###!!! [Parent][MessageChannel] Error: (msgtype=0x6A0008,name=PMessagePort::Msg___delete__) Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost


###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

Below is what is repeatedly output once I start playing a video from YouTube:

[vp9 @ 0x7f40e1cd8800] Param buffer (type 0, 92 bytes) is 0x2.
[vp9 @ 0x7f40e1cd8800] Slice 0 param buffer (316 bytes) is 0x1.
[vp9 @ 0x7f40e1cd8800] Slice 0 data buffer (251 bytes) is 0.
[vp9 @ 0x7f40e1cd8800] Decode to surface 0xe.
[Child 12785: MediaPDecoder #4]: D/PlatformDecoderModule VAAPIFrameHolder is releasing dmabuf surface UID = 328
[Child 12785: MediaPDecoder #4]: D/PlatformDecoderModule Got one VAAPI frame output with pts=11066000 dts=11066000 duration=34000 opaque=-9223372036854775808
[Child 12785: MediaPDecoder #5]: D/PlatformDecoderModule AudioTrimmer[0x7f40d38e5560] ::operator(): sample[12961000,12981000] no trimming information
[Child 12785: MediaPDecoder #4]: D/PlatformDecoderModule Created dmabuf UID = 334 HW surface e
[Child 12785: MediaPDecoder #4]: D/PlatformDecoderModule VAAPIFrameHolder is adding dmabuf surface UID = 334
[Child 12785: MediaPDecoder #3]: D/PlatformDecoderModule AudioTrimmer[0x7f40d38e5560] ::HandleDecodedResult: sample[12961000,12981000] (decoded[12954500,12974500] no trimming needed

The following is the output from vainfo:

vainfo: VA-API version: 1.8 (libva 2.7.1)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 20.1.1 ()
vainfo: Supported profile and entrypoints
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileNone                   : VAEntrypointStats
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Simple            : VAEntrypointEncSlice
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointFEI
      VAProfileH264Main               : VAEntrypointEncSliceLP
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointFEI
      VAProfileH264High               : VAEntrypointEncSliceLP
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointFEI
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointFEI
      VAProfileHEVCMain10             : VAEntrypointVLD
      VAProfileHEVCMain10             : VAEntrypointEncSlice
      VAProfileVP9Profile0            : VAEntrypointVLD
      VAProfileVP9Profile2            : VAEntrypointVLD

Wanted to report a similar finding as comment #40, expect for a hardware decoded avc1 video in my case; not vp9 decoded (which also works, but was not reported as crashing in #1). Playback of the video doesn't lead to sandbox violations, I think. At least firefox is not crashing and the video is displayed normally. security.sandbox.content.level is left untouched (by default 4).

Running firefox 78.0.1 on an up-to-date arch system. Log of MOZ_ENABLE_WAYLAND=1 MOZ_LOG="PlatformDecoderModule:5" LIBVA_DRIVER_NAME=iHD firefox for the video from #1 available on pastebin. Video playback was restricted to avc1 via enhanced-h264ify.

Blocks: 1652958

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #12)

The driver does in fact appear to be using SysV IPC; here, for example. This is… not great; SysV IPC's security model is based only on uid/gid and as far as I know there's no way to use a broker to restrict access as we do with regular file accesses (by passing fds with SCM_RIGHTS).

We might be able to get away without completely allowing SysV IPC if the driver only ever uses fixed hard-coded key_t values, because those are just integers and can be filtered by seccomp-bpf. (For example, 1140872792 seen in comment #7 is the value defined in the driver source as #define DUAL_VDBOX_KEY ('D'<<24|'V'<<8|'X'<<0).) This can even work on 32-bit x86 before kernel 4.3, because ipc(2)'s arguments are shifted by one, not passed in memory as for socketcall(2). However, it means we can't use CLONE_NEWIPC. And of course it will break if the driver ever changes that implementation detail.

Something else I'd like to find out is what options we have for moving the use of this driver into another process, like the RDD or GPU process, so that the sandbox exception isn't exposed to the process that's executing untrusted JS.

There's also the usual question of what Chromium does; they seem to have code for VA-API as a compile-time option, but not enabled in Chrome.

the IPC is for gen9 vdbox balance, need to record the task for each vdbox across process. now, we plan to change the logic to random dispatch. will remove these IPC call

You can run intel_gpu_top as root to check how the GPU is utilized.

Summary: [Wayland][VA-API] libva video playback leads to sandbox violations → VAAPI video playback leads to sandbox violations (Temporary solution: Uninstall iHD driver (intel-media-driver) and install i965 driver (i965-va-driver == intel-vaapi-driver))
Crash Signature: [@ semget ] [@ arena_dalloc | replace_free | mozilla::FFmpegVideoDecoder<T>::InitVAAPIDecoder] → [@ semget ] [@ arena_dalloc | replace_free | mozilla::FFmpegVideoDecoder<T>::InitVAAPIDecoder] [@ @0x0 | iHD_drv_video.so@0x42d3ba]

So what is the actual problem? Could a summary be added to the original description? Can this be worked around in the Intel driver?

(In reply to Paul Menzel from comment #61)

So what is the actual problem? Could a summary be added to the original description? Can this be worked around in the Intel driver?

Summary:

  • iHD sometimes uses a syscall Firefox doesn't want to allow in the content process sandbox for a reason. An Intel developer promised to change the iHD driver.
  • The ArchLinux wiki should unambiguously state that disabling the content process sandbox means disabling the protection against attackers. Don't ever disable the content process sandbox.
  • VAAPI won't be moved to the RDD process (it will keep its tight sandbox), but to the GPU process. Then, current syscall might not even matter.
  • A Wayland GPU process (based on dmabuf) will be implemented in the FF94 timeframe. So far there is only an X11 GPU process that has not been shipped yet due to a minor bug.
  • As a workaround, most users can temporarily switch from the iHD VAAPI driver to the i965 VAAPI driver.
  • Intel Iris Xe seems to be only supported by the iHD driver which means its users have to wait until the Wayland GPU process has been implemented and VAAPI been moved to the GPU process.

(Francois Guerraz from comment #0)

syscall 64

(Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #12)

The driver does in fact appear to be using SysV IPC; here, for example. This is… not great; SysV IPC's security model is based only on uid/gid and as far as I know there's no way to use a broker to restrict access as we do with regular file accesses (by passing fds with SCM_RIGHTS).

We might be able to get away without completely allowing SysV IPC if the driver only ever uses fixed hard-coded key_t values, because those are just integers and can be filtered by seccomp-bpf. (For example, 1140872792 seen in comment #7 is the value defined in the driver source as #define DUAL_VDBOX_KEY ('D'<<24|'V'<<8|'X'<<0).) This can even work on 32-bit x86 before kernel 4.3, because ipc(2)'s arguments are shifted by one, not passed in memory as for socketcall(2). However, it means we can't use CLONE_NEWIPC. And of course it will break if the driver ever changes that implementation detail.

Something else I'd like to find out is what options we have for moving the use of this driver into another process, like the RDD or GPU process, so that the sandbox exception isn't exposed to the process that's executing untrusted JS.

(Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #19)

For reference: Chromium has a compile-time option for VA-API, off by default, enabled in some downstream builds. From the layout of their codebase it looks like they run it in their GPU process, which seems to allow unrestricted use of SysV IPC; Chromium's renderer processes, in contrast, do not allow it. (Interestingly, they seem to use only seccomp-bpf filtering of the syscalls and don't unshare the IPC namespace.)

Allowing SysV IPC in the RDD process isn't ideal, but if the alternative is running ffmpeg in the GPU process: the GPU process currently doesn't have a sandbox on Linux, and its eventual sandbox would be weaker than for RDD, and the data immediately accessible to the RDD process would only be related to media decoding, not all of the browser's compositing. So, the RDD process seems to be the better place for this.

Intel developer:
(Xinfeng Zhang from comment #46)

the IPC is for gen9 vdbox balance, need to record the task for each vdbox across process. now, we plan to change the logic to random dispatch. will remove these IPC call

(Martin Stránský [:stransky] (ni? me) from bug 1610199 comment #75)

This bug a blocked by Bug 1683808 (and similar). The correct way forward here is to implement GPU process on Wayland (by dmabuf EGL framebuffer) and then implement VAAPI in GPU process as we actually can't use VAAPI in RDD process.

(Esokrarkose from bug 1683808 comment #23)

Since I have the new Intel Iris Xe, I can't make use of the older i965 driver, thus I'm forced to the new iHD driver.

Crash Signature: [@ semget ] [@ arena_dalloc | replace_free | mozilla::FFmpegVideoDecoder<T>::InitVAAPIDecoder] [@ @0x0 | iHD_drv_video.so@0x42d3ba] → [@ semget ] [@ arena_dalloc | replace_free | mozilla::FFmpegVideoDecoder<T>::InitVAAPIDecoder] [@ @0x0 | iHD_drv_video.so@0x42d3ba]

For the record, I've long been able to successfully use VAAPI with the iHD/iris driver (on Skylake / Thinkpad T460p) without disabling the sandbox, just with media.ffmpeg.vaapi.enabled and Wayland backend. That is on the Fedora 34 stock build as well as local and try builds. However, that is only for h264/avc1, not VP9 (using the enhanced-h264ify extension for youtube) - the hardware doesn't support it.

Have we already confirmed that at least on iHD the sandbox permission violation only happens with VP9? Or do other people run into it with h264/avc1 as well? (Comment 0 refers to h264 but not the iHD driver).

This wouldn't change the fact that we should move VAAPI out of the content process. But if it is the case, maybe the driver can be enhanced/fixed as well.

(In reply to Robert Mader [:rmader] from comment #63)

Have we already confirmed that at least on iHD the sandbox permission violation only happens with VP9? Or do other people run into it with h264/avc1 as well?

The codec doesn't seem to matter. Macbook Pro, Intel Iris Graphics 6100 (BDW GT3):
iHD + h264: bp-6b81fddc-8c9b-46cd-b555-b191d0210907
iHD + vp8: bp-fb03b6b1-f20d-44cd-bb66-4558a0210907

(Comment 0 refers to h264 but not the iHD driver).

It mentions "libva-intel-driver-hybrid", but there is "iHD_drv_video.so" in the stack trace.

But if it is the case, maybe the driver can be enhanced/fixed as well.

comment 46

So it seems to have changed a bit. At the moment, it crashes for me with SIGSYS 0x0000000000000040 (=sendfile).

To summarize bug 1698778: the plan as I understand it is to move the VA-API usage into the RDD process and adjust its sandbox as needed. I have a patch (attachment 9245843 [details]; it's currently kind of hacky) that works for radeon… and sometimes iHD, despite not handling SysV IPC yet, because whether intel-media-driver uses it seems to depend on the presence of a hardware feature; in particular, the Intel GPU I have on hand (PCI 8086:5917) apparently doesn't have it, so it works.

(Also, 0x40 is semget on amd64; 40 decimal is sendfile.)

I made a Try run with patches that should handle the Intel SysV IPC usage (direct link to the x86_64 build), and which flips the necessary prefs to use VA-API in the RDD process by default for ease of testing, but I don't have hardware to test it. If I understand correctly, this needs an 11th generation i5 or i7 (or i9?), using the integrated GPU. It would help if someone with the hardware in question could test it: unpack the build, play a suitable video (this VP8/VP9 demo page seems to work for intel), and use lsof or similar on the RDD process (the one with rdd as the last argument) to check for /dmabuf: file descriptors.

Try build works here, checked that /dmabuf: FDs are in /proc/${RDD_PID}/fd & semget syscall [that was crashing for me in https://bugzilla.mozilla.org/show_bug.cgi?id=1729355] is used via strace.

@jld: works for me too with no crash on Gen 11 (XPS 9300). I can see the Video engine being used in intel_gpu_top.

However, I would recommend against enabling it by default as, despite the hype, software rendering these days runs very smoothly and is as power efficient (GPUs need power too!) in my case playing a vp9 YouTube video at 4k used 10W with or without GPU acceleration. On the other hand you're going to get tons of drivers compatibility issues (there is a reason why it's not default in Chrome) and apart from the embedded world, I don't think it's worth it.

@jld: Great work, works for me on 11th gen (Dell XPS 9310). When can we expect these fixes to land?

@Francois Guerraz First of all the 9300 is 10th gen, or do you have the 9310? Secondly I have to disagree with your claims regarding power consumption. Using the nightly build of @jld I draw between 6 and 8 W while on the stable firefox without vaapi I draw between 16 and 18 W, please see the screenshots here for proof:

Firefox nightly build of @jld: https://imgur.com/nW91IY5
Firefox without workin vaapi: https://imgur.com/FPJkxXN

Both screenshots have been made after letting play the video for some time without other interaction.

@Esokrarkose: blame Intel for the confusion: Gen 10 CPU (Icelake) have "Gen11" GPUs...

If you indeed see a big drop in usage on your platform then that's good :) On my computer it's a bit more of a mixed bag

https://imgur.com/a/xrHLgv5

but I do see improvements on some videos indeed like 4k@60fps.

(In reply to Francois Guerraz from comment #69)

On the other hand you're going to get tons of drivers compatibility issues (there is a reason why it's not default in Chrome)

Doesn't Chrome use pixmaps for VAAPI and isn't that the reason why VAAPI is X11/Xwayland-only in Chrome?
Texture-from-pixmap has been deprecated long ago in Firefox because of problems.

EGL+Dmabuf is already shipping on Fedora/Ubuntu Wayland.
Firefox 94 enabled EGL+Dmabuf for X11/XWayland on Mesa >=21. It could be lowered.

We know that the deprecated Intel DDX driver (Driver "intel" in custom Xorg config) caused problems which is why it's not qualified anymore for hardware WebRender.
It's used if $ xrandr --listproviders contains "Intel" instead of the default and recommended "modesetting".
Some users manually choose this driver to use a legacy in-driver compositor to prevent tearing. It's recommended to use a regular X11 compositor instead. If the performance of uncomposited X11 is desired with a compositor, Wayland is recommended.

bug 1669189: VAAPI crashes with libva-vdpau.
Dmabuf VAAPI likely won't work on Nvidia because the deprecated (and from Debian removed) VAAPI-via-VDPAU library does not seem to support vaExportSurfaceHandle.
Firefox should enable VAAPI by default only for radeonsi, iHD and i965:

$ vainfo
libva info: VA-API version 1.13.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_13
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.13 (libva 2.12.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 21.4.0 ()
[...]

$ ls /usr/lib/x86_64-linux-gnu/dri/*drv*
/usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
/usr/lib/x86_64-linux-gnu/dri/nouveau_drv_video.so
/usr/lib/x86_64-linux-gnu/dri/r600_drv_video.so
/usr/lib/x86_64-linux-gnu/dri/radeonsi_drv_video.so

and apart from the embedded world, I don't think it's worth it.

Being able to play 4k videos fluently vs. not.
Blowing fans vs. not.
Try build works as expected: bug 1698778 comment 22

Merging this with bug 1698778 so that I have one bug number to put on the patches.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE

@jld https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/TOjLuUSjRPCPZvOuifAJYA/runs/1/artifacts/public/build/target.tar.bz2 worked, but with Firefox 95 in Ubuntu 21.04 I can't make it work, I tried replicating all flags without any success, could you help?

this PR https://github.com/intel/media-driver/pull/1293 is used to remove iPC call in intel-media-driver
the background of IPC usage is :
for KBL, CML... platforms, there are 2 VDBox in most of the configuration. each VDBox could handle both encode/decode
so, if both VDBox is used , it could provide x2 performance than only 1 VDBox is used.
because KMD has not provide the interface to query the "busy/idle" on these platform, we use IPC call to exchange the information of previous task , which VDBox is used for previous task, then the second session will be dispatched to another one. We call it IPC based task balance.
so, the 1 session (decoder or encoder) is dispatched to vdbox0, the second is dispatched to vdbox1. the third is dispatched to vdbox0 ...

the side effect of this patch:
but with this patch , no IPC call , we dispatch the task to different vdbox randomly. it means, maybe always vdbox0 is used ... , and some perf issue could not reproduced easily. ...

To be more precise: with Firefox 95 I do not see the crash, but I simply get high cpu usage which indicated vaapi is not used at all and I am wondering how I could make it work, i.e. which flag combinations are needed. It works successfully with the build of comment 67.

I get the following with Firefoxx 95 when rdd is enabled:

[Child 86733: MediaPDecoder #3]: D/PlatformDecoderModule Sandbox RDD decoder supports requested type
[RDD 86935: Main Thread]: D/PlatformDecoderModule PDMInitializer, Init PDMs in RDD process
[RDD 86935: Main Thread]: D/PlatformDecoderModule VA-API FFmpeg is disabled by platform
[RDD 86935: Main Thread]: D/PlatformDecoderModule VA-API FFmpeg is disabled by platform
[RDD 86935: Main Thread]: D/PlatformDecoderModule Agnostic decoder rejects requested type
[RDD 86935: Main Thread]: D/PlatformDecoderModule Agnostic decoder supports requested type
[RDD 86935: Main Thread]: D/PlatformDecoderModule Agnostic decoder supports requested type
[RDD 86935: Main Thread]: D/PlatformDecoderModule Agnostic decoder supports requested type
[RDD 86935: Main Thread]: D/PlatformDecoderModule Agnostic decoder supports requested type
[RDD 86935: MediaSupervisor #1]: D/PlatformDecoderModule Agnostic decoder supports requested type
[RDD 86935: MediaSupervisor #1]: D/PlatformDecoderModule Agnostic decoder supports requested type
[RDD 86935: MediaSupervisor #1]: D/PlatformDecoderModule VA-API is disabled by pref.
[RDD 86935: MediaPDecoder #1]: D/PlatformDecoderModule Initialising FFmpeg decoder.
[Child 86733: MediaPDecoder #3]: D/PlatformDecoderModule AudioTrimmer[7f87cfc529c0] ::PrepareTrimmers: sample[0,21000] no trimming information
[RDD 86935: MediaPDecoder #2]: D/PlatformDecoderModule OpusDataDecoder[7f6e36f900c0] ::Decode: Opus decoder skipping 312 of 960 frames
[RDD 86935: MediaPDecoder #1]: D/PlatformDecoderModule FFmpeg init successful.
[Child 86733: MediaPDecoder #1]: D/PlatformDecoderModule AudioTrimmer[7f87cfc529c0] ::HandleDecodedResult: sample[0,21000] (decoded[0,13500] no trimming needed
[RDD 86935: MediaPDecoder #2]: D/PlatformDecoderModule Choosing FFmpeg pixel format for video decoding.
[RDD 86935: MediaPDecoder #2]: D/PlatformDecoderModule Requesting pixel format YUV420P.
[vp9 @ 0x7f6e2be5d100] Format yuv420p chosen by get_format().

When I disable rdd's ffmpeg, I get the seccomp issues again. This is marked as closed, how do I get this to work?

Flags: needinfo?(jld)

Use firefox-96. This one is closed as duplicate of https://bugzilla.mozilla.org/show_bug.cgi?id=1698778 & you can see Milestone: 96 Branch there.

(Esokrarkose from bug 1683808 comment #23)

Since I have the new Intel Iris Xe, I can't make use of the older i965 driver, thus I'm forced to the new iHD driver.

(In reply to Esokrarkose from comment #77)

When I disable rdd's ffmpeg, I get the seccomp issues again. This is marked as closed, how do I get this to work?

Download https://nightly.mozilla.org, set media.ffmpeg.vaapi.enabled to true and restart Nightly.

Flags: needinfo?(jld)

Intel developer:
(Xinfeng Zhang from comment #75)

this PR https://github.com/intel/media-driver/pull/1293 is used to remove iPC call in intel-media-driver
the background of IPC usage is :
for KBL, CML... platforms, there are 2 VDBox in most of the configuration. each VDBox could handle both encode/decode
so, if both VDBox is used , it could provide x2 performance than only 1 VDBox is used.
because KMD has not provide the interface to query the "busy/idle" on these platform, we use IPC call to exchange the information of previous task , which VDBox is used for previous task, then the second session will be dispatched to another one. We call it IPC based task balance.
so, the 1 session (decoder or encoder) is dispatched to vdbox0, the second is dispatched to vdbox1. the third is dispatched to vdbox0 ...

the side effect of this patch:
but with this patch , no IPC call , we dispatch the task to different vdbox randomly. it means, maybe always vdbox0 is used ... , and some perf issue could not reproduced easily. ...

bug 1698778 comment 33 has allowed SysV IPC in Firefox' media process sandbox.
The Intel iHD VAAPI driver does not longer crash Firefox as of 96 when the media process is used (=default).
(If decoding in the media process has been manually disabled, iHD VAAPI would still crash in the content process, but that can and should be fixed by setting manually changed configuration options back to their default.)
bug 1743926 tracks some general edge cases.

Jed, is removal of IPC in the Intel VAAPI driver still required given the regressions it would cause?

Flags: needinfo?(jld)

@Darkspirit Thanks I can confirm it works as expected with Firefox 96 from Flathub beta.

thanks, actually, we found that i915 already have a ping-pong mechanism :

commit a8ebba75b358f9c912cbcba0c14a2072e7280b2f
Author: Zhao Yakui <yakui.zhao@intel.com>
Date: Thu Apr 17 10:37:40 2014 +0800

drm/i915: Use the coarse ping-pong mechanism based on drm fd to dispatch the BSD command on BDW GT3

The BDW GT3 has two independent BSD rings, which can be used to process the
video commands. To be simpler, it is transparent to user-space driver/middle.
Instead the kernel driver will decide which ring is to dispatch the BSD video
command.

As every BSD ring is powerful, it is enough to dispatch the BSD video command
based on the drm fd. In such case it can play back video stream while encoding
another video stream. The coarse ping-pong mechanism is used to determine
which BSD ring is used to dispatch the BSD video command.

we will refer this one for gen9 , and remove IPC. even it not block firefox anymore

Flags: needinfo?(jld)

(In reply to Esokrarkose from comment #81)

@Darkspirit Thanks I can confirm it works as expected with Firefox 96 from Flathub beta.

Please don't forget to manually set media.rdd-ffmpeg.enabled to true (bug 1744037) when "early beta" period has ended (I don't know which Beta version will be the first non-early beta, it seems to change over time), so that H264 VAAPI stays in the media process, otherwise you might run into this bug again (now bug 1743647).

You need to log in before you can comment on or make changes to this bug.