Closed Bug 1616185 Opened 6 years ago Closed 6 years ago

[Wayland] Implement h.264 VA-API decode by ffmpeg

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla75

Tracking Flags:

Tracking

Status

firefox75

---

fixed

People

(Reporter: stransky, Assigned: stransky)

References

(Blocks 1 open bug)

Details

Attachments

(5 files)

Bug 1616185 [Wayland] Implement VA-API decode in FFmpegDataDecoder, r?jya 6 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1616185 [Wayland] Implement VA-API decode in FFmpegVideoDecoder, r?jya 6 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1616185 [Wayland] Build VA-API support for ffmpeg58 and Wayland only, r?jya 6 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1616185 [Wayland] Load library symbols for VA-API r?jya 6 years ago Martin Stránský [:stransky] (ni? me) 47 bytes, text/x-phabricator-request		Details \| Review
video playing flickering/crash with green frame 5 years ago dontdieych 1.42 MB, text/plain		Details

Martin Stránský [:stransky] (ni? me)

Assignee

Description

•

6 years ago

Implement VA-API decode by ffmpeg on Wayland.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 1

•

6 years ago

Attached file Bug 1616185 [Wayland] Implement VA-API decode in FFmpegDataDecoder, r?jya — Details

Phabricator Automation

Updated

•

6 years ago

Assignee: nobody → stransky

Status: NEW → ASSIGNED

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 2

•

6 years ago

Attached file Bug 1616185 [Wayland] Implement VA-API decode in FFmpegVideoDecoder, r?jya — Details

Implement VA-API decoder on top of FFmpegDataDecoder.
Implement VAAPIFrameHolder class to hold decoded h264 image which is used by GL backend,
we need to keep reference to the frame as ffmpeg tends to re-use it for another video frames.

Depends on D63132

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 3

•

6 years ago

Attached file Bug 1616185 [Wayland] Build VA-API support for ffmpeg58 and Wayland only, r?jya — Details

Depends on D63133

Jean-Yves Avenard [:jya]

Comment 4

•

6 years ago

Do you intend to complete things so that there's no readbacks?

When we last play with this approach, performance was consistently worse than with software decoders

Flags: needinfo?(stransky)

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 5

•

6 years ago

•

Edited

(In reply to Jean-Yves Avenard [:jya] from comment #4)

Do you intend to complete things so that there's no readbacks?

When we last play with this approach, performance was consistently worse than with software decoders

I'm not sure what do you mean with "no readbacks". Do you mean that we should copy every frame right after creation from vasurface DRM buffer to a new gl texture and then render from it? I can check that.

Right now I use the direct rendering from va surface and I see rapid performance improvement than SW decoding. I checked mpv and it uses the same method (draw directly from vasurface) and it's the fastest rendering/playback I've seen so far.

There's also a difference if the video is played by WebRender or GL compositor, where GL compositor seems to work bette and I see rendering artifacts in WebRender. Right now I'd go for direct rendering & gl compositor which works for me.

I can implement the vasurface -> gl texture copy at WaylandDMABUFSurface which can be optionally enabled so there may be no much difference in the patches I submitted (we just don't need to keep reference to DRM buffers in this case).

Flags: needinfo?(stransky)

Robert Mader [:rmader]

Comment 6

•

6 years ago

Concerning Webrender, note bug 1579235, especially https://bugzilla.mozilla.org/show_bug.cgi?id=1579235#c8

Darkspirit

Updated

•

6 years ago

Updated

•

6 years ago

Depends on: 1616590

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 7

•

6 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #4)

Do you intend to complete things so that there's no readbacks?

When we last play with this approach, performance was consistently worse than with software decoders

I tested it today with GL compositor. I tested 2K / full HD and 720p clips playback on 4K display on Intel 630 / Fedora 31 / Wayland.

With VAAPI I have constant 4-5% cpu utilization (on 6 core CPU + HT on) which means one core is active and running about 50% no matter which clip is played.

With SW decode + GL rendering I have 9% cpu usage for 720p clip and 12-15% cpu usage for FullHD/2K clips which means one core is about 100% and another one 20-30%.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 8

•

6 years ago

For reference mpv --hwdec=vaapi gives me about 2% cpu utilization no matter which clip is played.

Martin Stránský [:stransky] (ni? me)

Assignee

Updated

•

6 years ago

No longer depends on: 1616590

Jean-Yves Avenard [:jya]

Comment 9

•

6 years ago

(In reply to Martin Stránský [:stransky] from comment #5)

(In reply to Jean-Yves Avenard [:jya] from comment #4)

Do you intend to complete things so that there's no readbacks?

When we last play with this approach, performance was consistently worse than with software decoders

I'm not sure what do you mean with "no readbacks". Do you mean that we should copy every frame right after creation from vasurface DRM buffer to a new gl texture and then render from it? I can check that.

No, I mean that supporting the VA-OpenGL surface in the compositor and paint them directly. No readback into a software buffer or DMA mapping.

This will require a much more extensive change, as you need sharing the HW context with the compositor and have native support for VA-GL images.

In the mean time, I'd want this to be behind a pref that is disabled by default.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 10

•

6 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #9)

No, I mean that supporting the VA-OpenGL surface in the compositor and paint them directly. No readback into a software buffer or DMA mapping.

I see. AFAIK VA-OpenGL surface can be used on GLX only.

On Wayland we use dmabuf to share hw context and it's implemented by WaylandDMABUFSurfaces - Bug 1572697.

So yes, under Wayland we can render directly from VASurface. This is also a reason why VAAPIFrameHolder() class is used in the patch - it references the HW buffer as far as it's used by gecko compositor. Without the reference VASurfaces are reused by ffmpeg and video playback is scattered.

This will require a much more extensive change, as you need sharing the HW context with the compositor and have native support for VA-GL images.

Yes, it was worked on Bug 1572697. We already have that implemented for WebRender/GL compositor and WebGL also can use it (Bug 1586696).

In the mean time, I'd want this to be behind a pref that is disabled by default.

It's off by default, Bug 1616680 has the needed changes to platform to enable it under preference. I'd need to update the patch here for it.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 12

•

6 years ago

Attached file Bug 1616185 [Wayland] Load library symbols for VA-API r?jya — Details

Load and bind symbols from libva and libav needed for HW accelerated video decode.

Depends on D63134

C.M.Chang[:chunmin]

Updated

•

6 years ago

Priority: -- → P3

Jean-Yves Avenard [:jya]

Comment 13

•

6 years ago

Thank you for this contribution. Quite nice.

I'm a bit unfamiliar with the mapping of the GPU image to be used later by the wayland code. Does that performs a readback or the mapping is handled like it would with a GL surface handle?
And how does this work in a multi-process environment?
decoding is currently done in the content process (but not for much longer), while compositing is in the GPU process.
Once bug 1595994 lands, decoding will be done in the RDD (remote data decoder) process.

I'm unfamiliar with how this done, have done work with latest vaapi version. Last I implemented a vaapi decoder was almost 10 years ago (in mythtv), things have changed since.

Ultimately, on windows we had to implement a GPU process and run the decoding there, because HW decoders and drivers have proven to be very crashy. so to enable this we will have to wait on 1595994

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 14

•

6 years ago

•

Edited

(In reply to Jean-Yves Avenard [:jya] from comment #13)

I'm a bit unfamiliar with the mapping of the GPU image to be used later by the wayland code. Does that performs a readback or the mapping is handled like it would with a GL surface handle?

That was the most difficult part of the work. FFmpeg decoded VASurfaces are GEM objects which can be mapped as dmabuf object (a fd in user space) so there it's any copy here. We use EXT_image_dma_buf_import extension to map dmabuf fd as EGLImage without copy. The issue here is that VASurfaces/GEM to dmabuf mapping isn't exact, dmabuf object is live until the fd is closed but underlying GEM object can be changed.

So there's a problem that VASurfaces/GEM are altered for the same dmabuf fd as VASurfaces are reused by va-api hw decoder. That's reason there's the frame holder class here - to keep VASurfaces/GEM mapped to exact dmabuf object until the dmabuf/EGLimage on top of is used by gecko compositor.

So yes, we do direct rendering from va-api decoded frames in gecko without any copy.

I'm not sure what do you mean with 'GL surface handle'. We use EGLimage which is an abstraction over GPU memory and can be mapped as texture/framebuffer so it's pretty much versatile.

And how does this work in a multi-process environment?

The VASurfaces/GEM mapped as dmabuf can be shared as fd or EGLImage. I use fd and SurfaceDescriptorDMABuf is used for sharing.

decoding is currently done in the content process (but not for much longer), while compositing is in the GPU process.
Once bug 1595994 lands, decoding will be done in the RDD (remote data decoder) process.

That's not a problem. But Wayland does not use GPU process.

I'm unfamiliar with how this done, have done work with latest vaapi version. Last I implemented a vaapi decoder was almost 10 years ago (in mythtv), things have changed since.

Ultimately, on windows we had to implement a GPU process and run the decoding there, because HW decoders and drivers have proven to be very crashy. so to enable this we will have to wait on 1595994

I guess there's not difference from Wayland POV. It does not matter which process does the decoding as results are always shared by SurfaceDescriptorDMABuf.

Thanks.

Pulsebot

Comment 15

•

6 years ago

Pushed by nbeleuzu@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e9ba11d2516b [Wayland] Implement VA-API decode in FFmpegDataDecoder, r=jya

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 16

•

6 years ago

Try for all four patches: https://treeherder.mozilla.org/#/jobs?repo=try&revision=90296ee316295b3b548dd3cfa9c68684972da633

Martin Stránský [:stransky] (ni? me)

Assignee

Updated

•

6 years ago

Summary: [Wayland] Implement VA-API decode by ffmpeg → [Wayland] Implement h.264 VA-API decode by ffmpeg

Pulsebot

Comment 18

•

6 years ago

Pushed by nbeleuzu@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/da012bb39f2b [Wayland] Implement VA-API decode in FFmpegVideoDecoder, r=jya https://hg.mozilla.org/integration/autoland/rev/e0f4279ea250 [Wayland] Build VA-API support for ffmpeg58 and Wayland only, r=jya https://hg.mozilla.org/integration/autoland/rev/5ebaa08b1816 [Wayland] Load library symbols for VA-API r=jya

Bogdan Tara[:bogdan_tara | bogdant]

Comment 19

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/e9ba11d2516b
https://hg.mozilla.org/mozilla-central/rev/da012bb39f2b
https://hg.mozilla.org/mozilla-central/rev/e0f4279ea250
https://hg.mozilla.org/mozilla-central/rev/5ebaa08b1816

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox75: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla75

Calixte Denizet (:calixte)

Updated

•

6 years ago

Regressions: 1619544

Comment hidden (obsolete)

Tried VA-API decoding with Firefox75 based on the steps provided in this link https://wiki.archlinux.org/index.php/Firefox#Hardware_video_acceleration But it doesn't seem to work. Still have the below error:

linux19@linux19-Inspiron-15-5578:~/Downloads/firefox$ MOZ_LOG="PlatformDecoderModule:5" MOZ_ENABLE_WAYLAND=1 ./firefox
[Child 10951: Main Thread]: D/PlatformDecoderModule Couldn't load function avcodec_get_hw_config
[Child 10951: Main Thread]: D/PlatformDecoderModule Couldn't load function av_hwdevice_ctx_create
[Child 10951: Main Thread]: D/PlatformDecoderModule Couldn't load function av_hwframe_transfer_get_formats
[Child 10951: Main Thread]: D/PlatformDecoderModule Couldn't load function av_hwdevice_ctx_create_derived
[Child 10951: Unnamed thread 0x7fc10b288280]: D/PlatformDecoderModule Sandbox decoder rejects requested type
[Child 10951: MediaPDecoder #2]: D/PlatformDecoderModule Initialising VA-API FFmpeg decoder
[Child 10951: MediaPDecoder #2]: D/PlatformDecoderModule VA-API FFmpeg is disabled by platform
[Child 10951: MediaPDecoder #2]: D/PlatformDecoderModule Initialising FFmpeg decoder.
[h264 @ 0x7fc10b00a800] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7fc10b00a800] nal_unit_type: 8(PPS), nal_ref_idc: 3
[Child 10951: MediaPDecoder #2]: D/PlatformDecoderModule FFmpeg init successful.
[h264 @ 0x7fc10b00a800] nal_unit_type: 7(SPS), nal_ref_idc: 3
[h264 @ 0x7fc10b00a800] nal_unit_type: 8(PPS), nal_ref_idc: 3
[h264 @ 0x7fc10b00a800] nal_unit_type: 5(IDR), nal_ref_idc: 3
[Child 10951: MediaPDecoder #2]: D/PlatformDecoderModule Choosing FFmpeg pixel format for video decoding.
[Child 10951: MediaPDecoder #2]: D/PlatformDecoderModule Requesting pixel format YUV420P.
[h264 @ 0x7fc10b00a800] Format yuv420p chosen by get_format().

VA-API and driver information from my system.

root@linux19-Inspiron-15-5578:/home/linux19# vainfo
libva info: VA-API version 1.5.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_1_4
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.5 (libva 2.5.0)
vainfo: Driver version: Intel i965 driver for Intel(R) Kaby Lake - 2.3.0
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Simple : VAEntrypointEncSlice
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointEncSliceLP
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointEncSliceLP
VAProfileH264MultiviewHigh : VAEntrypointVLD
VAProfileH264MultiviewHigh : VAEntrypointEncSlice
VAProfileH264StereoHigh : VAEntrypointVLD
VAProfileH264StereoHigh : VAEntrypointEncSlice
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc
VAProfileJPEGBaseline : VAEntrypointVLD
VAProfileJPEGBaseline : VAEntrypointEncPicture
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileVP8Version0_3 : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointEncSlice
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointEncSlice
VAProfileVP9Profile0 : VAEntrypointVLD
VAProfileVP9Profile0 : VAEntrypointEncSlice
VAProfileVP9Profile2 : VAEntrypointVLD
root@linux19-Inspiron-15-5578:/home/linux19#

Please let me know if required any more information. kindly help us narrow down the issue.

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 22

•

6 years ago

Please try latest nightly and open a new bug if it's broken for you there. Firefox 75 does not have va-api support finished.

Comment hidden (obsolete)

dontdieych

Comment 25

•

5 years ago

System:    Host: a Kernel: 5.7.9-1-MANJARO x86_64 bits: 64 compiler: gcc v: 10.1.0 Desktop: KDE Plasma 5.19.3 
           Distro: Manjaro Linux 
Machine:   Type: Laptop System: Dell product: XPS 15 9560 v: N/A serial: <filter> 
           Mobo: Dell model: 0YH90J v: A04 serial: <filter> UEFI: Dell v: 1.19.2 date: 05/22/2020 
CPU:       Topology: Quad Core model: Intel Core i7-7700HQ bits: 64 type: MT MCP arch: Kaby Lake rev: 9 L2 cache: 6144 KiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 44817 
           Speed: 1000 MHz min/max: 800/2800 MHz Core speeds (MHz): 1: 1000 2: 1000 3: 1000 4: 1000 5: 1000 6: 1001 7: 1000 
           8: 1000 
Graphics:  Device-1: Intel HD Graphics 630 vendor: Dell driver: i915 v: kernel bus ID: 00:02.0 
           Device-2: NVIDIA GP107M [GeForce GTX 1050 Mobile] driver: N/A bus ID: 01:00.0 
           Display: x11 server: X.Org 1.20.8 driver: modesetting resolution: 3840x2160~60Hz 
           OpenGL: renderer: Mesa Intel HD Graphics 630 (KBL GT2) v: 4.6 Mesa 20.1.3 direct render: Yes 
Audio:     Device-1: Intel CM238 HD Audio vendor: Dell driver: snd_hda_intel v: kernel bus ID: 00:1f.3 
           Sound Server: ALSA v: k5.7.9-1-MANJARO

Nightly 80.0a1 (2020-07-24)

env MOZ_X11_EGL=1 MOZ_LOG="PlatformDecoderModule:5" firefox-nightly

Since today(yesterday?) video playbacks are quite broken. It's playing but time to time it show all green frame and flickering. Sometimes it goes first frame (00:00) then play again.

I'll attach logs.
Normal mp4 file and youtube video affected.

dontdieych

Comment 26

•

5 years ago

Attached file video playing flickering/crash with green frame — Details

Martin Stránský [:stransky] (ni? me)

Assignee

Comment 27

•

5 years ago

(In reply to dontdieych from comment #25)

Since today(yesterday?) video playbacks are quite broken. It's playing but time to time it show all green frame and flickering. Sometimes it goes first frame (00:00) then play again.

Please try latest nightly, should be fixed now. File a new bug if you still see it.
Thanks.

You need to log in before you can comment on or make changes to this bug.