Closed Bug 1644610 Opened 4 years ago Closed 4 years ago

[X11] Implement ffmpeg/VAAPI video playback, pixmap-based image sharing, basic compositor

Categories

(Core :: Audio/Video: Playback, enhancement, P5)

Unspecified
Linux
enhancement

Tracking

()

RESOLVED WONTFIX

People

(Reporter: ibragimovrinat, Assigned: ibragimovrinat)

Details

Attachments

(3 files)

+++ This bug was initially created as a clone of Bug #1619523 +++

Implementation of ffmpeg/VA-API decoding for X11 backend.
This bug focuses on X server pixmap approach for inter-process image sharing for presentation.

Unlike shared memory chunks that are mapped to the process memory,
X Pixmaps are stored on X server side. Their representation is an opaque
number, that can not be used by a TextureClient side to peek into
TextureHost side address space.

Depends on D79010

Add a presentation path using X Pixmaps. Decoder renders a decoded image
into an X Pixmap, which is then passed to the rendering side. Although
this approach can be used for basic, opengl and WebRender, the path is
currently enabled only for the basic compositor. Once required bits of
X Pixmap-based TextureHosts are implemented for opengl and WebRender
compositors, they can also utilize this path.

Depends on D79011

Assignee: nobody → ibragimovrinat

I think my ability to push try builds is suspended. I'll try to figure out how to enable it again, and try afterwards.
It would be nice if anybody triggers a try build in case I fail.

FFMPEG_LOG("VA-API/X11 needs Basic compositor with XRender enabled");

XRender was deprecated by bug 1180942. This bug would introduce maintenance burden and only profits the proprietary Nvidia driver, even though Nvidia plans to increase its contributions to Mesa/Nouveau. Please don't crush this seedling by taking away the pressure.
With required sandboxing improvements (bug 1129492 comment 24), past pixmap problems and the desire for code sharing in mind, please consider wontfixing this in favor of bug 1619523. bug 1629788 optimizes software decoding. We had this discussion between bug 1619523 comment 3 and bug 1619523 comment 11. DMABUF/X11 examples were provided in bug 1580166 comment 5. stransky then confirmed that he has started the refactor to make the existing DMABUF VAAPI/WebGL code path usable on X11 as well.

Flags: needinfo?(jyavenard)

I don't see the point of having VA decoding with the basic compositor.
WebRender is the focus now and the basic compositor will now only be used on machines with the HW blacklisted.

There's no gain in performance as the GPU readback eats all the gain made by the HW decoder.

So just yet another video path for no real gain.

Flags: needinfo?(jyavenard)

(Jean-Yves Avenard [:jya] from comment #6)

I don't see the point of having VA decoding with the basic compositor.
WebRender is the focus now and the basic compositor will now only be used on machines with the HW blacklisted.

There's no gain in performance as the GPU readback eats all the gain made by the HW decoder.

So just yet another video path for no real gain.

And bug 1619523 could be used with Basic compositor as well, if that's desired at all.

Please let's focus on bug 1614523 while bug 788319 and bug 1580166 are on their way. Thanks

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX

To be fair I think it could work if the XRender is enabled and X11 pixmaps are used. But that's very buggy scenario as it highly depends on X driver implementation and the code path is disabled by default. We use XShm for Basic compositor on X11 now.

(In reply to Jean-Yves Avenard [:jya] from comment #6)

I don't see the point of having VA decoding with the basic compositor.
WebRender is the focus now and the basic compositor will now only be used on machines with the HW blacklisted.

There's no gain in performance as the GPU readback eats all the gain made by the HW decoder.

The X11 pixmaps are not readbacked if X11 driver implementation is correct - AFAIK it stays on GPU. But it really depends on driver implementation and as I said it's usually very buggy.

IIRC The deprecated XRender pref may even use the old Cairo backend instead of Skia. X11 pixmaps would prevent sandbox hardening (bug 1129492 comment 24), but DMABUF/DRI3 makes it possible to gain these performance improvements.

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #5)

XRender was deprecated by bug 1180942.

Hardware video decoding is way-way more buggier.
Also, XRender is used by almost everybody who uses GTK applications on X11 for more that a decade now. Take an application and trace it if you don't believe me.
So it's not like it's a show-stopper.

This bug would introduce maintenance burden and only profits the proprietary Nvidia driver

Although I can't say for sure will it introduce maintenance burden or not, you can't too. It's not that simple.
The part about NVIDIA is completely not true. Although I must say that VDPAU support can be added the same way, so it could benefit machine with NVIDIA drivers. Currently there are no such patches, though.

Please don't crush this seedling by taking away the pressure.

Not related at all.

With required sandboxing improvements (bug 1129492 comment 24),

As was said before in Bug 1619523, the bug you are mentioning is about removing display connection from the content process. That means video decoding will be moved out of content processes. The patches do not prevent that, so this argument doesn't apply.

past pixmap problems

This is not for NVIDIA, as their drivers do not even support VA-API.

and the desire for code sharing in mind

Yep, me too. That's why these patches mostly reuse what's already in the code base. No additional IPC, no IPC schema changes. Added code is mostly wrappers to connect pieces. And implementation of underlying mechanisms for those wrappers is also already in the code.

please consider wontfixing this in favor of bug 1619523.

Why did you close this bug? I can't see any technical reasons. I'm explaining reasons behind my decisions. Please do the same.
You just doesn't like it and then decide to shut me up? Please stop doing that.

bug 1629788 optimizes software decoding.

It may save a bit of CPU time by converting YUV to RGB in GPU, but still looks like more like "why not" to me. From my experience, most of the CPU consumption in CPU-based decoding and compositing comes from scaling.

We had this discussion between bug 1619523 comment 3 and bug 1619523 comment 11.

Yes we had. Your part of conversation was not technical though. I commented almost everything and still wait for the technical reasons.

DMABUF/X11 examples were provided in bug 1580166 comment 5.

Have you read the code you were linking to? It makes X Pixmap from dmabuf. Or dmabuf from X Pixmap.
It's an interesting technical exercise to implement that, but it won't move solution from X Pixmaps.

stransky then confirmed that he has started the refactor to make the existing DMABUF VAAPI/WebGL code path usable on X11 as well.

No, his comment doesn't say that. It was clearly mentioning EGL. So, his solution will work for EGL + WebRender only. Mine will support Basic compositor and maybe OpenGL and WebRender, on GLX.

And bug 1619523 could be used with Basic compositor as well, if that's desired at all.

To make that possible you'll need to implement what's already in my patch. And also more code over that.
If you still think you are right, please explain how you are planning to composite dmabuf's in Basic compositor.

(In reply to Jean-Yves Avenard [:jya] from comment #6)

There's no gain in performance as the GPU readback eats all the gain made by the HW decoder.

Basic, no accel, jumps from 3.4 W to 4.0 W, with about 3.5 W most of the time
Basic, with accel, jumps from 2.5 W to 3.1 W, with about 2.7 W most of the time.

There is a clear power consumption benefit.

Also, not for the sake of debates, but for comparison:
WebRender, no accel, jumps from 4.4 W to 4.7 W, with about 4.5 W most of the time.

WebRender is the focus now and the basic compositor will now only be used on machines with the HW blacklisted.

I'm implementing this for Basic because there are issues in OpenGL- and WebRender- compositors support for X Pixmaps. I'm still working on those two, but mainly on WebRender part. The main part is OK, but there are issues with shutdowns that I want to solve before posting the patch for review here. I've already posted links in Bug 1619523 for anybody interested.

(In reply to Rinat from comment #11)

As was said before in Bug 1619523, the bug you are mentioning is about removing display connection from the content process. That means video decoding will be moved out of content processes. The patches do not prevent that, so this argument doesn't apply.

My understanding is that X11 connections could be prevented in the RDD process, too.

DMABUF/X11 examples were provided in bug 1580166 comment 5.

Have you read the code you were linking to?

I wanted to find and provide example code on avoiding permission errors with GBM on X11 because nobody else did.
My understanding is that GBM wants to be utilized in the content or RDD process and the dmabuf be used via EGL in the GPU process.
The DRI3 dmabuf-to/from-pixmap part seemed unrelated for the moment.
pcwalton's DRI2 example states:
"This is a demonstration of how to use cross-process texture sharing using DRI/DRM directly on Linux. The client process (drm-client), which is intended to have lower privileges, does not need an X connection; you can see the resources that it is using with strace."

It may save a bit of CPU time by converting YUV to RGB in GPU, but still looks like more like "why not" to me.

Just for parity with Windows and soon MacOS.

If you still think you are right, please explain how you are planning to composite dmabuf's in Basic compositor.

No, I don't claim that, I don't know. I do not understand why the fallback path (Basic compositor) should get such a feature when we aim for WebRender with DMABUF and EGL. My impression was that you insisted on reintroducing an old code path that doesn't reduce X11 connections (as far as possible) without helping on getting DMABUF done earlier.

My understanding is that X11 connections could be prevented in the RDD process, too.

As far as I understand that's why DXVA2-based decoders are most likely will go to the GPU process. VA-API decoders can be moved there too.

does not need an X connection

You need to interact with X at some point anyway. Also, the whole codebase is modular enough. That means once one finds a way to reduce number of X connections on the decoder side, they can easily update the code. And no, just your words are enough. You clearly do not understand amount of work required to get a working implementation.

My impression was that you insisted on reintroducing an old code path that doesn't reduce X11 connections (as far as possible) without helping on getting DMABUF done earlier.

I want a working feature, not a "shiny new thing" that will supposedly solve issues sometime in the future.
What you see as a salvation requires either EGL or a ton of tricky code integration. EGL is still delayed, even though there is a patch. And tricky GLX integration is... tricky.
DMABUF is fine for Wayland, because with current Wayland compositors you either have EGL or have nothing at all.

Currently code in my patches is for Basic compositor because on Basic compositor it has no issues that I'm aware of. That's why I post it. I'll post WebRender support too once it's ready.

It may save a bit of CPU time by converting YUV to RGB in GPU, but still looks like more like "why not" to me.

Just for parity with Windows and soon MacOS.

So it is "why not".

I do not understand why the fallback path (Basic compositor) should get such a feature when we aim for WebRender with DMABUF and EGL.

Really sounds like a jealousy to me. You hate that backend that you don't like will have the feature while the backend which you like will not? Step up, make the necessary coding work to bring it to the working state. The only thing you are currently doing is smothering my efforts using administrative resources available to you.

I've considered various ways. The way you are trying to push, while is may be technically advanced, requires a lot of efforts. It's like a castle in the sky. It's beautiful and shiny. But it is still in the sky. The way I went allows me to have a working feature in observable future. Now, to be specific. The main plumbing is already in the code base. Why not to use it? What's wrong with having a feature while a more advanced feature is on the way? How on earth that prevents DMABUF support from being implemented? You really think I'll jump into bugs and close them with WONTFIX just as you do?

I think Rinat makes a compelling argument for decreased power use with their proposed patch, along with the idea that they are proposing a solution that gets us somewhere for more users today than in the future. That is a pragmatic argument, and as long as Rinat is willing to support it (and Mozilla is willing to accept it), I don't see why we ought to refuse the contribution.

I'll reopen just because it is not clear to me that those arguments have been rejected by maintainers.

Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: WONTFIX → ---

(In reply to Rinat from comment #14)

I've considered various ways. The way you are trying to push, while is may be technically advanced, requires a lot of efforts. It's like a castle in the sky. It's beautiful and shiny. But it is still in the sky. The way I went allows me to have a working feature in observable future. Now, to be specific. The main plumbing is already in the code base. Why not to use it? What's wrong with having a feature while a more advanced feature is on the way? How on earth that prevents DMABUF support from being implemented? You really think I'll jump into bugs and close them with WONTFIX just as you do?

Please calm down. The bug was wontfixed by Jean-Yves who is Mozilla media developer, not by Jan. Also I understand the patch can work for some subset of users/hardware although the idea to use Basic compositor this way is a bit radical.

I think the best approach here is to merge some common parts (loading of X11 libraries, config etc) which can be shared with EGL dmabuf implementation and maybe polish the rest or ship them as downstream patch. Many features like Wayland/Gtk3 was tested at distro level as feature for power users first and then merged upstream for wider audience.

To add also; on all other platforms when the basic compositor is used; we explicitly disable all use of HW.

The basic compositor is our fallback when something is broken with the GPU somewhere.

We've had thousands of crashes over the year due to bad hardware or drivers. I can't think of a case where a linux drivers would be of a higher quality than its windows equivalent ; just due to the unfortunate priority given to linux over windows development.

So doing differently on Linux over all other platforms, and force using the HW decoder with the basic compositor goes against that.

We would be introducing instability in a code path designed to be slower for sure, but stable.

(In reply to Jean-Yves Avenard [:jya] from comment #17)

To add also; on all other platforms when the basic compositor is used; we explicitly disable all use of HW.
So doing differently on Linux over all other platforms, and force using the HW decoder with the basic compositor goes against that.
We would be introducing instability in a code path designed to be slower for sure, but stable.

In the current patch version a user has to explicitly enable a setting about VA-API decoding and XRender support, both disabled in default configuration. Otherwise code won't even try to initialize hardware decoder. So I believe this will not introduce additional instability.

Looks like I failed to clearly explain what I was saying. There are two classes implementing compositors that are both called "basic": BasicCompositor and X11BasicCompositor. What you are talking about is most probably BasicCompositor. It's software-only. What I was talking about is X11BasicCompositor, which is used only when gfx.xrender.enabled is true. It uses GPU for compositing via XRender API.

And no, this whole approach not limited to X11 Basic compositor, it can be extended to OpenGL and WebRender. OpenGL compositor can process X11 Pixmaps if gfx.use-glx-texture-from-pixmap setting is enabled, but for some reason fails to drop unused TextureHosts if a tab with video goes to background. That needs to be fixed, but I haven't got into that yet. Also, I don't want to encourage to enable gfx.use-glx-texture-from-pixmap because it currently crashes WebRender (bug 1561976) until I address that too. The patch for the WebRender that I'm working on, still has issues on process shutdown, but the main functionality is there, and I see a power consumption decrease (from ~4.5 W to ~3.6W) with it enabled. I plan to fix that and then allow WebRender (and probably OpenGL) for the VA-API decoding too.

If I'm not misunderstanding something horribly, current summary is:

  • The fallback path (Basic compositor) won't use hardware, therefore this bug is wontfixed. (comment 6, comment 17)
  • Basic compositor will be replaced with Software WebRender (bug 1601053) at some point.
  • XRender (X11BasicCompositor; bug 1180942) and X11 pixmaps are deprecated (comment 8).
  • DMABUF / DRI allows to avoid further X11 connections. (bug 1129492 comment 24)
  • The previous OpenGL compositor is deprecated. (bug 1558202 comment 1)
  • WebRender/X11 (bug 1614523) is going to share RedHat-maintained Wayland improvements (bug 1586696, bug 1616185, bug 1629788) by bug 1619523 and bug 788319 to allow the removal of obsolete code paths at some point.
  • Another (then duplicate) video path is not desired. (comment 6)
  • Some common parts (loading of X11 libraries, config etc) can be merged and be used by the DMABUF implementation. (comment 16)
  • Until bug 1619523 is ready, it's possible to test DMABUF/VAAPI without logging out of X11: Install the Weston Wayland compositor: $ sudo apt install weston. If ons runs $ weston one can launch Firefox with Wayland enabled inside the Weston window and VAAPI works.
    (Tested with: MOZ_LOG="PlatformDecoderModule:5" LIBVA_DRIVER_NAME=i965 MOZ_ENABLE_WAYLAND=1 mozregression --launch 2020-06-10 --pref gfx.webrender.all:true widget.wayland-dmabuf-vaapi.enabled:true -a https://bug1619882.bmoattachments.org/attachment.cgi?id=9149605 -P stdout)
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → WONTFIX

As of today, almost no Linux users are using WebRender 1. I don't even know where Wayland sits in this picture. Nothing seems planned in the midterm to enable WebRender for Linux users 2. So, we won't be seeing hardware acceleration before a long time. I am not using Firefox to watch videos since a long time since it's either battery hungry (regular videos) or just unable to decode correctly (4k60p). The proposed patch seems quite minimal and can be tossed once there is a workable alternative.

(In reply to Vincent Bernat from comment #20)

As of today, almost no Linux users are using WebRender [1].
Nothing seems planned in the midterm to enable WebRender for Linux users [2].
So, we won't be seeing hardware acceleration before a long time.

Thanks for the question. WebRender is still disabled by default, but that will change. Linux is more driven by the community. Apart from some edge cases you should not notice any bugs with WebRender. I'm using it for 3 years now. Please open about:config, set gfx.webrender.all to true and restart Firefox. If you want to see improvements (like VAAPI on X11) sooner, you are welcome to join the Nightly community: https://nightly.mozilla.org

I don't even know where Wayland sits in this picture.

Firefox' Wayland backend is enabled by default on Fedora. You just need to turn on widget.wayland-dmabuf-vaapi.enabled there.
On other distributions, log in to Gnome Wayland, for example, and start Firefox with the MOZ_ENABLE_WAYLAND=1 environment variable.
Enabling the Wayland backend by default (upstream) depends on bug 1578640 and - currently - Flash removal in October.

I am not using Firefox to watch videos since a long time since it's either battery hungry (regular videos) or just unable to decode correctly (4k60p).

Yes, partly it's awful. bug 1619523 will solve this.
And bug 1629788 is about slightly optimized software decoding.

The proposed patch seems quite minimal and can be tossed once there is a workable alternative.

Sorry. This bug insists on using a deprecated code path for the non-accelerated Basic compositor. This was implemented after being asked for not doing so and for helping with DMABUF instead. We all were CC'd on this bug to create social pressure. Deprecated X11 pixmaps are not tied to WebRender: bug 1616185 is tied to WebRender and bug 1619523 will enable it on X11. Parts of this patch can be used for it. Thanks for all contributions to get bug 1619523 done soon. Please coordinate there.

I know this is a bug tracker, so I am not trying to start a debate. I'll just to answer two of your points in case you are interested by the provided input. But feel free to discard. I have already tried WebRender in the past, but did get back to not enabling it because of the awful performance with videos (3 months ago, it was twice the CPU usage compared to basic compositor, on the stable release of Firefox). I'll try to enable it again. As for Wayland, there are many showstoppers to use it. In my case, I would need to switch to another window manager and I wouldn't be able to use the videoconferencing tool we use at work (Zoom). VAAPI support in Firefox was a big motivation for me to look again recently.

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #5)

FFMPEG_LOG("VA-API/X11 needs Basic compositor with XRender enabled");

XRender was deprecated by bug 1180942. This bug would introduce maintenance burden and only profits the proprietary Nvidia driver, even though Nvidia plans to increase its contributions to Mesa/Nouveau. Please don't crush this seedling by taking away the pressure.
With required sandboxing improvements (bug 1129492 comment 24), past pixmap problems and the desire for code sharing in mind, please consider wontfixing this in favor of bug 1619523. bug 1629788 optimizes software decoding. We had this discussion between bug 1619523 comment 3 and bug 1619523 comment 11. DMABUF/X11 examples were provided in bug 1580166 comment 5. stransky then confirmed that he has started the refactor to make the existing DMABUF VAAPI/WebGL code path usable on X11 as well.

:jrmuizel, can you comment about XRender part?

Flags: needinfo?(jmuizelaar)

What is the correct forum to discuss these things? Is there a mailing list?

To me it looks like, the power usage aspect is important, and that until now, it wasn’t considered. (I created bug 1645298 1 to integrate this into QA.)

Where can statistics be found, how often there are still graphics driver errors, so the non-accelerated compositor is used?

Additionally, probably everybody knows, but we have to be clear, that if Wayland is used or not, is dependent on the used window manager, and not so much of the distribution.

Several enterprise distributions and other installations are going to use X over Wayland, even if the window manager would support Wayland in newer upstream releases, for several more years. They should use current Firefox releases though.

Having these changes proposed here – configurable, and up to the distribution to enable by default – would also give a good way to compare against the other proposed solutions.

Flags: needinfo?(jmuizelaar)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: