Closed Bug 1460499 Opened 7 years ago Closed 5 years ago

Use DirectComposition for hardware decoded video on Windows

Categories

(Core :: Graphics: WebRender, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
82 Branch
Tracking Status
firefox82 --- fixed

People

(Reporter: sotaro, Assigned: sotaro)

References

(Blocks 2 open bugs)

Details

(Whiteboard: wr-planning)

Attachments

(1 file, 13 obsolete files)

47 bytes, text/x-phabricator-request
Details | Review
We want a way to directly composite video frames if possible with DirectComposition.
We don't currently have any evidence we need to do this for the MVP
Priority: P2 → P3
Blocks: stage-wr-next
No longer blocks: stage-wr-trains
Blocks: 1423324
Depends on: 1546823
Whiteboard: [wr-q3][wr-july]
Whiteboard: [wr-q3][wr-july] → [wr-q3][wr-sept]

Chromium uses "Direct3D 11 Video Interfaces" for handling video overlays. To use it, texture data need to be the following format. Then we could not use software decoded videos for video overlays for now. Software decoded video normally uses YUV420(YV12).

  • DXGI_FORMAT_NV12
  • DXGI_FORMAT_P010
  • DXGI_FORMAT_P016

https://cs.chromium.org/chromium/src/ui/gl/swap_chain_presenter.cc?l=942

The patch does video rendering to DCompositionVisual when video is decoded to DXGITextureHostD3D11 with NV12 format. It was confirmed by visiting http://www.html5videoplayer.net/html5video/mp4-h-264-video-test/. When a video was playback, a video was also rendered to dc layer.

  • We could add software decoded video support. But it needs to be NV12 format. And we do not know yet about performance.
  • Video rendering with IDXGIDecodeSwapChain seems to provide better performance than D3D11VideoProcessor . But it seems that we need to modify decoding side code for using it.

I roughly a video playback performance with Attachment 9090712 [details] [diff]). The following youtube video was used for h.264 video playback. Majority of recent youtube videos does not use h.264.
https://www.youtube.com/watch?v=L5WFSSFuufI

With the patch, gpu and cpu usages were lower. gpu usage was lower than "Direct3D 11 (Advanced Layers)".

Compositing: WebRender

  • cpu 20%
  • gpu 60%

Compositing: WebRender with video overlay(Attachment 9090712 [details] [diff])

  • cpu 5%-10%
  • gpu 7%

Compositing: Direct3D 11 (Advanced Layers)

  • cpu: 5%
  • gpu: 16%-17%
Assignee: nobody → sotaro.ikeda.g
Depends on: 1581307
Depends on: 1582011
Depends on: 1582371
Whiteboard: [wr-q3][wr-sept] → [wr-q41]
Depends on: 1583432
Depends on: 1583482
Summary: Use DirectComposition for video on Window → Use DirectComposition for hardware decoded video on Window
Attachment #9097295 - Attachment is obsolete: true

Attachment 9097505 [details] [diff] set video dc layer under "default swap chain dc layer" and the default swap chain renders a transparent hole for the video dc layer. It is because video frame could have a ui element over the video dc layer.

From a discussion, we do not chooser the above option for video dc layer for now. Instead, video frame and UI elements above video is split to different dc layers. This does not need a transparent hole. There could be a risk that it could increase a number of dc layers.

Depends on: 1579235

For my own benefit, I'm going to describe what I think this patch does:
It creates a DC visual for each video. These visuals are ordered below the visual for the main swap chain. In WebRender's rendering (which goes into the main swap chain), video images are replaced with ClearRect items. Video visuals get their content as follows: Their content is a SEQUENTIAL_FLIP swap chain that has been created for the visual. It uses an RGB color space. Video is converted from NV12 and copied into the RGB swap chain using ID3D11VideoContext::VideoProcessorBlt, with the input being the ID3D11Texture2D from the video frame's RenderDXGITextureHostOGL, and the output being the ID3D11Texture2D back buffer of the swap chain. The patch hints that IDXGIDecodeSwapChain might be a better solution but doesn't use it.

No longer depends on: 1546823
Depends on: 1600539
Depends on: 1601531

Bug 1595994 might affect to this bug.

See Also: → 1595994
Blocks: 1601297
See Also: → 1623530
Summary: Use DirectComposition for hardware decoded video on Window → Use DirectComposition for hardware decoded video on Windows
Depends on: 1623638
Attachment #9097505 - Attachment is obsolete: true
Attached file Bug 1460499 - wip patch (obsolete) —
Whiteboard: [wr-q41] → wr-planning
Attachment #9135039 - Attachment is obsolete: true
Blocks: 1663585
Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9c88255c845c Use DirectComposition for hardware decoded video on Windows r=nical
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → 82 Branch
Regressions: 1663894

Sotaro, is it correct that this never got enabled? If so why?

Flags: needinfo?(sotaro.ikeda.g)

FWIW, I tried out this patch on a Intel HD 530 and overall usage seemed lower. One thing that was interesting is that it causes us to spend time in the "Video Processing" block of Task Manager. Neither, Chrome or Firefox normally spend anytime there.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #24)

Sotaro, is it correct that this never got enabled? If so why?

It was because I saw frame drop during STR of Bug 1667500. I noticed it just before trying to enable it. Current implementation uses RGB SwapChain, it seemed not provide good performance. We might need to use YUV SwapChain for performance.

When I created the current implementation, I did not have a Win PC that supported NV12 SwapChain. NV12 SwapChain support was not common. The my Win PCs only supported YUY2 SwapChain. Gecko did not have code to handle YUY2 video. Then I stopped to think about YUV SwapChain before.

I have 2 new Win PCs since then. Both supports NV12 SwapChain. NV12 SwapChain support seemed to become more common. It is better to look into NV12 SwapChain support for video now.

Flags: needinfo?(sotaro.ikeda.g)

Created Bug 1722447 for YUV SwapChain support for hardware decoded video.

(In reply to Jeff Muizelaar [:jrmuizel] from comment #25)

FWIW, I tried out this patch on a Intel HD 530 and overall usage seemed lower. One thing that was interesting is that it causes us to spend time in the "Video Processing" block of Task Manager. Neither, Chrome or Firefox normally spend anytime there.

Current implementation does not support YUV SwapChain. It might affect to performance.

:jrmuizel, can you provide chrome://gpu of chrome browser?

Flags: needinfo?(jmuizelaar)

chrome://gpu does not show support for overlays on this machine, I believe because of blocked directcomposition. Running Chrome with --disable-gpu-driver-bug-workarounds shows even better performance and does seem to use overlays.

Flags: needinfo?(jmuizelaar)

Or if not overlays at least a separate swapchain for video.

Regressions: 1765214
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: