Closed Bug 1059066 Opened 5 years ago Closed 5 years ago

Reduce copies in AppleVTDecoder

Categories

(Core :: Audio/Video, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: rillian, Assigned: jya)

References

Details

Attachments

(2 files, 6 obsolete files)

AppleVTDecoder calls VideoToolbox routines to decode h.264 video. These return CVImageBufferRefs.

Currently we assert that this is a CVPixelBuffer subclass, and then AppleVTDecoder::OutputFrame() constructs a VideoData object with a copy of the YUV data and submits it to the playback queue.

Instead we'd like to construct a wrapper around the CVImageBufferRef and queue that to (a) avoid the copy and (b) avoid moving the data out of GPU memory in the case of hardware decoding. Note that we must retain and release the ref since the VideoToolbox container may also use it as a reference frame.
Jean-Yves do you know how CVImageBuffer works?

Can a CVPixelBuffer point to GPU memory? What happens when I lock it?

Do we have to do something special to get an OpenGL buffer from VTDecompressionSessionDecodeFrame() or other hardware decoder output without copying to main memory?
Assignee: nobody → giles
Flags: needinfo?(jyavenard)
Where is the copy done?

It doesn't appear to copy the data, but instead VideoData::YCbCrBuffer contains pointers to the original CVImageBufferRefs content.

Which in itself would be a problem as you can't call CVPixelBufferUnlockBaseAddress until that buffer has been displayed.

I wonder if this explains the crash I'm seeing regularly (1049269)
AppleVTDecoder::OutputFrame -> VideoData::Create(aImage=null) -> VideoData::SetVideoDataToImage(aCopyData=true)
What kinetik said. We call VideoData::Create() with a nullptr for the Image* argument, which asks it to copy the YCbCrBuffer data.
cpearce pointed out https://code.google.com/p/chromium/codesearch#chromium/src/content/common/gpu/media/vt_video_decode_accelerator.cc&sq=package:chromium as a helpful reference. They seem to use CVPixelBuffer as well.

We'll have to talk to someone in #gfx about how to pass the the handle through. There's a MacIOSurface; I don't know if that's what we need or not.
I was planning to focus on a) first. As doing b) properly is complex. 
I don't believe there's a way to detect if a buffer returned is GPU based, and it will depend on the GPU itself. AMD, intel and Nvidia all having a different behaviour.

I had such discussion in the intel vaapi mailing list a few weeks ago on that matter, and in the end, the benefit in keeping the buffer in GPU didn't overcome the additional overhead in doing so.

The other issue, is that on intel architecture properly handling uswc required use of SSE4 accelerated code.
Flags: needinfo?(jyavenard)
Add various IOSurface related methods to MacIOSurface. The rules regarding the ownership of the IOSurface is changed to ensure the IOSurface isn't destroyed during the lifetime of the MacIOSurface. In order to provide compatibility with the Video Toolbox APIs, the usage count is also increased by 1
Attachment #8482631 - Flags: review?(matt.woodrow)
Assignee: giles → jyavenard
Status: NEW → ASSIGNED
Use IOSurface based images. This reduces the amount of copy and allow the backend to keep the images in GPU as required. The compositor currently only handles MacIOSurfaceImage in RGBA32 format. The default format output by the VideoToolbox decoder is 2 planes YUV (NV12). On my machine at least (ATI 6970M), requesting RGBA32 output reduces a lot of the speed benefits. It seems that the conversion is done in CPU.
Attachment #8482635 - Flags: review?(giles)
See Also: → 1061525
Should bug 1061525 be implemented, the CPU usage could be almost halved during playback of an h264 stream
Depends on: 1061525
Add assert to make sure buffer is an IOSurface
Attachment #8482644 - Flags: review?(giles)
Attachment #8482635 - Attachment is obsolete: true
Attachment #8482635 - Flags: review?(giles)
Comment on attachment 8482644 [details] [diff] [review]
Avoid picture frame copy and keep picture in GPU by using IOSurface objects

Review of attachment 8482644 [details] [diff] [review]:
-----------------------------------------------------------------

That nicely simplifies the code! Thanks for taking this on. r=me with comments addressed.

::: content/media/fmp4/apple/AppleVTDecoder.cpp
@@ +18,5 @@
>  #include "prlog.h"
>  #include "MediaData.h"
>  #include "VideoUtils.h"
> +#include "mozilla/ArrayUtils.h"
> +#include "MacIOSurfaceImage.h"

I failed to keep these in alphabetical order. At least don't make it worse. :)

@@ +198,5 @@
>      return;
>    }
>    if (flags & kVTDecodeInfo_FrameDropped) {
>      NS_WARNING("  ...frame dropped...");
> +    return;

Please fix this in a separate commit.

@@ +263,5 @@
>    nsAutoPtr<VideoData> data;
> +  data = VideoData::CreateFromImage(
> +    info, mImageContainer, aFrameRef->byte_offset,
> +    aFrameRef->composition_timestamp, aFrameRef->duration, image.forget(),
> +    aFrameRef->is_sync_point, aFrameRef->decode_timestamp, visible);

Please keep this one argument per line to match local style.

@@ +407,5 @@
> +  const void* surfaceKeys[] = {};
> +  const void* surfaceValues[] = {};
> +  AutoCFRelease<CFDictionaryRef> IOSurfaceProperties = CFDictionaryCreate(
> +    NULL, surfaceKeys, surfaceValues, 0, &kCFTypeDictionaryKeyCallBacks,
> +    &kCFTypeDictionaryValueCallBacks);

You can just pass NULL for the keys and values arguments when the dictionary is empty.

@@ +409,5 @@
> +  AutoCFRelease<CFDictionaryRef> IOSurfaceProperties = CFDictionaryCreate(
> +    NULL, surfaceKeys, surfaceValues, 0, &kCFTypeDictionaryKeyCallBacks,
> +    &kCFTypeDictionaryValueCallBacks);
> +
> +  SInt32 PixelFormatTypeValue = kCVPixelFormatType_32BGRA;

You say our compositor only handles RGBA32. Should this be kCVPixelFormatType_32RGBA instead?

@@ +417,5 @@
> +  const void* outputKeys[] = { kCVPixelBufferIOSurfacePropertiesKey,
> +                               kCVPixelBufferPixelFormatTypeKey,
> +                               kCVPixelBufferOpenGLCompatibilityKey };
> +  const void* outputValues[] = { IOSurfaceProperties, PixelFormatTypeNumber,
> +                                 kCFBooleanTrue };

One per line here too, please.

@@ +427,5 @@
> +
> +  rv = VTDecompressionSessionCreate(NULL,          // Allocator.
> +                                    mFormat, spec, // Video decoder selection.
> +                                    outputConfiguration, // Output video format.
> +                                    &cb, &mSession);

And here.
Attachment #8482644 - Flags: review?(giles) → review+
Comment on attachment 8482631 [details] [diff] [review]
Add various IOSurface related methods to MacIOSurface wrapper

Review of attachment 8482631 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/MacIOSurface.h
@@ +71,5 @@
>    static mozilla::TemporaryRef<MacIOSurface> CreateIOSurface(int aWidth, int aHeight,
>                                                               double aContentsScaleFactor = 1.0,
>                                                               bool aHasAlpha = true);
>    static void ReleaseIOSurface(MacIOSurface *aIOSurface);
> +  // LookupSurface() takes ownership of the provided IOSurface.

Holds a reference rather than takes ownership.

@@ +76,5 @@
>    static mozilla::TemporaryRef<MacIOSurface> LookupSurface(IOSurfaceID aSurfaceID,
>                                                             double aContentsScaleFactor = 1.0,
>                                                             bool aHasAlpha = true);
>  
> +  // The constructed MacIOSurface doesn't take ownership of aIOSurfacePtr.

This isn't really true any more, we hold a ref.
Attachment #8482631 - Flags: review?(matt.woodrow) → review+
(In reply to Ralph Giles (:rillian) from comment #11)
> Please keep this one argument per line to match local style.

I blame mach clang-format !!

> You say our compositor only handles RGBA32. Should this be
> kCVPixelFormatType_32RGBA instead?

sorry, I meant to say : the compositor only handles BGRA
Attachment #8482644 - Attachment is obsolete: true
Attachment #8482631 - Attachment is obsolete: true
Comment on attachment 8483297 [details] [diff] [review]
Avoid picture frame copy and keep picture in GPU by using IOSurface objects

Review of attachment 8483297 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/fmp4/apple/AppleVTDecoder.cpp
@@ +406,3 @@
>                                &kCFTypeDictionaryKeyCallBacks,
>                                &kCFTypeDictionaryValueCallBacks);
> +  AppleUtils::SetCFDict(spec, "EnableHardwareAcceleratedVideoDecoder", true);

You'll need to update this for the change in bug 1055694.
Attachment #8483297 - Attachment is obsolete: true
Attachment #8483898 - Attachment is obsolete: true
Depends on: 1062596
Attachment #8484719 - Attachment is obsolete: true
Blocks: 1062654
Keywords: checkin-needed
hi, 

the first patch worked as expected but the second page failed to apply:

patching file content/media/fmp4/apple/AppleVTDecoder.cpp
Hunk #3 FAILED at 396
1 out of 3 hunks FAILED -- saving rejects to file content/media/fmp4/apple/AppleVTDecoder.cpp.rej
patch failed, unable to continue (try -v)
patch failed, rejects left in working dir
errors during apply, please fix and refresh 1059066_use_iosurface_images.patch

could you take a look  thanks!
Flags: needinfo?(jyavenard)
Keywords: checkin-needed
Did you apply bug 1062596 first?

This bug depends on it.

A real pain I know :(
Flags: needinfo?(jyavenard)
oh yeah that worked! :) sorry missed the dependency when looking at the patch. All good now and checkedin!
Great thanks.
https://hg.mozilla.org/mozilla-central/rev/00a8eb0b6ee3
https://hg.mozilla.org/mozilla-central/rev/65cb5f911c75
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Depends on: 1064847
You need to log in before you can comment on or make changes to this bug.