1287379 - Rework how the Apple VT decoder is using InputExhausted()

Assignee

Description

•

9 years ago

The Apple VT decoder outputs frame as soon as one is decoded. We can reduce the rate at which the decoder is fed. The only time InputExhausted is to be used is when decoding the first frame.

Jean-Yves Avenard [:jya]

Assignee

Comment 1

•

9 years ago

Attached file Bug 1287379: Rework Apple VT use of InputExhausted. — Details

The only time we need to use InputExhausted is for the initial video decoding or when a frame is dropped. Review commit: https://reviewboard.mozilla.org/r/64848/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/64848/

Attachment #8771893 - Flags: review?(cpearce)

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 2

•

9 years ago

https://reviewboard.mozilla.org/r/64848/#review62052 I want to make sure I understand how this works before r+ing it, so I'm leaving the status r? for now. ::: dom/media/platforms/apple/AppleVTDecoder.cpp:292 (Diff revision 1) > - mCallback->InputExhausted(); > - } > - MOZ_ASSERT(mQueuedSamples); > - mQueuedSamples--; > - > if (!aImage) { As I understand it, the decoder may need a number of inputs before it can produce an output. The InputExhausted callback is to signal to the MediaFormatReader that we're in one of those situations and that the decoder requires more input, in excess of the "decode ahead" that the MediaFormatReader usually maintains. I'd like to understand what's causing InputExhausted to be called in the "decoder-requries-more-input" case. I'm guessing that if the decoder doesn't have enough input to generate output, that it calls the callback with a null 'image' argument? And that's what causes you (here) to call the InputExhausted callback? If that's the case, the comment "Image was dropped by decoder" is misleading, and should be changed so that it's obvious that calling the InputExhausted callback is critical to the decoder being fed input in a timely fashion. The commit message needs to change too. If the platform decoder doesn't call the PlatformCallback with a null image when it's not had enough input to produce an output, then (now that you're not calling InputExhausted in ProcessDecode()) I don't see where else you're calling InputExhausted that would enable you to break out of the "decoder-requries-more-input" state. ::: dom/media/platforms/apple/AppleVTDecoder.cpp:412 (Diff revision 1) > MonitorAutoLock mon(mMonitor); > mReorderQueue.Push(data); > while (mReorderQueue.Length() > mMaxRefFrames) { > mCallback->Output(mReorderQueue.Pop().get()); > } > + if (mReorderQueue.Length() < mMaxRefFrames) { Doesn't this InputExhausted callback override the MediaFormatReader's "decode ahead" behaviour, and cause the decoder to be fed enough input (providing it's producing output, otherwise we don't get to this code path) that enables it to potentially buffer up to mMaxRefFrames in its reorder queue? I'm not saying this is wrong or necessarily bad, I'm just making sure I understand how this works.

Jean-Yves Avenard [:jya]

Assignee

Comment 3

•

9 years ago

https://reviewboard.mozilla.org/r/64848/#review62052 > As I understand it, the decoder may need a number of inputs before it can produce an output. The InputExhausted callback is to signal to the MediaFormatReader that we're in one of those situations and that the decoder requires more input, in excess of the "decode ahead" that the MediaFormatReader usually maintains. > > I'd like to understand what's causing InputExhausted to be called in the "decoder-requries-more-input" case. > > I'm guessing that if the decoder doesn't have enough input to generate output, that it calls the callback with a null 'image' argument? And that's what causes you (here) to call the InputExhausted callback? > > If that's the case, the comment "Image was dropped by decoder" is misleading, and should be changed so that it's obvious that calling the InputExhausted callback is critical to the decoder being fed input in a timely fashion. > > The commit message needs to change too. > > If the platform decoder doesn't call the PlatformCallback with a null image when it's not had enough input to produce an output, then (now that you're not calling InputExhausted in ProcessDecode()) I don't see where else you're calling InputExhausted that would enable you to break out of the "decoder-requries-more-input" state. Let's ignore the decode ahead for now, because it's unrelated, it just happens to use a similar code path. The Apple VT decoder doesn't work like the WMF one. It outputs frame in decode order; it doesn't buffer frames for an insane amount of time (like WMF keeping over 30 frames). The first frame fed to the decoder will always be a keyframe. The decoder will immediately output that frame decoded. For one call to VTDecompressionSessionDecodeFrame, there will be a call to the callback. If a frame can't be decoded, or isn't to be displayed, then the callback will still be called with a NULL value. Not that this wasn't the case with VDA, VDA had operations to drain/flush: they no longer exist with VT because of this 1:1 ratio, at worse you just need to wait for the callback which is done using the VTDecompressionSessionWaitForAsynchronousFrames call (FWIW, Chrome and FFmpeg always call VTDecompressionSessionWaitForAsynchronousFrames after feeding the decoder a compressed frame) The MediaFormatReader calls Input expecting by default that a call to Output will shortly follow. If we respect the ratio of Input/Output as 1:1, there's no need for InputExhausted. As it is today, many of the MediaDataDecoder do not handle InputExhausted that way, they call it all the time as soon as they're done processing an input. This is unecessary and add unwarranted complexity: I intend to correct this, starting with the VT (because it's the easiest to do) In order to make sure that the frames are returned in presentation order, we must wait until a specific number of frames have been decoded before we can return one. So to get the first output, we must have decoded a minimum of mMaxRefFrames frames. this is the only time InputExhausted will now be used. For the rare case where the VT decoder will drop a frame, then we also need to call InputExhausted because otherwise we would break that ratio of 1:1 between Input/Output. We use InputExhausted as "decoder-requries-more-input", not because the VT decoder itself need more input (it doesn't) but because the VT MediaDataDecoder does need more input. Does this clarify the situation? I believe the implementation I provided does the right thing, and coincidentally is ten times easier than whatever was there before (which with hindsight have no clue how we got into that mess) > Doesn't this InputExhausted callback override the MediaFormatReader's "decode ahead" behaviour, and cause the decoder to be fed enough input (providing it's producing output, otherwise we don't get to this code path) that enables it to potentially buffer up to mMaxRefFrames in its reorder queue? > > I'm not saying this is wrong or necessarily bad, I'm just making sure I understand how this works. We always want to buffer up mMaxRefFrames. mMaxRefFrames is the size of the h264 "sliding window" ; it identifies how many frames depends on one another at the most. This is the minimal size required to guarantee that you can reordre the frames in PTS order. It's value is typically 2 (though we set it at 4 just to be safe). For the most common scenario, at any given time, there are always frames in the queue (until the decoder is drained) The VT decoder output frames in decode order at a ratio of 1:1. We can't ensure the frames can be properly reordered until we've seen "sliding window"'s size frames. Which is why we will call InputExhausted until that many frames have been seen.

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 4

•

9 years ago

Comment on attachment 8771893 [details] Bug 1287379: Rework Apple VT use of InputExhausted. https://reviewboard.mozilla.org/r/64848/#review62116 Thanks for the explanation.

Attachment #8771893 - Flags: review?(cpearce) → review+

Jean-Yves Avenard [:jya]

Assignee

Comment 5

•

9 years ago

Comment on attachment 8771893 [details] Bug 1287379: Rework Apple VT use of InputExhausted. Review request updated; see interdiff: https://reviewboard.mozilla.org/r/64848/diff/1-2/

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Updated

•

9 years ago

Priority: -- → P2

Anthony Jones (:ajones, :kentuckyfriedtakahe, :k17e)

Updated

•

9 years ago

Priority: P2 → P3

Jean-Yves Avenard [:jya]

Assignee

Updated

•

9 years ago

Depends on: 1297265

Jean-Yves Avenard [:jya]

Assignee

Comment 6

•

9 years ago

Moving this patch in bug 1297265

Status: NEW → RESOLVED

Closed: 9 years ago

No longer depends on: 1297265

Resolution: --- → DUPLICATE

Bugzilla

Rework how the Apple VT decoder is using InputExhausted()

Categories

(Core :: Audio/Video: Playback, defect, P3)

Tracking

()

People

(Reporter: jya, Assigned: jya)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Updated

Comment 6

Attachment

General

Description

File Name

Content Type