[mp4_demuxer] Expose AVCC data and make Annex B optional

RESOLVED FIXED in mozilla33

Status

()

Core
Audio/Video
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: rillian, Assigned: rillian)

Tracking

Trunk
mozilla33
x86
Mac OS X
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 7 obsolete attachments)

Apple's VideoToolbox decoder wants AVCC data so it can apply the Annex B transformation itself. Therefore the PlatformDecoderModule need a way to retrieve that through our stagefright wrapper and ask the demuxer not to mark the samples as Annex B.

This is complicated by the fact that libstagefright writes the Annex B-indicating 0x00000001 at the start of each sample, but doesn't actually write the sps, etc. table on keyframes.
(Assignee)

Comment 1

4 years ago
Created attachment 8432875 [details] [diff] [review]
avcc.patch

Patch from :ajones to fix this.
Assignee: nobody → giles
(Assignee)

Updated

4 years ago
Blocks: 941296
Comment on attachment 8432875 [details] [diff] [review]
avcc.patch

Review of attachment 8432875 [details] [diff] [review]:
-----------------------------------------------------------------

I think the PDM needs a way to signal whether it wants AVCC or Annex-B, and we need to be able to direct the demuxer to output said format.
(Assignee)

Comment 3

4 years ago
Created attachment 8434518 [details] [diff] [review]
conditional annex b v2
Attachment #8434518 - Flags: review?(cpearce)
(Assignee)

Comment 4

4 years ago
New patch porting my conditional stuff from the old decoder. I didn't update the libstagefright patch generation; there seems to be a missing tag file indicating the last imported revision to diff against.

I made avcc video the default, so for WMF we have to request Annex B explicitly. Apple requires avcc and ffmpeg accepts both. We could flip the default and have only Apple request avcc. That would let us drop the changes to ffmpeg, but this way we avoid the memmove in MP4Sample::Prepend().

Anthony had an idea of avoiding the copy altogether by pre-allocating the necessary space in the sample buffer, but I'm not sure what he meant in detail unless it's adding lots of internal complexity to MP4Sample so it can move its 'data' member around.
(Assignee)

Comment 5

4 years ago
https://tbpl.mozilla.org/?tree=Try&rev=8c9eca91f65e
(In reply to Ralph Giles (:rillian) from comment #4)
> Anthony had an idea of avoiding the copy altogether by pre-allocating the
> necessary space in the sample buffer, but I'm not sure what he meant in
> detail unless it's adding lots of internal complexity to MP4Sample so it can
> move its 'data' member around.

stagefright::MediaBuffer has a range_offset() and a range_length() field. The range_offset() is always 0. If we change it so that the buffer is written with non-zero offset then that would give us space at the beginning of the buffer to prepend into.
Comment on attachment 8434518 [details] [diff] [review]
conditional annex b v2

Review of attachment 8434518 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/fmp4/MP4Reader.cpp
@@ +174,5 @@
>    PlatformDecoderModule::Init();
>    mDemuxer = new MP4Demuxer(new MP4Stream(mDecoder->GetResource()));
> +#ifdef XP_WIN
> +  // Windows Media Foundation requires AnnexB samples.
> +  mDemuxer->PrepareAnnexB(true);

Once we're decoding with Gecko Media Plugins (OpenH264, or an EME plugin) this may not be correct; it may vary between PDMs on the same platform. You really need to ask the PlatformDecoderModule directly.

So I think we should add a new virtual method to PlatformDecoderModule:

enum H264Format {
  kAnnexB,
  kAVCC,
  kInvalid
};

virtual H264Format RequiredH264Format() = 0;

Pass the required H264Format to MP4Reader's call to MP4Demuxer::DemuxVideoSample(), just call mPlatform->RequiredH264Format() every time.

@@ +175,5 @@
>    mDemuxer = new MP4Demuxer(new MP4Stream(mDecoder->GetResource()));
> +#ifdef XP_WIN
> +  // Windows Media Foundation requires AnnexB samples.
> +  mDemuxer->PrepareAnnexB(true);
> +  // Apple requires plain samples, and ffmpeg doesn't care.

"plain" is not the right word here.
Attachment #8434518 - Flags: review?(cpearce) → review-
(Assignee)

Comment 8

4 years ago
Created attachment 8437322 [details] [diff] [review]
conditional annex b v3

Patch implementing this as a method on the PlaformDecoderModule. Kind of cumbersome this way, because of who has the VideoDecoderConfig object with the Annex B data, and who knows what the MediaDataDecoder actually wants.

I think it might be cleaner for the demuxer to return AVCC data and let the MediaDataDecoder call a utility function to prepend the data if it needs too.
Attachment #8432875 - Attachment is obsolete: true
Attachment #8434518 - Attachment is obsolete: true
(In reply to Ralph Giles (:rillian) from comment #8)
> I think it might be cleaner for the demuxer to return AVCC data and let the
> MediaDataDecoder call a utility function to prepend the data if it needs too.

OK, do it.
Created attachment 8442892 [details] [diff] [review]
conditional annex b v4

This patch moves Annex B preparation to a utility function. The demuxer now makes available both the raw AVCC data and the parsed AnnexB header through the VideoDecoderConfig, and decoders are expected to call the utility function to convert samples to Annex B if they need them.

The ffmpeg decoder is adapted to use its internal AVCC parsing.

The WMF decoder is adapted to convert the samples its given to Annex B.
Attachment #8437322 - Attachment is obsolete: true
Created attachment 8443108 [details] [diff] [review]
conditional annex b v5

Updated patch. This one actually builds on windows. I can't tell if it works; my vm connection may be corrupting the output video.

I considered adding a WMFVideoDataDecoder subclass so I could explicitly hold a reference to the VideoDecoderConfig and code application to video samples that way, but this seemed like less code.
Attachment #8442892 - Attachment is obsolete: true
Attachment #8443108 - Flags: review?(cpearce)
I pushed to try. Chris, could you please check the resulting build on your windows machine?

https://tbpl.mozilla.org/?tree=Try&rev=685f2b3b0487
Comment on attachment 8443108 [details] [diff] [review]
conditional annex b v5

Review of attachment 8443108 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/fmp4/wmf/WMFMediaDataDecoder.cpp
@@ +23,3 @@
>  namespace mozilla {
>  
>  WMFMediaDataDecoder::WMFMediaDataDecoder(WMFOutputSource* aSource,

Let's keep WMFMediaDataDecoder independent of the media type its decoding.

Pass the VideoDecoderConfig into the WMFVideoOutputSource constructor, and add a Input(mp4_demuxer::MP4Sample* aSample) method to WMFOutputSource which is called by WMFMediaDataDecoder::ProcessDecode() in the place where it calls mDecoder->Input().

The implemention on WMFAudioOutputSource can just pass the call through to the MFTDecoder, but the WMFVideoOutputSource needs to call the function to do the conversion.

::: media/libstagefright/binding/AnnexB.cpp
@@ +22,5 @@
> +  MOZ_ASSERT(aSample->data);
> +  MOZ_ASSERT(aSample->size < ArrayLength(kAnnexBDelimiter));
> +  // Overwrite the NAL length with the Annex B separator.
> +  memcpy(aSample->data, kAnnexBDelimiter, ArrayLength(kAnnexBDelimiter));
> +  // Prepend the Annex B header with SPS and PSP tables.

You only need to prepend the SPS and PPS (spelling mistake; it's not "PSP") infront of *keyframes*.
Attachment #8443108 - Flags: review?(cpearce) → review-
Created attachment 8443329 [details] [diff] [review]
conditional annex b v6

> Pass the VideoDecoderConfig into the WMFVideoOutputSource constructor, and
> add a Input(mp4_demuxer::MP4Sample* aSample) method to WMFOutputSource which
> is called by WMFMediaDataDecoder::ProcessDecode() in the place where it calls
> mDecoder->Input().

Yes, that's better.

> You only need to prepend the SPS and PPS (spelling mistake; it's not "PSP")
> in front of *keyframes*.

Thanks. Fixed.

https://tbpl.mozilla.org/?tree=Try&rev=bb4fd403a89a
Attachment #8443108 - Attachment is obsolete: true
Attachment #8443329 - Flags: review?(cpearce)
Created attachment 8443330 [details] [diff] [review]
annexbv5v6.diff

Interdiff with changes relative to the last patch.
Comment on attachment 8443329 [details] [diff] [review]
conditional annex b v6

Review of attachment 8443329 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.

::: content/media/fmp4/wmf/WMFVideoOutputSource.cpp
@@ +151,5 @@
> +  // Forward sample data to the decoder.
> +  const uint8_t* data = reinterpret_cast<const uint8_t*>(aSample->data);
> +  uint32_t length = aSample->size;
> +  HRESULT hr = mDecoder->Input(data, length, aSample->composition_timestamp);
> +  if (FAILED(hr)) {

We're warning in WMFMediaDataDecoder::ProcessDecode, so we don't really need to warn here as well.
Attachment #8443329 - Flags: review?(cpearce) → review+
Created attachment 8443774 [details] [diff] [review]
conditional annex b v7

Thanks for the Saturday review, cpearce!

Remove redundant warning text, and fix up commit message. Carrying forward r=cpeace.
Attachment #8443329 - Attachment is obsolete: true
Attachment #8443330 - Attachment is obsolete: true
https://hg.mozilla.org/integration/mozilla-inbound/rev/ccdc4a98cc11
https://hg.mozilla.org/mozilla-central/rev/ccdc4a98cc11
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
(In reply to Ralph Giles (:rillian) from comment #11)
> Created attachment 8443108 [details] [diff] [review]
> conditional annex b v5
> 
> Updated patch. This one actually builds on windows. I can't tell if it
> works; my vm connection may be corrupting the output video.

Turns your patch doesn't work on Windows.
Comment on attachment 8443774 [details] [diff] [review]
conditional annex b v7

Review of attachment 8443774 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/libstagefright/binding/AnnexB.cpp
@@ +21,5 @@
> +  MOZ_ASSERT(aSample);
> +  MOZ_ASSERT(aSample->data);
> +  MOZ_ASSERT(aSample->size >= ArrayLength(kAnnexBDelimiter));
> +  // Overwrite the NAL length with the Annex B separator.
> +  memcpy(aSample->data, kAnnexBDelimiter, ArrayLength(kAnnexBDelimiter));

Here we need:

  uint8_t* d = aSample->data;
  uint8_t* end = d + aSample->size;
  while (d + 4 < end) {
    uint32_t nalLen = (((uint32_t)d[0]) << 24) +
                      (((uint32_t)d[1]) << 16) +
                      (((uint32_t)d[2]) << 8) +
                         d[3];
    // Overwrite the NAL length with the Annex B separator.
    memcpy(d, kAnnexBDelimiter, ArrayLength(kAnnexBDelimiter));
    d += 4 + nalLen;
  }

That fixes it for me. I'll write a patch for this tomorrow.

::: media/libstagefright/frameworks/av/media/libstagefright/MPEG4Extractor.cpp
@@ -3301,5 @@
>                  }
>  
>                  CHECK(dstOffset + 4 <= mBuffer->size());
> -
> -                dstData[dstOffset++] = 0;

I think this is part of the problem, the demuxer can return multiple NALs in the same output sample, so we need to loop over all the NAL units and re-set the start codes in AnnexB::ConvertSample().
Ah, that makes sense. Thanks for investigating. I had assumed the problems were with prepending the AVCC data to non-keyframes.
Another issue here is that the stagefright demuxer assumes the sizeof the NAL length field is always 4, and that's not always the case.
Not by spec. Do you have files which the old code fails for too?

P.S. We should move this to a new bug.
Follow up in bug 1029824.
You need to log in before you can comment on or make changes to this bug.