Closed Bug 886196 Opened 11 years ago Closed 11 years ago

Support fragmented MP4 segments

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla28

People

(Reporter: kinetik, Assigned: cpearce)

References

(Blocks 1 open bug)

Details

(Keywords: feature, Whiteboard: [Shumway:P1][qa-])

Attachments

(6 files, 1 obsolete file)

Support streaming playback of fMP4 segments in regular HTML media elements. This currently depends on support in the backend, but ideally we need to support this even if the decoder backend doesn't. Handling this will probably take the form of transforming the incoming container data to something acceptable for the decoder backends. We may be able to use Blink's existing (f)MP4 parser: http://src.chromium.org/viewvc/chrome/trunk/src/media/mp4/
Depends on: 863033
That's also needed for MP3 streaming -- MSE does not provide a method to accept raw MP3 stream. The MP3 frames has to be re-packaged into fMP4.
Blocks: shumway
Depends on: 908503
Whiteboard: [shumway]
Whiteboard: [shumway] → [Shumway:P1]
Assignee: nobody → cpearce
Configure options to add flag for enabling/disabling fragmented MP4 parser. Also adds a pref to enable it too. The flag is only enabled if Windows Media Foundation support is enabled, since I have only hooked up a WMF decoder to the parser.
Attachment #827201 - Flags: review?(mh+mozilla)
This imports a fork of Chromium's fMP4 parser into the tree. Most of the files are unchanged, except that I flattened the includes, and all the required includes are selfcontained, this was so that I could build and test the demuxer in a stand alone program without Chromium or Gecko. The files that I are trivial imports, i.e. the files that aren't upstream code so need review are: * The build files. * Streams.h * basictypes.h (this provides most of the Chromium macros required by the other code). * mp4_demuxer.h/cc - this is code mostly copied from the original mp4_stream_parser.h/cc, but modified to accept a Stream abstraction, rather than taking in uint8 arrays. mp4_demuxer exposes the interface that client code uses. There are some todos here and there, mostly inherited from Chromium, but some for me, I'll get to them later. Chromium's fMP4 parser hard-codes the MSE MPEG4 constraints, and we've inherited them. I'll need to add generic MP4 demuxing code later for use in regular <video>.
Attachment #827205 - Flags: review?(kinetik)
Comment on attachment 827201 [details] [diff] [review] Patch 1 v1: Configure changes to enable fmp4 parser Review of attachment 827201 [details] [diff] [review]: ----------------------------------------------------------------- ::: configure.in @@ +8651,5 @@ > AC_SUBST(MOZ_OPUS) > AC_SUBST(MOZ_WEBM) > AC_SUBST(MOZ_DASH) > AC_SUBST(MOZ_WMF) > +AC_SUBST(MOZ_FMP4) Are you going to use MOZ_FMP4 in moz.build or Makefile.in? If not, please don't add the AC_SUBST.
Attachment #827201 - Flags: review?(mh+mozilla) → review+
High level reader abstraction, client of the mp4_demxuer::MP4Demxuer from the previous patch. Implements all the gecko stuff required for a MediaDecoderReader, and provides an abstraction layer for platform decoders to implement. Assumes that the demuxer can't seek, as the fmp4 demuxer can't.
Attachment #827213 - Flags: review?(kinetik)
(In reply to Mike Hommey [:glandium] from comment #4) > Comment on attachment 827201 [details] [diff] [review] > Patch 1 v1: Configure changes to enable fmp4 parser > > Review of attachment 827201 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: configure.in > @@ +8651,5 @@ > > AC_SUBST(MOZ_OPUS) > > AC_SUBST(MOZ_WEBM) > > AC_SUBST(MOZ_DASH) > > AC_SUBST(MOZ_WMF) > > +AC_SUBST(MOZ_FMP4) > > Are you going to use MOZ_FMP4 in moz.build or Makefile.in? If not, please > don't add the AC_SUBST. Yes, I use MOZ_FMP4 in a moz.build (in my patch 2).
Allow MP4Reader to be created by DecoderTraits, but only if a pref is set (default to not set). This is so that I can test the parser, and once non-fragmented is implemented, I'll remove the pref and make it creatable by default.
Attachment #827215 - Flags: review?(kinetik)
(In reply to Chris Pearce (:cpearce) from comment #3) > * basictypes.h (this provides most of the Chromium macros required by the > other code). We already have 3 different copies of basictypes.h in the tree. Can we avoid adding a fourth, with many possibilities of code wanting one and getting another?
I exported the headers into a directory to reduce the chance of collisions. I can rename it if you're really concerned, it's not really Chromium's basictypes.h, it's the macros that I needed pulled from the various headers in which they're defined. Chromium's unmodified basictypes.h pulls in half of Chromium, and I wanted to be able to build the demuxer stand alone quickly, so that it's easier to develop, test, and fuzz. So I modified this header to include only what I need.
Implements a platform decoder module to plug into the fragmented MP4 parser utilizing Windows Media Foundation. This should work on Vista and up. The interface that this implements and the calling context are defined in Patch 3. * Creates a MFTDecoder class to instantiate and run an IMFTransform decoder for driving the WMF H.264 and AAC decoders. * Implements decoders using the MFTDecoder for H.264 and AAC. Much of these decoders' code is imported from the WMFReader. * Supports DXVA and software decoding.
Attachment #827221 - Flags: review?(paul)
Are you drive-by fixing Bug 908503 here, or is there stuff left to do for Bug 908503? Does this bug qualify for the "feature" keyword, or is it impossible to use this feature without MSE?
I am only adding support for the MSE use case, so I'm using this bug. Bug 908503 refers to the <video> case, which requires a lot more work.
Attachment #827215 - Flags: review?(kinetik) → review+
This needs fuzzing!
Flags: sec-review?(cdiehl)
Christoph, I'm not sure how your fuzzer works these days but the demuxer can be built in a stand alone executable, so that may make your task of fuzzing easier.
Comment on attachment 827215 [details] [diff] [review] Patch 4 v1: Enable MP4Reader to be created by DecoderTraits Review of attachment 827215 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/DecoderTraits.cpp @@ +591,5 @@ > +/* static */ > +already_AddRefed<MediaDecoder> > +DecoderTraits::CreateDecoder(const nsACString& aType, MediaDecoderOwner* aOwner) > +{ > + nsRefPtr<MediaDecoder> decoder(InstantiateDecoder(aType)); Note to self: I need to pass in aOwner here, else the MOZ_OMX_DECODER case in InstantiateDecoder fails to compile above.
Comment on attachment 827213 [details] [diff] [review] Patch 3 v1: MP4Reader Review of attachment 827213 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/fmp4/MP4Decoder.cpp @@ +38,5 @@ > + return true; > + } > + > + // H.264 + AAC in MP4. > + static char const *const H264Codecs[] = { Not for right now, but we need to look at centralizing this, this is the fourth copy of a very similar table. Also, lowercase h? ::: content/media/fmp4/MP4Reader.cpp @@ +80,5 @@ > +}; > + > +MP4Reader::MP4Reader(AbstractMediaDecoder* aDecoder) > + : MediaDecoderReader(aDecoder), > + mLayersBackendType(mozilla::layers::LAYERS_NONE), Drop mozilla:: @@ +103,5 @@ > + mPlatform = PlatformDecoderModule::Create(); > + NS_ENSURE_TRUE(mPlatform, NS_ERROR_FAILURE); > + > + if (IsVideoContentType(mDecoder->GetResource()->GetContentType())) { > + // Extract the layer manager backend type so that platform decoders Trailing space. @@ +129,5 @@ > +{ > + bool ok = mDemuxer->Init(); > + NS_ENSURE_TRUE(ok, NS_ERROR_FAILURE); > + > + aInfo->mAudio.mHasAudio = mHasAudio = mDemuxer->HasAudio(); Isn't the usual pattern here to initialize the reader's mInfo member, then copy the result to *aInfo at the end? Makes for slightly better readability, too. @@ +202,5 @@ > +MP4Sample* > +MP4Reader::PopSample(TrackType aTrack) > +{ > + // Unfortunately the demuxer outputs in the order samples appear in the > + // media, not on a per stream basis. To cache the samples we get from To = We? @@ +234,5 @@ > + > + // Loop until we hit a return condition; we produce samples, or hit an error. > + while (true) { > + DecoderStatus status = decoder->Output(aOutData); > + if (status == DECODE_STATUS_OK) { Use a switch() on status so the compiler can warn about unhandled DecoderStatus if the enum is modified later. @@ +238,5 @@ > + if (status == DECODE_STATUS_OK) { > + MOZ_ASSERT(aOutData); > + return true; > + } > + // |decodedOutput| should only be non-null in success case. aOutData? @@ +282,5 @@ > +MP4Reader::DecodeAudioData() > +{ > + MOZ_ASSERT(mHasAudio); > + MOZ_ASSERT(mPlatform); > + MOZ_ASSERT(mAudioDecoder); To keep it one line: MOZ_ASSERT(mHasAudio && mPlatform && mAudioDecoder); @@ +295,5 @@ > + return ok; > +} > + > +bool > +MP4Reader::SkipVideoDemuxToNextKeyFrame(int64_t aTimeThreshold, uint32_t& parsed) s/Video//, because audio is skipped too, right? @@ +332,5 @@ > + AbstractMediaDecoder::AutoNotifyDecoded autoNotify(mDecoder, parsed, decoded); > + > + MOZ_ASSERT(mHasVideo); > + MOZ_ASSERT(mPlatform); > + MOZ_ASSERT(mVideoDecoder); To keep it one line: MOZ_ASSERT(mHasVideo && mPlatform && mVideoDecoder); and move it to the first line of the function. @@ +371,5 @@ > +{ > + if (!mDemuxer->CanSeek()) { > + return NS_ERROR_FAILURE; > + } > + return NS_OK; Return NS_ERROR_NOT_YET_IMPLEMENTED here for now? ::: content/media/fmp4/MP4Reader.h @@ +7,5 @@ > +#if !defined(MP4Reader_h_) > +#define MP4Reader_h_ > + > +#include "MediaDecoderReader.h" > +#include "mozilla/RefPtr.h" Unused? @@ +9,5 @@ > + > +#include "MediaDecoderReader.h" > +#include "mozilla/RefPtr.h" > +#include "nsAutoPtr.h" > +#include "nsRect.h" Unused? @@ +33,5 @@ > + MP4Reader(AbstractMediaDecoder* aDecoder); > + > + virtual ~MP4Reader(); > + > + nsresult Init(MediaDecoderReader* aCloneDonor) MOZ_OVERRIDE; virtual @@ +77,5 @@ > + > + MP4SampleQueue mCompressedAudioQueue; > + MP4SampleQueue mCompressedVideoQueue; > + > + mozilla::layers::LayersBackend mLayersBackendType; Drop mozilla::, this code is already inside that namespace. ::: content/media/fmp4/PlatformDecoderModule.h @@ +15,5 @@ > +namespace layers { > +class ImageContainer; > +} > + > +typedef int64_t Microseconds; I think I'd prefer either ticking with int64_t or adding real type safety for time handling. @@ +30,5 @@ > + virtual ~MediaDataDecoder() {}; > + > + virtual nsresult Shutdown() = 0; > + > + // Returns true if future decodes may produce output, for false for = or? @@ +50,5 @@ > +public: > + > + // Creates the appropriate PlatformDecoderModule for this platform. > + // Caller is responsible for deleting this instance. It's safe to have > + // multiple PlatformDecoderModule's alive at the same time. No ' @@ +51,5 @@ > + > + // Creates the appropriate PlatformDecoderModule for this platform. > + // Caller is responsible for deleting this instance. It's safe to have > + // multiple PlatformDecoderModule's alive at the same time. > + // There is one PlatformDecoderModule's created per media decoder. No 's @@ +60,5 @@ > + > + // TODO: Parameters for codec type. > + // Decode thread. > + virtual MediaDataDecoder* CreateVideoDecoder(mozilla::layers::LayersBackend aLayersBackend, > + mozilla::layers::ImageContainer* aImageContainer) = 0; Drop mozilla:: on these lines. ::: content/media/fmp4/demuxer/basictypes.h @@ +13,5 @@ > +PRLogModuleInfo* GetDemuxerLog(); > +#define LOG(...) PR_LOG(GetDemuxerLog(), PR_LOG_DEBUG, (__VA_ARGS__)) > +#else > +#define LOG(...) > +#endif Should the changes in this file be in a different patch?
Attachment #827213 - Flags: review?(kinetik) → review+
Comment on attachment 827205 [details] [diff] [review] Patch 2 v1: Import chromium's fmp4 demuxer into the tree Review of attachment 827205 [details] [diff] [review]: ----------------------------------------------------------------- Since this is a code import, I'm going to ride on the back of Chromium's review process and Christoph's fuzzing and more or less rubber stamp this.
Attachment #827205 - Flags: review?(kinetik) → review+
(In reply to Matthew Gregan [:kinetik] from comment #16) > Comment on attachment 827213 [details] [diff] [review] > Patch 3 v1: MP4Reader > > Review of attachment 827213 [details] [diff] [review]: > ----------------------------------------------------------------- > @@ +234,5 @@ > > + > > + // Loop until we hit a return condition; we produce samples, or hit an error. > > + while (true) { > > + DecoderStatus status = decoder->Output(aOutData); > > + if (status == DECODE_STATUS_OK) { > > Use a switch() on status so the compiler can warn about unhandled > DecoderStatus if the enum is modified later. mmmm some of the cases shouldn't be returned by the methods, for example Output() (and Flush()) should not return DECODE_STATUS_NOT_ACCEPTING, as that only makes sense for the Input() function. Shall I NS_WARNING(...);return false; on those unexpected cases? > @@ +295,5 @@ > > + return ok; > > +} > > + > > +bool > > +MP4Reader::SkipVideoDemuxToNextKeyFrame(int64_t aTimeThreshold, uint32_t& parsed) > > s/Video//, because audio is skipped too, right? No, PopSample(kVideo) will only Pop the video samples, the compressed audio samples in between the current time and the threshold will be enqueued in mCompressedAudioQueue, they won't be discarded. > ::: content/media/fmp4/PlatformDecoderModule.h > @@ +15,5 @@ > > +namespace layers { > > +class ImageContainer; > > +} > > + > > +typedef int64_t Microseconds; > > I think I'd prefer either ticking with int64_t or adding real type safety > for time handling. I too would like real type safety for time units. I was mostly intending this as a way to annotate what unit we're measuring time in, so that I don't feel like I should append "Usecs" to every variable/parameter name, and as a place holder for when we get real type safety. > ::: content/media/fmp4/demuxer/basictypes.h > @@ +13,5 @@ > > +PRLogModuleInfo* GetDemuxerLog(); > > +#define LOG(...) PR_LOG(GetDemuxerLog(), PR_LOG_DEBUG, (__VA_ARGS__)) > > +#else > > +#define LOG(...) > > +#endif > > Should the changes in this file be in a different patch? This is deliberate, as this is a deviation from my upstream repository of the parser, which is imported in the previous patch.
(In reply to Chris Pearce (:cpearce) from comment #18) > mmmm some of the cases shouldn't be returned by the methods, for example > Output() (and Flush()) should not return DECODE_STATUS_NOT_ACCEPTING, as > that only makes sense for the Input() function. Shall I > NS_WARNING(...);return false; on those unexpected cases? Ah, it might not end up being cleaner then, I'm fine with leaving it as is. > No, PopSample(kVideo) will only Pop the video samples, the compressed audio > samples in between the current time and the threshold will be enqueued in > mCompressedAudioQueue, they won't be discarded. Gotcha. In that case, do we need to worry about how much audio accumulates if the keyframes are pathologically far apart? Not sure if that's even possible with MP4, just thinking of the badly muxed WebM files with a single keyframe at the start of the media.
(In reply to Matthew Gregan [:kinetik] from comment #19) > Gotcha. In that case, do we need to worry about how much audio accumulates > if the keyframes are pathologically far apart? Yes, that case can happen. We are at least only buffering compressed data, so we'd need to be demuxing a very large file for this to cause a serious problem. I was planning on following up by refactoring the parser code to calculate the offset of the next sample in each stream and read samples as needed rather than in clusters and buffering them. Chromium's demuxer parses samples in clumps and buffers them because it receives input as uint8 buffers as they come in from MSE, rather than having input cached on disk like we do.
(In reply to Chris Pearce (:cpearce) from comment #10) > Created attachment 827221 [details] [diff] [review] > Patch 6 v1: WMF platform decoder for fmp4 Is there a missing Patch 5?
(In reply to Daniel Veditz [:dveditz] from comment #21) > (In reply to Chris Pearce (:cpearce) from comment #10) > > Created attachment 827221 [details] [diff] [review] > > Patch 6 v1: WMF platform decoder for fmp4 > > Is there a missing Patch 5? No, the patch labelled "patch 6" is the last patch, I mislabeled patch 5 as patch 6. There was another patch in my queue, but I landed that in a separate bug.
Attachment #827221 - Attachment description: Patch 6 v1: WMF platform decoder for fmp4 → Patch 5 v1: WMF platform decoder for fmp4
cdiehl is going to fuzz the demuxer, but he can't do that on Mac since there's no PlatformDecoderModule for MacOSX yet, and there likely won't be one for a while. So here's a PlatformDecoderModule that accepts input and just outputs blank green video frames and audio samples containing an A4 tone, of the duration implied in the input data. Disabled by default, to enable, set the pref media.fragmented-mp4.use-blank-decoder to true. The fmp4 parser will be fuzzed using this PlatformDecoderModule. We should re-fuzz when we get real PlatformDecoderModule's that actually decode on the platforms that we can fuzz too.
Attachment #832461 - Flags: review?(kinetik)
Attachment #832461 - Flags: review?(kinetik) → review+
Attachment #827205 - Attachment description: Patch 2 v2: Import chromium's fmp4 demuxer into the tree → Patch 2 v1: Import chromium's fmp4 demuxer into the tree
Unbitrotten, I also changed the LOG macro to DVLOG, so that I avoided some name collisions on some platforms.
Attachment #8333646 - Flags: review+
Attachment #827205 - Attachment is obsolete: true
Comment on attachment 827221 [details] [diff] [review] Patch 5 v1: WMF platform decoder for fmp4 Review of attachment 827221 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, apart from the do { continue; } while(false); bit. r+ with that fixed. ::: content/media/fmp4/wmf/MFTDecoder.cpp @@ +36,5 @@ > + // Create the IMFTransform to do the decoding. > + HRESULT hr; > + hr = CoCreateInstance(aMFTClsID, > + NULL, > + CLSCTX_ALL, // CLSCTX_INPROC_SERVER ? Yes, I'd expect this to be CLSCTX_INPROC_SERVER. @@ +216,5 @@ > + output.pEvents = nullptr; > + } > + > + if (hr == MF_E_TRANSFORM_STREAM_CHANGE) { > + // Type change, probably geometric apperature change. s/apperature/aperature/ @@ +239,5 @@ > + > + *aOutput = output.pSample; // AddRefs > + if (mMFTProvidesOutputSamples) { > + // If the MFT is providing samples, we must release the sample here. > + // Typically the only the H.264 MFT provides samples when using DXVA, s/the only the/only the/, I think ::: content/media/fmp4/wmf/MFTDecoder.h @@ +52,5 @@ > + uint32_t aDataSize, > + int64_t aTimestampUsecs); > + > + // Retrieves output from the MFT. Call this once Input() returns > + // MF_E_NOTACCEPTING. Some MFTs with hardware accelerationg (the H.264 s/accelerationg/acceleration/ @@ +65,5 @@ > + // due to lack of input. > + // - S_OK if an output frame is produced. > + HRESULT Output(RefPtr<IMFSample>* aOutput); > + > + // Sends a flushes message to the MFT. This causes it to discard all s/flushes/flush/ ? ::: content/media/fmp4/wmf/WMFAudioDecoder.cpp @@ +32,5 @@ > +} > + > +WMFAudioDecoder::~WMFAudioDecoder() > +{ > +} Why define this if it's empty? @@ +45,5 @@ > + // Contains the portion of the HEAACWAVEINFO structure that appears > + // after the WAVEFORMATEX structure (that is, after the wfx member). > + // This is followed by the AudioSpecificConfig() data, as defined > + // by ISO/IEC 14496-3. > + // ... The formatting of this part of this comment block is confusing. It looks like stuff is missing, but apparently, not. @@ +221,5 @@ > + MOZ_ASSERT(numSamples >= 0); > + if (numFrames == 0) { > + // All data from this chunk stripped, loop back and try to output the next > + // frame, if possible. > + continue; This does not work: > #include <stdio.h> > > int main () { > int i = 0; > do { > i++; > continue; > } while (false); > > printf("%d\n", i); > return 0; > } Prints: > 1 and exits. ::: content/media/fmp4/wmf/WMFAudioDecoder.h @@ +32,5 @@ > + Microseconds aDTS, > + Microseconds aPTS, > + int64_t aOffsetInStream); > + > + // Blocks until decoded sample is produced by the deoder. s/decoded sample/a decoded sample/ s/deoder/decoder/ ::: content/media/fmp4/wmf/WMFVideoDecoder.cpp @@ +84,5 @@ > + > + if (useDxva) { > + RefPtr<IMFAttributes> attr(mDecoder->GetAttributes()); > + > + UINT32 aware = 0; Maybe use a more descriptive name? @@ +358,5 @@ > + if (SUCCEEDED(hr)) { > + NS_ENSURE_TRUE(sample, DECODE_STATUS_ERROR); > + break; > + } > + if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT) { Is this a tab after the == ? Should be a single space. ::: content/media/fmp4/wmf/WMFVideoDecoder.h @@ +35,5 @@ > + Microseconds aDTS, > + Microseconds aPTS, > + int64_t aOffsetInStream) MOZ_OVERRIDE; > + > + // Blocks until decoded sample is produced by the deoder. Same typos here. @@ +63,5 @@ > + // This is used to approximate the decoder's position in the media resource. > + int64_t mLastStreamOffset; > + > + nsAutoPtr<MFTDecoder> mDecoder; > + RefPtr<mozilla::layers::ImageContainer> mImageContainer; Drop mozilla:: as you are already inside the namespace, here.
Attachment #827221 - Flags: review?(paul) → review+
Thanks for review! Now I just need the tree to open again... (In reply to Paul Adenot (:padenot) from comment #25) > Comment on attachment 827221 [details] [diff] [review] > Patch 5 v1: WMF platform decoder for fmp4 > @@ +45,5 @@ > > + // Contains the portion of the HEAACWAVEINFO structure that appears > > + // after the WAVEFORMATEX structure (that is, after the wfx member). > > + // This is followed by the AudioSpecificConfig() data, as defined > > + // by ISO/IEC 14496-3. > > + // ... > > The formatting of this part of this comment block is confusing. It looks > like stuff is missing, but apparently, not. I'll clean it up a little. > @@ +221,5 @@ > > + MOZ_ASSERT(numSamples >= 0); > > + if (numFrames == 0) { > > + // All data from this chunk stripped, loop back and try to output the next > > + // frame, if possible. > > + continue; > > This does not work: > > > #include <stdio.h> > > > > int main () { > > int i = 0; > > do { > > i++; > > continue; > > } while (false); > > > > printf("%d\n", i); > > return 0; > > } > > Prints: > > > 1 > > and exits. I'll be damned. I'll stick the loop into a sub-function.
Depends on: 941296
Depends on: 941298
Depends on: 941302
Depends on: 941301
Depends on: 941931
Depends on: 943721
Depends on: 944088
Depends on: 945085
Depends on: 946027
From the discussions with kinetik and padenot on IRC: there is no need for manual testing here. Adding [qa-] for now.
Whiteboard: [Shumway:P1] → [Shumway:P1][qa-]
Depends on: 1071563
No longer depends on: 1071563
Flags: sec-review?(cdiehl)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: