Closed Bug 1075332 Opened 10 years ago Closed 10 years ago

MSE content using fragmented mp4 do not play on OSX

Categories

(Core :: Audio/Video, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: jya, Assigned: jya)

References

Details

Attachments

(1 file, 1 obsolete file)

Example: http://people.mozilla.org/~jyavenard/tests/mse_mp4/paper.html This simple page loads two mp4 segments. Video should start automatically and play, yet either it doesn't play at all, or it plays about 3s worth. Chrome plays content just fine http://people.mozilla.org/~jyavenard/tests/mse_mp4/paperfull.html is similar but plays a single mp4 fragment (made of the two fragments in above URL)
Playing directly the mp4 file doesn't yield much better result... so it could be just the mp4 related code only, and not be MSE per-say http://people.mozilla.org/~jyavenard/tests/mse_mp4/facebook-paper.mp4
That has definitely regressed. It is probably my patch in bug 1056485. I suggest you just find a regression commit by checking out each revision in |hg log media/libstagefright/| my first guess would be bug 1056485.
facebook-paper.mp4 listed above isn't the same as the original (http://people.mozilla.org/~cpearce/video/facebook-paper.mp4) it has been truncated to the first 10s only. The original plays fine Reverting the various changes on stagefright changes do not appear to make consistent improvements.
This happens to be a problem I can only reproduce using the mac VideoToolbox decoder. Using FFmpeg it plays fine...
Summary: MSE content using fragmented mp4 do not play → MSE content using fragmented mp4 do not play on OSX
AAC decoder refactor. I believe much simpler approach than before. Behaves in effect in a similar fashion as the FFmpeg audio decoder. I'm not happy with the data type used for mOutputData, I'm hoping there's another nsObject I'm yet to discover that would allow to act like an array, but directly pass and forget the pointer to the allocated data (similar to nsAutoPtr. The discontinuity check is very basic: we calculate what the next packet's timestamp was supposed to be and if +/-5us : reset the timestamps. In case a seek happened, Flush() or Drain() would have been called anyway
Attachment #8518777 - Flags: review?(giles)
Comment on attachment 8518777 [details] [diff] [review] Refactor mac audio decoder. Properly calculate timestamps Review of attachment 8518777 [details] [diff] [review]: ----------------------------------------------------------------- r=me with comments addressed. ::: dom/media/fmp4/apple/AppleATDecoder.cpp @@ +237,5 @@ > { > // Pick a multiple of the frame size close to a power of two > // for efficient allocation. > const uint32_t MAX_AUDIO_FRAMES = 128; > + const uint32_t decodedSize = MAX_AUDIO_FRAMES * mConfig.channel_count; Call this decodedValues. 'Size' usually implies bytes. @@ -366,2 @@ > NS_ERROR("Failed to apply ADTS header"); > - mCallback->Error(); Shouldn't you still call mCallback->Error() here? If ConvertSample() fails once it will always fail, so we'll never get to the 'if (mLastErrror)' block at the end of the function. @@ +347,5 @@ > } > + > + bool discontinuity = mNextAudioTimestamp < 0 || > + mNextAudioTimestamp < (aSample->composition_timestamp - 5) || > + mNextAudioTimestamp > (aSample->composition_timestamp + 5); This can overflow. How about: const Microseconds fuzz = 5; bool discontinuity = mNextAudioTimestamp < fuzz || mNextAudioTimestamp > (std::numeric_limits<Microseconds>::max() - fuzz) || mNextAudioTimestamp < (aSample->composition_timestamp - fuzz) || mNextAudioTimestamp > (aSample->composition_timestamp + fuzz); Or use a CheckedInt. @@ +363,5 @@ > aSample->size, > aSample->data, > flags); > + > + if (mOutputData.Length() > 0) { You can also say, if (!mOutputData.IsEmpty()) if you like that better. No speed difference. @@ +366,5 @@ > + > + if (mOutputData.Length() > 0) { > + int rate = mOutputFormat.mSampleRate; > + int channels = mOutputFormat.mChannelsPerFrame; > + UInt32 numFrames = mOutputData.Length() / channels; nsTArray::Length() returns a size_t. @@ +367,5 @@ > + if (mOutputData.Length() > 0) { > + int rate = mOutputFormat.mSampleRate; > + int channels = mOutputFormat.mChannelsPerFrame; > + UInt32 numFrames = mOutputData.Length() / channels; > + int64_t duration = FramesToUsecs(numFrames, rate).value(); Please check for overflow here, too. Cribbing from EMEAudioDecoder.cpp: auto duration = FramesToUsecs(numFrames, rate); if (!duration.isValid()) { NS_ERROR("Invalid count of accumulated audio samples"); mCallback->Error(); return; } And pass duration.value() to the AudioData ctor below. @@ +380,5 @@ > + data(new AudioDataValue[mOutputData.Length()]); > + memcpy(data.get(), > + &mOutputData[0], > + mOutputData.Length() * sizeof(AudioDataValue)); > + mOutputData.Clear(); Yuck. I don't have any better suggestions though, other than using PodCopy() so you don't need the sizeof. You could save an allocation by making mOutputData an nsTArray<nsAutoArrayPtr<AudioDataValue>> but you'll need a more complicated flattening step at the end, which I don't think is a win.
Attachment #8518777 - Flags: review?(giles) → review+
(In reply to Ralph Giles (:rillian) from comment #6) > > // Pick a multiple of the frame size close to a power of two > > // for efficient allocation. > > const uint32_t MAX_AUDIO_FRAMES = 128; > > + const uint32_t decodedSize = MAX_AUDIO_FRAMES * mConfig.channel_count; > > Call this decodedValues. 'Size' usually implies bytes. decodedSamples maybe then. As that what those are I only re-used the name you had there (which was in bytes earlier) > > @@ -366,2 @@ > > NS_ERROR("Failed to apply ADTS header"); > > - mCallback->Error(); > > Shouldn't you still call mCallback->Error() here? If ConvertSample() fails > once it will always fail, so we'll never get to the 'if (mLastErrror)' block > at the end of the function. yes, you're right. that bit is wrong > > @@ +347,5 @@ > > } > > + > > + bool discontinuity = mNextAudioTimestamp < 0 || > > + mNextAudioTimestamp < (aSample->composition_timestamp - 5) || > > + mNextAudioTimestamp > (aSample->composition_timestamp + 5); > > This can overflow. How about: if this overflow there, it will overflow right after anyway when we add the duration played to mNextAudioTimestamp. It will also overflow in MP4Reader when it computes the next timestamp. Also mNextAudioTimestamp is signed. > > const Microseconds fuzz = 5; > bool discontinuity = mNextAudioTimestamp < fuzz || > mNextAudioTimestamp > (std::numeric_limits<Microseconds>::max() - fuzz) || > mNextAudioTimestamp < (aSample->composition_timestamp - fuzz) || > mNextAudioTimestamp > (aSample->composition_timestamp + fuzz); I can do this, but I really don't see the point. Would have to explicitly check for mNextAudioTimestamp being < 0 as it indicates that it has never been initialised. > @@ +366,5 @@ > > + > > + if (mOutputData.Length() > 0) { > > + int rate = mOutputFormat.mSampleRate; > > + int channels = mOutputFormat.mChannelsPerFrame; > > + UInt32 numFrames = mOutputData.Length() / channels; > > nsTArray::Length() returns a size_t. ok. > Yuck. I don't have any better suggestions though, other than using PodCopy() > so you don't need the sizeof. I thought you would come up with a fancy nsContainer that magically let the container "forget" its content and pass it to someone else which will free it later. > > You could save an allocation by making mOutputData an > nsTArray<nsAutoArrayPtr<AudioDataValue>> but you'll need a more complicated > flattening step at the end, which I don't think is a win. that looks heaps more complicated to me !
Attachment #8518777 - Attachment is obsolete: true
Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
QA Whiteboard: [good first verify]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: