Closed Bug 566245 Opened 14 years ago Closed 14 years ago

Merge nsWebM decoder to mozilla-central

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- alpha5+

People

(Reporter: roc, Assigned: cajbir)

References

()

Details

Attachments

(3 files, 9 obsolete files)

Need to extract the patches from mozilla-webmedia, clean them up, attach them here and get reviews.
Depends on: 566241
Depends on: 566498
Depends on: 566501
Group: mozilla-corporation-confidential
This is the first cut at a webm decoder. It is not ready for review. This is what the playback in our webm build demo is build from.
URL: 567056
Attachment #446464 - Flags: review?(roc)
-  // a non-catastrophic failure, it will set a new task to continue loading
+  // a non-catestrophic failure, it will set a new task to continue loading

Don't add this typo :-)

I probably would have gone for MOZ_WEBM instead of MOZ_VP8, but it doesn't matter much.

With this patch, we return "maybe" for canPlayType("video/webm") and canPlayType("audio/webm"). I think we should return "probably". This is the point of WebM.

We need tests. Basically we need stuff like test_can_play_type_ogg.html.
Attached patch intial attempt at webm support (obsolete) — Splinter Review
Minor update to the previous patch to actually build and work with all the other patches applied.
Attachment #446431 - Attachment is obsolete: true
(In reply to comment #3)
> I probably would have gone for MOZ_WEBM instead of MOZ_VP8, but it doesn't
> matter much.

I used what bug 566247 defines so if cpearce changes that I'm happy to change to whatever.
Address review comments, add tests. I'll change to MOZ_WEBM when/if bug 566247 is updated.
Attachment #446464 - Attachment is obsolete: true
Attachment #446650 - Flags: review?(roc)
Attachment #446464 - Flags: review?(roc)
Attachment #446650 - Flags: review?(roc) → review+
Attached patch webm decoder (obsolete) — Splinter Review
Implements nsWebMDecoder. I haven't rebased on the latest bug 566247 yet so things are still guarded with MOZ_VP8 instead of MOZ_WEBM.
Attachment #446477 - Attachment is obsolete: true
Attachment #447245 - Flags: review?(kinetik)
+// Default framerate to use in frames per second for when
+// webm files don't include a framerate.
+#define DEFAULT_FRAMERATE 20.0

Can you make this comment clearer?  Maybe move the comment from where we use this value up to here as that one is better.  We won't ever have a framerate from WebM, and we don't use this value to determine when to show frames.  Does the value matter?  Maybe it could be something further away from common frame rates to make it obvious how little it matters.

In each of the webm callbacks:
+  nsMediaStream* stream = decoder->GetCurrentStream();

I assume this can't ever be NULL when we're calling into nestegg so we don't need a NULL check here?

+  mDecoder->SetDuration(duration / 1000000);

This'll assert as it needs to be called from the main thread, but the current function is running on the state machine thread.

+  for (PRUint32 i = 0; i < ntracks; ++i) {

I think we should pick a single video and audio track (the first each that we see?) and remember their indices, then in places like:

+  if (nestegg_track_type(mContext, track) != NESTEGG_TRACK_AUDIO) {

We should test against the track index rather than just whether it's audio or video.  That should allow things to work correctly in the presence of multiple tracks, as nestegg_read_packet could produce data for multiple video or audio tracks.

As a side note, there's probably bit of work required in nestegg to expose more track metadata before we could handle multiple video/audio tracks correctly.  For example, exposing the enabled and default flags in the WebM metadata.

+      mInfo.mPicture.width = params.crop_right - params.crop_left;
+      mInfo.mPicture.height = params.crop_bottom - params.crop_top;

This needs to subtract the crop areas from the frame width and height, e.g. width - crop_right - crop_left.

+      mInfo.mPixelAspectRatio = float(params.display_width) / params.width;

This should be:

mInfo.mPixelAspectRatio = (float(params.display_width) / params.width) /
                          (float(params.display_height) / params.height);

+    }
+    if (type == NESTEGG_TRACK_AUDIO) {

else if?

+      mInfo.mAudioRate = params.rate;
+      mInfo.mAudioChannels = params.channels;
+      mChannels = params.channels;

I think we should use the values from Vorbis once we've set up the decoder as they must be the correct values--the ones in the WebM metadata could be wrong.  It was probably a mistake to expose them in the nestegg API.  It's just a matter of using mVorbisDsp.vi->rate and mVorbisDsp.vi->channels once we've initialized the decoder.

The same applies to the frame width and height in the WebM file--we should either use the values from the VP8 decoder or at least assert that they match.  It's slightly trickier to get the values as we'd need to feed the decoder a complete frame of data and peek into the decode stream, so we can leave this change for another bug.

+        opacket.packet = data;
+        opacket.bytes = length;
+        opacket.e_o_s = 0;
+        opacket.b_o_s = header == 0;
+        opacket.granulepos = 0;
+        opacket.packetno = mPacketCount++;

Maybe move this (and the other copy of this code) into a small function to manufacture ogg_packets?

+        SoundData* s = new SoundData(0, //mPageOffset,

Unused code/comment.

As a general comment, it'd be nice to share the Vorbis decode logic between backends, but it looks like that'd require a bunch of refactoring work, and may not even be worth the effort.  Maybe sometime in the future.

+    vpx_codec_stream_info_t si;
+    si.sz = sizeof(si);
+    si.w = si.h = 0;
+    si.is_kf = 0;

Can this be memset(0) + sizeof?

+    while((img = vpx_codec_get_frame(&mVP8, &iter))) {

Maybe check that the image format is what we expect (4:2:0)?

+      b.mPlanes[1].mHeight = img->d_h >> 1;
+      b.mPlanes[1].mWidth = img->d_w >> 1;

If d_h/d_w are odd, do these values round down or up?

+      VideoData *v = VideoData::Create(mInfo,
+                                       mDecoder->GetImageContainer(),
+                                       0, /* mPageOffset */

Unused code/comment.

+                                       tstamp_ms,
+                                       b,
+                                       si.is_kf,
+                                       tstamp_ms);

Since the WebM reader doesn't use the frame timecodes, can we just set it to -1?  Maybe do the same thing for the byte offsets.  That makes it clear that they're invalid values.

+      if (video && video->mTime + 40 < aTarget) {

Oops, that + 40 is a leftover hack.  It needs to be the actual frame duration.

+  // libnestegg context for webm container. Access on state machine thread only.

This comment and:

+PRBool nsWebMReader::DecodeAudioData()
+{
+  MonitorAutoEnter mon(mMonitor);
+  NS_ASSERTION(mDecoder->OnStateMachineThread() || mDecoder->OnDecodeThread(),
+               "Should be on playback or decode thread.");

This assertion disagree.  I think the comment is wrong--it's fine to access mContext on another thread, as long as mContext is accessed on only a single thread at a time, which I assume our existing locking takes care of.

+               "Should be on state machine or AV thread.");

"or decode thread"
+      VideoData *v = VideoData::Create(mInfo,
+                                       mDecoder->GetImageContainer(),
+                                       0, /* mPageOffset */
+                                       tstamp_ms,
+                                       b,
+                                       si.is_kf,
+                                       tstamp_ms);

NULL check and error out.
Comment on attachment 447245 [details] [diff] [review]
webm decoder


>+  uint64_t duration = 0;
>+  r = nestegg_duration(mContext, &duration);
>+  if (r == -1) {
>+    Cleanup();
>+    return NS_ERROR_FAILURE;
>+  }
>+  mDecoder->SetDuration(duration / 1000000);

Should we fail if the media doesn't contain a duration? If we do we can't play live streams, such as http://195.10.10.75:8800/live.webm . Perhaps just set the duration if r!=-1, and otherwise ignore the error?
Updated to use MOZ_WEBM
Attachment #446650 - Attachment is obsolete: true
Attachment #447271 - Flags: review+
(In reply to comment #10)
> Should we fail if the media doesn't contain a duration? If we do we can't play
> live streams, such as http://195.10.10.75:8800/live.webm . Perhaps just set the
> duration if r!=-1, and otherwise ignore the error?

Please maintain the live use case - it's actually super-useful and people are starting to build products around it.
(In reply to comment #8)
> 
> +      if (video && video->mTime + 40 < aTarget) {
> 
> Oops, that + 40 is a leftover hack.  It needs to be the actual frame duration.

I'm not sure what the correct fix is here since you wrote the seeking code. Can you be more explicit about what the fix is, or provide the fix.
(In reply to comment #13)
> I'm not sure what the correct fix is here since you wrote the seeking code. Can
> you be more explicit about what the fix is, or provide the fix.

In the Ogg code, this test is |video && video->mTime + mCallbackPeriod < aTarget)|.  We can't rely on mCallbackPeriod here, so I think we need to decode ahead and work out the time between frames.  It's a little bit fiddly.  Maybe we should spin it off into another bug--I can work on the fix if you like.

I just noticed this:
+  for (PRUint32 i = 0; i < count; ++i) {
[...]
+    uint64_t tstamp = 0;
+    r = nestegg_packet_tstamp(aPacket, &tstamp);
[...]
+        PRInt64 duration = samples * 1000 / mVorbisDsp.vi->rate;
+        SoundData* s = new SoundData(0, //mPageOffset,
+                                     tstamp / 1000000,
+                                     duration,
+                                     samples,
+                                     buffer,
+                                     mChannels);

We only need to call packet_stamp once here (outside the loop), as the stamp is per packet, not per chunk of data inside the packet (one of the slightly unfortunate design aspects of WebM).  Which also means that the time we pass to SoundData here (tstamp / 1000000) needs to be calculated more accurately as we decode audio, e.g. something like tstamp plus the sum of the durations of the previously decoded chunks for this call.
(In reply to comment #14)
> Which also means that the time we pass to
> SoundData here (tstamp / 1000000) needs to be calculated more accurately as we
> decode audio, e.g. something like tstamp plus the sum of the durations of the
> previously decoded chunks for this call.

Or we could just queue all of the decoded audio from this packet as a single SoundData.  The problem as it is now is that we can have multiple SoundData instances from a single packet in the queue, but since they all have the same tstamp, mAudioQueue.Duration() will return 0 (assuming there aren't SoundData instances from a later packet already queued).
+        mInfo.mCallbackPeriod = 1.0 / DEFAULT_FRAMERATE;

Should be 1000 / DEFAULT_FRAMERATE.
Blocks: 567491
(In reply to comment #14)
> In the Ogg code, this test is |video && video->mTime + mCallbackPeriod <
> aTarget)|.  We can't rely on mCallbackPeriod here, so I think we need to decode
> ahead and work out the time between frames.  It's a little bit fiddly.  Maybe
> we should spin it off into another bug--I can work on the fix if you like.

Bug 568162.
Blocks: 568162
Bug 568431 tracks removing assumptions about the callback period from the shared decoder logic.
Attached patch Adds a webm test (obsolete) — Splinter Review
Attachment #447933 - Flags: review?(kinetik)
Attached patch Adds a webm test (obsolete) — Splinter Review
Forgot the binary test file in the last patch.
Attachment #447933 - Attachment is obsolete: true
Attachment #447934 - Flags: review?(kinetik)
Attachment #447933 - Flags: review?(kinetik)
Attached patch webm decoder (obsolete) — Splinter Review
Addresses review comments.
Attachment #447245 - Attachment is obsolete: true
Attachment #447936 - Flags: review?(kinetik)
Attachment #447245 - Flags: review?(kinetik)
Comment on attachment 447934 [details] [diff] [review]
Adds a webm test

Might be worth starting a new block of _TEST_FILES for WebM files like we have for Ogg and Wave, but it's not a big deal.
Attachment #447934 - Flags: review?(kinetik) → review+
Comment on attachment 447936 [details] [diff] [review]
webm decoder

This is mostly nitpicky stuff:

+nsresult nsWebMReader::Init() {

Braces riding up.

+  // time when no vorbis data has beeni read.

Typo.

+  for (PRUint32 i = 0; i < ntracks; ++i) {

Maybe rename i to track?  Because the loop body is so big, it'd be easier to read with a better name than i.

+    int id = nestegg_track_codec_id(mContext, i);
+    if (id == -1) {

If a WebM file has a track we can't play and a track we can (say someone violates the spec and encodes Theora+Vorbis in WebM), should we play the Vorbis or reject the file completely?  This is a spec issue, I'll raise it on webm-discuss.

+        ogg_packet opacket;
+        InitOggPacket(&opacket, data, length, header == 0, PR_FALSE, 0);
+

I'd like to suggested:

    ogg_packet opacket = MakeOggPacket(data, length, header == 0, PR_FALSE, 0);

But it's mostly personal preference, so if you prefer the current code then leave it as is.

+      mInfo.mAudioRate = mVorbisDsp.vi->rate;
+      mInfo.mAudioChannels = mVorbisDsp.vi->channels;
+      mChannels = mInfo.mAudioChannels;
+      NS_ASSERTION(mInfo.mPicture.width == static_cast<PRInt32>(img->d_w), 
+        "WebM picture width from header does not match decoded frame");
+      NS_ASSERTION(mInfo.mPicture.height == static_cast<PRInt32>(img->d_h),
+        "WebM picture height from header does not match decoded frame");

Cool, thanks.  It make sense to trust the codec data over the container data for this, but it's not clear what the behaviour should be when they disagree (try to work by using the codec data, or reject the file).  I'll bring this up as a spec issue on webm-discuss.

+nestegg_packet* nsWebMReader::NextPacket(TrackType aTrackType)

One issue I thought of with this approach is that we don't know when a track has ended.  For instance, imagine a video with a single audio packet at the beginning and a large number of video packets.  If we try to read a second audio packet (which doesn't exist in the stream), we'll read and buffer all of the video packets until EOS before we can be sure there are no more audio packets.  I think nestegg's API needs to change to solve this, but I'm not actually sure how to (or if it's possible) to detect a track has ended early in a WebM file anyway.

The Ogg backend doesn't have this problem in the general case as there is a flag signaling end of stream for a logical stream.  It does have the same problem if the EOS flag is missing, though.

It's a general problem with both backends if the file is badly muxed and has huge chunks of video data between audio packets (or vice versa), but I'm not sure that it comes up in practice, or if we care about the performance of such badly created files.
Attachment #447936 - Flags: review?(kinetik) → review+
+  if (nestegg_track_type(mContext, track) != NESTEGG_TRACK_VIDEO) {
+    mAudioPackets.Push(packet);
+    return PR_TRUE;
+  }
+

This block can be removed now.
Attached patch Adds a webm testSplinter Review
Rebased to trunk
Attachment #447934 - Attachment is obsolete: true
Attachment #448321 - Flags: review+
Rebased to trunk.
Attachment #447271 - Attachment is obsolete: true
Attachment #448322 - Flags: review+
Attached patch webm decoderSplinter Review
Address review comments.
Attachment #447936 - Attachment is obsolete: true
Attachment #448326 - Flags: review+
Blocks: 569530
We want WebM support in alpha 5, marking as a blocker.
blocking2.0: --- → alpha5+
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.