Closed Bug 1131563 Opened 9 years ago Closed 9 years ago

Only a few frames played in WebM video (Android 2.3/3.0/4.0+)

Categories

(Core :: Audio/Video, defect)

ARM
Android
defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla39
Tracking Status
firefox36 --- wontfix
firefox37 + wontfix
firefox38 + verified
firefox39 + verified
relnote-firefox --- 36+
fennec 37+ ---

People

(Reporter: ioana.chiorean, Assigned: bechen)

References

Details

(Keywords: regression)

Attachments

(6 files, 2 obsolete files)

36 Beta 8, 37.0a2, 38.0a1
Acer Iconia A500 - Android 3.2.1

Steps:
1. Load  http://base-n.de/webm/out9.webm
2. Play it.

Expected Results:
- video should be played without any issues

Actual Results:
- Video stops after 1-2s 

Note:
- reproducing it only on Acer Iconia A500 - Android 3.2.1
- video: Webm on honeycomb: http://youtu.be/xBO0dBCPDAM
This is also reproducible on Samsung Galaxy R (Android 2.3.4).
Component: Audio/Video → Video/Audio
Product: Firefox for Android → Core
Summary: [Honeycomb] Webm Video plays for a sec only → Only a few frames played in WebM video (Android 2.3/3.0)
Version: Firefox 36 → unspecified
There is a Galaxy R in MV. Is there anyone there that can look at that?
tracking-fennec: --- → ?
Flags: needinfo?(ioana.chiorean)
Could not find reliable regression range because of the crashing builds:
last good build: 2014-12-19
first bad build: 2015-01-01
The builds in between those two are crashing while accessing the url: http://base-n.de/webm/out9.webm
Flags: needinfo?(ioana.chiorean)
Managed to find the regression range on Acer Iconia A500
good: 2014-12-22
bad: 2014-12-23

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b915a50bc6be&tochange=0532f2509f3f
Thanks for the bug report! Given that this affects beta, this is pretty urgent.

Can someone who can reproduce this generate a log with the following environmental variables set:

MEDIA_LOG_SAMPLES=1 NSPR_LOG_MODULES="MediaDecoder:5,MediaSource:5,MediaPromise:5,MP4Demuxer:5,Nestegg:5"

And attach the logcat to this bug?

See https://wiki.mozilla.org/Mobile/Fennec/Android#PR_Logging for instructions on generating NSPR logs on android.
Flags: needinfo?(ioana.chiorean)
Attached file log1.txt.zip
Here's the log file. I hope it is useful.
Flags: needinfo?(ioana.chiorean)
Bobby, working under the assumption that this is  your regression, do you have a 2.3 or 3.0 device to test against? We're coming up on the end of the 36 beta period
Assignee: nobody → bobbyholley
tracking-fennec: ? → 36+
Flags: needinfo?(bobbyholley)
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #9)
> Bobby, working under the assumption that this is  your regression, do you
> have a 2.3 or 3.0 device to test against? We're coming up on the end of the
> 36 beta period

It does not reproduce on my personal Lollipop Nexus 5, so a device might be good, yes. I'm low on bandwidth at the moment, so this is definitely at-risk for shipping.

In the mean time, I've put together a try build with some extra logging to try to narrow down what's going on:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ff905112891b

Iona, this build should be a vanilla mozilla-beta build with some extra logging. Can you give it a try on your device and post the logcat?

Another thing that would be useful would be to bisect to the exact changeset that caused the failure. I'm guessing it's going to be one of the pieces of bug 1109437.
Flags: needinfo?(bobbyholley)
Flags: needinfo?(ioana.chiorean)
Attached file log.txt
This is logged when tapping on 'play':
/Gecko   ( 2922): Returning false at /builds/slave/try-and-api-11-000000000000000/build/dom/media/webm/SoftwareWebMVideoDecoder.cpp:94
Flags: needinfo?(ioana.chiorean)
(In reply to Catalin Suciu from comment #11)
> Created attachment 8564083 [details]
> log.txt
> 
> This is logged when tapping on 'play':
> /Gecko   ( 2922): Returning false at
> /builds/slave/try-and-api-11-000000000000000/build/dom/media/webm/
> SoftwareWebMVideoDecoder.cpp:94

Thanks, that's extremely helpful.

I've added more logging in this build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c19caf517c8

Can you reproduce the problem a second time and post the new log?

I've also copied the video to a different server. Can you see if the problem reproduces there as well? That will help us rule out funny network behavior: http://people.mozilla.org/~bholley/out9.webm
Flags: needinfo?(catalin.suciu)
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #12)
> I've added more logging in this build:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=2c19caf517c8

This build was busted. Trying again:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0f9eecf5d051
The bug is also reproducible on Samsung Galaxy Tab running Android 4.0.4
Sylvestre I take it this is wontfix 36 with a release-note.
Flags: needinfo?(sledru)
Sure, added to the release notes with "On Android 2.3/3.0, some WebM videos cannot be read properly" as wording.
Tracking in 37 & 38
Flags: needinfo?(sledru)
(In reply to Sylvestre Ledru [:sylvestre] from comment #17)
> Sure, added to the release notes with "On Android 2.3/3.0, some WebM videos
> cannot be read properly" as wording.

Maybe this needs an update since the bug is reproducible also on Android 4.0 (see comment#15)
(In reply to Catalin Suciu from comment #18)
> (In reply to Sylvestre Ledru [:sylvestre] from comment #17)
> > Sure, added to the release notes with "On Android 2.3/3.0, some WebM videos
> > cannot be read properly" as wording.
> 
> Maybe this needs an update since the bug is reproducible also on Android 4.0
> (see comment#15)

OK. Updated.
Well, that escalates things then.
Severity: normal → major
Summary: Only a few frames played in WebM video (Android 2.3/3.0) → Only a few frames played in WebM video (Android 2.3/3.0/4.0+)
Flags: in-testsuite?
So, the problem is that we're activating the skip-to-keyframe logic, and this 10s video only has one keyframe (at the very beginning). So we just scoot all the way through the video frames and declare the video done.

Presumably this is a regression from bug 1091992, which redesigned the logic for keyframe skipping and introduced MediaDecoderStateMachine::NeedToSkipToNextKeyframe. So Aaron was spot-on.

Chris, do you have thoughts on what, in an ideal world, would prevent us from activating the keyframe skipping logic in this case?
Blocks: 1091992
Flags: needinfo?(cpearce)
Not sure if this matters but Firefox for Android decided on a different keyframe behavior than FxOS. FxOS seeks to keyframes where as Firefox for Android wanted accurate scrubbing.
We should make the same fix to WebMReader that I made to MP4Reader in the MSE case; ignore the skip-to-next-keyframe from the MediaDecoderStateMachine, and engage the skip-to-next-keyframe if the current playback position is after the timestamp of the next keyframe. Then we will only drop frames we cannot play.
(In reply to Chris Pearce (:cpearce) from comment #23)
> We should make the same fix to WebMReader that I made to MP4Reader in the
> MSE case; ignore the skip-to-next-keyframe from the
> MediaDecoderStateMachine, and engage the skip-to-next-keyframe if the
> current playback position is after the timestamp of the next keyframe. Then
> we will only drop frames we cannot play.

Makes sense. We'd need support in WebMReader for determining the next keyframe time, but that shouldn't be too hard - WebMReader already buffers ahead of the packet it reads in certain cases, so we just need to read packets into m{Audio,Video}Packets until we find one that's in a keyframe (determined via vpx_codec_peek_stream_info).

Anyway, this is bechen's bug - over to him.
Assignee: bobbyholley → bechen
Flags: needinfo?(bechen)
Flags: needinfo?(cpearce)
bechen - this patch applies to ff36, and accurately simulates the problem on
Desktop builds, so you can avoid worrying about android specifically. :-)
Flags: needinfo?(bechen)
See Also: → 1123498
tracking-fennec: 36+ → 37+
Patch v01, in this patch, I add a new function ShouldSkipVideoFrame which tries to overwrite the aKeyframeSkip if aKeyframeSkip is true.

ShouldSkipVideoFrame function:
1. skip the video frames which's timestamp is less than aTimeThreshold
2  try to find a keyframe in the next 5 seconds. (aTimeThreshold~ aTimeThreshold+5)

If we find a keyframe in 5 seconds, then we drop the other frames before it.
Otherwise if we can't find a keyframe in 5 seconds, we must restore the whole frames and set the flag aKeyframeSkip to false.

I'm not sure 5 seconds is a proper number...
Attachment #8571260 - Flags: feedback?(cpearce)
cpearce - This bug is fairly urgent. It has been waiting on feedback from you for two weeks. Do you have time to review the patch today? If not, can you recommend someone else to provide feedback?
Flags: needinfo?(cpearce)
Comment on attachment 8571260 [details] [diff] [review]
bug-1131563-skipkeyframe.v01.patch

Review of attachment 8571260 [details] [diff] [review]:
-----------------------------------------------------------------

Matthew Gregan is our WebM expert, he can look over this.
Attachment #8571260 - Flags: feedback?(cpearce) → feedback?(kinetik)
Flags: needinfo?(cpearce)
Comment on attachment 8571260 [details] [diff] [review]
bug-1131563-skipkeyframe.v01.patch

Review of attachment 8571260 [details] [diff] [review]:
-----------------------------------------------------------------

ShouldSkipVideoFrame ends up being pretty big and complex.  Can you please take a look at Chris's patch in bug 1123498 and try to split this up the same way?  i.e. separate finding the timestamp of the next keyframe from the decision about skipping keyframes.

::: dom/media/webm/WebMReader.cpp
@@ +1034,5 @@
> +    // push back the last packet because it is a keyframe packet.
> +    PushVideoPacket(tempQueue.PopFront());
> +    // Notify the decoder we drop some frames.
> +    if (mDecoder) {
> +      mDecoder->NotifyDecodedFrames(tempQueue.GetSize(), 0);

This seems to expect 3 arguments: parsed, decoded, dropped.  And shouldn't we be counting these as dropped?
Attachment #8571260 - Flags: feedback?(kinetik) → feedback-
v02 patch.
The basic logic is the same to v01 patch:
1. filter out the packets by aTimeThreshold
2. find the keyframe timestamp

Also I remove the drop packet code.
Attachment #8571260 - Attachment is obsolete: true
Attachment #8579241 - Flags: review?(kinetik)
ni Anthony for awareness. The last mobile 37 Beta build is scheduled for Mon, Mar 23.
Flags: needinfo?(ajones)
Lawrence - looks like Benjamin's patch is moving things in the right direction. If we're falling down the skip to next keyframe logic then we're struggling to keep up anyway.
Comment on attachment 8579241 [details] [diff] [review]
bug-1131563-skipkeyframe.v02.patch

Review of attachment 8579241 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for reworking this!

::: dom/media/webm/WebMReader.cpp
@@ +981,5 @@
> +    return -1;
> +  }
> +
> +  // Find keyframe.
> +  int r;

Move this to where it's first used.

@@ +983,5 @@
> +
> +  // Find keyframe.
> +  int r;
> +  uint64_t tstamp = 0;
> +  uint64_t tstamp_usecs;

Ditto for these.

@@ +984,5 @@
> +  // Find keyframe.
> +  int r;
> +  uint64_t tstamp = 0;
> +  uint64_t tstamp_usecs;
> +  bool endloop = false;

Call this foundKeyframe or similar?

@@ +985,5 @@
> +  int r;
> +  uint64_t tstamp = 0;
> +  uint64_t tstamp_usecs;
> +  bool endloop = false;
> +  CheckedInt64 keyframeTime = -1;

I don't think we need CheckedInt64 here, since we only ever divide a positive value by a positive constant.  Also, the CheckedInt64 default-initializes to valid, so at the end of this function we'd return via the isValid() path rather than the final return, which could be confusing if someone modifies this code later without noticing.

@@ +986,5 @@
> +  uint64_t tstamp = 0;
> +  uint64_t tstamp_usecs;
> +  bool endloop = false;
> +  CheckedInt64 keyframeTime = -1;
> +  while (true) {

Use while (!foundKeyframe) here and remove the if at the end of the loop.

@@ +1048,5 @@
> +  int64_t nextKeyframeTime = GetNextKeyframeTime(aTimeThreshold);
> +  if (nextKeyframeTime == -1) {
> +    return false;
> +  }
> +  return true;

Replace the body of this function with:

    return GetNextKeyframeTime(aTimeThreshold) != -1;

::: dom/media/webm/WebMReader.h
@@ +220,2 @@
>  private:
> +  // Get the keyframe's timestamp which greater than aTimeThreshold.

Maybe "Get the timestamp of the next keyframe greater than aTimeThreshold"?
Attachment #8579241 - Flags: review?(kinetik) → review+
Flags: needinfo?(ajones)
Keywords: checkin-needed
We're now too late to take this fix in 37 but it looks like we should have a fix in 38.
https://hg.mozilla.org/mozilla-central/rev/af2ce13b4a6b
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Benjamin, can we have an uplift request to 38 (aurora)? thanks
Flags: needinfo?(bechen)
Flags: needinfo?(bechen)
Comment on attachment 8581402 [details] [diff] [review]
bug-1131563.aurora.patch

Approval Request Comment
[Feature/regressing bug #]: 1091992
[User impact if declined]: Some video clips which have few Key-frame might not be played well.
[Describe test coverage new/current, TreeHerder]: Manaul test http://people.mozilla.org/~bholley/out9.webm
[Risks and why]: Low, change the logic about skip-to-next-keyframe for webm.
[String/UUID change made/needed]: none
Attachment #8581402 - Flags: approval-mozilla-aurora?
Attachment #8581402 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Verified as fixed in build 39.0a1 (2015-03-23);
Device: Samsung Galaxy R (Android 2.3.4)
Verified as fixed in build 38.0a2 (2015-03-26);
Device: Samsung Galaxy R (Android 2.3.4)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.