Redesign buffering mode for MSE

RESOLVED FIXED in Firefox 36

Status

()

P1
normal
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: bholley, Assigned: bholley)

Tracking

(Blocks: 1 bug)

unspecified
mozilla37
x86
Mac OS X
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox36 fixed, firefox37 fixed)

Details

Attachments

(7 attachments)

(Assignee)

Description

4 years ago
In bug 1093020 comment 5 and bug 1093020 comment 9, we discussed the need for an improved buffering mode mechanism for MSE.

With the current polling-based architecture, we have a delay of up to 1 second between receiving the required data and switching out of buffering mode. This is suboptimal, and especially important because play start time is one of the most important metrics that certain sites like Facebook care about.

The solution here is to create a new asynchronous callback-based buffering mode for situations like MSE where we have perfect data. This should be significantly easier to do now that we have MediaPromises.
(Assignee)

Comment 1

4 years ago
Sorry for the delay here - I keep getting sidetracked with DOM/JS/XPConnect/Wedding things. I'm going to make a more concerted effort to ignore those things now (see my bugzilla name).

So anyway - this afternoon I finally sat down and wrote up patches for this. They seem to work great so far - stuff is super zippy now, and buffering wait delays are now imperceptible in some cases. Previously they were gated at a the 1s state machine polling latency, and now we get delays like 0.016s.

I need to rebase and do a bit more testing, then I'll push to try.
(Assignee)

Comment 2

4 years ago
I've also confirmed that this eliminates the brief buffering lag when seeking backwards to already-downloaded data. \o/

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=8e8dcbbeb399
(Assignee)

Comment 5

4 years ago
Created attachment 8538769 [details] [diff] [review]
Part 1 - Remove MOZ_COUNT_{C,D}TOR on refcounted promise classes. v1

These were accidental and redundant, because refcounted classes get this behavior
automatically. And this is very lucky, because it turns out that our MOZ_COUNT_*
infrastructure can't handle varying-sized instances identified with the same
string, which is exactly what we can get with these templated types.

The only remaining use of these macros is on the non-templated ThenValueBase,
which is happily not variable-sized. \o/
Attachment #8538769 - Flags: review?(cpearce)
(Assignee)

Comment 6

4 years ago
Created attachment 8538770 [details] [diff] [review]
Part 2 - Generalize GetBufferingWait to UseBufferingHeuristics. v1
Attachment #8538770 - Flags: review?(cpearce)
(Assignee)

Comment 7

4 years ago
Created attachment 8538771 [details] [diff] [review]
Part 3 - Stop using buffering heuristics when they're not requested. v1

Given that we set the buffering wait to 0 in this case already, the only practical
impact on our behavior of this change is that we'll no longer ping-pong between
states.
Attachment #8538771 - Flags: review?(cpearce)
(Assignee)

Comment 8

4 years ago
Created attachment 8538772 [details] [diff] [review]
Part 4 - Switch m{Audio,Video}RequestPending to a tri-state. v1
Attachment #8538772 - Flags: review?(cpearce)
(Assignee)

Comment 9

4 years ago
Created attachment 8538773 [details] [diff] [review]
Part 5 - Schedule the state machine thread in MediaSourceDecoder::Load. v1

This is necessary to get the state machine moving after it's initialized.
MediaDecoder::Load does this, but it looks like we missed this in the override.
So the current code relies on the ScheduleStateMachine call at the end of
TrackBuffer::AppendData to get things rolling. We're going to be removing that
call, so we need to fix this.
Attachment #8538773 - Flags: review?(cpearce)
(Assignee)

Comment 10

4 years ago
Created attachment 8538774 [details] [diff] [review]
Part 6 - Implement non-polling buffering. v1
Attachment #8538774 - Flags: review?(cpearce)
Attachment #8538769 - Flags: review?(cpearce) → review+
Comment on attachment 8538771 [details] [diff] [review]
Part 3 - Stop using buffering heuristics when they're not requested. v1

Review of attachment 8538771 [details] [diff] [review]:
-----------------------------------------------------------------

Some initial comments...

::: dom/media/MediaDecoderStateMachine.cpp
@@ +2621,5 @@
>  
> +      // When we enter buffering mode from playback, we push our most recent
> +      // video frame back into the queue. So depending on how we started
> +      // buffering, we may have one dummy frame in the queue. :-(
> +      bool outOfAudio = IsAudioDecoding() && !AudioQueue().IsFinished() && AudioQueue().GetSize() == 0;

The logic in AdvanceFrame() that causes us to enter buffering state when HasLowDecodedData() returns true. HasLowDecodedData() can return true when we still have a non-zero amount of decoded data remaining. So you may not actually be out of decoded data the first time you enter this case... Which means you'll jump back into StartDecoding() on line 2655, and then probably flop back into buffering state.

In the !UseBufferingHeuristics() case, I think you should try making the logic in AdvanceFrame() only enter buffering state when we've completely exhausted data. You could add a !UseBufferingHeuristics() branch to HasLowDecodedData() that only returns true if data is exhausted for example.

@@ +2909,5 @@
>    if (mState == DECODER_STATE_DECODING &&
>        mDecoder->GetState() == MediaDecoder::PLAY_STATE_PLAYING &&
>        HasLowDecodedData(remainingTime + EXHAUSTED_DATA_MARGIN_USECS) &&
>        mDecoder->IsExpectingMoreData()) {
> +    if (!mReader->UseBufferingHeuristics() || JustExitedQuickBuffering() || HasLowUndecodedData()) {

Do you need to add this check? What if the CPU is too slow to decode the video in real time? Then HasLowDecodedData() can return true consistently or intermittently; keyframes can be slow to decode in software. If HasLowDecodedData() does return true due to the decode falling behind, because we don't check HasLowUndecodedData() here we'll enter buffering state (when playing MSE content at least). This will pause playback. I'd guess we'd then immediately flop out of buffering state (since we'll have buffered data to decode), and then repeat conditions and soon after flip back in... This would cause glitchy playback.

This /probably/ isn't a problem for MP4, since the MP4Reader is async, and so since HasLowDecodedData only checks whether audio is falling behind that's less likely to happen for an async reader, but for WebM content (which still uses the synchronous WebMReader) this could be a problem.

Can you test without this added? You may be able to `nice` your firefox process to simulate a low end CPU. I usually test on a Windows Atom tablet or my old netbook when I need to test low end hardware.

Since MP4 /should not/ be affected by this, if this check is required, you can land with this change in your patch if you file a followup for someone to investigate this before we turn on WebM MSE.

Please do test your patches with this addition removed first though.
Attachment #8538773 - Flags: review?(cpearce) → review+
Comment on attachment 8538770 [details] [diff] [review]
Part 2 - Generalize GetBufferingWait to UseBufferingHeuristics. v1

Review of attachment 8538770 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think you need the mBufferingWait change, but clearly the UseBufferingHeuristics() change is mandatory.

::: dom/media/MediaDecoderStateMachine.cpp
@@ -220,5 @@
>  
>    mAmpleVideoFrames =
>      std::max<uint32_t>(Preferences::GetUint("media.video-queue.default-size", 10), 3);
>  
> -  mBufferingWait = mScheduler->IsRealTime() ? 0 : mReader->GetBufferingWait();

Isn't mBufferingWait only touched on the !UseBufferingHeuristics path? You don't need this change right?
Attachment #8538770 - Flags: review?(cpearce) → review+
Attachment #8538772 - Flags: review?(cpearce) → review+
Comment on attachment 8538774 [details] [diff] [review]
Part 6 - Implement non-polling buffering. v1

Review of attachment 8538774 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.
Attachment #8538774 - Flags: review?(cpearce) → review+
Priority: -- → P1
(Assignee)

Comment 14

4 years ago
(In reply to Chris Pearce (:cpearce) from comment #12)
> Isn't mBufferingWait only touched on the !UseBufferingHeuristics path? You
> don't need this change right?

Well, I need to change the line anyway, because GetBufferingWait is going away - so I figured it would be more sensible to set the value to 0 if it wasn't in use rather than leaving it at 30. I can leave it at 30 though if you like to make the logic cleaner.
(Assignee)

Comment 15

4 years ago
Created attachment 8538980 [details] [diff] [review]
Part 7 - Only switch to buffering mode when the reader is waiting for data. v1

This serves as the cleanup for the issues discussed in part 3. They can't land
together, because this depends on machinery introduced in part 6. This and
part 3 should be reviewed together.

I've tested with a busywait in the MP4 video PDM, and this successfully avoids
jerky switches into buffering mode when the decoder is slow.
Attachment #8538980 - Flags: review?(cpearce)
Comment on attachment 8538980 [details] [diff] [review]
Part 7 - Only switch to buffering mode when the reader is waiting for data. v1

Review of attachment 8538980 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaDecoderStateMachine.cpp
@@ +2916,5 @@
> +      shouldBuffer = HasLowDecodedData(remainingTime + EXHAUSTED_DATA_MARGIN_USECS) &&
> +                     (JustExitedQuickBuffering() || HasLowUndecodedData());
> +    } else {
> +      MOZ_ASSERT(mReader->IsWaitForDataSupported());
> +      shouldBuffer = (IsVideoDecoding() && !VideoQueue().GetSize() && mVideoRequestStatus == RequestStatus::Waiting) ||

Can't you use OutOfDecodedVideo() etc here too?
Attachment #8538980 - Flags: review?(cpearce) → review+
Comment on attachment 8538771 [details] [diff] [review]
Part 3 - Stop using buffering heuristics when they're not requested. v1

Review of attachment 8538771 [details] [diff] [review]:
-----------------------------------------------------------------

r=cpearce in the presence of Part 7.
Attachment #8538771 - Flags: review?(cpearce) → review+
(Assignee)

Comment 18

4 years ago
(In reply to Chris Pearce (:cpearce) from comment #16)
> Can't you use OutOfDecodedVideo() etc here too?

Doh! Yes, that's precisely why I added them - I just forgot to go back and update the original code I wrote in the prototype. Thanks for catching that, and thanks for the quick reviews. Have a good vacation!
(Assignee)

Comment 20

4 years ago
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #19)
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1685c1fdafc7

I had the assertion there backwards:

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=89907c1f1b9f
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=b4ea5989dd1f

It's a bit too early to tell from the last 3 try runs, but it's possible that this push may increase the intermittent failure rates of certain tests in dom/media/mediasource/test (test_BufferingWait, test_WaitingForMissingData) on android. If it does, please just disable the test on the relevant platforms. Those tests are going to be intermittent until we fix bug 1093133, and these patches are very important for playback experience.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/0b8a156dcff0
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/65d172e7a1cf
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/ec275f65c676
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/72f171ec04ba
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/4c38dda06880
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/e6350e4fb18b
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/dc45e97d102d
(Assignee)

Comment 21

4 years ago
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #20)
> and these patches are very important for playback experience.

(and time here is important, especially given Sheila's recent call to everyone@ for people to dogfood MSE)
(Assignee)

Comment 23

4 years ago
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #22)
> https://treeherder.mozilla.org/ui/logviewer.html#?job_id=4829269&repo=mozilla-inbound

I think this one is unrelated.

https://treeherder.mozilla.org/ui/logviewer.html#?job_id=4830470&repo=mozilla-inbound

This is related, given the callstack.

Doing some retriggers to get to the bottom of this: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=57addcbadd01
(Assignee)

Comment 24

4 years ago
Also, can you check once all the results come in whether there were any other instances of suspicious stuff between the landing and the backout? Your eyes are much better at this stuff than mine. :-)
Flags: needinfo?(ryanvm)
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=4834000&repo=mozilla-inbound was another one, yes. Looks the same as the one you said is related :)

Beyond that, I've been away the last few hours and it's honestly hard to find them if you aren't catching them as they come in. Sorry :(
Flags: needinfo?(ryanvm)
Anyway, I'll file that Android one as a separate bug.
(Assignee)

Updated

4 years ago
Depends on: 1114383
(Assignee)

Comment 30

4 years ago
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #29)
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=85d5f96cad85

Looks like bug 1114383 fixed the crashes. \o/

Bumping this for emphasis:
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #20)
> It's a bit too early to tell from the last 3 try runs, but it's possible
> that this push may increase the intermittent failure rates of certain tests
> in dom/media/mediasource/test (test_BufferingWait,
> test_WaitingForMissingData) on android. If it does, please just disable the
> test on the relevant platforms. Those tests are going to be intermittent
> until we fix bug 1093133, and these patches are very important for playback
> experience.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9072cf5c5bb8
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/ef08b3ad6fe0
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/39b086aebde1
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/b701c02ed8e3
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/87af4078b4be
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9b8e6287c207
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/cb95d7109ba9
(Assignee)

Updated

4 years ago
Depends on: 1114830
(Assignee)

Updated

4 years ago
Blocks: 1114840
Comment on attachment 8538769 [details] [diff] [review]
Part 1 - Remove MOZ_COUNT_{C,D}TOR on refcounted promise classes. v1

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Less consistent testing, sites more likely to serve Flash video.
[Describe test coverage new/current, TBPL]: landed on m-c and shipping in Nightly.
[Risks and why]: Risk is low. This does have some affect on non-MSE media playback, but this is straightforward cleanup there with the primary changes restricted to the MSE code.
[String/UUID change made/needed]: None.

This request applies to all patches on this bug.
Attachment #8538769 - Flags: approval-mozilla-aurora?
status-firefox36: --- → affected
status-firefox37: --- → fixed
Attachment #8538769 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.