Closed Bug 1127203 Opened 5 years ago Closed 5 years ago

Playback never starts in MSE example, 100% CPU

Categories

(Core :: Audio/Video, defect, P1)

x86
macOS
defect

Tracking

()

VERIFIED FIXED
mozilla38
Tracking Status
firefox36 --- verified
firefox37 --- verified
firefox38 --- verified

People

(Reporter: jya, Assigned: bholley)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(4 files)

Attached file log.txt
This has very recently stopped working.

http://people.mozilla.org/~jyavenard/tests/mse_mp4/paper.html?eos=1&duration=-1

This is a super simple MSE example: add an init segment followed by a media segment.

Playback doesn't start.

Seeking sometimes can cause playback to start and it plays, sometimes it makes the throbber appear forever.

Log shows that we keep retrying over and over.
My guess is that bug 1096089 (commit 570a09a6eb68) introduced the regression
Another interesting bug, is that closing the tab showing that loaded page, shows that the MDSM continues to retry for a while with a round of media promises being created -> rejected. Why isn't everything shutdown already?
(In reply to Jean-Yves Avenard [:jya] from comment #2)
> Another interesting bug, is that closing the tab showing that loaded page,
> shows that the MDSM continues to retry for a while with a round of media
> promises being created -> rejected. Why isn't everything shutdown already?

The state machine needs to run before it can recognize that it's been moved to SHUTDOWN state. As discussed in bug 1120241 this takes a tiny bit longer than it could, but isn't going to cause the decoding to continue for longer than a few milliseconds.
Bobby, ignoring the not-starting issue. The logs also show an anomaly that we loop when we’re in WAITING_FOR_DATA forever, and appears retry even when no data is appended. That leads to massive CPU usage.

I’ve had that behaviour happening elsewhere (during the W4 tests, CPU goes super high for a few seconds while they are waiting to append more data)

I have logs there too if required.
Flags: needinfo?(bobbyholley)
Confirmed locally. This is bad.

This video starts at 0.066733s, and so I think it needs the fuzz factor to work at all.

What's happening is that we're being inconsistent in whether we apply the fuzz factor when selecting the reader. We don't apply it in Request{Audio,Video}Data, but we _do_ apply it when checking whether to resolve the waitfordata promise. So we end up bouncing back and forth.

I'll figure out a patch.
Priority: -- → P1
Summary: Playback never starts in MSE example → Playback never starts in MSE example, 100% CPU
Looks like we also need to use the fuzz while seeking to avoid the hang when trying to replay the video after it ends.

Patches seem to work. Flagging mattwoodrow for review, but if someone else gets to them first please steal them.
Flags: needinfo?(bobbyholley)
aError is a really misleading name.
Attachment #8557276 - Flags: review?(matt.woodrow)
Assignee: nobody → bobbyholley
Attachment #8557276 - Flags: review?(matt.woodrow) → review+
Attachment #8557277 - Flags: review?(matt.woodrow) → review+
Attachment #8557278 - Flags: review?(matt.woodrow) → review+
Thanks for the fast review Matt!
Blocks: 1128069
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #10)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=b1d79f1fec57

Green modulo 1 web-platform test, which just went from timing out to actually running and intermittently passing/failing. Filed bug 1128069 about that and pushed.

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/74d5eb626c0d
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/e5591b1c4d64
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/99994b4a3682
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/320b02bd690c

There was also one failure in test_dataChannel_basicDataOnly.html, but it looks like that test has other oranges filed on it anyway, and it didn't reproduce with retriggers, and the code that these patches touch are all MSE-only, so I'm going to call it unrelated.
Comment on attachment 8557278 [details] [diff] [review]
Part 3 - Use the tolerance value in TrackBuffersContainTime so that seeking operates with tolerance too. v1

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Youtube video playback can stall.
[Describe test coverage new/current, TreeHerder]: Landed on m-c.
[Risks and why]: This is an MSE-specific fix, so low.
[String/UUID change made/needed]: None.

Requesting uplift for all patches on this bug.
Attachment #8557278 - Flags: approval-mozilla-beta?
Attachment #8557278 - Flags: approval-mozilla-aurora?
Attachment #8557278 - Flags: approval-mozilla-beta?
Attachment #8557278 - Flags: approval-mozilla-beta+
Attachment #8557278 - Flags: approval-mozilla-aurora?
Attachment #8557278 - Flags: approval-mozilla-aurora+
Depends on: 1126465
Flags: qe-verify+
Reproduced the initial issue on Mac OS X 10.9.5 using an old Nightly, verified that the issue is fixed using the latest Nightly, Latest Aurora and Firefox 36 beta 6 on the same Mac OS X 10.9.5
You need to log in before you can comment on or make changes to this bug.