Closed Bug 1125970 Opened 5 years ago Closed 5 years ago

Remove or make safe MediaTaskQueue flushing capability

Categories

(Core :: Audio/Video, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox36 --- unaffected
firefox37 + fixed
firefox38 + fixed

People

(Reporter: cpearce, Assigned: bholley)

References

Details

Attachments

(3 files, 2 obsolete files)

In bug 1125472 we got bitten by flushing the MDSM's decode task queue causing an event that set critical state to be dropped, causing a playback problem.

We could prevent such problems in future by either not allowing task queue's to be flushed, or by otherwise making the flushing "safe".

We flush the task queues in pretty much every MediaDataDecoder implementation, in order to shut down their task queues faster, so we'd either need to Not Do That, or find another way to achieve that. Possibly this is an unnecessary optimization.
Blocks: 1130253
Picking up chris' patches from bug 1125472. Let's see where they stand on try now that the dust has settled:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a5e131da339c
Ok, so we still have timeouts on dom/media/test/test_bug493187.html. Let's try with my programmatic logging setup:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0f035eb7b904
Found the bug, or at least one of them. Pushing to try to confirm:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5487e41c4590
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #3)
> Found the bug, or at least one of them. Pushing to try to confirm:
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=5487e41c4590

Looking good so far. Did a full try push with an additional patch to make this stuff safe:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=931e8f5d4ae5
(My type-system based solution involves a lot of mechanical changes the code that I can't build locally, so I'm just iterating a bit on try)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=192a25d2f0cc
This fixes the android/b2g hang in dom/media/test/test_bug493187.html. I will
fold it into Part 1 before landing.
Attachment #8564806 - Flags: review?(cpearce)
I'm going to hold off flagging for review on the last patch until comment 5 comes back green.
Comment on attachment 8564812 [details] [diff] [review]
Part 3 - Make flushing an opt-in mechanism, and use it only for the PDM task queues. v1

Review of attachment 8564812 [details] [diff] [review]:
-----------------------------------------------------------------

The name "FlushableMediaTaskQueue" is a bit of an eye sore, but I'll consider that as motivation to change our PDMs to not flush task queues.
Attachment #8564812 - Flags: review?(cpearce) → review+
Comment on attachment 8564806 [details] [diff] [review]
Part 1.5 - Reject video promise in RequestVideoWithSkipTask::Cancel. v1

Review of attachment 8564806 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaDecoderReader.cpp
@@ +197,2 @@
>      mIsCanceled = true;
> +    mReader->mBaseVideoPromise.Reject(MediaDecoderReader::CANCELED, __func__);

Do you also need to reject mReader->mBaseAudioPromise? We can still return from MediaDecoderReader::RequestAudioData() without resolving mBaseAudioPromise without resolving the promise...
Attachment #8564806 - Flags: review?(cpearce) → review+
Assignee: nobody → bobbyholley
Attachment #8564829 - Flags: review?(cpearce) → review+
(In reply to Bobby Holley (Busy with media, don't ask for DOM/JS/XPConnect things) from comment #14)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=c7f3ee2c169d

All good modulo static analysis failure for implicit constructor. Verifying fix:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5feb52634342
Bobby - What do you think about uplifting this bug to 37 given that this bug fixed bug 1130253, which is an MSE P1, and that we're planning to ship MSE in 37? If you think that this can be uplifted, this bug contains quite a sizable code change touching many files. Can the fix be verified on Nightly first?
Flags: needinfo?(bobbyholley)
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #18)
> Bobby - What do you think about uplifting this bug to 37 given that this bug
> fixed bug 1130253, which is an MSE P1, and that we're planning to ship MSE
> in 37? If you think that this can be uplifted, this bug contains quite a
> sizable code change touching many files. Can the fix be verified on Nightly
> first?

We should absolutely uplift this to 37. It fixes lots of crashes and hangs by ceasing to do something that is very unsafe. The only verification that needs to be done is to ensure that we're not getting any new kinds of crashes and hangs (which we could, since this is a significant change in behavior).

The sizeable code change (part 3) doesn't change behavior - it just makes things more type-safe (i.e. prevents people from flushing the main task queue, which causes lots of crashes).  Part 2 is the big one, even though it's only 2 lines.
Flags: needinfo?(bobbyholley)
Comment on attachment 8564807 [details] [diff] [review]
Part 2 - Don't flush decode task queue in MediaDecoderStateMachine::FlushDecoding(). r=bholley

Requesting 37 uplift of all patches on this bug.

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Playback stalls an crashes playing video.
[Describe test coverage new/current, TreeHerder]: Landed on m-c.
[Risks and why]: This affects all video playback, so it's not protected by our ability to pref off MSE. On the other hand, this should be more robust than what we have for non-MSE playback too. Risk is acceptable, I think, and if not it's better to find out soon.
[String/UUID change made/needed]: None.

NB the third patch doesn't apply cleanly because of eme changes which haven't been uplifted. Per comment #19 it should be safe to just omit those parts of the patch. If we want to reduce risk, we can skip or delay uplifting the third patch, but I'd really like to have it for consistency in the MSE code.
Attachment #8564807 - Flags: approval-mozilla-aurora?
Blocks: 1129455
Comment on attachment 8564829 [details] [diff] [review]
Part 1 - Reject promises in MediaDecoderReader::ResetDecode and don't re-request audio and video when the promises have been rejected. v1

As noted, pre-approval was granted for a collection of MSE changes. Adding approval to the bug.
Attachment #8564829 - Flags: approval-mozilla-aurora+
Attachment #8564807 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8564812 - Flags: approval-mozilla-aurora+
No longer blocks: 1129455
You need to log in before you can comment on or make changes to this bug.