Closed Bug 1316506 Opened 3 years ago Closed 2 years ago

Make sure OOP video decoders are completely finished before we shut down MFR

Categories

(Core :: Audio/Video: Playback, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox52 --- wontfix

People

(Reporter: mattwoodrow, Assigned: mattwoodrow)

References

Details

Attachments

(3 files, 3 obsolete files)

Attached patch flush-remote-video (obsolete) — Splinter Review
MediaFormatReader currently shuts down the video/audio decoder tasks queues as a way of forcing all of their work to be completed and ensuring that any async responses are posted to MFR's task queue.

This doesn't work with OOP decoding since IPDL is async, and we can get responses coming in too late.

These causes crashes if we're already shut down MFRs task queue [1]

I think the best way to fix this is to explicitly flush any OOP work as well as shutting down the decoder task queues.



[1] https://treeherder.mozilla.org/logviewer.html#?job_id=30786086&repo=try#L4930
Attachment #8809273 - Flags: review?(jyavenard)
Attachment #8809273 - Flags: review?(dvander)
Comment on attachment 8809273 [details] [diff] [review]
flush-remote-video

Review of attachment 8809273 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaFormatReader.cpp
@@ +510,5 @@
> +  // to make sure that all responses have been posted
> +  // to our event queue. Add an explicit flush
> +  // to make sure that all OOP messages are
> +  // processed.
> +  dom::VideoDecoderManagerChild::Flush();

as discussed on IRC,

MediaDataDecoder::Flush() should be synchronous.

It is how all other decoders are implemented.

If we worry that when calling flush we would drain all OOP decoders, then you can track if there's a pending decoding or not and not drain then
Attachment #8809273 - Flags: review?(jyavenard)
Attachment #8809273 - Flags: review?(dvander)
Attachment #8809273 - Attachment is obsolete: true
Attached patch Make Flush sync (obsolete) — Splinter Review
The MediaDataDecoder API requires that Flush is sync, and we weren't doing this previously.

To be truly sync, then we need to wait on both 'other' threads (the video thread in the GPU process, and the video thread in the content process), so we have two sync calls to make sure we've processed everything in each of them.
Attachment #8809617 - Flags: review?(dvander)
Attached patch Make Flush synchronous (obsolete) — Splinter Review
The MediaDataDecoder API requires Flush() to be synchronous, and we're not doing that currently.

It's a little (a lot!) more complex with RemoteVideoDecoder, since there's 3 threads we need to be sync wrt instead of the usual 1.
Attachment #8809617 - Attachment is obsolete: true
Attachment #8809617 - Flags: review?(dvander)
Attachment #8809676 - Flags: review?(wmccloskey)
Comment on attachment 8809618 [details] [diff] [review]
Make sure we flush all decoders before shutdown

Review of attachment 8809618 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the changes.

Would be nicer if you used reviewboard really.. absolute pain to review in bugzilla

::: dom/media/MediaFormatReader.cpp
@@ +247,5 @@
>      mDecoder->SetSeekThreshold(aTime);
>    }
>    void Shutdown() override
>    {
> +    mDecoder->Flush();

This one is unnecessary, Other than in MFR::Shutdown(), Shutdown() is never called without first calling Flush() or Drain (which then wait for DrainComplete)

and with your changes above, on MFR shutdown, Flush() will be called, so now you're flushing twice in a row.

@@ +399,5 @@
>      },
>      [this, &data, aTrack] (MediaResult aError) {
>        data.mInitPromise.Complete();
>        data.mStage = Stage::None;
> +      data.mDecoder->Flush();

You get here when an initialisation error occurs. No data was ever fed to the decoder.

no need to flush

::: dom/media/MediaFormatReader.h
@@ +271,5 @@
>      void ShutdownDecoder()
>      {
>        MonitorAutoLock mon(mMonitor);
>        if (mDecoder) {
> +        mDecoder->Flush();

don't need that one either.
This is only called once a drain a completed and there's nothing more to do by the decoder. (unless the DrainComplete implementation is broken for the OOP decoder)
Attachment #8809618 - Flags: review?(jyavenard) → review+
I'm still a little concerned about the possibility of media forgetting to Flush correctly before shutting down, or us delivering an IPDL message after we've Flushed (seems like an easy bug to put in).

This makes Shutdown sync from the calling thread -> manager thread, so by the time it's done, we know that mCallback (a raw pointer) is cleared, so should be impossible for it to ever be dangling.

It also null checks mCallback, so that if anything does come in after we've shut down, it doesn't crash the content process.
Attachment #8809713 - Flags: review?(dvander)
(In reply to Matt Woodrow (:mattwoodrow) from comment #6)
> Created attachment 8809713 [details] [diff] [review]
> Make Shutdown sync
> 
> I'm still a little concerned about the possibility of media forgetting to

I'm not :)
Attachment #8809713 - Flags: review?(dvander) → review+
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dedcf2678054
Make sure we flush decoders before shutting them down. r=jya
https://hg.mozilla.org/integration/mozilla-inbound/rev/7b08bfceec28
Make Shutdown block on the manager thread so that we clear the callback pointer immediately. r=dvander
Keywords: leave-open
I don't feel like I can competently review this without reading a lot of code. Would dvander be a better reviewer?
The main bit I wanted review on is the implementation of RemoteVideoDecoder::Flush.

It's trying to implement a sync call across multiple threads (and IPDL), and I'm not sure if it's guaranteed to work correctly.

The comments explain what it's trying to do (and the assumptions that it makes), can you please check if these are actually guaranteed by IPDL?

Thanks!
Comment on attachment 8809676 [details] [diff] [review]
Make Flush synchronous

Review of attachment 8809676 [details] [diff] [review]:
-----------------------------------------------------------------

I think it makes more sense for VideoDecoderParent to post an event to itself that sends a message to VideoDecoderChild saying it's done.
Attachment #8809676 - Flags: review?(wmccloskey)
Attachment #8809676 - Attachment is obsolete: true
Attachment #8812668 - Flags: review?(wmccloskey)
Comment on attachment 8812668 [details] [diff] [review]
Make Flush synchronous

Review of attachment 8812668 [details] [diff] [review]:
-----------------------------------------------------------------

Is it possible to send a flush message while a different one is already pending? If there's only one RemoteVideoDecoder per VideoDecoderChild, I guess not. Just wanted to raise the question though.
Attachment #8812668 - Flags: review?(wmccloskey) → review+
(In reply to Bill McCloskey (:billm) from comment #14) 
> Is it possible to send a flush message while a different one is already
> pending? If there's only one RemoteVideoDecoder per VideoDecoderChild, I
> guess not. Just wanted to raise the question though.

Nope, they exist as a tightly coupled pair.
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a9b7a22027f
Make RemoteVideoDecoder::Flush synchronous. r=billm
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/68240063e11a
Initialize mFlushTask. r=bustage CLOSED TREE
Depends on: 1351370
Mass wontfix for bugs affecting firefox 52.
Matt, can we resolve this bug as fixed (in Nightly 53.0a1) or is follow up work needed? You marked this bug as leave-open when you landed the first patches in comment 8.
Flags: needinfo?(matt.woodrow)
Concerned from Matt was unwarranted. Plus since the code makes it even more explicit that the situation Matt was concerned about can't occur.
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(matt.woodrow)
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.