Closed Bug 1097823 Opened 5 years ago Closed 5 years ago

Prototype a Promise-y mechanism for cross-thread notification and communication in media code

Categories

(Core :: Audio/Video, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox36 --- fixed
firefox37 --- fixed

People

(Reporter: bholley, Assigned: cpearce)

References

Details

Attachments

(3 files, 7 obsolete files)

20.38 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
5.90 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
12.74 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
No description provided.
Hit submit too soon :-)
Summary: Prototype a Promise-y → Prototype a Promise-y mechanism for cross-thread notification and communication in media code
This came out of a discussion that cpearce and I had about the state of the state machine.

The media code is complicated and racey, mostly depending on MediaDecoderState machine to keep kicking the wheels whenever they get stuck. It's not going to be rewritten in a day, but it would be nice if we at least had a better model to move towards. Some needs:

* Async from the get-go.
* Easy to use, no need to hand-roll a bunch of machinery for every interaction.
* Strong static and dynamic enforcement of threading invariants.
* Strong guarantees that async requests are never lost (possibly with an optional timeout feature).

I've been hacking up a prototype for the past few hours, and it's going quite well. I'm not looking to spend a ton of time on this, but if we end up with machinery that does the above, I think it will be time well-spent.
Here's a proof-of-concept that more or less works. There's still tweaking to
be done here and there, but I wanted feedback on the overall design.

This patch introduces MediaPromise and uses it to replace
RequestSampleCallback for video data. The next steps would be to use it for
audio as well, and then to get rid of MediaDataDecodedListener and all the
custom runnables.

From there we can go about using it for new things like having the state
machine wait for the reader to notify it of newly-arrived data.
Attachment #8522366 - Flags: feedback?(cpearce)
Another question - we probably either want to make MediaData reference-counted, or build an optional 'exclusive' concept into the promise type. As it stands right now, having multiple consumers waiting on a VideoDataPromise isn't safe, because OnVideoDecoded takes ownership of the incoming pointer and deletes it (meaning that we can't pass that pointer to two consumers).

Chris, which would you prefer? The former seems more robust, but I could buy the argument that we just never want that to happen and should assert against it.
Comment on attachment 8522366 [details] [diff] [review]
Implement Media Promises and prototype their usage with RequestVideoData. PoC

Review of attachment 8522366 [details] [diff] [review]:
-----------------------------------------------------------------

That looks awesome.
Attachment #8522366 - Flags: feedback?(cpearce) → feedback+
(In reply to Bobby Holley (:bholley) from comment #5)
> Another question - we probably either want to make MediaData
> reference-counted, or build an optional 'exclusive' concept into the promise
> type.

I share this concern. I've been thinking recently that we should make MediaData refcounted.

I hadn't considered some sort of UniquePtr/exclusive ownership type thing... We often peek at the MediaData in the MediaQueues to get its timestamps etc, so we'd need to not break that.

Either would be fine with me. Being refcounted is easy to understand, but the reason MediaData wasn't refcounted in the first place was that it wasn't supposed to be shared between owners.
(In reply to Chris Pearce (:cpearce) from comment #6)
> That looks awesome.

\o/

(In reply to Chris Pearce (:cpearce) from comment #7)
> I share this concern. I've been thinking recently that we should make
> MediaData refcounted.

OK, I'll do that.
 
> I hadn't considered some sort of UniquePtr/exclusive ownership type thing...
> We often peek at the MediaData in the MediaQueues to get its timestamps etc,
> so we'd need to not break that.

I meant baking it into the promise type, such that it would be an error to invoke Then() a second time on that promise. Sounds like we don't need to do that.
Depends on: 1100776
Blocks: 1104964
No longer blocks: 1104964
Depends on: 1104964
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=7bf3f832af47

Finally green. Uploading patches and flagging for review.
This is necessary to have strong guarantees that promises will be resolved.
When we start flushing the task queue, normal dispatch starts to fail,
meaning that we can't dispatch promise resolution. We have 3 options to handle
this:

(A) Never respond to the promise.
(B) Invoke the Resolve/Reject callback synchronously if dispatch fails.
(C) Prevent dispatch from failing.

(C) seems like the option least likely to violate invariants if we can get away
with it. Promise resolution is unlikely to be a heavyweight task in the way that
a decode task might be, so this should hopefully be ok.
Attachment #8530668 - Flags: review?(cpearce)
Attachment #8522366 - Attachment is obsolete: true
Attachment #8530667 - Flags: review?(cpearce) → review+
Comment on attachment 8530668 [details] [diff] [review]
Part 2 - Implement MediaTaskQueue::ForceDispatch. v1

Review of attachment 8530668 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaTaskQueue.cpp
@@ +169,5 @@
>  {
>    MonitorAutoLock mon(mQueueMonitor);
>    AutoSetFlushing autoFlush(this);
> +
> +  // Clear the tasks, but preserve those with mForceDispatch by re-appending

This is the second time you've written this code... You could factor it out into a common function to avoid repeating it?

::: dom/media/MediaTaskQueue.h
@@ +32,5 @@
>  
>    explicit MediaTaskQueue(TemporaryRef<SharedThreadPool> aPool);
>  
>    nsresult Dispatch(TemporaryRef<nsIRunnable> aRunnable);
> +  nsresult ForceDispatch(TemporaryRef<nsIRunnable> aRunnable);

Probably you should have a comment here explaining that only things that absolutely can't afford to be Flushed should use ForceDispatch, and that "normal" operations should use Dispatch().
Attachment #8530668 - Flags: review?(cpearce) → review+
Comment on attachment 8530669 [details] [diff] [review]
Part 3 - Implement Media Promises. v1

Review of attachment 8530669 [details] [diff] [review]:
-----------------------------------------------------------------

I'm concerned about the threadsafety seeming to be optional on the promise holder... So will refrain from r+ing it until I get a response on that...

::: dom/media/MediaPromise.h
@@ +24,5 @@
> +
> +namespace mozilla {
> +
> +extern PRLogModuleInfo* gMediaPromiseLog;
> +void EnsureMediaPromiseLog();

EnsureMediaPromiseLog() is unused in this header.

@@ +105,5 @@
> +        return NS_OK;
> +      }
> +
> +    private:
> +      ThenValueBase* mThenValue;

Any reason not to make mThenValue an nsAutoPtr?

@@ +233,5 @@
> +  bool IsPending() { return mResolveValue.isNothing() && mRejectValue.isNothing(); }
> +  void DispatchAll()
> +  {
> +    mMutex.AssertCurrentThreadOwns();
> +    for (size_t i = 0; i < mThenValues.Length(); ++i)

Always put braces on blocks.

@@ +266,5 @@
> +
> +  ~MediaPromiseHolder() { MOZ_ASSERT(!mPromise); }
> +
> +  already_AddRefed<PromiseType> Ensure(const char* aMethodName) {
> +    if (mMonitor) {

Should we assert that mMonitor is not null?

@@ +271,5 @@
> +      mMonitor->AssertCurrentThreadOwns();
> +    }
> +    if (!mPromise) {
> +      mPromise = new PromiseType(aMethodName);
> +    }

Should you assert or enforce that if the holder already holds a promise, that the promise has the same method name as aMethodName? What does it mean for aMethodName to be different to mPromise::mCreationSite?

@@ +276,5 @@
> +    nsRefPtr<PromiseType> p = mPromise;
> +    return p.forget();
> +  }
> +
> +  // Provide a Monitor that should always be held when accessing this instance.

Holding the monitor seems to be optional? Should it be? Or should you be asserting that mMonitor is not null?

My gut feeling is that threadsafety should not be optional...
Comment on attachment 8530670 [details] [diff] [review]
Part 4 - Use MediaPromises for RequestAudioData and RequestVideoData. v1

Review of attachment 8530670 [details] [diff] [review]:
-----------------------------------------------------------------

Looking good, but we should tighten up on the MP4Reader EOS and MediaCodecReader.

Let's go another round...

brsun, bechen do you have any feedback on the changes here to MediaCodecReader?

::: dom/media/MediaDecoderReader.cpp
@@ +298,5 @@
>    MOZ_ASSERT(OnDecodeThread());
>    mShutdown = true;
> +
> +  mAudioPromise.SetMonitor(nullptr);
> +  if (!mAudioPromise.IsEmpty()) {

I'm a bit worried that in the case of async readers we're touching the internals of the PromiseHolder here without hold the monitor that we were previously using to mutex the state. But I don't think it's safe to assume that by the time this function runs that the monitor is still valid in the case of an async Reader.

It seems to me that the Shutdown() implementation in the MediaDecoderReader subclasses should be required to reject the promise for us. So we should assert that they're empty here.

But of course since MediaDecoderReader must adapt the old synchronous readers to the async model you still need to have something here to reject the promises used by the base class MediaDecoderReader::Read*Data() implementation... So you can't assert... So I guess what you're doing is unavoidable for now...

::: dom/media/MediaDecoderReader.h
@@ +97,4 @@
>    // properly!
>    virtual nsresult ResetDecode();
>  
>    // Requests the Reader to call OnAudioDecoded() on aCallback with one

Comments are out of date...

::: dom/media/fmp4/MP4Reader.cpp
@@ +612,5 @@
> +        decoder.mOutput.RemoveElementAt(0);
> +        ReturnOutput(output, aTrack);
> +      }
> +      else if (decoder.mEOS) {
> +        RejectPromise(decoder.mType, END_OF_STREAM, __func__);

We have a bug in the MP4Reader that we send the EOS notification before the DrainComplete() callback has been made. We should not send the EOS notification until the DrainComplete() callback has been made, otherwise all output samples may not yet have been delivered to the MP4Reader.

pehrsons posted a patch to fix this in bug 1106547. His patch hasn't landed yet.

You should either rebase on top of pehrson's patch, or fix the issue here and tell pehrsons that you'll fix his issue in this bug and that he doesn't need to land his patch.

@@ +632,5 @@
>      } else {
>        {
>          MonitorAutoLock lock(decoder.mMonitor);
>          MOZ_ASSERT(!decoder.mEOS);
> +        decoder.mEOS = true;

Same thing here for EOS, don't reject the promise until the DrainComplete() callback has come in for this stream.

@@ +804,1 @@
>    MOZ_ASSERT(mVideo.mDecoder);

Can you please also assert in MP4Reader::Shutdown() (before calling MediaDecoderReader::Shutdown()) that our promise holders are empty. They should have been rejected when we flushed.

::: dom/media/mediasource/MediaSourceReader.cpp
@@ +294,4 @@
>  }
>  
>  void
>  MediaSourceReader::Shutdown()

We should be able to assert that *our* PromiseHolders are empty/rejected here, because the MediaSourceReader is properly async.

::: dom/media/omx/MediaCodecReader.cpp
@@ +379,5 @@
>  {
>    MOZ_ASSERT(GetTaskQueue()->IsCurrentThreadIn());
>    MOZ_ASSERT(HasAudio());
> +
> +  nsRefPtr<AudioDataPromise> p = mAudioPromise.Ensure(__func__);

Shouldn't you set a monitor on the promise holder? Ditto for the video case.

@@ +496,5 @@
>    }
> +  else if (AudioQueue().AtEndOfStream()) {
> +    mAudioPromise.Reject(END_OF_STREAM, __func__);
> +  } else {
> +    MOZ_CRASH("Dropping promise on the floor");

You can't assume that DecodeAudioDataSync() always produces output; it blocks waiting on output from the platform's decoder, and if it times out it dispatches a task to call DecodeAudioDataTask again. So you shouldn't fail here.

Ideally we shouldn't be blocking on the task queue, but that should be someone else's problem.

@@ +518,5 @@
>    }
> +  else if (VideoQueue().AtEndOfStream()) {
> +    mVideoPromise.Reject(END_OF_STREAM, __func__);
> +  } else {
> +    MOZ_CRASH("Dropping promise on the floor");

Same comment about that from the audio case about not being able to assume failure here applies.

@@ +523,3 @@
>    }
>    return result;
>  }

Can you also assert in the Shutdown() implementation here that the PromiseHolders are empty, they should have been rejected.
Attachment #8530670 - Flags: review?(cpearce)
Attachment #8530670 - Flags: review-
Attachment #8530670 - Flags: feedback?(brsun)
Attachment #8530670 - Flags: feedback?(bechen)
Comment on attachment 8530671 [details] [diff] [review]
Part 5 - Replace AudioDecodeRendezvous with promise-y goodness. v1

Review of attachment 8530671 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great. Someone who works on WebAudio should OK this too.
Attachment #8530671 - Flags: review?(karlt)
Attachment #8530671 - Flags: review?(cpearce)
Attachment #8530671 - Flags: review+
Blocks: 1107368
(In reply to Chris Pearce (:cpearce) from comment #18)
> I'm concerned about the threadsafety seeming to be optional on the promise
> holder... So will refrain from r+ing it until I get a response on that...

Per IRL discussion, we need it to be null sometimes in the shutdown case (at least for now). I'll try to also use the monitor for the MediaDecoderReader base class, but that's an issue with the next patch, rather than this one. But I think it's also reasonable to imagine having a MediaPromiseHolder in a struct with serialized access (and no monitor), so I'm not sure we should always force a monitor to be used.

I agree that there may be improvements we can make to nudge consumers towards safer practices, but I don't think we should block this bug over it. I've filed bug 1107368 as a followup.

> Any reason not to make mThenValue an nsAutoPtr?

Yes - the destructor is private, and only invokable by the friend class.

> Should you assert or enforce that if the holder already holds a promise,

I don't think so. IMO the whole point of the |Ensure| piece is to allow multiple consumers to request the same thing (i.e. wait for data).

> that the promise has the same method name as aMethodName? What does it mean
> for aMethodName to be different to mPromise::mCreationSite?

It would mean that two different APIs both return a the same promise. This could be reasonable if two different APIs serviced the same underlying request. For example, we might have two methods to wait for some network data, one of which kicks off the load if it isn't already happening, and the other of which just rejects the promise immediately if it isn't already happening.
Comment on attachment 8530669 [details] [diff] [review]
Part 3 - Implement Media Promises. v1

Review of attachment 8530669 [details] [diff] [review]:
-----------------------------------------------------------------

OK, I look forward to the follow ups.
Attachment #8530669 - Flags: review?(cpearce) → review+
Comment on attachment 8530671 [details] [diff] [review]
Part 5 - Replace AudioDecodeRendezvous with promise-y goodness. v1

Can you manually test with the URL in bug 936317 comment 9, please to check
that this doesn't reintroduce 1026325?

>+ * This class is used to decode media buffers on the decode threadpool. It
>+ * doesn't hold anything for the time being, bug will when we fix bug 1104964.

*but* will

But, bug 1104964 is now fixed.

I guess it doesn't need to hold anything, or is there value changing things in
the future to keep the task queue on the AudioContext?

>-  nsCOMPtr<nsIThreadPool> mThreadPool;

I expect '#include "nsIThreadPool.h"' is no longer necessary.

>+MediaBufferDecoder::Shutdown()
>+{
> }

Follow-up for removal of Shutdown() or rethink the existence of
MediaBufferDecoder on AudioContext?  Or reference the follow-up bug if you expect this to be needed again?

>+  if (aReason == MediaDecoderReader::DECODE_ERROR) {
>+    mDecoderReader->Shutdown();
>+    ReportFailureOnMainThread(WebAudioDecodeJob::InvalidContent);
>+  } else {
>+    MOZ_ASSERT(aReason == MediaDecoderReader::END_OF_STREAM);
>+    FinishDecode();
>   }
>+}
>+
>+void
>+MediaDecodeTask::FinishDecode()
>+{
>   mDecoderReader->Shutdown();

Seems we could have a single mDecoderReader->Shutdown() call before the
aReason test.
Attachment #8530671 - Flags: review?(karlt) → review+
(In reply to Karl Tomlinson (back Dec 9 :karlt) from comment #23)
> I guess it doesn't need to hold anything, or is there value changing things
> in
> the future to keep the task queue on the AudioContext?

Good catch, Karl. That part was actually obsolete because I reordered bug 1104964 to land under this one - so this entire class can go. I renamed MediaBufferDecoder::AsyncDecodeMedia to mozilla::AsyncDecodeWebAudio.

> Seems we could have a single mDecoderReader->Shutdown() call before the
> aReason test.

We could, but I think it would actually be less clear, because it would violate the pattern used by all of the other error cases.
(In reply to Karl Tomlinson (back Dec 9 :karlt) from comment #23)
> Can you manually test with the URL in bug 936317 comment 9, please to check
> that this doesn't reintroduce 1026325?

Confirmed.
Chris and I went through review feedback on the bus together.
Attachment #8530670 - Attachment is obsolete: true
Attachment #8530670 - Flags: feedback?(brsun)
Attachment #8530670 - Flags: feedback?(bechen)
Attachment #8532933 - Flags: review+
Attachment #8532933 - Flags: feedback?(brsun)
Attachment #8532933 - Flags: feedback?(bechen)
There was one orange on the inbound landing:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=4420628&repo=mozilla-inbound

This makes total sense - the current code races between the asynchronous shutdown
of the borrowed task queue and the shutdown of the subdecoders that might need
to use it to reject their promises.

So I landed the following fix:
https://hg.mozilla.org/integration/mozilla-inbound/rev/47c97b8b490c

Nobody is around right now to review it, so I'm flagging chris for retroactive
review.
Attachment #8532955 - Flags: review?(cpearce)
Attachment #8532955 - Flags: review?(cpearce) → review+
Blocks: 1108421
Attachment #8532970 - Flags: review?(cpearce) → review+
Comment on attachment 8532968 [details] [diff] [review]
Part 7 - Followup to avoid null-derefing when promises have already been rejected during shutdown. v1 rpending=cpearce

Review of attachment 8532968 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/MediaDecoderReader.h
@@ +315,5 @@
>    bool mAudioDiscontinuity;
>    bool mVideoDiscontinuity;
>    bool mShutdown;
> +
> +public:

Doesn't really need to be public...
Attachment #8532968 - Flags: review?(cpearce) → review+
(In reply to Chris Pearce (:cpearce) from comment #31)
> Doesn't really need to be public...

Yeah, it could also be protected. Not sure if it makes much of a difference, but am happy to change it if you want.
Blocks: 1108701
Moving deps to bug 1097823.
No longer blocks: 821062, 1107368, 1108421
Blocks: 1108707
Attachment #8532970 - Attachment is obsolete: true
Attachment #8532968 - Attachment is obsolete: true
Attachment #8532955 - Attachment is obsolete: true
Attachment #8532933 - Attachment is obsolete: true
Attachment #8532933 - Flags: feedback?(brsun)
Attachment #8532933 - Flags: feedback?(bechen)
Attachment #8530671 - Attachment is obsolete: true
Blocks: 1109437
Comment on attachment 8530667 [details] [diff] [review]
Part 1 - Move NotDecodedReason from RequestSampleCallback to MediaDecoderReader. v1

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Less consistent playback support, video sites more
likely to use flash.
[Describe test coverage new/current, TBPL]: Landed on m-c.
[Risks and why]: Moderate as these patches affect normal video playback as well as the MediaSource extension. However, they are important for being able to backport future bug fixes from m-c.
[String/UUID change made/needed]: None
Attachment #8530667 - Flags: approval-mozilla-aurora?
Comment on attachment 8530668 [details] [diff] [review]
Part 2 - Implement MediaTaskQueue::ForceDispatch. v1

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Less consistent playback support, video sites more
likely to use flash.
[Describe test coverage new/current, TBPL]: Landed on m-c.
[Risks and why]: Moderate as these patches affect normal video playback as well as the MediaSource extension. However, they are important for being able to backport future bug fixes from m-c.
[String/UUID change made/needed]: None
Attachment #8530668 - Flags: approval-mozilla-aurora?
Comment on attachment 8530669 [details] [diff] [review]
Part 3 - Implement Media Promises. v1

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Less consistent playback support, video sites more
likely to use flash.
[Describe test coverage new/current, TBPL]: Landed on m-c.
[Risks and why]: Moderate as these patches affect normal video playback as well as the MediaSource extension. However, they are important for being able to backport future bug fixes from m-c.
[String/UUID change made/needed]: None
Attachment #8530669 - Flags: approval-mozilla-aurora?
Attachment #8530667 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8530668 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8530669 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.