Closed Bug 1198664 Opened 6 years ago Closed 6 years ago

[Aries][Video] GonkMediaDataDecoder keeps calling MediaDataDecoderCallback::InputExhausted() when video is paused.

Categories

(Core :: Audio/Video: Playback, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox43 --- affected
firefox44 --- fixed

People

(Reporter: jhlin, Assigned: jhlin)

References

Details

Attachments

(2 files, 12 obsolete files)

31.30 KB, patch
jhlin
: review+
Details | Diff | Splinter Review
22.56 KB, patch
jhlin
: review+
Details | Diff | Splinter Review
Steps to reproduce:
1. Turn on media log modules.
2. Play a video in Video app.
3. Pause it.
4. 'adb logcat' shows MediaFormatReader::NotifyInputExhausted() is called continuously.

Root cause:
On Aries, there are only 3 active output buffers for video decoding, and when paused they're all hold by Gecko.
This problem won't happen on Flame because it has 7 active buffers.
I'm not sure I follow the cause to effect.

It's up to the decoder to not call InputExhausted too many times ; if not the reader will keep sending frames.

Maybe have something similar to what I did on the mac in bug 1198094.

But I thought the gonk PDM only called InputExhausted when GonkMediaDataDecoder::ProcessOutput returns NS_ERROR_NOT_AVAILABLE
(In reply to Jean-Yves Avenard [:jya] from comment #1)
> I'm not sure I follow the cause to effect.
> 
> It's up to the decoder to not call InputExhausted too many times ; if not
> the reader will keep sending frames.
> 
> Maybe have something similar to what I did on the mac in bug 1198094.
> 
> But I thought the gonk PDM only called InputExhausted when
> GonkMediaDataDecoder::ProcessOutput returns NS_ERROR_NOT_AVAILABLE

Exactly. In this case ProcessOutput() always returns NS_ERROR_NOT_AVAILABLE because all output buffers are hold by Gecko so InputExhausted() will be called.
Attached file Use all meta data output buffers. (obsolete) —
5 output buffers are allocated but only 3 are used because of how mMetaDataBuffersToSubmit is determined. Change the calculation to match the case of regular output buffers.
Attachment #8652794 - Flags: review?(sotaro.ikeda.g)
Make wasted buffers useful and avoid buffer starvation situation.
Attachment #8652800 - Flags: review?(jyavenard)
Attachment #8652800 - Flags: review?(bwu)
Rephrase the summary to describe the real cause of this problem.
Summary: [Aries][Video] MediaFormatReader keeps sending input/output requests to decoder when paused. → [Aries][Video] GonkMediaDataDecoder keeps calling MediaDataDecoderCallback::InputExhausted() when video is paused.
See Also: → 1009410
Comment on attachment 8652794 [details] [review]
Use all meta data output buffers.

Nice!
Attachment #8652794 - Flags: review?(sotaro.ikeda.g) → review+
Comment on attachment 8652800 [details] [diff] [review]
Ask stagefright to use all allocated buffers.

Review of attachment 8652800 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM!
Attachment #8652800 - Flags: review?(bwu) → review+
Comment on attachment 8652800 [details] [diff] [review]
Ask stagefright to use all allocated buffers.

Review of attachment 8652800 [details] [diff] [review]:
-----------------------------------------------------------------

I have no idea what that would do :)
Attachment #8652800 - Flags: review?(jyavenard)
Carry r+ from Blake.
Attachment #8652800 - Attachment is obsolete: true
Attachment #8653299 - Flags: review+
Luckily bug 1197075 landed before me and it shows using wasted output buffers is not good enough if too many frames will be decoded ahead.
After more thinking and discussing with Blake, it seems calling InputExhausted() when decoder returns NS_ERROR_NOT_AVAILABLE even when there is some input pending doesn't make much sense. We should distinguish the case that no output available because of no input provided, and the case that no output available because out of buffer.
Attachment #8652794 - Attachment is obsolete: true
Attachment #8653299 - Attachment is obsolete: true
Attachment #8653400 - Flags: review?(bwu)
How do you distinguish the case where you have pending input, but they aren't sufficient to allow decoding a full frame ?

This can happen with B-Frames for example. or is this patch going to handle it ?

Maybe we could modify so we don't always attempt to decode some frames ahead of time?

This would have happened without bug 1197075 ; like if the MDSM frames queue size dropped below 1 or if you were seeking (this was happening with bug 1171257).
With bug 1197075 we will now always decode ahead of needing the frame, because much less than what 1171257 could have done (like calling the decoder with thousands of input)
(In reply to Jean-Yves Avenard [:jya] from comment #11)
> How do you distinguish the case where you have pending input, but they
> aren't sufficient to allow decoding a full frame ?
 Good point.

> 
> This can happen with B-Frames for example. or is this patch going to handle
> it ?
  You're right. Checking pending input along won't handle this case. Will change it to check the return value of Input() instead.

> 
> Maybe we could modify so we don't always attempt to decode some frames ahead
> of time?

  I don't think it's harmful to 'attempt to' decode frames ahead. That won't succeed when we're running out of input and output buffers anyway.

> 
> This would have happened without bug 1197075 ; like if the MDSM frames queue
> size dropped below 1 or if you were seeking (this was happening with bug
> 1171257).
> With bug 1197075 we will now always decode ahead of needing the frame,
> because much less than what 1171257 could have done (like calling the
> decoder with thousands of input)
 
  It's not that bug 1197075 causes problem, just that it shows the previous fix won't actually fix the issue. In fact I'm very thankful that it's landed before my useless code is checked in. :)
Correction. Actually we should check if decoder is out of output buffer.
I wonder if this is what's causing some intermittent timeout that mostly happens on b2g, especially the EME tests.

With bug 1171257 under some circumstances it started to feed the decoder like crazy with compressed frames. Could have been in the thousands are demuxing a frame with MSE is extremely fast. That could have fed megabytes of data to the decoder at once waiting for decoded frames to come out
Priority: -- → P1
After thinking twice, maybe we should check if MediaCodec can notify us once input buffer is available and we can fill in more data. That would be better to make decoder let us know. 

John, 
I remember in the past you mentioned that we can register a callback in MediaCodec to know that. Is it supported in KK 4.4.2 which we are using now?
(In reply to Blake Wu [:bwu][:blakewu] from comment #15)
> After thinking twice, maybe we should check if MediaCodec can notify us once
> input buffer is available and we can fill in more data. That would be better
> to make decoder let us know. 
> 
> John, 
> I remember in the past you mentioned that we can register a callback in
> MediaCodec to know that. Is it supported in KK 4.4.2 which we are using now?
 
 Unfortunately, no. However MediaCodec::requestActivityNotification() might be used for that.
Comment on attachment 8653400 [details] [diff] [review]
Don't ask for more input when there is some pending.

Review of attachment 8653400 [details] [diff] [review]:
-----------------------------------------------------------------

Cancel the review temporarily and further discuss to see if we can find the better solution.
Attachment #8653400 - Flags: review?(bwu)
(In reply to John Lin [:jolin][:jhlin] from comment #16)
> (In reply to Blake Wu [:bwu][:blakewu] from comment #15)
> > After thinking twice, maybe we should check if MediaCodec can notify us once
> > input buffer is available and we can fill in more data. That would be better
> > to make decoder let us know. 
> > 
> > John, 
> > I remember in the past you mentioned that we can register a callback in
> > MediaCodec to know that. Is it supported in KK 4.4.2 which we are using now?
>  
>  Unfortunately, no. However MediaCodec::requestActivityNotification() might
> be used for that.
Cool. It looks it's a promising solution.
Looks like there're are several state changes that can trigger notification, and requestActivityNotification() doesn't clearly state what happened. I guess that's why Google added new callbacks in L. :(
I will try returning NS_ERROR_NOT_AVAILABLE in GonkVideoDecoderManager::Input() when out of input buffers and not calling InputExhausted() in that case.
Blocks: 1170589
Blocks: 1159343
WIP patch that
- uses a dedicate looper thread for decoding task
- drives input & output by notifications from MediaCodec
- requests one input from reader more than decoder can handle
- returns output buffer as soon as it can
Attachment #8653400 - Attachment is obsolete: true
Attachment #8664009 - Flags: feedback?(bechen)
Attached patch waitFenceOnebyOne.patch (obsolete) — Splinter Review
The patch might have help if we really spend some time for waiting fence.
Redundant code removed.
Attachment #8664009 - Attachment is obsolete: true
Attachment #8664635 - Attachment is obsolete: true
Attachment #8664009 - Flags: feedback?(bechen)
Attachment #8667105 - Flags: review?(bwu)
For MediaCodecProxy, calling InputExhausted() when output is unavailable doesn't seem right. Since there are fixed amount of input buffers, requesting and queuing some(2 in this impl) samples in advance seems a better choice.
To respond to decoder events (so we can send input and pull output as soon as possible), notification is needed and MediaCodecProxy::requestActivityNotification() is the only way in KK. It can only be used with looper so all decoder tasks are moved to the looper too.
Attachment #8667108 - Flags: review?(bwu)
Comment on attachment 8667105 [details] [diff] [review]
Refactor: move common behaviors to base class

Review of attachment 8667105 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me!
Attachment #8667105 - Flags: review?(bwu)
Attachment #8667105 - Flags: review?(ajones)
Attachment #8667105 - Flags: review+
Comment on attachment 8667108 [details] [diff] [review]
Run decoder tasks in dedicate looper thread.

Review of attachment 8667108 [details] [diff] [review]:
-----------------------------------------------------------------

This design looks good to me, just have one question and one suggestion. 
Is it required to protect mWaitOutput with lock?

::: dom/media/platforms/gonk/GonkMediaDataDecoder.cpp
@@ +12,5 @@
>  
>  #include "mozilla/Logging.h"
>  #include <android/log.h>
>  #define GMDD_LOG(...) __android_log_print(ANDROID_LOG_DEBUG, "GonkMediaDataDecoder", __VA_ARGS__)
> +#define INPUT_TIMEOUT_US 0LL

Could you add some comments to explain what 0 stands for? no wait for wait forever?
Thanks!

@@ +140,5 @@
>  }
>  
>  void
>  GonkDecoderManager::onMessageReceived(const sp<AMessage> &aMessage)
>  {

This function becomes a little long to me :-). 
To increase readability, maybe we could put some codes in some functions, like ProcessFlush(), ProcessInput(), ProcessOutput, and etc.

@@ +161,5 @@
> +      } else {
> +        GMDD_LOG("input processed: error#%d", rv);
> +        mDecodeCallback->Error();
> +      }
> +

Remove line.
Comment on attachment 8667105 [details] [diff] [review]
Refactor: move common behaviors to base class

It makes sense for Jean-Yves to review this one.
Attachment #8667105 - Flags: review?(ajones) → review?(jyavenard)
Address review comments.
> Is it required to protect mWaitOutput with lock?
I don't think so. It's accessed only in the looper thread.
Attachment #8667108 - Attachment is obsolete: true
Attachment #8667108 - Flags: review?(bwu)
Attachment #8668220 - Flags: review+
Comment on attachment 8668220 [details] [diff] [review]
Part 2 - run decoder tasks in dedicate looper thread.

Set review request. :-$
Attachment #8668220 - Flags: review+ → review?(bwu)
Comment on attachment 8668220 [details] [diff] [review]
Part 2 - run decoder tasks in dedicate looper thread.

Review of attachment 8668220 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome! It looks good to me. 
This should fix many performance problems.
Attachment #8668220 - Flags: review?(bwu) → review+
Carry r+ from bwu.
Attachment #8668220 - Attachment is obsolete: true
Attachment #8669277 - Flags: review+
Comment on attachment 8667105 [details] [diff] [review]
Refactor: move common behaviors to base class

Review of attachment 8667105 [details] [diff] [review]:
-----------------------------------------------------------------

r+ = me with comments attached.

For the thread-safety issue, please fix it in a follow up patch (preferred) or another new bug.

Thanks.

::: dom/media/platforms/gonk/GonkMediaDataDecoder.cpp
@@ +64,5 @@
> +                           data->Size(),
> +                           data->mTime,
> +                           0);
> +    }
> +    if (rv == OK) {

if (NS_SUCCEEDED(rv))

@@ +65,5 @@
> +                           data->mTime,
> +                           0);
> +    }
> +    if (rv == OK) {
> +      mQueueSample.RemoveElementAt(0);

There seem to be a potential race here should Flush() be called during the mDecoder->Input() . The mutex is released during that time. Should Flush() be called then, mQueueSample will be clear. You then attempt to remove the first element -> boom.

So you should test that the queue hasn't been cleared while not holding the lock and handle that case appropriately.

Having said that, you only ever queue one sample at a time as as soon as it's added you remove it. So do you even need a queue of samples here? Or is mDecoder->Input() taking too long a time to require a queue and make GonkDecoderMananger::Input reentrant?

I realise that this is the same code as there was before, just moved ; but better handle the problem now or in a followup bug/patch

@@ +82,5 @@
> +nsresult
> +GonkDecoderManager::Flush()
> +{
> +  if (mDecoder == nullptr) {
> +    GMDD_LOG("Decoder is not inited");

Decoder isn't initialized

::: dom/media/platforms/gonk/GonkMediaDataDecoder.h
@@ +29,1 @@
>    // Add samples into OMX decoder or queue them if decoder is out of input buffer.

In the code, you check that this pointer isn't nullptr or otherwise assume EOS, this should be mentioned here.

@@ +61,5 @@
> +    , mLastDecodedTime(0)
> +    , mDecodeCallback(nullptr)
> +  {}
> +
> +  bool InitLoopers(const char* aName);

using an enum would be more elegant IMHO, and so it's up to InitLoopers to set the type with ALooper rather than each client

@@ +63,5 @@
> +  {}
> +
> +  bool InitLoopers(const char* aName);
> +
> +  virtual void onMessageReceived(const android::sp<android::AMessage> &aMessage) override;

virtual is redundant

@@ +79,5 @@
>  
>    MozPromiseHolder<InitPromise> mInitPromise;
>  
> +  Mutex mMutex; // Protects mQueueSample.
> +  // An queue that stores the samples which are waiting to be sent into mDecoder.

A queue that stores the samples waiting to be sent to mDecoder

As it can contain more than one sample, and those are queued, mQueuedSamples seems more logical

@@ +84,5 @@
> +  // Empty element means EOS and there shouldn't be any sample be queued after it.
> +  // Samples are queued in caller's thread and dequeued in mTaskLooper.
> +  nsTArray<nsRefPtr<MediaRawData>> mQueueSample;
> +
> +  int64_t mLastDecodedTime;  // The last decoded frame presentation time.

the name is confusing between presentation time and decode time. mLastTime would be less ambivalent.
Attachment #8667105 - Flags: review?(jyavenard) → review+
Address review comments and carry r+.
Attachment #8667105 - Attachment is obsolete: true
Attachment #8670154 - Flags: review+
Rebase on part 1. Also remove some unused data members.
Attachment #8669277 - Attachment is obsolete: true
Attachment #8670156 - Flags: review+
(In reply to Jean-Yves Avenard [:jya] from comment #31)
> Comment on attachment 8667105 [details] [diff] [review]
> Refactor: move common behaviors to base class
> 
> Review of attachment 8667105 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r+ = me with comments attached.
> 
> For the thread-safety issue, please fix it in a follow up patch (preferred)
> or another new bug.
> 
> Thanks.

 Thanks for the review. The thread-safety issue is fixed in the part 2 patch, which doesn't unlock during Input(). (Not needed anymore because with 0 timeout value, it now returns immediately).

> 
> ::: dom/media/platforms/gonk/GonkMediaDataDecoder.cpp
> @@ +64,5 @@
> > +                           data->Size(),
> > +                           data->mTime,
> > +                           0);
> > +    }
> > +    if (rv == OK) {
> 
> if (NS_SUCCEEDED(rv))

 Don't think this applies here because rv is not a nsresult.

> 
> Having said that, you only ever queue one sample at a time as as soon as
> it's added you remove it. So do you even need a queue of samples here? Or is
> mDecoder->Input() taking too long a time to require a queue and make
> GonkDecoderMananger::Input reentrant?
> 

 There are chances that Input() will be failed because decoder is out of input buffer (i.e., the else block below). When that happens the sample will stay in queue.


 The rest are fixed in updated patch.
Comment on attachment 8670154 [details] [diff] [review]
Part 1 - Refactor: move common behaviors to base class

Review of attachment 8670154 [details] [diff] [review]:
-----------------------------------------------------------------

I don't see any of my comments addressed other than the grammatical mistake on a comment :(

::: dom/media/platforms/gonk/GonkMediaDataDecoder.cpp
@@ +67,5 @@
> +                           data->mTime,
> +                           0);
> +    }
> +    if (rv == OK) {
> +      mQueuedSamples.RemoveElementAt(0);

where have you addressed my comment about possible race on mQueuedSamples?
ohh... I had started before you posted the update.
thanks for that.
Fix the bustage by uploading the correct patch. :-$
Attachment #8670156 - Attachment is obsolete: true
Attachment #8670639 - Flags: review+
Sorry about that. Please try again.
Keywords: checkin-needed
Failed to upload the correct patch. :(
Keywords: checkin-needed
The one that really fix bustage.
Attachment #8670639 - Attachment is obsolete: true
Attachment #8670642 - Flags: review+
Keywords: checkin-needed
Blocks: 1210232
You need to log in before you can comment on or make changes to this bug.