Closed Bug 1258922 Opened 5 years ago Closed 5 years ago

test_BufferingWait_mp4.html times out for audio promise is not resolved or rejected

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox47 --- fixed
firefox48 --- fixed
firefox49 --- fixed

People

(Reporter: jwwang, Assigned: jya)

References

Details

(Keywords: intermittent-failure)

Attachments

(4 files)

repro steps:
1. apply the patch to print debug logs: https://hg.mozilla.org/try/rev/ededc1260512
2. push to try with: "try: -b o -p win32 -u mochitest-media[Windows 7] -t none[Windows 7]"
3. retrigger the mda job for 10 or 15 times.

results:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=08df968a0bc8&selectedJob=18409668

3 timeouts in test_BufferingWait_mp4.html out of 16 runs.
Blocks: 1237806
https://treeherder.mozilla.org/logviewer.html#?job_id=18413987&repo=try

Here are some logs from https://hg.mozilla.org/try/file/ededc1260512/dom/media/MediaDecoderStateMachine.cpp#l753

08:29:00     INFO -  [MediaPlayback #2]: D/MediaDecoder Decoder=f751000 OnNotDecoded (aType=1, aReason=0, t=3981405, ae=4226030, ve=4099999, mAudioStatus=pending, mVideoStatus=pending)

aType=1 means video
aReason=0 means END_OF_STREAM
mAudioStatus=pending means MDSM is waiting for the audio promise to be resolved or rejected.

It looks like MDSM never receives END_OF_STREAM for audio and playback never reaches the end.
Flags: needinfo?(jyavenard)
Logs of the green try:
http://archive.mozilla.org/pub/firefox/try-builds/jwwang@mozilla.com-08df968a0bc8e73c10c2fe5ee08c9d629121b61a/try-win32/try_win7-all_test-mochitest-media-bm119-tests1-windows-build83.txt.gz

07:58:01     INFO -  [MediaPlayback #4]: D/MediaDecoder Decoder=1c6cb300 OnNotDecoded (aType=0, aReason=0, t=6353560, ae=8359182, ve=4099999, mAudioStatus=pending, mVideoStatus=idle)

It is interesting to note that when END_OF_STREAM is received for audio, the audio end time is 8359182 which doesn't make sense for the end time should be around 4s.
(In reply to JW Wang [:jwwang] from comment #1)
> It looks like MDSM never receives END_OF_STREAM for audio and playback never
> reaches the end.

sounds like bug 1233639 except for audio.
Flags: needinfo?(jyavenard)
jwwang, since this happen quite often now can you fix this test failure ?
Flags: needinfo?(jwwang)
Priority: -- → P2
Hi Jya,
Do you have idea about the cause?
Flags: needinfo?(jwwang) → needinfo?(jyavenard)
not really. in bug 1233639 the problem occurs when the MDSM fails to request new data after the MediaFormatReader returns the last sample.

The MFR won't know that it hit EOS until then.

It's important that the MDSM keeps requesting new data until it sees EOS.

comment 1 however indicates that the MDSM is still waiting for the MFR to resolve a promise. I can't imagine how this could happen unless the WMF decoder didn't output that last frame.

kamidphish mentioned seeing something similar with the WMF decoder, where one frame was never returned.
Flags: needinfo?(jyavenard)
I will try to find a workaround to alleviate the test timeout before we have a right fix.
(In reply to JW Wang [:jwwang] from comment #2)
> Logs of the green try:
> http://archive.mozilla.org/pub/firefox/try-builds/jwwang@mozilla.com-
> 08df968a0bc8e73c10c2fe5ee08c9d629121b61a/try-win32/try_win7-all_test-
> mochitest-media-bm119-tests1-windows-build83.txt.gz
> 
> 07:58:01     INFO -  [MediaPlayback #4]: D/MediaDecoder Decoder=1c6cb300
> OnNotDecoded (aType=0, aReason=0, t=6353560, ae=8359182, ve=4099999,
> mAudioStatus=pending, mVideoStatus=idle)
> 
> It is interesting to note that when END_OF_STREAM is received for audio, the
> audio end time is 8359182 which doesn't make sense for the end time should
> be around 4s.

one of the segment has data from 0 to 0.8359182, sounds like a weird coincidence that the time returned be so similar
This seems to have spiked in frequency around April 1 or 2.  Any ideas why this is suddenly so much worse?
Flags: needinfo?(jyavenard)
Flags: needinfo?(jwwang)
I'm going to have a shot at it today...
Assignee: nobody → jyavenard
Flags: needinfo?(jyavenard)
I've been unable to reproduce the problem locally on either mac, Windows 10 VM 64 bits, on Windows 7 32 bits.

Had it running with --run-until-failure --repeat 1000 on each machines.
Attachment #8743765 - Flags: review?(jyavenard)
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

https://reviewboard.mozilla.org/r/48023/#review44815
Attachment #8743765 - Flags: review?(jyavenard) → review+
Can we keep the test still running on Linux which never fails this test?
https://hg.mozilla.org/mozilla-central/rev/6a407cbfffb0
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
I will re-add the test.

I believe the intermittent issue will be fixed by P1 in bug 1264199
Resolution: FIXED → DUPLICATE
Duplicate of bug: 1264199
sorry, that should read P9 in the comment above.
Unfortunately, it has happened again in a win7 PGO mode.

https://treeherder.mozilla.org/logviewer.html#?job_id=26625248&repo=mozilla-inbound

The new debugging logs are interesting:

 02:38:17     INFO -  [392] WARNING: Decoder=187e57b0 metadata: channels=2 rate=22050 hasAudio=1 hasVideo=1, state: mPlayState=PLAY_STATE_PLAYING mIsDormant=0, mShuttingDown=0:
 02:38:17     INFO -  [392] WARNING: Decoder=187e57b0 reader data:
 02:38:17     INFO -  audio decoder: wmf audio decoder
 02:38:17     INFO -  audio frames decoded: 91
 02:38:17     INFO -  video decoder: D3D9 Hardware Decoder
 02:38:17     INFO -  hardware video decoding: enabled
 02:38:17     INFO -  video frames decoded: 167 (skipped:0)
 02:38:17     INFO -  Dumping data for demuxer 976600:
 02:38:17     INFO -  	Dumping Audio Track Buffer(audio/mp4a-latm): - mLastAudioTime: 0.046439
 02:38:17     INFO -  		NumSamples:87 Size:17834 NextGetSampleIndex:1 NextInsertionIndex:69
 02:38:17     INFO -  		Buffered: ranges=[(0.000000, 4.100000)]
 02:38:17     INFO -  	Dumping Video Track Buffer(video/avc) - mLastVideoTime: 4.100000
 02:38:17     INFO -  		NumSamples:120 Size:129101 NextGetSampleIndex:120 NextInsertionIndex:96
 02:38:17     INFO -  		Buffered: ranges=[(0.095000, 4.100000)]

for the audio track, NextGetSampleIndex is 1, mLastAudioTime is 0.046439 ; this indicates that the MFR has only requested a single sample from the audio source buffer.

The only way it could get there is either:
1- The MediaSourceDemuxer didn't resolve its promise somehow, and the MFR is still waiting for it.
2- The MFR has decoded a sample, but hasn't started demuxing the 2nd one. This is also surprising as with the decode ahead machinery, as soon as a frame is decoded, it will start decoding another, always trying to keep two samples decoded in advance
3- The Windows WMF audio decoder didn't output a sample and didn't notify that it needed more data.

3, may not even matter however as IIRC, the MFR will continue to feed the decoder regardless of the NeedInput flag or the decoder outputting system, so long as it has fed the decoder >= 2 frames.

so issue is either 1 or 2 as far as I can tell.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Another case where only a single frame of audio was demuxed:
 17:54:06     INFO -  [1192] WARNING: Decoder=1b61c260 metadata: channels=2 rate=22050 hasAudio=1 hasVideo=1, state: mPlayState=PLAY_STATE_PLAYING mIsDormant=0, mShuttingDown=0:
 17:54:06     INFO -  [1192] WARNING: Decoder=1b61c260 reader data:
 17:54:06     INFO -  audio decoder: wmf audio decoder
 17:54:06     INFO -  audio frames decoded: 91
 17:54:06     INFO -  video decoder: D3D9 Hardware Decoder
 17:54:06     INFO -  hardware video decoding: enabled
 17:54:06     INFO -  video frames decoded: 167 (skipped:0)
 17:54:06     INFO -  Dumping data for demuxer 776600:
 17:54:06     INFO -  	Dumping Audio Track Buffer(audio/mp4a-latm): - mLastAudioTime: 0.046439
 17:54:06     INFO -  		NumSamples:87 Size:17834 NextGetSampleIndex:1 NextInsertionIndex:69
 17:54:06     INFO -  		Buffered: ranges=[(0.000000, 4.100000)]
 17:54:06     INFO -  	Dumping Video Track Buffer(video/avc) - mLastVideoTime: 4.100000
 17:54:06     INFO -  		NumSamples:120 Size:129101 NextGetSampleIndex:120 NextInsertionIndex:96
 17:54:06     INFO -  		Buffered: ranges=[(0.095000, 4.100000)]

It would be good to amend the logs to show if there are pending promises: the MDSM Request*Data and the MFR either the demuxer or the MediaDataDecoder ..

need to find out where the MediaFormatReader is stuck and hasn't demuxed more than one frame.
[392] WARNING: Decoder=187e57b0 GetMediaTime=4218730 GetClock=4218730 mState=DECODING mPlayState=3 mDecodingFirstFrame=0 IsPlaying=1 mAudioStatus=pending mVideoStatus=idle mDecodedAudioEndTime=4226030 mDecodedVideoEndTime=4099999 mIsAudioPrerolling=0 mIsVideoPrerolling=0: 

We already show the status for MDSM Request*Data promises. It should be easy to add logs for MFR.
Flags: needinfo?(jwwang)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f3428ad82ae16ccdec05124dddeb1fb883bd008

I will monitor this one closely. It adds full output of the MFR state that is relevant. 

(note: Code isn't thread-safe but really we don't care as obviously it's called when we're stuck anyway and making this code thread-safe is not advised as the MediaFormatReader was designed to never run on multiple threads at once.)

Feel free to restart the test if all completed successfully, test shouldn't run very long, I've removed all mochitests but 3 MSE ones.
(In reply to Jean-Yves Avenard [:jya] from comment #49)
> (note: Code isn't thread-safe but really we don't care as obviously it's
> called when we're stuck anyway and making this code thread-safe is not
> advised as the MediaFormatReader was designed to never run on multiple
> threads at once.)
I am worried you might read wrong/staled values from an inconsistent memory view due to data race.
(In reply to JW Wang [:jwwang] from comment #50)
> I am worried you might read wrong/staled values from an inconsistent memory
> view due to data race.

it's very doubtful.

The bug above can only happen if the MFR is now idle and hasn't processed the next task, either because it's waiting on a pending promise, a task failed to be dispatch etc...

By the time the test timeout, the MFR must be idled.

If we do intend to have this committed, seeing that this is just for debugging purposes it will be fine. The cost to make access to those members thread-safe is just too high
I think bug 1268868 gave me a way to reproduce the problem..

looking at it, it's a wonder this test ever worked that well...
my bad, totally different issue.

had a timeout in the test again, this time:
 09:28:29     INFO -  4 INFO TEST-UNEXPECTED-FAIL | dom/media/mediasource/test/test_BufferingWait_mp4.html | uncaught exception - NotSupportedError: Operation is not supported at @http://mochi.test:8888/tests/dom/media/mediasource/test/test_BufferingWait_mp4.html:20:14
 09:28:29     INFO -  EventListener.handleEvent*@http://mochi.test:8888/tests/dom/media/mediasource/test/test_BufferingWait_mp4.html:16:3
 09:28:29     INFO -  bootstrapTest@http://mochi.test:8888/tests/dom/media/mediasource/test/mediasource.js:16:5
 09:28:29     INFO -  Async*runWithMSE/<@http://mochi.test:8888/tests/dom/media/mediasource/test/mediasource.js:20:5
 09:28:29     INFO -  SimpleTest._newCallStack/rval@http://mochi.test:8888/tests/SimpleTest/SimpleTest.js:144:17
 09:28:29     INFO -  EventHandlerNonNull*this.addLoadEvent@http://mochi.test:8888/tests/SimpleTest/SimpleTest.js:169:13
 09:28:29     INFO -  @http://mochi.test:8888/tests/SimpleTest/SimpleTest.js:1323:5
 09:28:29     INFO -      simpletestOnerror@SimpleTest/SimpleTest.js:1563:11
 09:28:29     INFO -      OnErrorEventHandlerNonNull*@SimpleTest/SimpleTest.js:1543:1
 09:28:29     INFO -  JavaScript error: http://mochi.test:8888/tests/dom/media/mediasource/test/test_BufferingWait_mp4.html, line 20: NotSupportedError: Operation is not supported
 09:28:29     INFO -  MEMORY STAT | vsize 774MB | vsizeMaxContiguous 4186512MB | residentFast 293MB | heapAllocated 101MB
 09:28:29     INFO -  5 INFO TEST-OK | dom/media/mediasource/test/test_BufferingWait_mp4.html | took 1649ms
 09:28:29     INFO -  6 INFO TEST-START | dom/media/mediasource/test/test_WaitingOnMissingData_mp4.html
 09:28:36     INFO -  JavaScript error: , line 0: NotSupportedError: Operation is not supported
 09:33:58     INFO -  Not taking screenshot here: see the one that was previously logged
 09:33:58     INFO -  7 INFO TEST-PASS | dom/media/mediasource/test/test_WaitingOnMissingData_mp4.html | Receive a sourceopen event
 09:33:58     INFO -  8 INFO TEST-UNEXPECTED-FAIL | dom/media/mediasource/test/test_WaitingOnMissingData_mp4.html | Test timed out.

the error occurs at var sb = ms.addSourceBuffer("video/mp4");

This can only fail if the machine doesnt have a h264 decoder. It looks like machine https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slave.html?name=t-w864-ix-120 doesn't have Windows Media Expension pack installed.
This one also doesn't make much sense:
 08:26:26     INFO -  23 INFO SourceBuffer buffered ranges grew from TimeRanges: [0.095, 1.625396)[3.298333, 4.040272) to TimeRanges: [0.095, 2.414875)[3.298333, 4.040272)
 08:26:26     INFO -  24 INFO Got a waiting event at 2.46678
 08:26:26     INFO -  25 INFO TEST-PASS | dom/media/mediasource/test/test_BufferingWait_mp4.html | Reached target time of: 2.36843
 08:26:26     INFO -  26 INFO TEST-PASS | dom/media/mediasource/test/test_BufferingWait_mp4.html | Should not spend an inordinate amount of time buffering: 0.837
 08:26:26     INFO -  27 INFO TEST-PASS | dom/media/mediasource/test/test_BufferingWait_mp4.html | fetchWithXHR load uri='bipbop/bipbop4.m4s' status=200
 08:26:26     INFO -  28 INFO Loading buffer: [0, 23112)
 08:26:26     INFO -  29 INFO SourceBuffer buffered ranges grew from TimeRanges: [0.095, 2.414875)[3.298333, 4.040272) to TimeRanges: [0.095, 4.040272)
 08:26:26     INFO -  30 INFO TEST-UNEXPECTED-FAIL | dom/media/mediasource/test/test_BufferingWait_mp4.html | Test timed out.

The waiting event was fired *past* the end of the buffered data (2.46s vs 2.41s).

The details being:
08:26:27     INFO -  video frames decoded: 168 (skipped:0)
08:26:27     INFO -  MFR: audio: ni=0 no=0 ie=1 demuxr:0 demuxq:0 decoder:0 tt:-1.000000 in:86 out:91 qs=4294967295 pending:0 waiting:1 sid:0
08:26:27     INFO -  MFR: video: ni=0 no=0 ie=1 demuxr:0 demuxq:0 decoder:0 tt:-1.000000 in:168 out:168 qs=48 pending:0 waiting:0 sid:0
08:26:27     INFO -  Dumping data for demuxer 6f22a2ec00:
08:26:27     INFO -  	Dumping Audio Track Buffer(audio/mp4a-latm): - mLastAudioTime: 0.000000
08:26:27     INFO -  		NumSamples:87 Size:22706 NextGetSampleIndex:0 NextInsertionIndex:69
08:26:27     INFO -  		Buffered: ranges=[(0.000000, 4.100000)]
08:26:27     INFO -  	Dumping Video Track Buffer(video/avc) - mLastVideoTime: 4.100000
08:26:27     INFO -  		NumSamples:120 Size:135821 NextGetSampleIndex:120 NextInsertionIndex:96
08:26:27     INFO -  		Buffered: ranges=[(0.095000, 4.100000)]
08:26:27     INFO -  :
08:26:27     INFO -  MEMORY STAT | vsize 757MB | vsizeMaxContiguous 4281433MB | residentFast 69MB | heapAllocated 64MB
08:26:27     INFO -  [2920] WARNING: Decoder=6f1b179080 GetMediaTime=4218730 GetClock=4218730 mState=DECODING mPlayState=3 mDecodingFirstFrame=0 IsPlaying=1 mAudioStatus=waiting mVideoStatus=idle mDecodedAudioEndTime=4226030 mDecodedVideoEndTime=4099999 mIsAudioPrerolling=0 mIsVideoPrerolling=0:

the MFR shows "in:86 out:91" so it had received more frames from the WMF audio decoder than it asked to decode, this cause the queue size number (which is unsigned) to overflow, and no decoding will ever be requested again.

However, something weird is that the MDSM is waiting for an audio promise to be resolved, while the log "MFR: audio: ni=0 no=0" indicates that it has no audio MediaDataPromise to resolve, so something is amiss there.

Going to investigate for a start how the WMF audio decoder managed to return more frames than it was fed...
See Also: → 1269177
See Also: → 1269178
Depends on: 1269230
Attachment #8743765 - Attachment is obsolete: true
Review commit: https://reviewboard.mozilla.org/r/50707/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/50707/
Attachment #8743765 - Attachment description: MozReview Request: Bug 1258922: [MSE] Disable noisy test. r=test-only → MozReview Request: Bug 1258922: [MSE] P1. Do not go over gap when attempting to find the next key frame. r?gerald
Attachment #8743765 - Attachment is obsolete: false
Attachment #8749037 - Flags: review?(gsquelart)
Attachment #8749038 - Flags: review?(gsquelart)
Attachment #8743765 - Flags: review?(gsquelart)
data promise is only resolved once the decoder has been drained. It was possible for a promise to never be resolved if skipping to the next key frame failed.

Review commit: https://reviewboard.mozilla.org/r/50709/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/50709/
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/48023/diff/1-2/
Attachment #8743765 - Flags: review+
It was possible for the variable to be used uninitialized.

Review commit: https://reviewboard.mozilla.org/r/50733/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/50733/
Attachment #8743765 - Attachment description: MozReview Request: Bug 1258922: [MSE] P1. Do not go over gap when attempting to find the next key frame. r?gerald → MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald
Attachment #8749037 - Attachment description: MozReview Request: Bug 1258922: [MSE] P2. Check that the data we are attempting to skip to is buffered. r?gerald → MozReview Request: Bug 1258922: [MSE] P3. Check that the data we are attempting to skip to is buffered. r?gerald
Attachment #8749038 - Attachment description: MozReview Request: Bug 1258922: [MSE] P3. Set draining flag to true when skip to next keyframe failed. r?gerald → MozReview Request: Bug 1258922: [MSE] P4. Set draining flag to true when skip to next keyframe failed. r?gerald
Attachment #8749057 - Flags: review?(gsquelart)
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/48023/diff/2-3/
Comment on attachment 8749037 [details]
MozReview Request: Bug 1258922: [MSE] P3. Check that the data we are attempting to skip to is buffered. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50707/diff/1-2/
Comment on attachment 8749038 [details]
MozReview Request: Bug 1258922: [MSE] P4. Set draining flag to true when skip to next keyframe failed. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50709/diff/1-2/
Attachment #8749057 - Flags: review?(gsquelart) → review+
Comment on attachment 8749057 [details]
MozReview Request: Bug 1258922: [MSE] P1. Initialise variable. r?gerald

https://reviewboard.mozilla.org/r/50733/#review47421
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

https://reviewboard.mozilla.org/r/48023/#review47423
Attachment #8743765 - Flags: review?(gsquelart) → review+
Comment on attachment 8749037 [details]
MozReview Request: Bug 1258922: [MSE] P3. Check that the data we are attempting to skip to is buffered. r?gerald

https://reviewboard.mozilla.org/r/50707/#review47425

r+, once you pick one of the two proposed solutions to the issue. (Otherwise send it back for review.)

::: dom/media/mediasource/MediaSourceDemuxer.cpp:465
(Diff revision 2)
> +  uint32_t parsed = 0;
> +  // Ensure that the data we are about to skip to is still available.
> +  TimeIntervals buffered = mManager->Buffered(mType);
> +  buffered.SetFuzz(MediaSourceDemuxer::EOS_FUZZ);
> +  if (buffered.Contains(aTimeThreadshold)) {
> -  bool found;
> +    bool found;
> -  uint32_t parsed =
> +    uint32_t parsed =
> -    mManager->SkipToNextRandomAccessPoint(mType, aTimeThreadshold, found);
> +      mManager->SkipToNextRandomAccessPoint(mType, aTimeThreadshold, found);
> -  if (found) {
> +    if (found) {
> -    return SkipAccessPointPromise::CreateAndResolve(parsed, __func__);
> +      return SkipAccessPointPromise::CreateAndResolve(parsed, __func__);
> -  }
> +    }
> +  }
>    SkipFailureHolder holder(
>      mManager->IsEnded() ? DemuxerFailureReason::END_OF_STREAM :
>                            DemuxerFailureReason::WAITING_FOR_DATA, parsed);

The doubling of 'parsed' is a bit confusing.
I'm guessing the 2nd declaration at line 471 should just be an assignment instead, so it is used after the if-block, at line 479.
Otherwise if the inner 'parsed' should not exit the if-block, I don't think you actually need an outer 'parsed', as it would still be 0 at line 479.
Attachment #8749037 - Flags: review?(gsquelart) → review+
Attachment #8749038 - Flags: review?(gsquelart) → review+
Comment on attachment 8749038 [details]
MozReview Request: Bug 1258922: [MSE] P4. Set draining flag to true when skip to next keyframe failed. r?gerald

https://reviewboard.mozilla.org/r/50709/#review47427
JW: here is an interesting one:
https://archive.mozilla.org/pub/firefox/try-builds/jyavenard@mozilla.com-6348dd7beef5524eb20fc339e6337b8ac0ccdbdc/try-win32/try_win7_ix_test-mochitest-media-e10s-bm110-tests1-windows-build70.txt.gz

dom/media/mediasource/test/test_DrainOnMissingData_mp4.html failed because the "waiting" event wasn't received.

in the log you see that:
22:14:58     INFO -  [MediaPlayback #3]: D/MediaFormatReader MediaFormatReader(6c9c000)::Update: Rejecting Video promise: WAITING_FOR_DATA

at the time, the last video frame returned was 0.335000 which is exactly the last video frame in the source buffer.

The log shows that the MDSM is waiting on the waiting promise to be resolved.

The question now is: why wasn't the "waiting" event fired ?

or is the mochitest wrong?
https://treeherder.mozilla.org/logviewer.html#?job_id=20390057&repo=try#L2671

in all the tests failing, the issue is now that the "waiting" event wasn't fired. Instead "stalled" was fired.

I haven't been able to reproduce the problem where the promise wasn't resolved anymore. so at least there's progress
Flags: needinfo?(jwwang)
23:21:31 INFO - [3628] WARNING: Decoder=118318c0 GetMediaTime=369989 GetClock=315950385 mState=DECODING mPlayState=3 mDecodingFirstFrame=0 IsPlaying=1 mAudioStatus=idle mVideoStatus=waiting mDecodedAudioEndTime=0 mDecodedVideoEndTime=369989 mIsAudioPrerolling=0 mIsVideoPrerolling=0:

mVideoStatus=waiting: waiting promise is not resolved

I guess there is something wrong in MDSM which fails to enter buffering mode (to trigger 'waiting' event) even when mVideoStatus=waiting. I will look into that.
Flags: needinfo?(jwwang)
Comment on attachment 8749057 [details]
MozReview Request: Bug 1258922: [MSE] P1. Initialise variable. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50733/diff/1-2/
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/48023/diff/3-4/
Comment on attachment 8749037 [details]
MozReview Request: Bug 1258922: [MSE] P3. Check that the data we are attempting to skip to is buffered. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50707/diff/2-3/
Comment on attachment 8749038 [details]
MozReview Request: Bug 1258922: [MSE] P4. Set draining flag to true when skip to next keyframe failed. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50709/diff/2-3/
Comment on attachment 8749057 [details]
MozReview Request: Bug 1258922: [MSE] P1. Initialise variable. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50733/diff/2-3/
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/48023/diff/4-5/
Comment on attachment 8749037 [details]
MozReview Request: Bug 1258922: [MSE] P3. Check that the data we are attempting to skip to is buffered. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50707/diff/3-4/
Comment on attachment 8749038 [details]
MozReview Request: Bug 1258922: [MSE] P4. Set draining flag to true when skip to next keyframe failed. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50709/diff/3-4/
Comment on attachment 8749057 [details]
MozReview Request: Bug 1258922: [MSE] P1. Initialise variable. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50733/diff/3-4/
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/48023/diff/5-6/
Comment on attachment 8749037 [details]
MozReview Request: Bug 1258922: [MSE] P3. Check that the data we are attempting to skip to is buffered. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50707/diff/4-5/
Comment on attachment 8749038 [details]
MozReview Request: Bug 1258922: [MSE] P4. Set draining flag to true when skip to next keyframe failed. r?gerald

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/50709/diff/4-5/
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Approval Request Comment
[Feature/regressing bug #]: 1258922
[User impact if declined]: Intermittent stalls (including with Youtube), more prevalent on slower machines.
[Describe test coverage new/current, TreeHerder]: Lots of mochitest and w3c platform test. Manual testing
[Risks and why]: Low, this mostly affects slow machines which also have less than optimal network connection. Though it was also happening rather regularly on try run.
[String/UUID change made/needed]: None
Attachment #8743765 - Flags: approval-mozilla-beta?
Attachment #8743765 - Flags: approval-mozilla-aurora?
Request is for all 4 patches.
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

Let's uplift to Aurora48. I would prefer to let this bake in Aurora before uplifting to Beta.
Attachment #8743765 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8743765 [details]
MozReview Request: Bug 1258922: [MSE] P2. Do not go over gap when attempting to find the next key frame. r?gerald

This has baked on Aurora/Nightly for a bit, let's uplift to Beta47.
Attachment #8743765 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Blocks: 1272916
Depends on: 1305826
No longer depends on: 1305826
You need to log in before you can comment on or make changes to this bug.