Closed Bug 1128069 Opened 9 years ago Closed 8 years ago

mediasource-play-then-seek-back.html intermittently failing

Categories

(Core :: Audio/Video: Playback, defect, P5)

x86
macOS
defect

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox48 --- wontfix
firefox49 --- fixed
firefox50 --- fixed
firefox51 --- fixed

People

(Reporter: bholley, Assigned: jya)

References

(Blocks 1 open bug, )

Details

Attachments

(10 files)

This test was marked as timing out, and I made some changes that fixed the timeout, and now it fails/passes intermittently. I don't want to block my work on this, so I'm disabling the test in bug 1127203.
Blocks: 1066467
Component: Audio/Video → Audio/Video: Playback
The test still appears to fail intermittently, the failure is always on assert_true(mediaElement.seeking, 'mediaElement is seeking');

for some reasons, sometimes the seeking attribute is false, even after we just started to seek.

I can reproduce it regularly on my debug build.

JW got an idea on what could be happening there?
Flags: needinfo?(jwwang)
Summary: re-enable mediasource-play-then-seek-back.html → mediasource-play-then-seek-back.html intermittently failing
Error is:
assert_true: mediaElement is seeking expected true got false

finishSeekThenPlay@http://localhost/webref/media-source/mediasource-play-then-seek-back.html:33:21
EventExpectationsManager.prototype.handleWaitCallback_@http://localhost/webref/media-source/mediasource-util.js:115:9
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
setTimeout handler*EventExpectationsManager.prototype.expectEvent/eventHandler<@http://localhost/webref/media-source/mediasource-util.js:89:17
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
EventListener.handleEvent*EventExpectationsManager.prototype.expectEvent@http://localhost/webref/media-source/mediasource-util.js:91:9
addExtraTestMethods/test.expectEvent@http://localhost/webref/media-source/mediasource-util.js:314:13
delayedPlayHandler@http://localhost/webref/media-source/mediasource-play-then-seek-back.html:43:21
addExtraTestMethods/test.waitForCurrentTimeChange/onTimeUpdate<@http://localhost/webref/media-source/mediasource-util.js:330:21
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
EventListener.handleEvent*addExtraTestMethods/test.waitForCurrentTimeChange@http://localhost/webref/media-source/mediasource-util.js:334:13
@http://localhost/webref/media-source/mediasource-play-then-seek-back.html:51:21
EventExpectationsManager.prototype.handleWaitCallback_@http://localhost/webref/media-source/mediasource-util.js:115:9
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
setTimeout handler*EventExpectationsManager.prototype.expectEvent/eventHandler<@http://localhost/webref/media-source/mediasource-util.js:89:17
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
EventListener.handleEvent*EventExpectationsManager.prototype.expectEvent@http://localhost/webref/media-source/mediasource-util.js:91:9
addExtraTestMethods/test.expectEvent@http://localhost/webref/media-source/mediasource-util.js:314:13
@http://localhost/webref/media-source/mediasource-play-then-seek-back.html:19:17
@http://localhost/webref/media-source/mediasource-util.js:396:17
loadData_/request.onload<@http://localhost/webref/media-source/mediasource-util.js:148:13
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
EventHandlerNonNull*loadData_@http://localhost/webref/media-source/mediasource-util.js:138:26
MediaSourceUtil.loadBinaryData@http://localhost/webref/media-source/mediasource-util.js:183:9
@http://localhost/webref/media-source/mediasource-util.js:394:13
@http://localhost/webref/media-source/mediasource-util.js:375:17
onSourceOpen@http://localhost/webref/media-source/mediasource-util.js:167:13
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
Test.prototype.step_func/<@http://localhost/resources/testharness.js:1427:20
EventListener.handleEvent*openMediaSource_@http://localhost/webref/media-source/mediasource-util.js:170:9
@http://localhost/webref/media-source/mediasource-util.js:373:13
@http://localhost/webref/media-source/mediasource-util.js:354:13
Test.prototype.step@http://localhost/resources/testharness.js:1403:20
async_test@http://localhost/resources/testharness.js:518:13
@http://localhost/webref/media-source/mediasource-util.js:351:16
@http://localhost/webref/media-source/mediasource-util.js:359:16
@http://localhost/webref/media-source/mediasource-util.js:382:9
@http://localhost/webref/media-source/mediasource-play-then-seek-back.html:13:13
The test is skipped on my computer by running `./mach web-platform-tests testing/web-platform/tests/media-source/mediasource-play-then-seek-back.html` locally. How to enable it?
Flags: needinfo?(jwwang)
I think this could be the issue:
https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaDecoder.cpp#812

mLogicallySeeking is set to true, and right after it calls MediaDecoder::CallSeek() which then calls DiscardOngoingSeekIfExists, which will set mLogicallySeeking to false.

from that point on, IsSeeking will always return false, even if there's a seek pending.
hmm.. misread.. forget the above.
ok..

it's definitely the test that is wrong.
it waits for the "seeking" event to be fired before continuing, and only then check that the seeking attribute is true.

This is wrong. When the seeking event is received, it's entirely possible that seeking as completed as there's a stable state in between.
It is incorrect to assume that the seeking attribute will still be true once the seeking event handler gets to run.

Review commit: https://reviewboard.mozilla.org/r/69476/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/69476/
Attachment #8778102 - Flags: review?(jwwang)
Attachment #8778103 - Flags: review?(jwwang)
Attachment #8778104 - Flags: review?(gsquelart)
Attachment #8778105 - Flags: review?(jwwang)
Attachment #8778106 - Flags: review?(gsquelart)
See w3c/MSE Issue 19, 20 & 26.

    Changing the duration now can never call the range removal algorithm. An explicit call to remove must be used for range removal.
    This spec change performed the following:
    - Require remove() for all Range Removals

Review commit: https://reviewboard.mozilla.org/r/69480/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/69480/
With MediaSource, the duration is always known and exact. It is entirely possible that we have played data at some point, then removed that data and adjusted the duration.

Review commit: https://reviewboard.mozilla.org/r/69482/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/69482/
Comment on attachment 8778105 [details]
Bug 1128069: [MSE] P4. Do not adjust duration to what data we may have been seen in the past.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69482/diff/1-2/
Attachment #8778105 - Attachment description: Bug 1128069: [MSE] P4. Do not adjust duration to what data we may have seen in the past. → Bug 1128069: [MSE] P4. Do not adjust duration to what data we may have been seen in the past.
Comment on attachment 8778106 [details]
Bug 1128069: [MSE] P7. Adjust webref expected results.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69484/diff/1-2/
Comment on attachment 8778103 [details]
Bug 1128069: [MSE] P2. Change webref expectations now that the test is valid.

https://reviewboard.mozilla.org/r/69478/#review66630
Attachment #8778103 - Flags: review?(jwwang) → review+
Comment on attachment 8778105 [details]
Bug 1128069: [MSE] P4. Do not adjust duration to what data we may have been seen in the past.

https://reviewboard.mozilla.org/r/69482/#review66634

::: dom/media/MediaDecoderStateMachine.cpp:1112
(Diff revision 2)
>      duration = mInfo.mMetadataDuration.ref();
>    } else {
>      return;
>    }
>  
> -  if (duration < mObservedDuration.Ref()) {
> +  if (mExplicitDuration.Ref().isNothing() &&

Add a comment here so it is obvious why we prefer mExplicitDuration over mObservedDuration in the MSE case.
Attachment #8778105 - Flags: review?(jwwang) → review+
Comment on attachment 8778104 [details]
Bug 1128069: Update webref MSE tests according to updated spec.

https://reviewboard.mozilla.org/r/69480/#review66638

r+ assuming you address my concerns (if you agree with my interpretation).

::: testing/web-platform/tests/media-source/mediasource-seek-beyond-duration.html:66
(Diff revision 1)
> +                    sourceBuffer.remove(1.5, Infinity);
> +                    assert_true(sourceBuffer.updating, "updating");
> +                    test.expectEvent(sourceBuffer, "updateend");

It seems to me that the specs allow the remove() operation to complete as quickly as it wants, including synchronously during the function call.
If I'm correct, then you cannot assert that 'updating' will still be true after remove() returns.
Also you could miss the event if it is sent during the remove() call, so I think you should move expectEvent before remove (as was done in the function above).
Attachment #8778104 - Flags: review?(gsquelart) → review+
Attachment #8778106 - Flags: review?(gsquelart) → review+
Comment on attachment 8778106 [details]
Bug 1128069: [MSE] P7. Adjust webref expected results.

https://reviewboard.mozilla.org/r/69484/#review66642

r+, but could you please give more information in the commit description about why deleting the .ini file adjusts the test results.
(In reply to Gerald Squelart [:gerald] from comment #19)
> Comment on attachment 8778104 [details]
> Bug 1128069: [MSE] P3. Update webref MSE tests according to updated spec.
> 
> https://reviewboard.mozilla.org/r/69480/#review66638
> 
> r+ assuming you address my concerns (if you agree with my interpretation).
> 
> :::
> testing/web-platform/tests/media-source/mediasource-seek-beyond-duration.
> html:66
> (Diff revision 1)
> > +                    sourceBuffer.remove(1.5, Infinity);
> > +                    assert_true(sourceBuffer.updating, "updating");
> > +                    test.expectEvent(sourceBuffer, "updateend");
> 
> It seems to me that the specs allow the remove() operation to complete as
> quickly as it wants, including synchronously during the function call.

hmmm.. I don't believe so 
https://w3c.github.io/media-source/index.html#widl-SourceBuffer-remove-void-double-start-unrestricted-double-end
remove calls the range remove algorithm:
https://w3c.github.io/media-source/index.html#sourcebuffer-range-removal
Step 3: "Set the updating attribute to true

then from step 5, the remaining steps are to be run synchronously.

So updating will always be true when remove returns.

> Also you could miss the event if it is sent during the remove() call, so I
> think you should move expectEvent before remove (as was done in the function
> above).

no you can't miss it either. because while the event *may* be queued, it can't have already been fired as we're still within the current event loop.
So setting the listener right after remove is fine. I can set it up right before if it's less confusing.
(In reply to Gerald Squelart [:gerald] from comment #20)
> Comment on attachment 8778106 [details]
> Bug 1128069: [MSE] P5. Adjust webref expected results.
> 
> https://reviewboard.mozilla.org/r/69484/#review66642
> 
> r+, but could you please give more information in the commit description
> about why deleting the .ini file adjusts the test results.

hmmm that's how the platform-test are run... it's such a usual practice that I didn't think it deserve a dedicated comment
Comment on attachment 8778102 [details]
Bug 1128069: Correct invalid test.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69476/diff/1-2/
Comment on attachment 8778103 [details]
Bug 1128069: [MSE] P2. Change webref expectations now that the test is valid.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69478/diff/1-2/
Comment on attachment 8778104 [details]
Bug 1128069: Update webref MSE tests according to updated spec.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69480/diff/1-2/
Comment on attachment 8778105 [details]
Bug 1128069: [MSE] P4. Do not adjust duration to what data we may have been seen in the past.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69482/diff/2-3/
Comment on attachment 8778106 [details]
Bug 1128069: [MSE] P7. Adjust webref expected results.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/69484/diff/2-3/
I believe the fix in P4 will likely fix issues seen in the past where the ended event wasn't fired.
(In reply to Jean-Yves Avenard [:jya] from comment #21)
> (In reply to Gerald Squelart [:gerald] from comment #19)
> > Comment on attachment 8778104 [details]
> > It seems to me that the specs allow the remove() operation to complete as
> > quickly as it wants, including synchronously during the function call.
> 
> hmmm.. I don't believe so 
> https://w3c.github.io/media-source/index.html#widl-SourceBuffer-remove-void-
> double-start-unrestricted-double-end
> remove calls the range remove algorithm:
> https://w3c.github.io/media-source/index.html#sourcebuffer-range-removal
> Step 3: "Set the updating attribute to true
> 
> then from step 5, the remaining steps are to be run synchronously.
> 
> So updating will always be true when remove returns.

Oops, I missed step 5.

However... Even though it is asynchronous, wouldn't it be *possible* that the operation still goes through steps 6 and 7 quickly enough, so that when the js engine finally runs 'sourceBuffer.updating' it will be back to false?

> > Also you could miss the event if it is sent during the remove() call, so I
> > think you should move expectEvent before remove (as was done in the function
> > above).
> no you can't miss it either. because while the event *may* be queued, it
> can't have already been fired as we're still within the current event loop.
> So setting the listener right after remove is fine. I can set it up right
> before if it's less confusing.

I see, you're right, no need to move it, unless you want to be consistent with the other functions.
(In reply to Jean-Yves Avenard [:jya] from comment #28)
> I believe the fix in P4 will likely fix issues seen in the past where the
> ended event wasn't fired.

Can you elaborate that? I don't understand how it fixes the problem.
(In reply to Gerald Squelart [:gerald] from comment #29)
> However... Even though it is asynchronous, wouldn't it be *possible* that
> the operation still goes through steps 6 and 7 quickly enough, so that when
> the js engine finally runs 'sourceBuffer.updating' it will be back to false?

I'm told the JS model doesn't allow for that. So please forget&forgive my comments, go ahead with the patch.
Attached file log.txt —
JW, looking at https://treeherder.mozilla.org/#/jobs?repo=try&revision=eee25edb96c43c9de22b642b8d139ee1a6906d76

it still permafail on linux OPT (can't reproduce with a debug build).

I've managed to reproduce it locally, also only with an OPT build.

The script testing/web-platform/tests/media-source/mediasource-play-then-seek-back.html (actual test is Test playing then seeking back."
intermittently fails.

it appears related to the handling of the seeked event.
The listener for the seeked event is set once the "seeking" event is received.
My Javascript-fu is not advanced enough to see if they are scheduling a new task to set the seeked event listener (in which case the test would be invalid as it's entirely possible the seeked event is actually received prior setting the event handler")

I don't understand what they are doing, appearing to set some kind of timer with 0 delay to set the thing.

Is this something you could look into?

thank you.
Flags: needinfo?(jwwang)
Attached file seek-past-duration.log —
managed to reproduce the other on my machine too. took over 5000 repeats to do it (left it running overnight)
cleaned up log where I've removed most of the stuff related to the other tests running in parallel.

The difference between a successful test and a failed one is the MDSM after the final seek (target=1.509297) would normally go from "Changed state from SEEKING (to 1509297) to COMPLETED" to instead of "Changed state from SEEKING (to 1509297) to DECODING""

The duration change just occurred on the main thread but the MDSM has detected that we've reached the end of the media instead. The mirror that update the mExplicitDuration hasn't had the time to run then, and as such the end time is set to the last seek time which is 1.509297 in place of the duration. This cause the test assertion that duration != currentTime.

Not sure why the MDSM thinks that media has ended. There's at least another frame to play in there.

Maybe it's a race due to the TrackBuffersManagers knowing that the media has ended before the duration has had the chance to be completed.
for the seek-past-duration test, the media being played is weirdly formed:
it as a moof+mdat made of a single frame with:
Processing video/avc frame(pts:1494000 end:1527200, dts:1427600, duration:33200, kf:1)"
followed by another:
Processing video/avc frame(pts:1460800 end:1527200, dts:1460800, duration:66400, kf:1)"

we can see that the two frames overlapped.
depending on the timing of endOfStream being called, either it will have time to play the first stream (and the test will succeed) or it happens after the second sample is played. in which case playback will stop before the duration has changed. The MDSM uses a mirror for the explicit duration, it may not have received the updated duration before entering EOS mode, in which case the currentTime will be one frame behind the final duration.

Maybe we should simply adjust currentTime in the media decoder instead.

Still puzzled with the other permafail test, and I still believe the issue is in the JS
Blocks: 1293159
Assignee: nobody → jyavenard
Comment on attachment 8778750 [details]
Bug 1128069: [MSE] P6. Call NotifyDataArrived from MediaDecoder.

https://reviewboard.mozilla.org/r/69914/#review67042

r+ with suggestion:

::: dom/media/mediasource/MediaSourceDecoder.cpp:196
(Diff revision 2)
> +    // We want the MediaSourceReader to refresh its buffered range as it may
> +    // have been modified (end lined up).

I don't quite understand the comment in parentheses, could you please rephrase for clarity?
(I see it was just moved from somewhere else, but you may want to take this opportunity to clarify the comment.)
Attachment #8778750 - Flags: review?(gsquelart) → review+
Comment on attachment 8778750 [details]
Bug 1128069: [MSE] P6. Call NotifyDataArrived from MediaDecoder.

https://reviewboard.mozilla.org/r/69914/#review67042

> I don't quite understand the comment in parentheses, could you please rephrase for clarity?
> (I see it was just moved from somewhere else, but you may want to take this opportunity to clarify the comment.)

it's in the spec:
https://w3c.github.io/media-source/index.html#widl-SourceBuffer-buffered

step 4.2 "If readyState is "ended", then set the end time on the last range in track ranges to highest end time."

the comments is identical to what was there before, and was simply moved from the earlier place.
Comment on attachment 8778750 [details]
Bug 1128069: [MSE] P6. Call NotifyDataArrived from MediaDecoder.

https://reviewboard.mozilla.org/r/69914/#review67042

> it's in the spec:
> https://w3c.github.io/media-source/index.html#widl-SourceBuffer-buffered
> 
> step 4.2 "If readyState is "ended", then set the end time on the last range in track ranges to highest end time."
> 
> the comments is identical to what was there before, and was simply moved from the earlier place.

The grammar is confusing to me.
Do you mean something like: "Now that the end is lined up, we want the MSR to refresh etc."
Comment on attachment 8778749 [details]
Bug 1128069: [MSE] P5. Adjust currentTime to end position in MediaDecoder.

https://reviewboard.mozilla.org/r/69912/#review67062

Which MDSM mirror are you talking about? MDSM::Duration() returns mDuration which is a canonical.
Comment on attachment 8778750 [details]
Bug 1128069: [MSE] P6. Call NotifyDataArrived from MediaDecoder.

https://reviewboard.mozilla.org/r/69914/#review67042

> The grammar is confusing to me.
> Do you mean something like: "Now that the end is lined up, we want the MSR to refresh etc."

the buffered range is using mirrors for the calculations. It's refreshed whenever NotifyDataArrived is called.
The buffered range value depends on the readyState of the MediaSource. If it's ended, then it's aligned to the longest track in all sourcebuffer; otherwise it's the intersection of each track's buffered range.
Comment on attachment 8778749 [details]
Bug 1128069: [MSE] P5. Adjust currentTime to end position in MediaDecoder.

https://reviewboard.mozilla.org/r/69912/#review67106

::: dom/media/MediaDecoderStateMachine.cpp:2373
(Diff revision 1)
>  
>        if (mPlayState == MediaDecoder::PLAY_STATE_PLAYING &&
>            !mSentPlaybackEndedEvent)
>        {
>          int64_t clockTime = std::max(AudioEndTime(), VideoEndTime());
> -        clockTime = std::max(int64_t(0), std::max(clockTime, Duration().ToMicroseconds()));
> +        clockTime = std::max(int64_t(0), clockTime);

I would prefer not to change MDSM. Just do the adjustment in MediaDecoder.
Attachment #8778749 - Flags: review?(jwwang) → review+
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fbe63241a709
Correct invalid test. r=jwwang
https://hg.mozilla.org/integration/autoland/rev/8563ff06f0d5
[MSE] P2. Change webref expectations now that the test is valid. r=jwwang
https://hg.mozilla.org/integration/autoland/rev/c0aa02955756
Update webref MSE tests according to updated spec. r=gerald
https://hg.mozilla.org/integration/autoland/rev/95ecfed6808e
[MSE] P4. Do not adjust duration to what data we may have been seen in the past. r=jwwang
https://hg.mozilla.org/integration/autoland/rev/8cc5eea320fb
[MSE] P5. Adjust currentTime to end position in MediaDecoder. r=jwwang
https://hg.mozilla.org/integration/autoland/rev/b406ebdeebd2
[MSE] P6. Call NotifyDataArrived from MediaDecoder. r=gerald
https://hg.mozilla.org/integration/autoland/rev/4ce72936dad6
[MSE] P7. Adjust webref expected results. r=gerald
Flags: needinfo?(jwwang)
Jean-Yves, is it something you would like to uplift or we should let it ride the train? Thanks
Flags: needinfo?(jyavenard)
Comment on attachment 8778102 [details]
Bug 1128069: Correct invalid test.

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Events not fired as they should, could cause stalling with some MSE players (BBC has reported not getting some events)
[Describe test coverage new/current, TreeHerder]: In central for a few weeks, manual tests, lots of regression tests.
[Risks and why]: Low, we don't fundamentally change what was already there.
[String/UUID change made/needed]: None
Flags: needinfo?(jyavenard)
Attachment #8778102 - Flags: approval-mozilla-beta?
Attachment #8778102 - Flags: approval-mozilla-aurora?
uplift request is for all patches.
Comment on attachment 8778102 [details]
Bug 1128069: Correct invalid test.

Several tweaks to mediasource decoder and playback, with new tests. 
Should fix some stalling in video playback, let's take this for beta 7.
Attachment #8778102 - Flags: approval-mozilla-beta?
Attachment #8778102 - Flags: approval-mozilla-beta+
Attachment #8778102 - Flags: approval-mozilla-aurora?
Attachment #8778102 - Flags: approval-mozilla-aurora+
Flags: needinfo?(jyavenard)
Depends on: 1298043
The WPT tests are wrong and were fixed in bug 1293613 and bug 1293186. 

Backing out on the basis of invalid tests is the wrong approach.
Flags: needinfo?(jyavenard)
I discussed it on IRC with gerard and it wasn't clear whether it was a test issue or a code issue. We decided that it was lower risk backing out until you were around to weigh in rather than shipping a potentially-broken Beta build to our users. Thanks for the critique though, noted for the future.
Also, a "the tests are going to have issues if bugs XXX and YYY don't land too" comment would have gone a long way.
I will make a list of all the bugs that should be linked to this one.

I didn't expect this change to increase the likelihood of a timeout in the tests in their current form (there are already intermittents issues lodged for them)
actually puzzled on what's missing to be honest.
To succeed there's a dependency on bug 1293613. However, those also tests check on new MSE features added in the MSE V1 spec.

I think those can ride the train.

So the easiest to uplift everything but P7.
For the record, the fix for this error is found in https://reviewboard.mozilla.org/r/70836/diff/1#index_header
Backout of part 7 from Aurora:
https://hg.mozilla.org/releases/mozilla-aurora/rev/01a9a573e78d
Thank you RyanVM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: