No video ended event fired when video file has a shorter audio track
Categories
(Core :: Audio/Video: Playback, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox66 | --- | wontfix |
firefox67 | --- | wontfix |
firefox68 | --- | fixed |
People
(Reporter: luca, Assigned: pehrsons)
References
(Regression)
Details
(Keywords: regression)
Attachments
(15 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
Steps to reproduce:
Our application waits for certain videos to fire the "ended" event, when a video has played until its end.
Actual results:
Since Firefox version 65, the ended event will not occur, if the videos audio track is shorter than the video itself. Infact firefox doesn't seem to play the video to its end at all.
we discovered this with this video:
http://ostsee-life.nabu.de/scenes/S_100/360/4096x2048/S_100_luftaufnahme_4096x2048_0_0.mp4
and also its .webm equivalent.
the audio seems to be about 700ms shorter.
ff version 61 to 64 work just fine. 65 and 66 are stopping the playback too early.
Expected results:
The video ended event should be fired
the previously linked video has been updated. the now faulty file has been moved to http://ostsee-life.nabu.de/scenes/S_100/360/4096x2048/bad/S_100_luftaufnahme_4096x2048_0_0.mp4
Updated•6 years ago
|
Comment 2•6 years ago
|
||
:alwu, sounds like a regression following your changes with audio looping.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Comment 3•6 years ago
|
||
Hi, Luca,
I can't reproduce the issue with latest Nightly 68.0a1 (2019-03-20) on OSX 10.12.6.
The ended
event was dispatched correctly and I used the video link you provided in comment1.
Comment 5•6 years ago
|
||
BTW, in Firefox Release 65.0.1, the ended
event is fired correctly as well.
Updated•6 years ago
|
Updated•6 years ago
|
Thanks for your example. You are right, the problem seems to be a little more specific.
In our engine, the video-element is created in javascript and is not added to the dom. We use it to render it into a WebGL-Context in order to create a 360°-View. We also use the WebAudioApi and "createMediaElementSource()" to route the videos Audio through our sound engine.
I have extended your Demo by creating the video node in javascript AND routing the sound into Webaudio (it seems to only occur in this combination). I then won't receive a "ended" event on firefox-versions >= 65 with this demo, i do receive the event with versions =< 64
Here is the link to my demo (feel free to download the html and linked video, if you need to):
http://luca.kfcloud.de/firefoxVideoEndedDemo/demo.html
PS: In the demo i skip towards the end of the video, so you don't have to listen to the whole thing.
PPS: I hosted the video separatly, because hooking it into web-audio would produce cross origin-problems otherwise
PPS: Example wont work in Browsers that require user interaction to start webaudio contexts (like chrome). It also works in Microsofts Edge for exampe ;)
Comment 8•6 years ago
|
||
(In reply to Luca from comment #6)
PS: In the demo i skip towards the end of the video, so you don't have to listen to the whole thing.
PPS: I hosted the video separatly, because hooking it into web-audio would produce cross origin-problems otherwise
This will not work in either chrome or FF version 67 and later due to block autoplay.
You call videoElement.play(); which will prevent play() from working.
You need to test if the promise is rejected.
Comment 9•6 years ago
|
||
(In reply to Luca from comment #7)
PPS: Example wont work in Browsers that require user interaction to start webaudio contexts (like chrome). It also works in Microsofts Edge for exampe ;)
A work-around would be to mute the video element.
Comment 10•6 years ago
•
|
||
The result of the preliminary investigation is that, the video sink stopped consuming video data from video queue after audio reached to the end, which caused no more video decoding task to be dispatched and the video decoding couldn't reach to the end.
I'm still investigating why we stop comsuming the video data and why didn't dispatch any decoding task after that.
This issue looks not related with audio looping, it seems more related with using decoded stream.
Updated•6 years ago
|
Comment 11•6 years ago
|
||
The reason that we didn't dispatch the ended
event is because video sink stopped consuming data, so that we couldn't decode video to the end, which causes decoding was never ended.
And the reason that video sink stopped consuming data is because the clock time [1] stopped increasing after audio reached to the end. The clock time was gotten from [2], which showed that the mLastOutputTime
didn't be updated. That is, after audio reahced to the end, MSG seems stopping calling NotifyOutput()
[3] which would use media source event to call DecodedStream::NotifyOutput()
[4], which would update mLastOutputTime
.
[1] https://searchfox.org/mozilla-central/rev/486b64e4bff86b7988af8c8b80845404ad197533/dom/media/mediasink/VideoSink.cpp#516
[2] https://searchfox.org/mozilla-central/rev/486b64e4bff86b7988af8c8b80845404ad197533/dom/media/mediasink/DecodedStream.cpp#751
[3] https://searchfox.org/mozilla-central/rev/486b64e4bff86b7988af8c8b80845404ad197533/dom/media/mediasink/DecodedStream.cpp#92
[4] https://searchfox.org/mozilla-central/rev/486b64e4bff86b7988af8c8b80845404ad197533/dom/media/mediasink/DecodedStream.cpp#417
Hi, Paul,
Could you help me check the MSG part to see why the NotifyOutput()
didn't be called after audio reached to the end?
Thank you!
Comment 12•6 years ago
|
||
Hmm, wait, we only captured the audio stream, so it makes sense to not call NotofyOutput() after there is no audio.
Comment 13•6 years ago
•
|
||
Now I have patches to fix this issue, I'm planing to use AudioSinkWrapper
to manager both AudioSink
and DecodedStream
, which should return correct clock time during different situations. I'm testing my patches and will request review later.
Comment 14•6 years ago
|
||
:alwu, any ideas which changes caused the regression?
I'm thinking some of the MediaStream/MSG changes caused it, :phersons?
Just asking as it would ensure the ideal fix.
Comment 15•6 years ago
|
||
I guess that this issue has existed for a while since we introduced the DecodedStream. Because DecodedStream doesn't consider the case to return increasing clock time after audio ended when audio is shorter than video.
Assignee | ||
Comment 16•6 years ago
|
||
Maybe this check [1] needs to account for an ended audio track and thus switch to video? It was part of bug 1423241 and the shift from a stream listener to track listeners, and landed in 65.
Like Alastor said we're only capturing audio here, but I don't think that makes a difference to MediaDecoder or DecodedStream. They will always process both tracks if present.
Assignee | ||
Comment 17•6 years ago
|
||
We could also pad the audio track with silence, and delay the ending of the audio track in the MSG, until the video track has also ended. This could have side effects if there's a second, still live, audio track and both are fed to web audio through a MediaStreamAudioSourceNode. I'm not sure right now which approach would be easier.
Comment 18•6 years ago
|
||
We've already had a place to handle this kinds of case [1], so I think it would be good to let AudioSinkWrapper to manager DecodedStream as well.
[1] https://searchfox.org/mozilla-central/rev/a7315d78417179b151fef6108f2bce14786ba64d/dom/media/mediasink/AudioSinkWrapper.cpp#79
Updated•6 years ago
|
Assignee | ||
Comment 19•6 years ago
|
||
(In reply to Alastor Wu [:alwu] from comment #18)
We've already had a place to handle this kinds of case [1], so I think it would be good to let AudioSinkWrapper to manager DecodedStream as well.
[1] https://searchfox.org/mozilla-central/rev/a7315d78417179b151fef6108f2bce14786ba64d/dom/media/mediasink/AudioSinkWrapper.cpp#79
As DecodedStream goes as an AudioSink (as far as VideoSink is concerned it's on the same level as AudioSinkWrapper) it feels natural that it only cares about reporting audio position. But it also processes video, sending video frames to a video track, so somehow it too needs access to the progressing video clock after audio has finished in order to send out the last video frames. Does your solution accommodate that?
Comment 20•6 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #19)
(In reply to Alastor Wu [:alwu] from comment #18)
We've already had a place to handle this kinds of case [1], so I think it would be good to let AudioSinkWrapper to manager DecodedStream as well.
[1] https://searchfox.org/mozilla-central/rev/a7315d78417179b151fef6108f2bce14786ba64d/dom/media/mediasink/AudioSinkWrapper.cpp#79As DecodedStream goes as an AudioSink (as far as VideoSink is concerned it's on the same level as AudioSinkWrapper) it feels natural that it only cares about reporting audio position.
I agree that it feels convoluted that an AudioSink would need to know about the video position. That is not the job of an AudioSink.
And while that approach may/will work, I don't think it should be the preferred one for the sake of expediency.
Comment 21•6 years ago
|
||
(In reply to Andreas Pehrson [:pehrsons] from comment #19)
As DecodedStream goes as an AudioSink (as far as VideoSink is concerned it's on the same level as AudioSinkWrapper) it feels natural that it only cares about reporting audio position. But it also processes video, sending video frames to a video track, so somehow it too needs access to the progressing video clock after audio has finished in order to send out the last video frames. Does your solution accommodate that?
That is what AudioSinkWrapper
did now, it would calculate position depending on the playback rate when audio has ended, which could make video sink get correct clock time to consume video frame.
What I would like to do is not to change any functionality inside the DecodedStream
, to make a wrapper to handle all other audio position non-related stuffs.
(In reply to Jean-Yves Avenard [:jya] from comment #20)
I agree that it feels convoluted that an AudioSink would need to know about the video position. That is not the job of an AudioSink.
And while that approach may/will work, I don't think it should be the preferred one for the sake of expediency.
It seems to me that we should move the check [1] to VideoSink
and to make VideoSink
switching to calculate the position after audio ended. Then AudioSinkWrapper
could be used to handle some difference between DecodedStream
and AudioSink
.
WDYT?
Comment 22•6 years ago
|
||
I'm thinking about the following concept.
AudioSinkWrapper
is responsible to return audio clock time for media with or without audio (using playback rate to calculate the clock)
So for the level of the audio sink, it actually doesn't need to know about video position, the function [1] here should be renamed to GetSystemClock()
or something similar.
VideoSink
contains AudioSinkWrapper
, it could always rely the clock time AudioSinkWrapper
returns. AudioSink
and DecodedStream
are two dedicated sink to consume audio data, so we can let AudioSinkWrapper
to manage them, so that it could know when we should switch to system clock in order to return the correct time.
Comment 23•6 years ago
|
||
If we capture audio for a media element without audio track, what would happen? Would we append silence to the track?
Because AudioSinkWrapper
won't create AudioSink
when media doesn't have audio track, I'm thinking about whether we have to create DecodedStream
in this case if we let AudioSinkWrapper
to manage DecodedStream
.
Comment 24•6 years ago
|
||
hmm, DecodedStream
would also be responsible to return video ended promise [1]. Now I'm not sure that it's a good idea to AudioSink
and DecodedStream
...
Assignee | ||
Comment 25•6 years ago
|
||
(In reply to Alastor Wu [:alwu] from comment #23)
If we capture audio for a media element without audio track, what would happen? Would we append silence to the track?
Because
AudioSinkWrapper
won't createAudioSink
when media doesn't have audio track, I'm thinking about whether we have to createDecodedStream
in this case if we letAudioSinkWrapper
to manageDecodedStream
.
If there's no audio track decoded by MediaDecoder, DecodedStream won't be outputting any audio track either. It'll be running off the video track's NotifyOutput instead.
Assignee | ||
Comment 26•6 years ago
|
||
(In reply to Jean-Yves Avenard [:jya] from comment #20)
(In reply to Andreas Pehrson [:pehrsons] from comment #19)
(In reply to Alastor Wu [:alwu] from comment #18)
We've already had a place to handle this kinds of case [1], so I think it would be good to let AudioSinkWrapper to manager DecodedStream as well.
[1] https://searchfox.org/mozilla-central/rev/a7315d78417179b151fef6108f2bce14786ba64d/dom/media/mediasink/AudioSinkWrapper.cpp#79As DecodedStream goes as an AudioSink (as far as VideoSink is concerned it's on the same level as AudioSinkWrapper) it feels natural that it only cares about reporting audio position.
I agree that it feels convoluted that an AudioSink would need to know about the video position. That is not the job of an AudioSink.
And while that approach may/will work, I don't think it should be the preferred one for the sake of expediency.
While this is true for idealistic reasons, this isn't how DecodedStream has been working in the past. DecodedStream will de facto have to feed video to its video track, so it will need access to video time as well. It would be unnecessarily complicated to break the video clock out of DecodedStream, just to hand it another way of getting the video clock so it can still feed video.
I say we keep the status quo and let DecodedStream drive the clock after audio ends too. This is what AudioSinkWrapper does, so we keep these two at the same level.
We can refactor this in the future, but it'll have to involve breaking out the feeding of the audio track to something on the same level as AudioSink (not AudioSinkWrapper), and the feeding of video to a dedicated VideoSink. HTMLMediaElement.captureStream() even says to keep playing the audio through the regular audio sink and feed it to an audio track, and to accomodate this a refactor like this is inevitable.
Unless anyone objects, I'll take this bug and fix the regression, and anything else that comes up from the tests I'll be adding; and we'll push the refactor of making DecodedStream a pure audio sink onto the future.
Assignee | ||
Comment 27•6 years ago
|
||
I've confirmed the regression range with mozregression.
Comment 28•6 years ago
|
||
According to comment27, pass this bug to Andreas, thank you!
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 29•6 years ago
|
||
I was in a deep rabbit hole making sure we could append and process precisely the right time of data in the MSG, but I think I made it out: https://treeherder.mozilla.org/#/jobs?repo=try&revision=da98d0eafe3df69843eaf9ddfad3275e3e2b46f4
There were some other bugs as well with weirdly formed media files among our tests, but we should behave more like VideoSink in what frames we pass on now.
Assignee | ||
Comment 30•6 years ago
|
||
FWIW there's an intermittent on 3 of the test cases left, that doesn't reproduce locally:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=96bf908f6f9f029e668314d2334389072371a7bd
There are a few shutdown hangs remaining after that, that I hope are not my fault, and hopefully fixed on the latest m-c.
Assignee | ||
Comment 31•6 years ago
|
||
Assignee | ||
Comment 32•6 years ago
|
||
Assignee | ||
Comment 33•6 years ago
|
||
Assignee | ||
Comment 34•6 years ago
|
||
Assignee | ||
Comment 35•6 years ago
|
||
This is similar to what VideoSink does, and is required for us to end playback
on HAVE_CURRENT_DATA (a frame for currentTime must be available).
Assignee | ||
Comment 36•6 years ago
|
||
Assignee | ||
Comment 37•6 years ago
|
||
Null frames could still be valid. It's how DecodedStream signals a pause. They
could also be valid and have an intrinsic size and the force-black flag set.
Assignee | ||
Comment 38•6 years ago
|
||
This is necessary to get an accurate total track duration in case DecodedStream
does a lot of seeking and pausing.
This also removes previous code for avoiding blocking a video track, as this new
code handles that case too.
Assignee | ||
Comment 39•6 years ago
|
||
Assignee | ||
Comment 40•6 years ago
|
||
Assignee | ||
Comment 41•6 years ago
|
||
A case where this wasn't working was bipbop-lateaudio.mp4, where the last frame
has duration 0 and starts and ends before the previous frame has ended. The
VideoSink still renders this frame at the end of playback, so this patch
brings DecodedStream closer to that behavior by rendering all frames with a
start time after the previous frame's start time. The track's duration is still
based on absolute times so things don't blow up.
Assignee | ||
Comment 42•6 years ago
|
||
streamCurrentTime
is relative to the start of the stream, as is the track's
segment. We need to account for the track's start time in the stream when
notifying of output.
Assignee | ||
Comment 43•6 years ago
|
||
Assignee | ||
Comment 44•6 years ago
|
||
This gives us a guarantee that the first frame of a media file can be rendered
with a second media element and mozCaptureStream(), even if the file is very
very short.
With longer video-only files there were also cases where nothing ended up being
rendered. Probably because the MediaStreamGraph ended up switching from an
AudioCallbackDriver to a SystemClockDriver and this took enough time to put the
SourceMediaStream::EndTrack and the SourceMediaStream::AddTrackListener calls
for this video track to be processed in the same iteration. The listener would
then always lose to the ending track and update main thread state too late,
leading to its media element not rendering any frames and nasty intermittent
failures.
Assignee | ||
Comment 45•6 years ago
|
||
Assignee | ||
Comment 46•6 years ago
|
||
FWIW, the rev without the last patch, looks like this: https://treeherder.mozilla.org/#/jobs?repo=try&revision=96bf908f6f9f029e668314d2334389072371a7bd
I managed to reproduce this under rr locally by increasing PARALLEL_TESTS
to 4.
What happened that caused the errors was that SourceMediaStream::EndTrack (from DecodedStream) and several SourceMediaStream::AddDirectTrackListener/AddTrackListener (from DecodedStream, HTMLMediaElement) ended up being processed in the same MSG iteration.
Because of the SwapMessageQueues
that happens at the end of the iteration, the listeners were only really added during the following iteration.
The current iteration meanwhile sent a NotifyEnded() to an existing listener (from MediaStreamTrack) which updated the track's main thread state, and made the HTMLMediaElement end playback. The listeners that took until the next iteration to get added would have notified about initial frame size, and provide the link to the VideoFrameContainer, but they were ignored due to the media element already being ended.
This could also be fixed by MediaStreamGraph processing the ControlMessages at the beginning of an iteration. I am not sure of the background behind why things are as they are however, so I'd be keen to hear more during review. Seeing that we long-term will remove SourceMediaStreams, break video out from the graph, and make all audio in the graph pull-only, this will not really be needed. However, because audio will be pull-only and because tracks shouldn't end at seeks or pauses per spec, we'll need a mechanism similar to this, that ends tracks only once playout of their content have finished, anyway. I think this is a pretty good compromise for the time being.
Updated•6 years ago
|
Updated•6 years ago
|
Comment 47•6 years ago
|
||
Comment 48•6 years ago
|
||
Comment 49•6 years ago
|
||
Backed out 14 changesets (bug 1536766) for causing RunWatchdog and mediaElementAudioSource wpt failures
push that caused the backout: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&selectedJob=241359260&revision=72d37a08f281c24833d1c49e947fb6db4df2fb87
backout: https://hg.mozilla.org/integration/autoland/rev/41f1dcbe9caab2d6c9b9407ec60bd83928d17c08
Assignee | ||
Comment 50•6 years ago
|
||
Thanks for the backout. The RunWatchdog failures are bug 1543980 that I forgot to add a workaround for before pushing. I've done that locally now.
The mediaElementAudioSource WPT were completely unexpected however. I'll try to assess whether they're intermittent and whether we should fix them or update their result expectations.
Assignee | ||
Comment 51•6 years ago
|
||
Comment 52•6 years ago
|
||
Comment 53•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/932ae0651492
https://hg.mozilla.org/mozilla-central/rev/d6286692b29a
https://hg.mozilla.org/mozilla-central/rev/b30dd5995d98
https://hg.mozilla.org/mozilla-central/rev/67442b20a350
https://hg.mozilla.org/mozilla-central/rev/7a96da026057
https://hg.mozilla.org/mozilla-central/rev/0bdfa8fcd3c2
https://hg.mozilla.org/mozilla-central/rev/46eb05673835
https://hg.mozilla.org/mozilla-central/rev/32d459a42132
https://hg.mozilla.org/mozilla-central/rev/0e1bbc399742
https://hg.mozilla.org/mozilla-central/rev/ea1f424b6c7d
https://hg.mozilla.org/mozilla-central/rev/c13cae5f1694
https://hg.mozilla.org/mozilla-central/rev/8b2771254100
https://hg.mozilla.org/mozilla-central/rev/93eb36847687
https://hg.mozilla.org/mozilla-central/rev/2787160c8537
https://hg.mozilla.org/mozilla-central/rev/e2a82cb2f43c
Description
•