Setting playbackRate on a media element does not work when the media element is going to the MediaStreamGraph
Categories
(Core :: Audio/Video: MediaStreamGraph, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox91 | --- | fixed |
People
(Reporter: ntim, Assigned: alwu, Mentored)
References
(Blocks 1 open bug)
Details
(Keywords: dev-doc-complete, Whiteboard: [lang=c++] [media-audio])
Attachments
(6 files, 4 obsolete files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
I'm facing the same problem after I changed the volume via GainNode and changing the playbackRate of the HMLAudioElement. Changing the playbackRate to 2.0 works with Chrome and Opera, but not with Firefox. To find a solution, I created a post on stackoverflow:
https://stackoverflow.com/questions/57391854/how-to-set-playbackrate-of-htmlaudioelement-in-combination-with-a-sourcenode-and
I don't understand why the playbackRate can't be changed when I just changed the volume with the Web audio API. Also I tried to find the proof that changing playbackRate is not supported in the specs. I found nothing. It doesn't make sense to me.
Please correct me, if im wrong. In test provided by Daniel, playback rate is changing, while we are in Bypass mode (isBypassing == true) and, doesn
t react in other mode (when isBypassing == false). That is the problem, it should has the same behavior in both modes, am I right?
I think I can try to solve this.
(In reply to Vadim from comment #4)
Please correct me, if i
m wrong. In test provided by Daniel, playback rate is changing, while we are in Bypass mode (isBypassing == true) and, doesn
t react in other mode (when isBypassing == false). That is the problem, it should has the same behavior in both modes, am I right?
I think I can try to solve this.
You are right. That's what i also experience. Is there any progress on that?
Updated•5 years ago
|
is there any progress on this issue? It's over an year old and still not fixed...
Assignee | ||
Comment 8•5 years ago
|
||
Hi, Paul,
Could you give me any suggestion about how should I fix this bug?
Thank you.
Comment 9•5 years ago
|
||
This method appends the audio to the graph. It's necessary to time-stretch it (for now we only apply the volume), using SoundTouch, like we do in AudioStream.cpp. It might be necessary to extract the time-stretcher if we're using it in two locations, to have the same options, etc.
Comment 12•4 years ago
|
||
Any progress or update on this bug? It's been 2 years now...
Comment 14•4 years ago
|
||
Has not yet appeared information?Very necessary thing
Reporter | ||
Comment 15•4 years ago
•
|
||
Paul, do you have an updated permalink in comment 9? It would be helpful for a contributor who wants to look at it in their free time. I may have time in the next months but I can't promise anything, though I'm sure anyone who's free right now would find this helpful :)
Comment 16•4 years ago
|
||
(In reply to Tim Nguyen :ntim from comment #15)
Paul, do you have an updated permalink in comment 9? It would be helpful for a contributor who wants to look at it in their free time. I may have time in the next months but I can't promise anything, though I'm sure anyone who's free right now would find this helpful :)
Hi, I've recently looked into the code but I'm not really fluent C++. Paul was referring to this line at the time, so the DecodedStream::SendAudio
method. This is the up-to-date permalink.
Reporter | ||
Comment 19•4 years ago
|
||
(In reply to Nerixyz from comment #16)
Hi, I've recently looked into the code but I'm not really fluent C++. Paul was referring to this line at the time, so the
DecodedStream::SendAudio
method. This is the up-to-date permalink.
Thanks! I've taken a quick look at this and all the extra context around it, there are a couple of things I found:
- bug 966247 comment 32 from :jwwang, so this will need some extra work for video input as well, and buffering latency as well.
We will need to overhaul DecodedStream which is responsible for sending
audio/video frames to a media stream.To support playbackRate rate changes in video is simple. We just need to
modify the duration of each frame (duration /= playbackRate) before sending
it to the media stream.We have more work to do when it comes to audio. We will need to employ
SoundTouch (which is also used by AudioStream) to stretch audio frames in
response to playbackRate changes.The biggest issue is the latency in playbackRate changes introduced by the
push-model of DecodedStream. DecodedStream pushes audio/video frames to the
media stream immediately after the frames are received by MDSM. So if the
amount of decode-ahead is 10s, the change in playbackRate will take effect
10s later.One solution is to change the push-model to pull-model to minimize the
latency (which we have done for AudioStream!). However it requires rewriting
most of the code of DecodedStream which is certainly not trivial.
-
bug 1443511 which solves the latency mentioned by the previous comment for volume changes (without rewriting DecodedStream)
-
Relevant code in AudioStream.cpp is
mTimeStretcher
and notablyGetTimeStretched
My familiarity with low-level audio/stream concepts isn't so good, so I hope that can be useful to anyone who wants to look at this.
Assignee | ||
Comment 20•4 years ago
|
||
Sorry for taking the bug but didn't do any actions, I was working on other higher priority stuffs. If anyone gets interest in this bug, feel free to take it, or I can take this bug again.
So from the things I know so far, the work would need to be done inside GraphDriver where we create low-level audio stream (cubeb) for the Gecko-level audio stream.
AudioStream is used for normal audio playback where we would perform the playback rate change so this issue won't happen on normal playback.
Assignee | ||
Updated•4 years ago
|
Comment 21•4 years ago
|
||
Alastor, I might take this again after I'm done with the telemetry stuff, let's keep each other informed of our respective progress.
Comment 22•4 years ago
|
||
Guys, is this really that hard to fix?
Assignee | ||
Comment 23•4 years ago
|
||
I started looking at this issue today, but I'm not sure if my direction is correct. First, as the AudioCallbackDriver
is the place interacting with cubeb, that probably is the place to change the playback rate. So I started adding a method on MediaTrackGraph
, which is to set the playback rate on the driver, and send the playback rate from DecodedStream
via accessing mData->mAudioTrack->Graph()
. That would append a message in order to set the playback rate to the driver on the graph thread.
However, when digging more details in the AudioCallbackDriver
, I found that it's actually a source to accept a mixed stream via MixerCallback()
, which is a mixed stream from different input and output stream together, that seems indicating that the playback rate change should not put on here, because not all streams need to be modified the rate.
Paul, would you mind to give me some suggestion about this?
Thank you.
Assignee | ||
Comment 24•4 years ago
|
||
In addition, the work for this issue are more media stream related, so move this to another component.
Comment 25•4 years ago
|
||
This needs to happen in DecodedStream (around here. Let me know if a quick call would be useful to get you started?
Assignee | ||
Comment 26•4 years ago
•
|
||
Quick update, I've talked to Paul today, the current architecture of processing audio samples in DecodedStream
is pull-driven, which makes us hard to implement a solution for playback rate on a realtime stream, because push-driven method causes some latency. That means the playback rate won't take effect immediately. Therefore, we will need to do some work to change this behavior and make a better solution for playback. That would need more time, so sorry for the waiting.
Assignee | ||
Comment 27•4 years ago
|
||
AudioDecoderInputTrack
stores the audio data in a SPSC queue which allows MTG to pull data from it by calling AudioDecoderInputTrack::AppendData()
.
Assignee | ||
Comment 28•4 years ago
|
||
As we will use AudioDecoderInputTrack
for audio track later, need to remove some dependency of SourceMediaTrack
first.
Depends on D106040
Assignee | ||
Comment 29•4 years ago
|
||
Depends on D106041
Assignee | ||
Comment 30•4 years ago
•
|
||
Hi, Andreas,
My current patches implement a new class AudioDecoderInputTrack
to fulfill the pull-driven track for DecodedStream
. However, after applying those patches, take comment1 as an example, the output sound is completedly distortion, and I haven't found the reason.
(Note, those patches haven't performed time stretching yet. They are simply changing the push-driven to pull-driven)
Could you give me some suggestion and see if anything I did is wrong?
Thank you so much.
Comment 31•4 years ago
|
||
I'm going to build your patches and dump the audio (MOZ_DUMP_AUDIO=1) to see if I can hear/inspect what the problem seems to be. In the meantime I'll provide some pointers on the AudioDecoderInputTrack for some best-practice things.
Comment 32•4 years ago
|
||
One issue I see is that you include silence from underruns in the clock. That will confuse the state machine because you will overrun the duration before ending. The other one I see is that you assume AUDIO_FORMAT_S16. This is causing the distortion. On desktop we always use float. On Android we still use int16 I believe, but that is subject to change. Some older versions did not support float AIUI.
s/AUDIO_FORMAT_S16/AUDIO_OUTPUT_FORMAT/ will make your troubles go away.
This could be handy perhaps, I rewrote the test case in comment 1 a bit to better be able to reason about the output. See https://codepen.io/pehrsons/pen/KKNyVmX. It generates a 100Hz sine tone on the fly (needs a pref) and plays that. I also removed the filter, so activating the filter will now turn on playback through the graph using DecodedStream. With MOZ_DUMP_AUDIO=1 it's useful to look at the generated wav file (appearing in the working directory, sandbox needs to be off) in e.g., Audacity.
Assignee | ||
Comment 33•4 years ago
•
|
||
Hi, Andreas,
Thank for the suggestion you gave me last time, and now I have some new questions. So for current WIP, I changed the way how DecodedStream
deal with the audio data, now audio track would determine when it should end, instead of setting the end time via EndTrackAt()
and comparing the time in NotifyOutput()
.
The reason of doing that is because, at the time calling EndTrackAt()
DecodedStream
actually didn't know the actual frames amount which would be writed. That would be determined later while appending data onto mSegement
in AudioDecoderInputTrack
(appending remaining + resampling / time stretching)
The problem I encounter now is that, I don't know how to suspend the track when DecodedStream
is not playing (i.e. media is paused). So even if the DecodedStream
is not playing, AudioDecoderInputTrack
would keep appending silience to the MTG
, which causes DecodedStream::NotifyOutput()
being called during media is paused.
I saw that MediaTrack::Suspend()
seems something I can try, but I'm not sure if that is a right way to do, because MTG
might have its own way to manage the tracks.
Thank you so much.
Comment 34•4 years ago
|
||
Suspending the input track wouldn't be enough, because you could still be underrunning while the DecodedStream is playing, and NotifyOutput would wrongly account for that silence too.
I mentioned a bit on this under Q5 in Paul's doc, pasted here for reference:
Q5 : Do I need to care about the A/V synchronization? Or that will be handled by MTG?
Currently the audio clock reported from DecodedStream to MediaDecoderStateMachine is powered by MediaTrackListener::NotifyOutput. You will have to change this because ProcessedMediaTracks are not allowed to underrun (so silence you add because no real data was available would count towards progressing the clock). But you can keep the listener for video and for audio you can call directly into NotifyOutput from the graph thread in ProcessInput. mOnOutput needs to signal the amount of real samples added to the track. When time stretching, I guess this number of frames would have to be converted too(?). Likewise for video in video-only cases, I suppose.
Assignee | ||
Comment 35•4 years ago
|
||
Thanks! That did solve my problem. Will update patches later after having more progress.
Assignee | ||
Comment 36•4 years ago
|
||
Quick update, I've finished the implementation of changing the data process model in DecodedStream
, now I'm testing them to ensure that they won't cause any breakage. If it goes well, then I will start the next step that is to implement the time stretching for audio data.
Comment 37•4 years ago
|
||
Landing these separately could be useful for detecting regressions from the first piece.
Comment 38•4 years ago
|
||
Comment on attachment 9204703 [details]
Bug 1517199 - part1 : implement AudioDecoderInputTrack.
Revision D106040 was moved to bug 1695265. Setting attachment 9204703 [details] to obsolete.
Comment 39•4 years ago
|
||
Comment on attachment 9204704 [details]
Bug 1517199 - part2 : use MediaTrack
as input parameter for DecodedStreamGraphListener's method.
Revision D106041 was moved to bug 1695265. Setting attachment 9204704 [details] to obsolete.
Comment 40•4 years ago
|
||
Comment on attachment 9204705 [details]
Bug 1517199 - part3 : use AudioDecoderInputTrack
for audio in DecodedStream
.
Revision D106042 was moved to bug 1695265. Setting attachment 9204705 [details] to obsolete.
Assignee | ||
Comment 41•4 years ago
|
||
Quick update: the pre-requirement work has been moved to bug1695265, will continue work on this one after finishing that.
Assignee | ||
Comment 42•4 years ago
|
||
update : just landed bug1695265, and I will continue to do the rest of work in this bug.
Assignee | ||
Comment 43•4 years ago
|
||
update : my current WIP can adjust playback rate successfully, will do more testing to ensure that everything goes well.
Assignee | ||
Comment 44•4 years ago
|
||
Assignee | ||
Comment 45•4 years ago
|
||
update : hooray! now my patch work on all kinds of audio, I tested that on Youtube and it worked well! I'm going to test my patch on the try server to see if we get any failure.
Updated•4 years ago
|
Assignee | ||
Comment 46•4 years ago
|
||
Depends on D114560
Comment 47•4 years ago
|
||
This is great news :alwu.
I emailed you another test case where I experienced something which I think is related to this issue.
Can you please see if your patch has an effect on it?
Short summary sample is when trying to use an AudioContext on a video element, you lose the ability to setPlaybackRate.
Simplest example being:
( new AudioContext() ).createMediaElementSource( obj_video_tag )
Assignee | ||
Comment 48•4 years ago
|
||
Yes, I've tested your site and it works well after applying my fix.
Assignee | ||
Comment 49•4 years ago
|
||
I'm going to target landing these patches on Fx90, so change the priority to P1.
Assignee | ||
Comment 50•4 years ago
|
||
Depends on D114916
Assignee | ||
Comment 51•4 years ago
|
||
Depends on D114921
Assignee | ||
Comment 52•4 years ago
|
||
I found another problem in DecodedStream
that would make us incorrectly determine when the video would end if we adjust playback rate to 1x+ (eg. 2x). Currently the video track and the audio track use different type of media stream track, and the one that video track uses doesn't support for the playback rate changes, which makes the video output time from the track is slower than the acutally audio output time.
Eg. if I play a video, which duration is 60s, in 2x playback rate, when the audio ends in 60s, the video output time is still in 30s. So we have to wait another 30s in order to receive the ended event for media element.
I'm working on this problem, and will submit another patch to handle that.
Assignee | ||
Comment 53•4 years ago
|
||
When we adjust the playback rate on the audio track, the audio clock time would be no longer align with the graph time.
Eg. playback rate=2, when the graph time passes 10s, the audio clock time actually already goes forward 20s.
After audio clock ended, the video clock would start to drive the clock time but the video clock time is align with the graph time, which means it would be smaller than the audio clock in that situation.
Therefore, we have to ignore the video clock time if that happens. In addition, this patch also address the duration change on video frames based on the playback rate.
Depends on D114922
Updated•4 years ago
|
Assignee | ||
Comment 54•4 years ago
•
|
||
Update : currently my patches already got r+, but the reviewer want me to add more test to ensure the A/V won't be broken when we change the playback rate for the media stream track.
When I was watching normal videos on Youtube and changed its playback rate during capturing the stream, I didn't notice that anything wrong on the A/V sync. But for a more precise check, I started testing on this video where we can clearly notice the defect if the A/V isn't sync.
Then, I found out most of time the A/V sync works well if you capture stream before video starts playing. But sometime it would be broken, one example here is to start video first without capturing the stream, then modify the playback rate to 0.5x, then we can see the A/V out of sink. Here is the profiled result. When DecodedStream
's clocktime was 4040854
, VideoSink
already outputed some frames that were later than the clocktime.
At that time the frames in VideoSink
are like this, it should only play the first frames, instead of those frames later than the clock time.
[Child 674857: MediaDecoderStateMachine #1]: V/MediaDecoder DecodedStream=7f94230e4800 time is now 4040854
[Child 674857: MediaDecoderStateMachine #1]: D/MediaDecoder DecodedStream=7f94230e4800 Dropping audio [4014500,4034500] sz=66
[Child 674857: MediaDecoderStateMachine #1]: V/MediaDecoder VideoSink=7f9410d857c0 playing video frame 4037000 (id=7b) (vq-queued=5)
[Child 674857: MediaDecoderStateMachine #1]: V/MediaDecoder VideoSink=7f9410d857c0 playing video frame 4071000 (id=7c) (vq-queued=5)
[Child 674857: MediaDecoderStateMachine #1]: V/MediaDecoder VideoSink=7f9410d857c0 playing video frame 4104000 (id=7d) (vq-queued=5)
[Child 674857: MediaDecoderStateMachine #1]: V/MediaDecoder VideoSink=7f9410d857c0 playing video frame 4137000 (id=7e) (vq-queued=5)
[Child 674857: MediaDecoderStateMachine #1]: V/MediaDecoder VideoSink=7f9410d857c0 playing video frame 4171000 (id=7f) (vq-queued=5)
But the weird thing is that, when you change the rate to other value than 0.5x, such like 1.5x, 2.0x, it would only have a short period of times where A/V sync is broken (eg. ~1s), then adjust back to normal soon. But for 0.5x, it would still stuck on the wrong A/V. However, if you pause the video, and resume it again, the A/V would back to normal immediately no matter what playback rate you set before.
I need to check the logic in VideoSink
to see what happened. In addition, I also noticed another issue that is sometime if we change the playback rate dramastically, eg. from 2.0x to 0.5x (only happening from very fast to very slow), then the audio clock would get stuck. The reason of clock getting stuck is because MDSM stopped decoding audio data as we have already had many audio data due to previous fast playback rate (MDSM
would control how many audio data we need to buffered based on the playback rate.) I haven't fully understood why that would happen comparing with simply playing video in a very slow playback rate. And wonder if those two issues have any similarity in the root cause.
Assignee | ||
Comment 55•4 years ago
|
||
I think I understand why A/V isn't sync after we capture the stream. That is actually not related with changing the playback rate. There are two reasons (1) we incorrectly discard some audio frames when we swtich to the DecodedStream
from AudioSink
(2) clock time isn't consistent between two sinks.
The first issue "incorrectly discarding some audio frames" happens when we switch to the DecodedStream
from AudioSink
. AudioSink
would store some frames in its own media queue before those audio frames get played. After AudioSink
got shutdown, it doesn't push those frames back to the original audio queue. Therefore, when DecodedStream
starts, it try to retrieve audio data from the audio queue, but it doesn't know that there is already a gap between "the audio DecodedStream
is going to play" and "the audio that AudioSink
had played". This issue also causes the problem that I mentioned in comment54, where the audio clock would stop working during the dramastically playback rate change.
After solving the first issue, the A/V syncs would be getting closer, but still has a little bit delay between audio and video. Because the DecodedStream
will determine the audio clock based on the start time. Currently the start time is based on the current media time that actually has some difference with the audio clock time. The audio clock time is calculated based on the amount of audio frames we've played. So if the DecodedStream
use the start time that is different with the amount of audio frames we've played, then the audio clock would be shifting a little bit from the video.
If I simply solved the first issue, but not to address the second issue. I can still notice that around 2 frames shifting between audio and video. But if I solved the second issue as well by using this patch. Then I can see the A/V sync works perfectly.
Assignee | ||
Comment 56•4 years ago
|
||
The interesting thing is that, the first issue has existed for a long while. But on the version where I didn't land the refactor in bug1695265 yet, I didn't see A/V going unsync after capturing a media stream. TBH I don't know why, but abandoning the frames that we've not played before is apparently wrong to me.
Assignee | ||
Comment 57•4 years ago
|
||
When we switch to the DecodedStream from AudioSink. AudioSink would store some frames in its own media queue before those audio frames get played. After AudioSink got shutdown, it doesn't push those frames back to the original audio queue.
Therefore, when DecodedStream starts, it try to retrieve audio data from the audio queue, but it doesn't know that there is already a gap between "the audio DecodedStream is going to play" and "the audio that AudioSink had played".
Comment 58•4 years ago
|
||
Comment on attachment 9223069 [details]
Bug 1517199 - part6 : ensure AudioSink won't discard frames that are not played yet.
Revision D115749 was moved to bug 1712595. Setting attachment 9223069 [details] to obsolete.
Updated•4 years ago
|
Assignee | ||
Comment 59•4 years ago
|
||
Update : reviews are all going well, and very close to finish. But we're in the soft-code-freeze already, enabling a new feature (adjusting playback rate on the captured stream) is not encourage by the release manager. So I will land these patches on Fx91, and sorry couldn't make that happen on Fx90.
Assignee | ||
Comment 60•4 years ago
|
||
But I will land those blocker bugs that will fix the A/V sync issue on Fx90.
Updated•4 years ago
|
Assignee | ||
Comment 61•4 years ago
|
||
Depends on D115035
Comment 62•3 years ago
|
||
Comment 63•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/172ef2cae6ed
https://hg.mozilla.org/mozilla-central/rev/772b0a555674
https://hg.mozilla.org/mozilla-central/rev/8d6a8d4836ea
https://hg.mozilla.org/mozilla-central/rev/1c845c5e1253
https://hg.mozilla.org/mozilla-central/rev/b1945b566398
https://hg.mozilla.org/mozilla-central/rev/2d291a99004d
Assignee | ||
Comment 64•3 years ago
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]: This fixes a long existing issue (originally reported in bug966247, 7 years ago) so I think adding this on the note would be good for people waiting this change.
[Affects Firefox for Android]: Yes
[Suggested wording]: Fix the issue of changing playback rate failed on the media element which stream is being captured.
[Links (documentation, blog post, etc)]: None
Comment 65•3 years ago
•
|
||
This is more of a thing for developers, setting dev-doc-needed
.
Suggested wording for developers:
- Changing the speed of a media element ("<video>" or "<audio>") using the
playbackRate
attribute now works when the media element is captured to aMediaStream
or viaAudioContext.createMediaElementSource
.
Comment 66•3 years ago
|
||
Unfortunately there is no way to detect this, so websites using this feature are likely to UA sniff, and so it won't start working magically without a fix.
Description
•