Audio skips if <video>.playbackRate > 2
Categories
(Core :: Audio/Video: Playback, defect, P3)
Tracking
()
People
(Reporter: s.bugzilla.mozilla, Assigned: ke5trel)
References
Details
Attachments
(10 files)
24.07 KB,
text/plain
|
Details | |
558.53 KB,
application/x-gzip
|
Details | |
791.84 KB,
audio/mpeg
|
Details | |
840.66 KB,
audio/mpeg
|
Details | |
35.90 KB,
image/png
|
Details | |
36.99 KB,
image/png
|
Details | |
135.33 KB,
image/png
|
Details | |
87.87 KB,
image/png
|
Details | |
1.74 MB,
application/zip
|
Details | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
Reporter | ||
Updated•8 years ago
|
Comment 1•8 years ago
|
||
Updated•8 years ago
|
Reporter | ||
Comment 2•8 years ago
|
||
Comment 3•8 years ago
|
||
Reporter | ||
Comment 4•8 years ago
|
||
Comment 5•8 years ago
|
||
Reporter | ||
Comment 8•8 years ago
|
||
Reporter | ||
Comment 9•8 years ago
|
||
Reporter | ||
Comment 10•8 years ago
|
||
Reporter | ||
Comment 11•8 years ago
|
||
Reporter | ||
Comment 12•7 years ago
|
||
Comment 13•7 years ago
|
||
Assignee | ||
Comment 14•7 years ago
|
||
Assignee | ||
Comment 15•7 years ago
|
||
Assignee | ||
Comment 16•7 years ago
|
||
Comment 18•6 years ago
|
||
I believe this is due to Firefox playing ~70ms samples of audio at original speed, and skipping between them. In my experience with a similar effect in mplayer (https://mplayerhq.hu/), blending samples of about 15ms produces quality comparable to Webkit (Chrome, Opera).
To test this yourself in Firefox and mplayer:
video: https://www.youtube.com/watch?v=K0Tsa3smr1w
"DASH Audio" stream URL for same video (retrieved with youtube-dl -j K0Tsa3smr1w)
mplayer -speed 2 -af scaletempo=stride=70 <media_file_or_url> # present Firefox quality
mplayer -speed 2 -af scaletempo=stride=15:overlap=1 <media_file_or_url> # better quality
Note that using ]
to speed up greater than 3x, the words are still intelligible in the latter method. BTW: "overlap=1" causes the samples to be blended together across their entire duration.
Comment 19•6 years ago
|
||
(In reply to Bryce Seager van Dyk (:bryce) from comment #7)
I've been unable to repro this on OSX and a couple of other systems I've
tested.Isaac, are you still seeing issue? If so, could you please gather a log
using MOZ_LOG=MediaFormatReader:5?
What are the prospects of this bug getting fixed in the near future? Do you need anything to reproduce the bug? (I personally have noticed the bug for months (ever since I wanted to switch back to firefox) across devices and platforms; so reproducing it should be easy I guess?)
Paul, do you have any thoughts on the difficulty of a fix here and if there are any related audio bugs we should be aware of/link to this bug?
Comment 21•6 years ago
|
||
It would be 1383363, and it's not too hard for someone that knows how to write DSP code, but needs to be prioritized.
Comment 22•6 years ago
|
||
I also just want to clarify: I notice this at 1.5 and 1.75 and 2x speeds as well, it's just more subtle - definitely still there though. Since the title here is talking about the issue at higher speeds, just want to make sure that it's not ignored for speeds <2x.
Comment 23•6 years ago
|
||
I used MOZ_DUMP_AUDIO=1 to capture samples from 1x vs 3x. The waveforms at 3x are identical to 1x for about 45ms. Then it skips and copies another 45ms.
Comment 24•6 years ago
|
||
In this screenshot, the first track is Chrome, 2nd track is Firefox at 3x, and 3rd track is Firefox at 1x.
Comment 25•6 years ago
|
||
Top is 3x. Bottom is 1x. Estimate of the amount of audio data that is skipped before waveforms align.
Comment 26•6 years ago
|
||
I started digging into the SoundTouch library and confirmed my suspicion. https://hg.mozilla.org/mozilla-central/file/tip/media/libsoundtouch/src/TDStretch.cpp#l637
Author says it's implemented using "WSOLA-like method". Waveform-similarity-based synchronized overlap-add. Maybe based on this paper from 1993? https://www.semanticscholar.org/paper/An-overlap-add-technique-based-on-waveform-(WSOLA)-Verhelst-Roelands/d94abd77e52a56c425e4b86e6c7d692583ea406d
Perhaps it doesn't scale well past 2x.
Comment 27•6 years ago
|
||
From the author: https://www.surina.net/article/time-and-pitch-scaling.html
Assignee | ||
Comment 28•6 years ago
|
||
(In reply to alex.korchemniy from comment #23)
The waveforms at 3x are identical to 1x for about 45ms. Then it skips and copies another 45ms.
This is because the default sequence size at 2x speed is 40ms with some variation due to 15ms seek window and 8ms overlap. I made a build with a 10ms sequence size that produces similar results to Chrome up to 4x speed, 1ms produces slighter better results but the difference is subtle. According to documentation, a smaller sequence is more computationally expensive but I have been unable to detect any significant difference myself. I think 10ms is a good middle ground, below which there are diminishing returns. The seek window and overlap can be left alone as these values help smooth out irregularities quite well.
Example video used in comparison: https://www.youtube.com/watch?v=fDek6cYijxI
Assignee | ||
Comment 29•6 years ago
|
||
Updated•6 years ago
|
Comment 30•6 years ago
|
||
Thanks. The patch is a significant improvement!
At 4x with 10ms window, it's easy to comprehend speech. There is some slightly noticeable choppiness in some videos. I'm not an audio expert, but I tried to do some analysis using Audacity + Gimp. I plotted both track as Mel Spectrograms, took a screenshot, aligned the two graphs in Gimp, set the layer mode to difference. There are some noticeable periodic artifacts. Perhaps related to the "sliding" part of the algorithm.
I'll try 1ms later today.
Comment 31•6 years ago
|
||
With the SoundTouch algorithm there isn't much improvement with a window size below 10ms. I found a quicker way to test various parameters using mpv, which uses same/similar algo. When I get some time I'll generate random samples with various parameters and do a blind perceptual test.
mpv --af=scaletempo=stride=8:overlap=1:search=10 --speed=4 test.mp3
- Overlap in this case is a percentage rather than ms.
Comment 32•6 years ago
|
||
![]() |
||
Comment 33•6 years ago
|
||
bugherder |
Comment 34•6 years ago
|
||
Not sure whether this is considered inappropriate here, however I just wanted to say thank you tackling this issue.
The latest Firefox nightly sounds nearly identical to Chrome for me now. So this is a huge improvement! I'm not sure whether this would be worth the effort, however I might see use cases (especially for power users) to configure the SoundTouch without recompiling Firefox.
So thanks again for making this happen <3. This has been a show stopper for me to switch from Chrome to Firefox for the past years and now I can finally do the switch. I really grateful for this.
FWICT there seem to be many people on the internet who did/could not switch to Firefox because of this bug (OTOH this might be selective perception). Anyway IMHO it might be a good idea communicate this change broadly and not bury it somewhere deep in the changelog. This is a major feature for some people and might be the biggest improvement in Firefox 73!
Comment 35•6 years ago
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]: Vastly improved audio quality when changing the audio playback rate during media playback. This was requested for years, and brings Firefox closer perceptually to other browsers
[Affects Firefox for Android]: yes
[Suggested wording]: Improved audio quality when playing back audio at a faster or slower speed
[Links (documentation, blog post, etc)]:
Updated•6 years ago
|
Comment 37•6 years ago
|
||
Firefox is an essential project. I am so god damn happy that this has been fixed. omg. a basic feature built into youtube/audible, any major media distributor has built in speed controls expecting functional browser implementation. especially in the era of chromium based browser monopoly this is important. (I don't trust chromium even though it is open source) I listen to audio at like 3.5-5.5 speeds with closed backed studio tuned headphones and any improvement can be noticeable. I can't get past 2.0-3.0 with laptop speakers on chrome/firefox 73 or with pre firefox 73 and headphones. but now, YEET Chrome, google. I moved my bookmarks over and deleted most of my chrome data already this here is my last action on chrome before uninstall.
I am wondering as an aside, could i get some code usable in the console to mess with more precise window and stuff to experiment at very high speeds. (If you have spare time, I can figure it out after some finagling) I am very satisfied with beta Firefox 73 audio at high speeds, but i wanna experiment. I consume a lot media always involving speech. In any case thank you, thank you, thank you guys ad infinitum for your very important work here and in general.
Comment 38•6 years ago
|
||
Clearly something that we can do yes. I'm going to think about it today. Thanks for your kind message!
Updated•1 year ago
|
Description
•