MSE audio playback not gapless
Categories
(Core :: Audio/Video: Playback, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox67 | --- | fixed |
People
(Reporter: tomerlahav, Assigned: jya)
References
Details
Attachments
(1 file)
Updated•10 years ago
|
| Assignee | ||
Comment 1•10 years ago
|
||
Comment 2•10 years ago
|
||
Comment 3•10 years ago
|
||
| Assignee | ||
Comment 4•10 years ago
|
||
| Comment hidden (obsolete) |
| Comment hidden (obsolete) |
| Assignee | ||
Comment 7•10 years ago
|
||
| Assignee | ||
Comment 8•10 years ago
|
||
| Assignee | ||
Comment 9•10 years ago
|
||
| Reporter | ||
Comment 10•10 years ago
|
||
| Assignee | ||
Comment 11•10 years ago
|
||
Comment 12•10 years ago
|
||
| Assignee | ||
Comment 13•7 years ago
|
||
(In reply to Tomer Lahav from comment #2)
Thanks for your extremely detailed reply, I appreciate it!
If I understand correctly, you're saying that it's possible to achieve
gapless MSE playback on Firefox.
Do you have any examples of how I should go about it?Let me tell you how I created these audio files, and maybe you'll be able to
tell me where I went wrong.
I created a sine wave as uncompressed 16-bit PCM audio at 44.1kHz - 10
seconds of uncompressed audio.
I then sliced these 10 seconds to 10 segments of PCM data - exactly 44100
samples each (1 second).I encoded the segments to AAC using fdk-aac.
Now, AAC uses blocks of 1024 samples each.
Specifically, fdk-aac adds two blocks of silence ahead of the actual audio
data.
This accounts for 2048 samples, which at 44100 rate are ~0.046440 seconds.
That's why I have to get rid of the first 0.046440 seconds of each file.Now, AAC also must round up to a whole frame/block of data, which must be a
multiple of 1024.
That's why junk audio is also being added to end of file which needs to be
cut out as well.The actual audio data that I care about (sine wave) is exactly 1 second long.
These 1 second (44100 samples) segments should be able to connect gaplessly
to each other, provided that we get rid of the excess "junk" audio at the
start and end of the encoded file.
So with bug 1524890, we trimmed the frames so that rather than dropping the entire frame we make it fit within the appendWindowStart , appendWindowEnd
We end up with a perfect [0, 10s] buffered range. No more hole.
And yet, you can hear gaps in the audio.
You can give it a try.
Windows 64: https://queue.taskcluster.net/v1/task/CDvmPWdhQZq5Fwlk4lIUBg/runs/0/artifacts/public/build/install/sea/target.installer.exe
Mac: https://queue.taskcluster.net/v1/task/VC8v6OYJSnOjgjgRv2tBkQ/runs/0/artifacts/public/build/target.dmg
I'll investigate further what's going on here.
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Comment 14•7 years ago
|
||
So after a bit of investigation, this is what's happening, and why we here a gap still.
So the first two audio packet (1024 audio frames each) are dropped because they are fully outside the appending window. We rely on the demuxed data which states that the packet starts 0 and 0.023220s respectively (and each with a duration of 0.023220). By that info, none of those packets are needed.
So we start decoding from the 3rd packet.
When decoding that 3rd packet (and starting from there), then the first 449 decoded audio frames are silence (0s). That leads to an audible gap of 10.2ms every seconds as each packets is repeated every second.
Now if I don't drop those first two packets and I feed them to the decoder, either FFmpeg or Apple's CoreAudio AAC decoder.
The first packet decoded packet results in 1024 frames of silence.
The 2nd packet decoded has only the first 512 frames that are silence (well, more accurately, the first 450 are 0s, the rest isn't audible), the remaining 512 frames are noise.
The 3rd packet decoded output 1024 frames of noise.
So to get the behaviour we're expecting (apparent gapless), we would need to keep that 2nd packet and somehow makes it non-visible to MSE, but still used for decoding and then dropping all the decoded frames.
Have to think how that could be done with the existing structure in place.
| Assignee | ||
Comment 15•7 years ago
|
||
Some audio decoders, such as AAC and Opus have a need for a pre-roll content. As such, in order to be able to fully get the content of the first frame we keep the frame just prior that would have normally been dropped.
We set this frame to have a duration of 1us so that it will be dropped later by the decoding pipeline. The starting time of the first frame is adjusted so that we have continuous data, without gap in the buffered range.
| Assignee | ||
Comment 16•7 years ago
•
|
||
With this latest change, both streams play without gaps.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a25734ad0ea40ee3513861c52020414de5f9c619
| Reporter | ||
Comment 17•7 years ago
|
||
Awesome, glad to hear!
I'd be happy to test as well once you have a build to share...
Comment 18•7 years ago
|
||
Comment 19•7 years ago
|
||
| bugherder | ||
| Assignee | ||
Comment 20•7 years ago
|
||
(In reply to Tomer Lahav from comment #17)
Awesome, glad to hear!
I'd be happy to test as well once you have a build to share...
This feature is now available in Firefox Nightly 67, thank you for testing :)
| Reporter | ||
Comment 21•7 years ago
|
||
Tested it on nightly - sounds perfectly seamless, thank you!!
Updated•7 years ago
|
Comment 22•3 years ago
|
||
Is this broken again? The demo in the original report (http://melatonin64.github.io/gapless-mse-audio/) doesn't work with the latest releases.
I'm testing it on:
Version: 105.0.3
Build ID: 20221007233509
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:105.0) Gecko/20100101 Firefox/105.0
| Assignee | ||
Comment 23•3 years ago
|
||
Please open a new bug if there's one still
Description
•