[MSE] Implement gapless audio support
Categories
(Core :: Audio/Video: Playback, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox67 | --- | fixed |
People
(Reporter: jya, Assigned: jya)
References
(Blocks 1 open bug, )
Details
Attachments
(17 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
MSE specs have provision to implement gapless audio in the way of having appendWindowStart / appendWindowEnd truncate frames rather than dropping affected frames entirely.
http://w3c.github.io/media-source/index.html#sourcebuffer-coded-frame-processing
step 8 and 9:
-
If presentation timestamp is less than appendWindowStart, then set the need random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.
NoteSome implementations MAY choose to collect some of these coded frames with presentation timestamp less than appendWindowStart and use them to generate a splice at the first coded frame that has a presentation timestamp greater than or equal to appendWindowStart even if that frame is not a random access point. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement.
-
If frame end timestamp is greater than appendWindowEnd, then set the need random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.
NoteSome implementations MAY choose to collect coded frames with presentation timestamp less than appendWindowEnd and frame end timestamp greater than appendWindowEnd and use them to generate a splice across the portion of the collected coded frames within the append window at time of collection, and the beginning portion of later processed frames which only partially overlap the end of the collected coded frames. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement. In conjunction with collecting coded frames that span appendWindowStart, implementations MAY thus support gapless audio splicing.
Additionally, when adding overlapping frames:
step 13.
If last decode timestamp for track buffer is unset and presentation timestamp falls within the presentation interval of a coded frame in track buffer, then run the following steps:
right now we drop the frames whose starting times fall within the presentation interval of the added frame. We should truncate them instead.
This last part would have prevented bug 1524500 from occurring.
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
|
||
Assignee | ||
Comment 2•2 years ago
|
||
Depends on D20159
Assignee | ||
Comment 3•2 years ago
|
||
So that we never access the underlying buffer directly.
Depends on D20160
Assignee | ||
Comment 4•2 years ago
|
||
And we add some strong assertions that we never read passed the end of the buffer.
Depends on D20161
Assignee | ||
Comment 5•2 years ago
|
||
Make its use more explicit and less likely to be incorrect.
Depends on D20162
Assignee | ||
Comment 6•2 years ago
|
||
The number of frames is only meaningful with audio as a VideoData always contain a single frame.
So we only have this member in AudioData, which will simplify the logic in a future commit where the number of frames present depends on the trimming filter applied.
Depends on D20163
Assignee | ||
Comment 7•2 years ago
|
||
This will allow to remove mFrames member and calculate from the size of the content, which will dynamically change depending on a cropping filter.
Depends on D20164
Assignee | ||
Comment 8•2 years ago
|
||
Depends on D20165
Assignee | ||
Comment 9•2 years ago
|
||
Depends on D20166
Assignee | ||
Comment 10•2 years ago
|
||
Don't re-create a new trimmed AudioData when we want to remove some content. This remove the need for some copies.
Depends on D20167
Assignee | ||
Comment 11•2 years ago
|
||
It can be determined from the size of the buffer and the number of audio frames. Additionally, it ensures that the duration of the frame is always exactly what the AudioData contains.
Depends on D20168
Assignee | ||
Comment 12•2 years ago
|
||
Nicer than passing int64_t directly.
Depends on D20170
Assignee | ||
Comment 13•2 years ago
|
||
There's two cases to handle.
1- A Frame isn't entirely contained between appendWindowStart and appendWindowEnd
2- A new frame is appended which overlaps partially an existing frame.
To achieve this we tweak the start time and duration of the sample added (case 1), or the frame about to be partially covered (case 2). We then set a flag that will indicate to the decoder that the decompressed frame will have to be truncated.
Depends on D20171
Assignee | ||
Comment 14•2 years ago
|
||
A simplistic decoder wrapper that will take a decoded frame and truncate it should the originally compressed frame contained trimming information.
Depends on D20172
Assignee | ||
Comment 15•2 years ago
|
||
With these changes
https://nzhang227.github.io/gapless_audio_mse/demo.html
2nd test plays without gap.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 16•2 years ago
|
||
Depends on D20173
Assignee | ||
Comment 17•2 years ago
|
||
Assignee | ||
Comment 18•2 years ago
|
||
We didn't set the duration on the created IMF sample before sending it to the decoder.
For audio this isn't a problem as the duration is calculated from the number of frames returned.
For video however, the duration appears to have been calculated by WMF as the delta of pts. However, for the last frame, the value returned was set to a value not matching our input.
So we set the duration on the IMFSample so it doesn't have to make it up.
Depends on D20330
Assignee | ||
Updated•2 years ago
|
Comment 19•2 years ago
|
||
Pushed by jyavenard@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a00f0fd94fbd P1. Remove unused method. r=bryce https://hg.mozilla.org/integration/autoland/rev/2b3f87f5d02c P2. Add Intersects methods. r=gerald https://hg.mozilla.org/integration/autoland/rev/22d4f90342ef P3. Add AudioData::Data method that returns a Span. r=bryce https://hg.mozilla.org/integration/autoland/rev/90c5339325a3 P4. Use Span<> with AudioBufferCursor. r=bryce https://hg.mozilla.org/integration/autoland/rev/f5daf84b3b7e P5. Make MediaData::Type an enum class. r=bryce https://hg.mozilla.org/integration/autoland/rev/66eb3b2554bd P6. Remove mFrames member from MediaData. r=bryce https://hg.mozilla.org/integration/autoland/rev/926ec4aa5429 P7. No longer access AudioData::mFrames directly. r=bryce https://hg.mozilla.org/integration/autoland/rev/731a0f1e7f21 P8. Rely on buffer length to calculate the number of frames. r=bryce https://hg.mozilla.org/integration/autoland/rev/ddc76517c336 P9. Add += and -= operator. r=gerald https://hg.mozilla.org/integration/autoland/rev/08fd6ddd88e4 P10. Add AudioData::SetTrimWindow. r=bryce https://hg.mozilla.org/integration/autoland/rev/9b93c2718c18 P11. Remove duration from AudioData construction parameter. r=bryce https://hg.mozilla.org/integration/autoland/rev/c3f6ea574cdf P12. Serialise TimeUnit over ipdl. r=mjf https://hg.mozilla.org/integration/autoland/rev/c4e8f8c88edf P13. [MSE] Mark frames as truncated when needed. r=bryce https://hg.mozilla.org/integration/autoland/rev/d60fccd7de59 P14. Add AudioTrimmer wrapper. r=bryce https://hg.mozilla.org/integration/autoland/rev/4ff646d086c0 P15. [MSE] Add extra logging. r=bryce https://hg.mozilla.org/integration/autoland/rev/fe39e68ee383 P16. Fix mochitests. r=bryce https://hg.mozilla.org/integration/autoland/rev/1b377d3c27c3 P17. Set duration on IMF sample. r=bryce
Comment 20•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a00f0fd94fbd
https://hg.mozilla.org/mozilla-central/rev/2b3f87f5d02c
https://hg.mozilla.org/mozilla-central/rev/22d4f90342ef
https://hg.mozilla.org/mozilla-central/rev/90c5339325a3
https://hg.mozilla.org/mozilla-central/rev/f5daf84b3b7e
https://hg.mozilla.org/mozilla-central/rev/66eb3b2554bd
https://hg.mozilla.org/mozilla-central/rev/926ec4aa5429
https://hg.mozilla.org/mozilla-central/rev/731a0f1e7f21
https://hg.mozilla.org/mozilla-central/rev/ddc76517c336
https://hg.mozilla.org/mozilla-central/rev/08fd6ddd88e4
https://hg.mozilla.org/mozilla-central/rev/9b93c2718c18
https://hg.mozilla.org/mozilla-central/rev/c3f6ea574cdf
https://hg.mozilla.org/mozilla-central/rev/c4e8f8c88edf
https://hg.mozilla.org/mozilla-central/rev/d60fccd7de59
https://hg.mozilla.org/mozilla-central/rev/4ff646d086c0
https://hg.mozilla.org/mozilla-central/rev/fe39e68ee383
https://hg.mozilla.org/mozilla-central/rev/1b377d3c27c3
Comment 21•2 years ago
|
||
Hi Jean, is there something we should test here in particular ?
Assignee | ||
Comment 22•2 years ago
|
||
you can test the attached URL if interested
Updated•2 years ago
|
Description
•