[MSE] Implement gapless audio support

RESOLVED FIXED in Firefox 67

Status

()

P2
normal
Rank:
15
RESOLVED FIXED
2 months ago
5 days ago

People

(Reporter: jya, Assigned: jya)

Tracking

(Blocks: 1 bug)

unspecified
mozilla67
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox67 fixed)

Details

(URL)

Attachments

(17 attachments)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
(Assignee)

Description

2 months ago

MSE specs have provision to implement gapless audio in the way of having appendWindowStart / appendWindowEnd truncate frames rather than dropping affected frames entirely.

http://w3c.github.io/media-source/index.html#sourcebuffer-coded-frame-processing

step 8 and 9:

  1. If presentation timestamp is less than appendWindowStart, then set the need random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.
    Note

    Some implementations MAY choose to collect some of these coded frames with presentation timestamp less than appendWindowStart and use them to generate a splice at the first coded frame that has a presentation timestamp greater than or equal to appendWindowStart even if that frame is not a random access point. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement.

  2. If frame end timestamp is greater than appendWindowEnd, then set the need random access point flag to true, drop the coded frame, and jump to the top of the loop to start processing the next coded frame.
    Note

    Some implementations MAY choose to collect coded frames with presentation timestamp less than appendWindowEnd and frame end timestamp greater than appendWindowEnd and use them to generate a splice across the portion of the collected coded frames within the append window at time of collection, and the beginning portion of later processed frames which only partially overlap the end of the collected coded frames. Supporting this requires multiple decoders or faster than real-time decoding so for now this behavior will not be a normative requirement. In conjunction with collecting coded frames that span appendWindowStart, implementations MAY thus support gapless audio splicing.

Additionally, when adding overlapping frames:
step 13.
If last decode timestamp for track buffer is unset and presentation timestamp falls within the presentation interval of a coded frame in track buffer, then run the following steps:

right now we drop the frames whose starting times fall within the presentation interval of the added frame. We should truncate them instead.

This last part would have prevented bug 1524500 from occurring.

Updated

2 months ago
Rank: 15
Priority: -- → P2
(Assignee)

Comment 2

a month ago

Depends on D20159

(Assignee)

Comment 3

a month ago

So that we never access the underlying buffer directly.

Depends on D20160

(Assignee)

Comment 4

a month ago

And we add some strong assertions that we never read passed the end of the buffer.

Depends on D20161

(Assignee)

Comment 5

a month ago

Make its use more explicit and less likely to be incorrect.

Depends on D20162

(Assignee)

Comment 6

a month ago

The number of frames is only meaningful with audio as a VideoData always contain a single frame.
So we only have this member in AudioData, which will simplify the logic in a future commit where the number of frames present depends on the trimming filter applied.

Depends on D20163

(Assignee)

Comment 7

a month ago

This will allow to remove mFrames member and calculate from the size of the content, which will dynamically change depending on a cropping filter.

Depends on D20164

(Assignee)

Comment 9

a month ago

Depends on D20166

(Assignee)

Comment 10

a month ago

Don't re-create a new trimmed AudioData when we want to remove some content. This remove the need for some copies.

Depends on D20167

(Assignee)

Comment 11

a month ago

It can be determined from the size of the buffer and the number of audio frames. Additionally, it ensures that the duration of the frame is always exactly what the AudioData contains.

Depends on D20168

(Assignee)

Comment 12

a month ago

Nicer than passing int64_t directly.

Depends on D20170

(Assignee)

Comment 13

a month ago

There's two cases to handle.
1- A Frame isn't entirely contained between appendWindowStart and appendWindowEnd
2- A new frame is appended which overlaps partially an existing frame.

To achieve this we tweak the start time and duration of the sample added (case 1), or the frame about to be partially covered (case 2). We then set a flag that will indicate to the decoder that the decompressed frame will have to be truncated.

Depends on D20171

(Assignee)

Comment 14

a month ago

A simplistic decoder wrapper that will take a decoded frame and truncate it should the originally compressed frame contained trimming information.

Depends on D20172

(Assignee)

Comment 15

a month ago

With these changes

https://nzhang227.github.io/gapless_audio_mse/demo.html

2nd test plays without gap.

(Assignee)

Comment 16

a month ago

Depends on D20173

(Assignee)

Comment 18

a month ago

We didn't set the duration on the created IMF sample before sending it to the decoder.
For audio this isn't a problem as the duration is calculated from the number of frames returned.
For video however, the duration appears to have been calculated by WMF as the delta of pts. However, for the last frame, the value returned was set to a value not matching our input.

So we set the duration on the IMFSample so it doesn't have to make it up.

Depends on D20330

(Assignee)

Updated

a month ago
Assignee: nobody → jyavenard

Comment 19

28 days ago
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a00f0fd94fbd
P1. Remove unused method. r=bryce
https://hg.mozilla.org/integration/autoland/rev/2b3f87f5d02c
P2. Add Intersects methods. r=gerald
https://hg.mozilla.org/integration/autoland/rev/22d4f90342ef
P3. Add AudioData::Data method that returns a Span. r=bryce
https://hg.mozilla.org/integration/autoland/rev/90c5339325a3
P4. Use Span<> with AudioBufferCursor. r=bryce
https://hg.mozilla.org/integration/autoland/rev/f5daf84b3b7e
P5. Make MediaData::Type an enum class. r=bryce
https://hg.mozilla.org/integration/autoland/rev/66eb3b2554bd
P6. Remove mFrames member from MediaData. r=bryce
https://hg.mozilla.org/integration/autoland/rev/926ec4aa5429
P7. No longer access AudioData::mFrames directly. r=bryce
https://hg.mozilla.org/integration/autoland/rev/731a0f1e7f21
P8. Rely on buffer length to calculate the number of frames. r=bryce
https://hg.mozilla.org/integration/autoland/rev/ddc76517c336
P9. Add += and -= operator. r=gerald
https://hg.mozilla.org/integration/autoland/rev/08fd6ddd88e4
P10. Add AudioData::SetTrimWindow. r=bryce
https://hg.mozilla.org/integration/autoland/rev/9b93c2718c18
P11. Remove duration from AudioData construction parameter. r=bryce
https://hg.mozilla.org/integration/autoland/rev/c3f6ea574cdf
P12. Serialise TimeUnit over ipdl. r=mjf
https://hg.mozilla.org/integration/autoland/rev/c4e8f8c88edf
P13. [MSE] Mark frames as truncated when needed. r=bryce
https://hg.mozilla.org/integration/autoland/rev/d60fccd7de59
P14. Add AudioTrimmer wrapper. r=bryce
https://hg.mozilla.org/integration/autoland/rev/4ff646d086c0
P15. [MSE] Add extra logging. r=bryce
https://hg.mozilla.org/integration/autoland/rev/fe39e68ee383
P16. Fix mochitests. r=bryce
https://hg.mozilla.org/integration/autoland/rev/1b377d3c27c3
P17. Set duration on IMF sample. r=bryce
Depends on: 1530278
Depends on: 1530280
Depends on: 1530322
(Assignee)

Updated

5 days ago
See Also: → bug 1510093
You need to log in before you can comment on or make changes to this bug.