Don't create a new decoder for a TrackBuffer when timecode overlap is within error epsilon

RESOLVED FIXED in mozilla35

Status

()

defect
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: kinetik, Assigned: kinetik)

Tracking

(Blocks 1 bug)

Trunk
mozilla35
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

5 years ago
SwitchAudioReader(19.999219) ranges=[( 0.000000, 20.024000)]
SwitchAudioReader(19.999219) ranges=[(20.023000, 90.157000)]

The overlap heuristic in SourceBuffer::AppendData detects the overlap of the second range start time and the first range end time and creates a new decoder.  

The overlap is 1 millisecond, which is the timecode granularity of most WebM files (i.e. they have a timecode scale of 1000000).  The error causes by timecode resolution should be taken into account when determining if a range overlaps.

The current behaviour should work correctly (i.e. other than using unnecessary decoders, there should be no user observable difference), but it would be preferable to avoid creating decoders in this situation.  Note that this can only happen when ContainerParser::IsMediaSegmentPresent returns true (i.e. on a cluster boundary, for WebM), so the inefficiency caused by this problem is somewhat limited.
(Assignee)

Updated

5 years ago
See Also: → 1062670
(Assignee)

Comment 1

5 years ago
This fixes the issue for WebM video.  There are issues remaining with WebM audio due to the way frame durations are calculated, which can be incorrect with audio variable length audio frames.  I'll chase that up in a separate bug.
(Assignee)

Updated

5 years ago
Assignee: nobody → kinetik
Status: NEW → ASSIGNED
(Assignee)

Updated

5 years ago
See Also: → 1065207
(Assignee)

Updated

5 years ago
See Also: → 1065218
(Assignee)

Updated

5 years ago
Attachment #8486884 - Flags: review?(cajbir.bugzilla)

Updated

5 years ago
Attachment #8486884 - Flags: review?(cajbir.bugzilla) → review?(karlt)
(Assignee)

Comment 2

5 years ago
Rebased.
Attachment #8486884 - Attachment is obsolete: true
Attachment #8486884 - Flags: review?(karlt)
Attachment #8486974 - Flags: review?(karlt)
Comment on attachment 8486884 [details] [diff] [review]
Switch internal SourceBuffer timestamp handling from double/seconds to int64_t/microseconds.

Is there a convention that int64_t timestamps in media code are always
microseconds?  If not, please document the units of the member variables and
parameters, or use a typedef.

>+  virtual bool TimestampsFuzzyEqual(int64_t aLhs, int64_t aRhs)
>+  {
>+    NS_WARNING("Using default ContainerParser::TimestampFuzzyEquals implementation");

The frequency of the warning will essentially be saying that this *must* be
fixed for MP4.  Is that the intention?  Perhaps warn only on first call, if
this is less of a priority.

>+    aEnd = mapping[endIdx].mTimecode / NS_PER_USEC;
>+    aEnd += (mapping[endIdx].mTimecode - mapping[endIdx - 1].mTimecode) / NS_PER_USEC;

Wouldn't this be more likely to align with a contiguous start if the estimate
were calculated in NS and then aligned on USEC, instead of aligning
both the timecode and duration individually on USEC?

>-        (start < lastEnd || start - lastEnd > 0.1)) {

>+        !mParser->TimestampsFuzzyEqual(start, lastEnd)) {

I don't know what is the best algorithm here, but this change is more than
just switching from double to int64_t, so please document the change in the commit message.

Is the decision here to determine whether the existing decoder/reader pair
will process the new data appropriately?  Will it be OK with small overlaps?

How much overlap is not OK?
(Assignee)

Comment 4

5 years ago
(In reply to Karl Tomlinson (:karlt) from comment #3)
> Is there a convention that int64_t timestamps in media code are always
> microseconds?  If not, please document the units of the member variables and
> parameters, or use a typedef.

Outside of the format-specific code, yes.  

> >+  virtual bool TimestampsFuzzyEqual(int64_t aLhs, int64_t aRhs)
> >+  {
> >+    NS_WARNING("Using default ContainerParser::TimestampFuzzyEquals implementation");
> 
> The frequency of the warning will essentially be saying that this *must* be
> fixed for MP4.  Is that the intention?  Perhaps warn only on first call, if
> this is less of a priority.

That's the intention, once someone who understand MP4 timecodes has inspected it they can write the necessary (hopefully simple) implementation.

> Wouldn't this be more likely to align with a contiguous start if the estimate
> were calculated in NS and then aligned on USEC, instead of aligning
> both the timecode and duration individually on USEC?

In practice it doesn't matter because almost all WebM timestamps are in milliseconds, but I agree it'd be better and I'll change it.

> I don't know what is the best algorithm here, but this change is more than
> just switching from double to int64_t, so please document the change in the
> commit message.

Will do.

> Is the decision here to determine whether the existing decoder/reader pair
> will process the new data appropriately?  Will it be OK with small overlaps?
> 
> How much overlap is not OK?

In practice, probably up to a frame duration, since most formats/parsers will require/expect data muxed in monotonically increasing order.  What we're trying to detect here is cases that appear to be overlapping based on timecode comparison but are actually contiguous chunks of data (something that would be trivial with a byte offset supplied), and treat anything else as a real overlap that requires a new decoder.

I don't know what the best approach is here yet, either, so this is really just an attempt to improve the status quo.
Comment on attachment 8486974 [details] [diff] [review]
Switch internal SourceBuffer timestamp handling from double/seconds to int64_t/microseconds.

Thanks for the explanations.

(In reply to Matthew Gregan [:kinetik] from comment #4)
> (In reply to Karl Tomlinson (:karlt) from comment #3)
> > How much overlap is not OK?
> 
> In practice, probably up to a frame duration, since most formats/parsers
> will require/expect data muxed in monotonically increasing order.  What
> we're trying to detect here is cases that appear to be overlapping based on
> timecode comparison but are actually contiguous chunks of data (something
> that would be trivial with a byte offset supplied), and treat anything else
> as a real overlap that requires a new decoder.

That sounds OK, provided frame durations are not going to be
less than or equal to twice GetTimecodeScale().

> I don't know what the best approach is here yet, either, so this is really
> just an attempt to improve the status quo.

Changing from floating point to fixed is sensible at least.
Attachment #8486974 - Flags: review?(karlt) → review+
https://hg.mozilla.org/mozilla-central/rev/f63cd36acc65
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35

Updated

5 years ago
Blocks: 1067785
You need to log in before you can comment on or make changes to this bug.