Closed Bug 1239223 Opened 4 years ago Closed 4 years ago

Curious timestamps of audio data when decoding sine.webm

Categories

(Core :: Audio/Video: Playback, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla46
Tracking Status
firefox46 --- fixed

People

(Reporter: jwwang, Assigned: jya)

References

Details

Attachments

(3 files)

Run "NSPR_LOG_MODULES=timestamp:1,MediaSample:4 ./mach run dom/media/test/sine.webm". We will get logs like:

2016-01-13 07:00:08.099949 UTC - 166700800[7f6402b741a0]: Decoder=7f640432b300 OnAudioDecoded [0,0] disc=1
2016-01-13 07:00:08.100140 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [0,12000] disc=0
2016-01-13 07:00:08.100264 UTC - 166700800[7f6402b741a0]: Decoder=7f640432b300 OnAudioDecoded [0,21333] disc=0
2016-01-13 07:00:08.100435 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [0,21333] disc=0
2016-01-13 07:00:08.100538 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [0,21333] disc=0
2016-01-13 07:00:08.100669 UTC - 166700800[7f6402b741a0]: Decoder=7f640432b300 OnAudioDecoded [0,21333] disc=0
2016-01-13 07:00:08.100879 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [0,21333] disc=0
2016-01-13 07:00:08.101005 UTC - 166700800[7f6402b741a0]: Decoder=7f640432b300 OnAudioDecoded [0,21333] disc=0
2016-01-13 07:00:08.101097 UTC - 166700800[7f6402b741a0]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.104721 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.104855 UTC - 166700800[7f6402b741a0]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.104973 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.105081 UTC - 209991424[7f6402b74080]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.105209 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.105374 UTC - 166434560[7f6402b742c0]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.105548 UTC - 166434560[7f6402b742c0]: Decoder=7f640432b300 OnAudioDecoded [141324,162657] disc=0
2016-01-13 07:00:08.105648 UTC - 166434560[7f6402b742c0]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.105812 UTC - 166434560[7f6402b742c0]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.105919 UTC - 209991424[7f6402b74080]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.106027 UTC - 209991424[7f6402b74080]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.106163 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.106236 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.106380 UTC - 368506624[7f6402b73f60]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0
2016-01-13 07:00:08.106513 UTC - 209991424[7f6402b74080]: Decoder=7f640432b300 OnAudioDecoded [312000,333333] disc=0

There are many samples with the same start time.
Flags: needinfo?(jyavenard)
IIRC, if a block contains more than one sample, they will all have the same timestamp due to the way webm same timestamp.
Not sure if this is what's happening here.
Matthew would know more about this.
Flags: needinfo?(jyavenard) → needinfo?(kinetik)
WebM has timecodes on the (Simple)Block, but a Block may contain more than one frame.  In the case of sine.webm, we see:

| + SimpleBlock (key, track number 1, 8 frame(s), timecode 0.000s = 00:00:00.000) at 8707
|  + Frame with size 28
|  + Frame with size 57
|  + Frame with size 44
|  + Frame with size 43
|  + Frame with size 44
|  + Frame with size 44
|  + Frame with size 43
|  + Frame with size 44
| + [I frame for track 1, timecode 0]
| + SimpleBlock (key, track number 1, 8 frame(s), timecode 0.141s = 00:00:00.141) at 9069
[...]
| + SimpleBlock (key, track number 1, 8 frame(s), timecode 0.312s = 00:00:00.312) at 9434
[...]

Where the first 8 frames correspond to the entries in the log with timecode 0, the next 8 correspond with timecode 141, etc.

To have accurate start times for these frames, we'd need to decode each frame and synthesize the timecodes ourselves.  Given that we're clearly already doing this work to synthesize the end timecode/frame duration, it looks like a bug that we're not updating the start timecode.
Flags: needinfo?(kinetik)
Assignee: nobody → jyavenard
Priority: -- → P5
https://dxr.mozilla.org/mozilla-central/source/dom/media/webm/WebMDemuxer.cpp#545

if a block contains more than one frames, they will all have the same timecode / duration

This was the same in the old WebMReader:
http://hg.mozilla.org/mozilla-central/file/f23035b9e6bd/dom/media/webm/WebMReader.cpp#l459
As far as the demuxer knows every frame in a block has the same start time.  There's (almost always) no duration available from the demuxer (and a duration, like a timecode, is attached to the block rather than the frame(s) contained within).  The actual duration of each frame can differ in the case of audio and you need to decode (or at least inspect) each frame to find out the real duration.

For reference, the old code in VorbisDecoder::Decode in http://hg.mozilla.org/mozilla-central/file/f23035b9e6bd/dom/media/webm/AudioDecoder.cpp#l191

...computes the start time and duration of each frame in the block based on the duration of the last one, as I mentioned in comment 2.
thank you.

So I need to re-assess how we do MSE with Opus and Vorbis then, as we store frames straight out of the demuxer, and the buffered range is calculated per what has been demuxed.

If all the frames have the same start time (we haven't decoded anything yet), it means that our buffered range will be one sample long.. that's no good.

I'll check however in this particular case, why the frames have the same start time out of the decoder. That should be identical to the previous webmreader / AudioDecoder
Priority: P5 → P2
Attachment #8708215 - Flags: review?(kinetik) → review+
See Also: → 1240201
https://hg.mozilla.org/mozilla-central/rev/2fbc29d0ecfe
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Blocks: 1241057
Blocks: 1241062
No longer blocks: 1241062
sorry, bz got confused where to upload the patch... this patch is for bug 1241062
Comment on attachment 8709941 [details] [diff] [review]
Bug 1241062: [opus] P1. Properly calculate decoded audio sample timestamps.

Review of attachment 8709941 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/platforms/agnostic/OpusDecoder.cpp
@@ +258,5 @@
>    if (!duration.isValid()) {
>      NS_WARNING("OpusDataDecoder: Int overflow converting WebM audio duration");
>      return -1;
>    }
> +  CheckedInt64 time = startTime - FramesToUsecs(mOpusParser->mPreSkip - mFrames,

Can't this underflow?
Attachment #8709941 - Flags: review?(kinetik)
the maximum size of an opus sample is 120ms, so it's fairly unlikely it ever could.

I could of course change that to startTime + FramesToUsecs(mFrames,...) - FramesToUsecs(mOpusParser->mPreSkip - mFrames

mFrames will in practice always be 0 once bug 1240201 land
Just add an assert then.  r+
Attachment #8710262 - Flags: review?(kinetik) → review+
You need to log in before you can comment on or make changes to this bug.