Open Bug 1220051 Opened 9 years ago Updated 2 years ago

audio out of sync in mp4 on firefox windows 7

Categories

(Core :: Audio/Video: Playback, defect, P3)

x86_64
Windows 7
defect

Tracking

()

People

(Reporter: peeyush, Unassigned)

Details

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.71 Safari/537.36

Steps to reproduce:


Example URL:
https://www.dropbox.com/s/zjq2brw7comtadp/videoStream_1444945180029_363.mp4?dl=1

Steps to reproduce the problem:
1. Download the video from the specific url above.
2. Play in firefox. 



Actual results:

You will notice a significant audio sync problem.


Expected results:

The audio and video should be in sync. If you play the same video with firefox mac the audio and video are both in sync.
OS: Unspecified → Windows 7
Hardware: Unspecified → x86_64
Confirming on Nightly 45.0a1 20151029 and 41.0.2 on Windows 8.1. Plays in sync in VLC.
Status: UNCONFIRMED → NEW
Component: Untriaged → Audio/Video: Playback
Ever confirmed: true
Product: Firefox → Core
Version: 41 Branch → Trunk
Flags: needinfo?(jyavenard)
How was that video encoded?

Every single frame have a different duration. never seen that before !
Flags: needinfo?(peeyush)
this video was encoded using wowza transcoder on wowza server. We have a setup where an rtmp(flash) webcam stream from the browser is recorded by wowza first to an flv file and then encoded to an mp4. The webcam stream is captured using hdfvr (http://hdfvr.com)

Could you please help me what command you used to get the information about the frames duration?
Flags: needinfo?(peeyush)
something like:
ffprobe -show_entries packet=pts_time,dts_time,duration_time,stream_index videoStream_1444945180029_363.mp4 

you need to install the ffprobe utility
I could run the command and see what you meant. But i ran this command on a second video where not all frames are different but few of them and there is no sync issue on Firefox windows 7.

https://www.dropbox.com/s/gah9f5nq9fdjyh3/videoStream_1446206627573_216.mp4?dl=1
that file is badly muxed.
I'm tempted to close it as invalid.

There are multiple frames with the same timestamp in there:
like:
stream_index=0
pts_time=1.135000
dts_time=1.135000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=1.135000
dts_time=1.135000
duration_time=0.044000

twice we have a frame at 1.135000s

or:
[PACKET]
stream_index=0
pts_time=2.218000
dts_time=2.218000
duration_time=0.033333
[/PACKET]
[PACKET]
stream_index=0
pts_time=2.218000
dts_time=2.218000
duration_time=0.051000
[/PACKET]


Every 20 frames, you have a frame with the same start time and usually different duration.

You're lucky it plays at all.
So Chrome plays it just the same on Windows (broken A/V sync); however Edge plays it properly.

Not surprising that chrome plays it the same; their WMF based decoder is almost identical to ours.

Our code only sets the presentation timestamp on the samples about to be decoded, never the duration.
According to the IMTransform doc:
https://msdn.microsoft.com/en-us/library/windows/desktop/dd940438%28v=vs.85%29.aspx
"For video, if the duration is not available in the compressed format, the decoder should calculate the duration as the inverse of the frame rate, converted to 100-nanosecond units and rounded down.
"

Now here, the decoder doesn't appear to do precisely that and the behaviour is closer to setting the duration based on the pts (which makes more sense); however the duration set on the output sample do not always match up the input.

I tried explicitly setting up the duration on the input sample; but while it does cause some variations; the duration on the output sample still do no match up what was set on input and the WMF decoder recalculated it anyway.

We could do what I did on the ffmpeg decoder and keep a map that match timestamp with duration: doing so make it play that video just fine); however the WMF doc clearly states that the decoder is free to play on the returned sample, including combining them (though probably doesn't make much sense for video)

I also tried only setting the time on the first sample, and then only setting the durations on all remaining samples: but the result is rubbish: all durations on the output happens to be 0.
That btw, exhibit something I believe is the wrong behaviour in our code: all frames are displayed super quickly one after the other.

Chris, got any ideas?
Flags: needinfo?(cpearce)
It's hard to believe that being 1 frame per 20 frames out could cause such noticeable A/V sync issues, but anyhoo...

(In reply to Jean-Yves Avenard [:jya] from comment #6)
> [PACKET]
> stream_index=0
> pts_time=2.218000
> dts_time=2.218000
> duration_time=0.033333
> [/PACKET]
> [PACKET]
> stream_index=0
> pts_time=2.218000
> dts_time=2.218000
> duration_time=0.051000
> [/PACKET]

I'm guessing that in this case, Edge plays the first packet from [2.218000,2.251333] and the second packet from [2.251333,2.269]...

What if in the WMF decoder we remember the end time of the last output, and set the timestamp of the next output to be the maximum of the timestamp reported by WMF and the end time of the previous sample? Then we'd have the above behaviour.
Flags: needinfo?(cpearce)
(In reply to Chris Pearce (:cpearce) from comment #8)
> It's hard to believe that being 1 frame per 20 frames out could cause such
> noticeable A/V sync issues, but anyhoo...

it's not just that..

the duration returned by IMFSample::GetSampleDuration() are mostly wrong.

> I'm guessing that in this case, Edge plays the first packet from
> [2.218000,2.251333] and the second packet from [2.251333,2.269]...
> 
> What if in the WMF decoder we remember the end time of the last output, and
> set the timestamp of the next output to be the maximum of the timestamp
> reported by WMF and the end time of the previous sample? Then we'd have the
> above behaviour.

we could try that...
I don't know if the behaviour would be identical with all HW decoders however as it appears what a decoder does is up to it.
Flags: needinfo?(jyavenard)
Priority: -- → P2
Peeyush,

It may pay to file a bug against the application generating the file or to use something like ffmpeg to re-encode it properly.
Mass change P2 -> P3
Priority: P2 → P3
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.