Closed Bug 466699 Opened 16 years ago Closed 15 years ago

audio video out of sync with ogg theora + vorbis

Categories

(Core :: Audio/Video, defect, P2)

x86
macOS
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: asa, Assigned: cajbir)

References

Details

(Keywords: verified1.9.1)

Attachments

(2 files, 1 obsolete file)

I'm told by chris double that this requires a platform specific fix but I see it on both Mac and Windows so I hope it's OK to file this as a single bug for now. 

Audio and video get out of sync with ogg theora + vorbis in the latest Firefox trunk builds.

examples: http://piercedotzler.com/asa/obama-medium-high.ogv which was encoded with Handbrake and http://piercedotzler.com/asa/obama-scaled-high.ogg which was encoded with ffmpeg2theora (not sure if those are different libtheora verions) both show the problem. 

Both are theora and vorbis I think. Some readers at my blog say it works fine the first time and loses sync only on subsequent plays.
Flags: blocking1.9.1?
Sorry, I should have noted (in case it's important) that both were transcoded from an h.264 QuickTime movie. I don't remember what the specific transcoding options were except that they were to the high side on quality.
Chris Double wrote:
> The sync issues are due to us using the system to clock to synchronize the audio with the video. So we start the audio playing, start the system clock count, and use the latter to decide what video frame to display.
>
> This has issues with drift - which is what you noticed and why pause/unpause fixed it - we reset the system clock value that we sync with when you do that.
>
> It also has issues due to latency. Where the audio hardware takes a few milliseconds to actually play the sample we sent to it due to buffering and hardware latency. 
>
> Ideally we'd sync with the audio hardware clock time. That is, we find out the actual time of the current sample being played by the audio hardware and sync the video frame to that. This is in fact what we originally did in beta1 on linux. Unfortunately the support for getting this information is unreliable in the audio framework we're using and I'm waiting for feedback regarding a fix. This is why I took it out following beta 1. It caused major problems on PulseAudio systems, didn't work on Windows but did work on Mac. 
>
> I'm hoping to get hardware sync support back in soon. Yes, there really should be a bug for it.
>
> Another issue that can complicate matters is some programs that convert to Ogg don't synchronise correctly. But then you'd notice the issue in VLC too. I use ffmpeg2theora for the conversions at http://tinyvid.tv and it works quite well.
>
> Chris.
Chris Double wrote:

> The sync issues are due to us using the system to clock to
> synchronise the audio with the video. So we start the audio playing,
> start the system clock count, and use the latter to decide what video
> frame to display.
> 
> This has issues with drift - which is what you noticed Mark and why
> pause/unpause fixed it - we reset the system clock value that we sync
> with when you do that.
> 
> It also has issues due to latency. Where the audio hardware takes a
> few milliseconds to actually play the sample we sent to it due to
> buffering and hardware latency.

> Ideally we'd sync with the audio hardware clock time. That is, we
> find out the actual time of the current sample being played by the
> audio hardware and sync the video frame to that. This is in fact what
> we originally did in beta1 on linux. Unfortunately the support for
> getting this information is unreliable in the audio framework we're
> using and I'm waiting for feedback regarding a fix. This is why I
> took it out following beta 1. It caused major problems on PulseAudio
> systems, didn't work on Windows but did work on Mac.

> I'm hoping to get hardware sync support back in soon. Yes, there
> really should be a bug for it.

> Another issue that can complicate matters is some programs that
> convert to Ogg don't synchronise correctly. But then you'd notice the
> issue in VLC too. I use ffmpeg2theora for the conversions at
> http://tinyvid.tv and it works quite well.
Flags: blocking1.9.1? → blocking1.9.1+
Assignee: nobody → chris.double
Chris, are you working on a patch for this?  I thought you were but I haven't seen anything here.
Yes, I'm currently working on this.
Upcoming patch needs fixes in bug 485433 for bugs in the libsydneyaudio alsa backend.
Depends on: 485433
I still really really want this for 1.9.1 but we aren't going to actually block the release on it.
Flags: blocking1.9.1+ → wanted1.9.1+
Is it safe to take on a 3.5.1 release?
Blizzard argued for this to actually block based on the visibility of the bug. Since we have fixes for Windows and Mac in hand, I'm willing to block on those, but not for Linux where the situation is more murky.
Flags: wanted1.9.1+ → blocking1.9.1+
These are patches to libsydneyaudio required to get the avsync working. It implements pause/resume in the alsa backend and fixes a bug in the mac pause/resume so that it matches the behavior of the other backends.

I'll arrange upstreaming of these.
Attached patch Fix (obsolete) — Splinter Review
Synchronizes the video frame with the audio position obtained from the audio backend. We constantly feed data to the audio backend whenever we can rather than when the frame is due to ensure a constant uninterrupted stream of audio data is available and accurate audio location is obtained.

Pausing and resuming now use the pause/resume feature of the audio backend. This keeps the audio time accurate. It also fixes the issue where pausing resulted in buffered audio data still playing until the hardware buffer was exhausted.
Attachment #377353 - Flags: superreview?(roc)
Attachment #377353 - Flags: review?(roc)
+        time = 
+          hwtime < 0.0 ?

Same line

-        time = (TimeStamp::Now() - mPlayStartTime - mPauseDuration).ToSeconds();
+        QueueDecodedFrames();

Why do we need QueueDecodedFrames here?

+        PlayAudio(frame);

Seems like we can call PlayAudio(frame) for the same frame multiple times? Surely that's a bug?

+        double hwtime = mAudioStream ? mAudioStream->GetPosition() : -1.0;
+        time = 
+          hwtime < 0.0 ?
+            (TimeStamp::Now() - mPlayStartTime - mPauseDuration).ToSeconds() :
+            hwtime;
         if (time < frame->mTime) {
-          mon.Wait(PR_MillisecondsToInterval(PRInt64((frame->mTime - time)*1000)));
+          mon.Wait(PR_MillisecondsToInterval(PRInt64((frame->mTime - time)*500)));

Seems like if frame->mTime - time is one second, then if timers are really accurate we'll first wait for 500 ms, then for 250, then for 125, ... isn't that bad?

       while (!mDecodedFrames.IsEmpty() && time >= mDecodedFrames.Peek()->mTime) {
         LOG(PR_LOG_DEBUG, ("Skipping frame time %f with audio at time %f", mDecodedFrames.Peek()->mTime, time));
+        PlayAudio(frame);
         delete frame;
         frame = mDecodedFrames.Peek();
         mDecodedFrames.Pop();
       }

I don't understand why we play the audio for skipped frames. That audio is going to take up time, so shouldn't we display the video for that frame while its audio is playing? I guess my real question is why we even try to catch up here. What's the point?

It seems to me the logical thing to do would be to treat the audio and video separately, play enough audio to keep the buffers full and scan the frame queue to play the appropriate video frame for the current audio stream position.

+  mLastFrameTime = 0;
+  mDecodedFrames.ResetTimes(mCallbackPeriod);

This seems wrong. mLastFrameTime is the frame for the last frame in the queue, but ResetTimes sets the last frame in the queue's time to mCallbackPeriod times the number of frames minus one, right?

In fact, do we actually need to store times in the frames? It seems to me we can compute the time for each frame from mLastFrameTime and the number of frames in the queue.
> Same line

fixed.

> Why do we need QueueDecodedFrames here?

I've moved it outside of the loop. It's needed to place frames decoded by liboggplay into mDecodedFrames so the check we skip frames correctly.

> Seems like we can call PlayAudio(frame) for the same frame multiple times?
> Surely that's a bug?

No because we reset the audio data in the frame when its been played so it only gets played once. nsAudioStream queues this data up if it can't write it immediately to the hardware. We still need to call PlayAudio however, even if we are sending no frame audio data, so it can play audio data it has buffered but not yet written to hardware. I've modifed the comment on PlayAudio data to mention this.

> Seems like if frame->mTime...

Fixed.

> I don't understand why we play the audio for skipped frames.

We're not really playing the audio for the skipped frame. We're playing additional audio that liboggplay has decoded, along with audio buffered by nsAudioStream but not yet sent to the hardware. liboggplay doesn't match exactly the audio data for the frame. So the audio data attached to a frame isn't the exact amount of audio required for that particular frame.

There is a certain amount of lead-in data, sometimes some data for the next frame, etc. We must pump all the audio data so we can get the correct data from the audio time to do the sync. 

So we are in effect doing exactly what you say in the 'seems to me the logical thing to do...'.

Now that we understand better what liboggplay is doing with audio data in relation to the video data it would probably be better not to store the audio data as part of the frame object and queue it immediately to be played, then regularly call nsAudioStream to play it. Maybe better would be to have this as an event queue that we send messages to whenever we get audio data. These are refactorings I thought too big for this patch so close to release.

> This seems wrong.

Fixed. I assign mLastFrameTime to the next time after the last item in the queue.

> In fact, do we actually need to store times in the frames?

At the time I did it this way because there was the chance of reusing the frame data in a cache or for optimizing seeking which I ended up not doing. I wanted to know the actual time associated with the frame data. We could change it in this patch if you prefer or maybe do a separate cleanup bug for it.
Attachment #377353 - Attachment is obsolete: true
Attachment #377400 - Flags: superreview?(roc)
Attachment #377400 - Flags: review?(roc)
Attachment #377353 - Flags: superreview?(roc)
Attachment #377353 - Flags: review?(roc)
I've been seeking around this morning in a long ogg file on Linux and one thing I've noticed is that sometimes the audio just stops playing when you seek, even though the video keeps playing.  Have you seen this symptom and if you have, do you think this patch will help with that?
Why don't you just try the patch?
Comment on attachment 377400 [details] [diff] [review]
address review comments

Thanks.
Attachment #377400 - Flags: superreview?(roc)
Attachment #377400 - Flags: superreview+
Attachment #377400 - Flags: review?(roc)
Attachment #377400 - Flags: review+
(In reply to comment #14)
> So we are in effect doing exactly what you say in the 'seems to me the logical
> thing to do...'.

I believe you...

> Now that we understand better what liboggplay is doing with audio data in
> relation to the video data it would probably be better not to store the audio
> data as part of the frame object and queue it immediately to be played, then
> regularly call nsAudioStream to play it. Maybe better would be to have this as
> an event queue that we send messages to whenever we get audio data. These are
> refactorings I thought too big for this patch so close to release.

Yes, I think if the audio data was handled separately this would be much clearer.

Maybe we could restructure things like this, post 3.5: have a single decoder thread that takes turns running oggplay_step_decode for all videos. The audio data is immediately extracted and sent to nsAudioStreams, the video frames are put into per-decoder frame queues. Have a single control thread that runs the state machines for all the decoders, takes care of flushing the nsAudioStreams, and pulls video frames from the frame queue to display the one that best matches the current audio time.
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/dab9a93171de
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: [needs landing] → [baking for 1.9.1]
Pushed libsydneyaudio patch to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/8cae9be96140
Blocks: 470639
I feel AUDIO/VIDEO sync better, but still see some out off sync issues.
1. when playing long video
2. after skipping video

As there are many A/V bugs getting fixed today, i will retest it tomorrow.
I'm not sure this is fixed, when seeking through the video and awaiting audio to catch up.  See my screencast i captured that if you engage the video seek, the audio will quickly get out of sync.

See my screencast url:  http://www.screencast.com/t/5BqIYjywy
my testurl: http://tinyvid.tv/show/arjohrmgt2yf

Tested on Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090526 Minefield/3.6a1pre
and Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre) Gecko/20090526 Shiretoko/3.5pre Ubiquity/0.1.5

Reopening for blocker attention.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre)
Gecko/20090526 Shiretoko/3.5pre)

I can confirm Tony's behaviour as described above.  Audio lags when jumping forward, corrects when moving back with seek bar.

NOTE:
The youtube and downloaded versions of the same video do not show this behaviour.  I think it is safe to say it's us.
Possibly it's bug 493958 that is still causing problems.
Hmm... I see a/v sync problems with that video.
(In reply to comment #24)
> I'm not sure this is fixed, when seeking through the video and awaiting audio
> to catch up. 
> [...] 
> Reopening for blocker attention.

Opened bug Bug 495004 for this. Closing this bug.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Given the comments from bug 495004, this is fixed in Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090526 Minefield/3.6a1pre
and Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre) Gecko/20090526 Shiretoko/3.5pre Ubiquity/0.1.5.   Thanks for investigating.
Status: RESOLVED → VERIFIED
Blocks: 483493
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: