Closed Bug 788185 Opened 12 years ago Closed 10 years ago

Investigate A/V sync issues in WebRTC

Categories

(Core :: WebRTC: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: jesup, Assigned: jesup)

References

Details

(Whiteboard: [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-][s=fx32][p=5, 1.5:p1, ft:webrtc])

Attachments

(1 file, 1 obsolete file)

Investigate A/V sync issues with WebRTC

This includes things like clock drift and compensation in getUserMedia and MediaStreams

Adding instrumentation would be good; unit tests to make sure compensation works would be even better.
Whiteboard: [getUserMedia]
Whiteboard: [getUserMedia] → [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-]
Attachment #708866 - Attachment is obsolete: true
Comment on attachment 708930 [details] [diff] [review]
add a/v sync to Audio/Video Conduits

Syncing a/v turned out to be pretty easy since audio and video are both in the mPipelines array. 10+ minutes both sides still seemed to have good sync.  Debugs verify the two sides get synced up.
Attachment #708930 - Attachment description: add a/v sync to Audio/Video Conduits WIP → add a/v sync to Audio/Video Conduits
Attachment #708930 - Flags: review?(tterribe)
Attachment #708930 - Flags: review?(ekr)
Comment on attachment 708930 [details] [diff] [review]
add a/v sync to Audio/Video Conduits

Review of attachment 708930 [details] [diff] [review]:
-----------------------------------------------------------------

This will work, so r=me, but see the final comment, which would probably require a re-review should you choose to address it.

::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
@@ +241,5 @@
> +{
> +  CSFLogDebug(logTag, "%s Synced to %p", __FUNCTION__, aConduit);
> +
> +  if (aConduit) {
> +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());

You might want to add a comment saying that this will break things if we ever share ViEs among multiple conduits (including if you plan to merge the send/receive conduits), since setting the VoE will clear the audio channel association for all video channels.

@@ +246,5 @@
> +    mPtrViEBase->ConnectAudioChannel(mChannel, aConduit->GetChannel());
> +    // NOTE: this means the VideoConduit will keep the AudioConduit alive!
> +    mSyncedTo = aConduit;
> +  } else if (mSyncedTo) {
> +    mPtrViEBase->DisconnectAudioChannel(mChannel);

This call shouldn't actually be necessary (as mentioned above, setting the VoE... to anything, including NULL... clears all audio channels).

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp
@@ +345,5 @@
> +  // XXX Needs to be adjusted when we support multiple streams of the same type
> +  for (std::map<int, bool>::iterator it = mTypes.begin(); it != mTypes.end(); ++it) {
> +    if (it->second != aIsVideo) {
> +      // Ok, we have one video, one non-video - cross the streams!
> +      mozilla::WebrtcAudioConduit *audio_conduit = static_cast<mozilla::WebrtcAudioConduit*> 

Trailing whitespace.

@@ +349,5 @@
> +      mozilla::WebrtcAudioConduit *audio_conduit = static_cast<mozilla::WebrtcAudioConduit*> 
> +                                                   (aIsVideo ?
> +                                                    mPipelines[it->first]->Conduit() :
> +                                                    aPipeline->Conduit());
> +      mozilla::WebrtcVideoConduit *video_conduit = static_cast<mozilla::WebrtcVideoConduit*> 

Ditto.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.h
@@ +224,5 @@
>    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(RemoteSourceStreamInfo)
>   private:
>    nsRefPtr<nsDOMMediaStream> mMediaStream;
>    std::map<int, mozilla::RefPtr<mozilla::MediaPipeline> > mPipelines;
> +  std::map<int, bool> mTypes;

I think it would make a lot more sense to make this an attribute of the MediaPipeline. Then you don't need to have a separate hash map, etc.
Attachment #708930 - Flags: review?(tterribe) → review+
Attachment #708930 - Flags: review?(ekr)
Comment on attachment 708930 [details] [diff] [review]
add a/v sync to Audio/Video Conduits

Review of attachment 708930 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like the wrong design.... I'd rather see the pairing logic live in VCM.

::: media/webrtc/signaling/src/media-conduit/AudioConduit.h
@@ +161,5 @@
>  
>    MediaConduitErrorCode Init(WebrtcAudioConduit *other);
>  
> +  int GetChannel() { return mChannel; }
> +  webrtc::VoiceEngine* GetVoiceEngine() { return mVoiceEngine; }

This isn't something we generally want people doing. I wonder if it would be better if VideoConduit were a friend.

::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
@@ +241,5 @@
> +{
> +  CSFLogDebug(logTag, "%s Synced to %p", __FUNCTION__, aConduit);
> +
> +  if (aConduit) {
> +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());

Check for return value.

@@ +242,5 @@
> +  CSFLogDebug(logTag, "%s Synced to %p", __FUNCTION__, aConduit);
> +
> +  if (aConduit) {
> +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());
> +    mPtrViEBase->ConnectAudioChannel(mChannel, aConduit->GetChannel());

Check for return value, here and below.

@@ +244,5 @@
> +  if (aConduit) {
> +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());
> +    mPtrViEBase->ConnectAudioChannel(mChannel, aConduit->GetChannel());
> +    // NOTE: this means the VideoConduit will keep the AudioConduit alive!
> +    mSyncedTo = aConduit;

This seems like a pretty big architectural change. Are you sure this is needed? What happens in the ViE code if you destroy the voice engine that you are connected to

::: media/webrtc/signaling/src/media-conduit/VideoConduit.h
@@ +51,5 @@
>  
>    /**
> +   * Set up A/V sync between this (incoming) VideoConduit and an audio conduit.
> +   */
> +  void SyncTo(WebrtcAudioConduit *aConduit);

const WebrtcAudioConduit&

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp
@@ +340,3 @@
>      return;
>    }
> +  CSFLogDebug(logTag, "%s track %d %s = %p", __FUNCTION__, aTrack, aIsVideo ? "video" : "audio",

This seems like the wrong place to put this logic.

In general, we are *only* using the structures in PCMedia as storage and shutdown. All manipulation
(creation, pairing of audio conduits, setup..) is happening in VCM. Similarly, the logic for pairing
A and V streams should be in VCM. This will also have the advantage that when we have multiple
streams of the same type, we don't need to move a lot of the sync information into here from where
it is present in VCM.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.h
@@ +206,5 @@
>  
>    nsDOMMediaStream* GetMediaStream() {
>      return mMediaStream;
>    }
> +  void StorePipeline(int aTrack, bool aIsVideo,

A bool here seems very error-prone. We should use an enum instead.

@@ +224,5 @@
>    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(RemoteSourceStreamInfo)
>   private:
>    nsRefPtr<nsDOMMediaStream> mMediaStream;
>    std::map<int, mozilla::RefPtr<mozilla::MediaPipeline> > mPipelines;
> +  std::map<int, bool> mTypes;

In fact, the pipelines already know what type they are, at least via the conduit.
(In reply to Eric Rescorla (:ekr) from comment #5)
> Comment on attachment 708930 [details] [diff] [review]
> add a/v sync to Audio/Video Conduits
> 
> Review of attachment 708930 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This seems like the wrong design.... I'd rather see the pairing logic live
> in VCM.

I'll address that.

> 
> ::: media/webrtc/signaling/src/media-conduit/AudioConduit.h
> @@ +161,5 @@
> >  
> >    MediaConduitErrorCode Init(WebrtcAudioConduit *other);
> >  
> > +  int GetChannel() { return mChannel; }
> > +  webrtc::VoiceEngine* GetVoiceEngine() { return mVoiceEngine; }
> 
> This isn't something we generally want people doing. I wonder if it would be
> better if VideoConduit were a friend.

Sure, no problem - though if we're really going that route, a shared MediaDevice that wraps all the Voice/VideoEngines might be better.  I already have a step in that direction as part of refactoring the send/receive merge.

> 
> ::: media/webrtc/signaling/src/media-conduit/VideoConduit.cpp
> @@ +241,5 @@
> > +{
> > +  CSFLogDebug(logTag, "%s Synced to %p", __FUNCTION__, aConduit);
> > +
> > +  if (aConduit) {
> > +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());
> 
> Check for return value.
> 
> @@ +242,5 @@
> > +  CSFLogDebug(logTag, "%s Synced to %p", __FUNCTION__, aConduit);
> > +
> > +  if (aConduit) {
> > +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());
> > +    mPtrViEBase->ConnectAudioChannel(mChannel, aConduit->GetChannel());
> 
> Check for return value, here and below.

Ok.

> 
> @@ +244,5 @@
> > +  if (aConduit) {
> > +    mPtrViEBase->SetVoiceEngine(aConduit->GetVoiceEngine());
> > +    mPtrViEBase->ConnectAudioChannel(mChannel, aConduit->GetChannel());
> > +    // NOTE: this means the VideoConduit will keep the AudioConduit alive!
> > +    mSyncedTo = aConduit;
> 
> This seems like a pretty big architectural change. Are you sure this is
> needed? What happens in the ViE code if you destroy the voice engine that
> you are connected to

Because we're holding the AudioConduit here, that can't happen (per the comment).  Moving to a separate Device structure to hold the VoiceEngine and channels/etc would allow us to let the audioconduit go away (see above).  This isn't a problem in our current usage; we may need to address it in some manner to support full renegotiation (but for that I likely would want the refactor of the Conduit->Engine interface first).

The alternative would be to add pairing ala the receive/transmit audio pairing, and have the audio engine undo the sync when needed - but that would get trickier when we add support for multiple tracks of a type all synced.

I'm already planning to refactor the pairing and at the same time bring VideoConduits into the fold and pair them (right now we fail to send some RTCP's because of lack of VideoConduit pairing), so addressing this in that work makes the most sense.

> ::: media/webrtc/signaling/src/media-conduit/VideoConduit.h
> @@ +51,5 @@
> >  
> >    /**
> > +   * Set up A/V sync between this (incoming) VideoConduit and an audio conduit.
> > +   */
> > +  void SyncTo(WebrtcAudioConduit *aConduit);
> 
> const WebrtcAudioConduit&

ok

> ::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.cpp
> @@ +340,3 @@
> >      return;
> >    }
> > +  CSFLogDebug(logTag, "%s track %d %s = %p", __FUNCTION__, aTrack, aIsVideo ? "video" : "audio",
> 
> This seems like the wrong place to put this logic.
> 
> In general, we are *only* using the structures in PCMedia as storage and
> shutdown. All manipulation
> (creation, pairing of audio conduits, setup..) is happening in VCM.
> Similarly, the logic for pairing
> A and V streams should be in VCM. This will also have the advantage that
> when we have multiple
> streams of the same type, we don't need to move a lot of the sync
> information into here from where
> it is present in VCM.

Well the pairing is relying on this code to act as the array storage, but I can move it at the expense of a more complicated interface.  I put it here because tracks in the same MediaStream live together in this structure, so it directly maps to the set of tracks that must be synchronized with each other.  Moving the logic to the VCM seems to me to export what should be internal knowledge/state of the MediaStreams, and I don't see it actually helping anything.  If tracks are in the same stream, they should be synced, always.  I don't see that as manipulation, I see that as a direct consequence of them being put together, which was already decided by what stream it went into (in VCM).  If I missed something or there's a consequence I don't see, please let me know, but otherwise I'd like to keep it here.

> ::: media/webrtc/signaling/src/peerconnection/PeerConnectionMedia.h
> @@ +206,5 @@
> >  
> >    nsDOMMediaStream* GetMediaStream() {
> >      return mMediaStream;
> >    }
> > +  void StorePipeline(int aTrack, bool aIsVideo,
> 
> A bool here seems very error-prone. We should use an enum instead.

Ok, see below

> @@ +224,5 @@
> >    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(RemoteSourceStreamInfo)
> >   private:
> >    nsRefPtr<nsDOMMediaStream> mMediaStream;
> >    std::map<int, mozilla::RefPtr<mozilla::MediaPipeline> > mPipelines;
> > +  std::map<int, bool> mTypes;
> 
> In fact, the pipelines already know what type they are, at least via the
> conduit.

Ah, for some reason I missed that (perhaps because I didn't see them used much).  Ok, that simplifies things a little, and matches Tim's request which I was already working on.
I already landed the initial patch on m-c:

https://hg.mozilla.org/mozilla-central/rev/be76182b91a6

Leaving open for cleanup
Target Milestone: --- → mozilla21
Blocks: 970426
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-] → [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-][s=fx32]
Whiteboard: [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-][s=fx32] → [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-][s=fx32][p=5;ft:webrtc]
Whiteboard: [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-][s=fx32][p=5;ft:webrtc] → [getUserMedia], [WebRTC], [blocking-gum-], [blocking-webrtc-][s=fx32][p=5, 1.5:p1, ft:webrtc]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: