Closed Bug 792175 (webrtc-big-lock) Opened 7 years ago Closed 7 years ago

Locks for PeerConnection to protect against multiple thread access

Categories

(Core :: WebRTC: Signaling, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: ekr, Assigned: ekr)

References

Details

(Whiteboard: [WebRTC], [blocking-webrtc+] [qa-])

Attachments

(2 files, 11 obsolete files)

2.22 KB, patch
Details | Diff | Splinter Review
127.45 KB, patch
jesup
: review+
Details | Diff | Splinter Review
No description provided.
Attached patch Lock PeerConnection (obsolete) — Splinter Review
Summary: Lock PeerConnection → Locks for PeerConnection to protect against multiple thread access
Whiteboard: [WebRTC]
Blocks: 791330
Whiteboard: [WebRTC] → [WebRTC], [blocking-webrtc+]
Attached patch Lock PeerConnection (obsolete) — Splinter Review
Attachment #662272 - Attachment is obsolete: true
Attachment #663040 - Flags: review?(tterribe)
Attachment #663040 - Flags: review?(ethanhugg)
Note: this was causing crashes in linked_ptr but now that I have cleaned up a few issues in the ASan for PCImpl, it seems to be ok.
Comment on attachment 663040 [details] [diff] [review]
Lock PeerConnection

Review of attachment 663040 [details] [diff] [review]:
-----------------------------------------------------------------

The big lock is a bit scary, but every lock I could find in VcmSIPCCBinding and PeerConnectionImpl is scoped so it should be OK.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +962,5 @@
> +  return CloseInt();
> +}
> +
> +
> +NS_IMETHODIMP

This should be nsresult like the .h
Attachment #663040 - Flags: review?(ethanhugg) → review+
Comment on attachment 663040 [details] [diff] [review]
Lock PeerConnection

Review of attachment 663040 [details] [diff] [review]:
-----------------------------------------------------------------

I think this needs some work. See comments in-line.

::: media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp
@@ +362,5 @@
>  
>    if (main_thread_) {
>      main_thread_->Dispatch(WrapRunnable(
>        stream_->GetStream(), &mozilla::MediaStream::AddListener, listener_),
> +      NS_DISPATCH_NORMAL);

I don't see what guarantees stream_ and listener_ are still around when this event gets executed.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.h
@@ +42,5 @@
>  
>  namespace sipcc {
>  
> +// A class to hold some fo the singleton objects we need:
> +// * The global PeerConnectionImpl table

End with a period, please.

@@ +44,5 @@
>  
> +// A class to hold some fo the singleton objects we need:
> +// * The global PeerConnectionImpl table
> +// * Currently SIPCC only allows a single stack instance to exist in a process
> +//   at once. This class implements a singleton object that wraps that

End with a period, please.

@@ +64,5 @@
>    PeerConnectionImpl::SipccState sipcc_state() { return mSipccState; }
>  
> +  // We could make these available only via accessors but it's too much trouble.
> +  std::map<const std::string, PeerConnectionImpl *> mPeerConnections;
> +  mozilla::Mutex mLock;

You need to document exactly what this lock is protecting. Even if it's "everything in PeerConnectionCtx", you need to say so.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +65,5 @@
> +
> +  ~ApiDebug() {
> +    std::cerr << "Exiting " << fxn_ << std::endl;
> +  }
> + 

Trailing whitespace.

@@ +71,5 @@
> +};
> +
> +// Enter an API call and check that the state is OK,
> +// the PC isn't closed, etc.
> +#define PC_ENTER_API_CALL(assert_ice_ready) \

I'd recommend "PC_AUTO_ENTER_API_CALL" to make it clear this is an RAII construct. Otherwise someone might look for a corresponding PC_LEAVE_API_CALL() or (if less observant), not realize the lock is held until the end of the scope, even if they only wanted to make a single call.

@@ +72,5 @@
> +
> +// Enter an API call and check that the state is OK,
> +// the PC isn't closed, etc.
> +#define PC_ENTER_API_CALL(assert_ice_ready) \
> +  /*    ApiDebug apiDebug(__FUNCTION__); */                            \

Either do this properly (e.g., a construct only enabled with debugging or a log message), or remove it. Don't just leave it as commented-out code.

@@ +75,5 @@
> +#define PC_ENTER_API_CALL(assert_ice_ready) \
> +  /*    ApiDebug apiDebug(__FUNCTION__); */                            \
> +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock); \
> +    do { nsresult res = CheckApiState(assert_ice_ready); \
> +    if (NS_FAILED(res)) return res;} while(0)

So normally the "do { } while(0)" construct is used to allow things like
  if(foo)PC_ENTER_API_CALL(false);
  else ...
to work. That's obviously not going to work here, so it's just extra verbosity.

@@ +77,5 @@
> +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock); \
> +    do { nsresult res = CheckApiState(assert_ice_ready); \
> +    if (NS_FAILED(res)) return res;} while(0)
> +
> +// Enter an API call but don't check state. 

Trailing whitespace.

@@ +80,5 @@
> +
> +// Enter an API call but don't check state. 
> +#define PC_ENTER_API_CALL_NO_CHECK() \
> +  /*    ApiDebug apiDebug(__FUNCTION__); */                             \
> +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock);

Remove the trailing ";".

@@ +335,5 @@
>  
>  PeerConnectionImpl::~PeerConnectionImpl()
>  {
> +  // Remove ourselves from the table.
> +  MutexAutoLock pctable_lock(PeerConnectionCtx::GetInstance()->mLock);

Why aren't you using the PC_ENTER_API_CALL_NO_CHECK() macro here?

@@ +338,5 @@
> +  // Remove ourselves from the table.
> +  MutexAutoLock pctable_lock(PeerConnectionCtx::GetInstance()->mLock);
> +  PeerConnectionCtx::GetInstance()->mPeerConnections.erase(mHandle);
> +
> +  // Now self-destruct. Note that the lock protects us from

From what, exactly?

Also, as a point of style, any time you begin a sentence with "Note that", you can just drop those two words.

@@ +399,1 @@
>    ), NS_DISPATCH_SYNC);

There are a number of functions here which operate on a PeerConnectionImpl but don't acquire the global lock. You should either assert lock ownership or document why they don't need to.

@@ +963,5 @@
> +}
> +
> +
> +NS_IMETHODIMP
> +PeerConnectionImpl::CloseInt() {

Please assert mLock.AssertCurrentThreadOwns().

@@ +1072,5 @@
>    mReadyState = ready_state;
> +  RUN_ON_THREAD(mThread, WrapRunnable(mPCObserver,
> +                                      &IPeerConnectionObserver::OnStateChange,
> +                                      IPeerConnectionObserver::kReadyState),
> +                NS_DISPATCH_NORMAL);

Same lifetime question as above.

@@ +1077,3 @@
>  }
>  
> +// Acquire an instance. Note that this leaves the entire PC system locked

This kind of API seems incredibly dangerous. From what I can tell, it should be illegal to call this without immediately storing the result in a ScopedDeletePtr (which every callsite I saw does, but no guarantee that will continue in the future). Instead, I think you should replace PeerConnectionWrapper with an RAII friend class that calls AcquireInstance() in its constructor and ReleaseInstance() in its destructor, and then make those two methods private. You can make it extend MutexAutoLock (so you can inherit its benefits, like making heap allocation illegal) and then assert lock ownership in AcquireInstance() and ReleaseInstance().

@@ +1128,5 @@
>  PeerConnectionImpl::IceCompleted(NrIceCtx *ctx)
>  {
>    CSFLogDebug(logTag, "ICE completed");
>    mIceState = kIceConnected;
> +  

Whitespace-only change.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h
@@ +370,4 @@
>  
>    void ChangeReadyState(ReadyState ready_state);
> +  nsresult CheckApiState(bool assert_ice_ready) {
> +    PR_ASSERT(!assert_ice_ready || (mIceState != kIceGathering));

You should also assert lock ownership here.

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +681,5 @@
>  class SignalingTest : public ::testing::Test {
>  public:
> +  void SetUp() {
> +    nsIThread *thread;
> +    

Trailing whitespace.
Attachment #663040 - Flags: review?(tterribe) → review-
> ::: media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp
> @@ +362,5 @@
> >  
> >    if (main_thread_) {
> >      main_thread_->Dispatch(WrapRunnable(
> >        stream_->GetStream(), &mozilla::MediaStream::AddListener, listener_),
> > +      NS_DISPATCH_NORMAL);
> 
> I don't see what guarantees stream_ and listener_ are still around when this
> event gets executed.

Yeah this looks like a bug.


> ::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.h
> @@ +42,5 @@
> >  
> >  namespace sipcc {
> >  
> > +// A class to hold some fo the singleton objects we need:
> > +// * The global PeerConnectionImpl table
> 
> End with a period, please.
> 
> @@ +44,5 @@
> >  
> > +// A class to hold some fo the singleton objects we need:
> > +// * The global PeerConnectionImpl table
> > +// * Currently SIPCC only allows a single stack instance to exist in a process
> > +//   at once. This class implements a singleton object that wraps that
> 
> End with a period, please.
> 
> @@ +64,5 @@
> >    PeerConnectionImpl::SipccState sipcc_state() { return mSipccState; }
> >  
> > +  // We could make these available only via accessors but it's too much trouble.
> > +  std::map<const std::string, PeerConnectionImpl *> mPeerConnections;
> > +  mozilla::Mutex mLock;
> 
> You need to document exactly what this lock is protecting. Even if it's
> "everything in PeerConnectionCtx", you need to say so.
> 
> ::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
> @@ +65,5 @@
> > +
> > +  ~ApiDebug() {
> > +    std::cerr << "Exiting " << fxn_ << std::endl;
> > +  }
> > + 
> 
> Trailing whitespace.
> 
> @@ +71,5 @@
> > +};
> > +
> > +// Enter an API call and check that the state is OK,
> > +// the PC isn't closed, etc.
> > +#define PC_ENTER_API_CALL(assert_ice_ready) \
> 
> I'd recommend "PC_AUTO_ENTER_API_CALL" to make it clear this is an RAII
> construct. Otherwise someone might look for a corresponding
> PC_LEAVE_API_CALL() or (if less observant), not realize the lock is held
> until the end of the scope, even if they only wanted to make a single call.

OK.


> @@ +72,5 @@
> > +
> > +// Enter an API call and check that the state is OK,
> > +// the PC isn't closed, etc.
> > +#define PC_ENTER_API_CALL(assert_ice_ready) \
> > +  /*    ApiDebug apiDebug(__FUNCTION__); */                            \
> 
> Either do this properly (e.g., a construct only enabled with debugging or a
> log message), or remove it. Don't just leave it as commented-out code.

OK.


> @@ +75,5 @@
> > +#define PC_ENTER_API_CALL(assert_ice_ready) \
> > +  /*    ApiDebug apiDebug(__FUNCTION__); */                            \
> > +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock); \
> > +    do { nsresult res = CheckApiState(assert_ice_ready); \
> > +    if (NS_FAILED(res)) return res;} while(0)
> 
> So normally the "do { } while(0)" construct is used to allow things like
>   if(foo)PC_ENTER_API_CALL(false);
>   else ...
> to work. That's obviously not going to work here, so it's just extra
> verbosity.

I think I disagree with this. IMO it's good hygiene to have a construct
that doesn't create duplicate semicolons. And this will be legal
(though useless) if we just remove the debugging ApiDebug class.

> @@ +77,5 @@
> > +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock); \
> > +    do { nsresult res = CheckApiState(assert_ice_ready); \
> > +    if (NS_FAILED(res)) return res;} while(0)
> > +
> > +// Enter an API call but don't check state. 
> 
> Trailing whitespace.
> 
> @@ +80,5 @@
> > +
> > +// Enter an API call but don't check state. 
> > +#define PC_ENTER_API_CALL_NO_CHECK() \
> > +  /*    ApiDebug apiDebug(__FUNCTION__); */                             \
> > +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock);
> 
> Remove the trailing ";".

My plan here is actually to wrap this in another do while().


> @@ +335,5 @@
> >  
> >  PeerConnectionImpl::~PeerConnectionImpl()
> >  {
> > +  // Remove ourselves from the table.
> > +  MutexAutoLock pctable_lock(PeerConnectionCtx::GetInstance()->mLock);
> 
> Why aren't you using the PC_ENTER_API_CALL_NO_CHECK() macro here?

Good call. Bit rot.


> > +// Acquire an instance. Note that this leaves the entire PC system locked
> 
> This kind of API seems incredibly dangerous. From what I can tell, it should
> be illegal to call this without immediately storing the result in a
> ScopedDeletePtr (which every callsite I saw does, but no guarantee that will
> continue in the future). Instead, I think you should replace
> PeerConnectionWrapper with an RAII friend class that calls AcquireInstance()
> in its constructor and ReleaseInstance() in its destructor, and then make
> those two methods private. You can make it extend MutexAutoLock (so you can
> inherit its benefits, like making heap allocation illegal) and then assert
> lock ownership in AcquireInstance() and ReleaseInstance().

OK.


> @@ +1128,5 @@
> >  PeerConnectionImpl::IceCompleted(NrIceCtx *ctx)
> >  {
> >    CSFLogDebug(logTag, "ICE completed");
> >    mIceState = kIceConnected;
> > +  
> 
> Whitespace-only change.
> 
> ::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h
> @@ +370,4 @@
> >  
> >    void ChangeReadyState(ReadyState ready_state);
> > +  nsresult CheckApiState(bool assert_ice_ready) {
> > +    PR_ASSERT(!assert_ice_ready || (mIceState != kIceGathering));
> 
> You should also assert lock ownership here.
> 
> ::: media/webrtc/signaling/test/signaling_unittests.cpp
> @@ +681,5 @@
> >  class SignalingTest : public ::testing::Test {
> >  public:
> > +  void SetUp() {
> > +    nsIThread *thread;
> > +    
> 
> Trailing whitespace.
Attached patch Lock PeerConnection (obsolete) — Splinter Review
Attachment #663040 - Attachment is obsolete: true
I have discovered a deadlock on shutdown and I can reproduce quite regularly when I add over 30 Init signaling unit tests. I added these tests to ensure SIPCC was not leaking threads.

The deadlock is called from these two (on device event gets called during shutdown)

a) test::SignalingEnvironment:TearDown -> PeerConnection::Shutdown

b) PeerConnectionCtx::onDeviceEvent -> Cleanup

a)

#6	0x100085383 in CSF::CC_SIPCCService::removeCCObserver at CC_SIPCCService.cpp:745
#7	0x10004bfb1 in CSF::CallControlManagerImpl::disconnect at CallControlManagerImpl.cpp:264
#8	0x10004a881 in CSF::CallControlManagerImpl::destroy at CallControlManagerImpl.cpp:87
#9	0x100089765 in sipcc::PeerConnectionCtx::Cleanup at PeerConnectionCtx.cpp:113
#10	0x1000896f4 in sipcc::PeerConnectionCtx::Destroy at PeerConnectionCtx.cpp:69
#11	0x10008ee39 in sipcc::PeerConnectionImpl::Shutdown at PeerConnectionImpl.cpp:1135
#12	0x100008d41 in test::SignalingEnvironment::TearDown at signaling_unittests.cpp:664
#13	0x100c00ca6 in testing::internal::TearDownEnvironment at gtest.cc:4145



b)

#6	0x100085383 in CSF::CC_SIPCCService::removeCCObserver at CC_SIPCCService.cpp:745
#7	0x10004bfb1 in CSF::CallControlManagerImpl::disconnect at CallControlManagerImpl.cpp:264
#8	0x10004a881 in CSF::CallControlManagerImpl::destroy at CallControlManagerImpl.cpp:87
#9	0x100089765 in sipcc::PeerConnectionCtx::Cleanup at PeerConnectionCtx.cpp:113
#10	0x1000898e6 in sipcc::PeerConnectionCtx::onDeviceEvent at PeerConnectionCtx.cpp:135
#11	0x10004cb7b in CSF::CallControlManagerImpl::notifyDeviceEventObservers at CallControlManagerImpl.cpp:499
#12	0x10004ca51 in CSF::CallControlManagerImpl::onDeviceEvent at CallControlManagerImpl.cpp:477
Duplicate of this bug: 805691
Attached patch Lock PeerConnection (obsolete) — Splinter Review
Attachment #665028 - Attachment is obsolete: true
Attached patch Lock PeerConnection (obsolete) — Splinter Review
Attachment #682803 - Attachment is obsolete: true
Attachment #682961 - Flags: review?(rjesup)
Comment on attachment 682961 [details] [diff] [review]
Lock PeerConnection

Review of attachment 682961 [details] [diff] [review]:
-----------------------------------------------------------------

Adding derf since he r-ed the first patch.
Attachment #682961 - Flags: review?(tterribe)
Comment on attachment 682961 [details] [diff] [review]
Lock PeerConnection

Review of attachment 682961 [details] [diff] [review]:
-----------------------------------------------------------------

Adding derf since he r-ed the first patch.
Attachment #682961 - Flags: feedback?(ethanhugg)
ehugg: this is causing linked_ptr failures on exit in the unit tests.

I used to get these and then I stopped getting them and now I get them
again, so I kind of suspect this patch is not to blame and I've just re-triggered
it. Still, it makes me somewhat nervous. Any thoughts?
#0  0x00000001000549e5 in linked_ptr_internal::depart (this=0x10d350bb0) at linked_ptr.h:67
#1  0x0000000100073b65 in linked_ptr<CSF::CC_SIPCCCallInfo>::depart (this=0x10d350ba8) at linked_ptr.h:139
#2  0x0000000100073b35 in linked_ptr<CSF::CC_SIPCCCallInfo>::~linked_ptr (this=0x10d350ba8) at linked_ptr.h:84
#3  0x0000000100072315 in linked_ptr<CSF::CC_SIPCCCallInfo>::~linked_ptr (this=0x10d350ba8) at linked_ptr.h:84
#4  0x000000010007997c in std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >::~pair (this=0x10d350ba0) at stl_pair.h:68
#5  0x0000000100079955 in std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >::~pair (this=0x10d350ba0) at stl_pair.h:68
#6  0x0000000100079879 in __gnu_cxx::new_allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >::destroy (this=0x7fff5fbfea78, __p=0x10d350ba0) at new_allocator.h:110
#7  0x00000001000797fb in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::_M_destroy_node (this=0x100f197c0, __p=0x10d350b80) at stl_tree.h:401
#8  0x00000001000796cb in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::_M_erase (this=0x100f197c0, __x=0x10d350b80) at stl_tree.h:1326
#9  0x00000001000796ae in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::_M_erase (this=0x100f197c0, __x=0x10d333320) at stl_tree.h:1324
#10 0x00000001000796ae in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::_M_erase (this=0x100f197c0, __x=0x10d3337c0) at stl_tree.h:1324
#11 0x00000001000796ae in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::_M_erase (this=0x100f197c0, __x=0x10d33b9e0) at stl_tree.h:1324
#12 0x00000001000796ae in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::_M_erase (this=0x100f197c0, __x=0x10d31c710) at stl_tree.h:1324
#13 0x0000000100079605 in std::_Rb_tree<cc_call_info_t_*, std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> >, std::_Select1st<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > >, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::clear (this=0x100f197c0) at stl_tree.h:712
#14 0x00000001000795d5 in std::map<cc_call_info_t_*, linked_ptr<CSF::CC_SIPCCCallInfo>, std::less<cc_call_info_t_*>, std::allocator<std::pair<cc_call_info_t_* const, linked_ptr<CSF::CC_SIPCCCallInfo> > > >::clear (this=0x100f197c0) at stl_map.h:509
#15 0x0000000100077f26 in Wrapper<CSF::CC_SIPCCCallInfo>::reset (this=0x100f197c0) at Wrapper.h:147
#16 0x0000000100076101 in CSF::CC_SIPCCCallInfo::reset () at CC_SIPCCCallInfo.cpp:30
#17 0x000000010008e966 in CSF::CC_SIPCCService::destroy (this=0x10d00a460) at CC_SIPCCService.cpp:372
#18 0x00000001000567ce in CSF::CallControlManagerImpl::disconnect (this=0x10d00a250) at CallControlManagerImpl.cpp:232
#19 0x0000000100055051 in CSF::CallControlManagerImpl::destroy (this=0x10d00a250) at CallControlManagerImpl.cpp:53
#20 0x0000000100094bde in sipcc::PeerConnectionCtx::Cleanup (this=0x10d009dd0) at PeerConnectionCtx.cpp:92
#21 0x0000000100094b41 in sipcc::PeerConnectionCtx::Destroy () at PeerConnectionCtx.cpp:46
#22 0x0000000100099e79 in sipcc::PeerConnectionImpl::Shutdown () at PeerConnectionImpl.cpp:1042
#23 0x0000000100011f81 in test::SignalingEnvironment::TearDown (this=0x10d0098a0) at /Users/ekr/dev/mozilla-inbound/media/webrtc/signaling/test/signaling_unittests.cpp:763
#24 0x0000000100bec976 in TearDownEnvironment (env=0x10d0098a0) at /Users/ekr/dev/mozilla-inbound/media/webrtc/trunk/testing/gtest/src/gtest.cc:4145
#25 0x0000000100bf13e8 in std::for_each<std::reverse_iterator<__gnu_cxx::__normal_iterator<testing::Environment**, std::vector<testing::Environment*, std::allocator<testing::Environment*> > > >, void (*)(testing::Environment*)> (__first=@0x7fff5fbff278, __last=@0x7fff5fbff270, __f=0x100bec960 <TearDownEnvironment>) at stl_algo.h:159
#26 0x0000000100beb6e7 in testing::internal::UnitTestImpl::RunAllTests (this=0x10bc1f860) at /Users/ekr/dev/mozilla-inbound/media/webrtc/trunk/testing/gtest/src/gtest.cc:4255
#27 0x0000000100bf5463 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x10bc1f860, method={ptr = 4307465088, ptr = 0}, location=0x100ca649c "auxiliary test code (environments or event listeners)") at /Users/ekr/dev/mozilla-inbound/media/webrtc/trunk/testing/gtest/src/gtest.cc:2071
#28 0x0000000100bf09ae in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x10bc1f860, method={ptr = 4307465088, ptr = 0}, location=0x100ca649c "auxiliary test code (environments or event listeners)") at /Users/ekr/dev/mozilla-inbound/media/webrtc/trunk/testing/gtest/src/gtest.cc:2123
#29 0x0000000100beb364 in testing::UnitTest::Run (this=0x1010075a0) at /Users/ekr/dev/mozilla-inbound/media/webrtc/trunk/testing/gtest/src/gtest.cc:3882
#30 0x000000010000604e in main (argc=1, argv=0x7fff5fbff458) at /Users/ekr/dev/mozilla-inbound/media/webrtc/signaling/test/signaling_unittests.cpp:1239
(gdb)
Comment on attachment 682961 [details] [diff] [review]
Lock PeerConnection

Review of attachment 682961 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/MediaStreamGraph.cpp
@@ +1683,5 @@
>  MediaStreamGraphImpl::EnsureRunInStableState()
>  {
> +  if (!NS_IsMainThread()) {
> +    MOZ_ASSERT(false);
> +  }

Per IRC this was unintentional; remove

@@ +1688,2 @@
>    NS_ASSERTION(NS_IsMainThread(), "main thread only");
> +  

whitespace

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp
@@ +98,5 @@
> +      CSFLogDebug(logTag, "%s: couldn't acquire peerconnection %s", __FUNCTION__, peerconnection); \
> +        return VCM_ERROR; \
> +     } \
> +  } while (0)
> +

#define ACQUIRE_PC(retval) sipcc::PeerConnectionWrapper pc(peerconnection); \
		do { \
		if (!pc.impl()) { \
		CSFLogDebug(logTag, "%s: couldn't acquire peerconnection %s", __FUNCTION__, peerconnection); \
		return retval; \
		} \
		} while (0)

And then for void funcs, ACQUIRE_PC(/* */), and for others ACQUIRE_PC(VCM_ERROR) (or whatever error is appropriate).  The "/* */" as an argument is bizarre, but works.

Not a big deal, but it means you can use it in pretty much any function regardless of return type.

I'm ok if you leave it.

::: media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp
@@ +449,3 @@
>                                                    &MediaStream::AddListener,
>                                                    listener_),
> +                       NS_DISPATCH_NORMAL);			

trailing spaces

@@ +674,5 @@
> +  nsRefPtr<MediaStream> stream (stream_->GetStream());
> +  return RUN_ON_THREAD(main_thread_, WrapRunnable(stream,
> +                                                  &MediaStream::AddListener,
> +                                                  listener_),
> +                       NS_DISPATCH_NORMAL);			

spaces

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.h
@@ +39,5 @@
>    PeerConnectionImpl::SipccState sipcc_state() { return mSipccState; }
>  
> +  // We could make these available only via accessors but it's too much trouble.
> +  std::map<const std::string, PeerConnectionImpl *> mPeerConnections;
> +  // This is a big giant lock for everything in both PeerConnectionCtx and

add a blank line before this comment for readability; this is an important comment.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +979,5 @@
> +PeerConnectionImpl::CheckApiState(bool assert_ice_ready)
> +{
> +  PeerConnectionCtx::GetInstance()->mLock.AssertCurrentThreadOwns();
> +  PR_ASSERT(!assert_ice_ready || (mIceState != kIceGathering));
> +  

space

@@ +997,5 @@
> +nsresult
> +PeerConnectionImpl::CloseInt() {
> +  PC_ASSERT_LOCK();
> +
> +  if (mCall != nullptr)

if (mCall)

@@ +1002,3 @@
>      mCall->endCall();
>  #ifdef MOZILLA_INTERNAL_API
> +  if (mDataConnection != nullptr)

ditto

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +508,5 @@
>  
>    void Close()
>    {
>      cout << "Close" << endl;
> +    

space

@@ +510,5 @@
>    {
>      cout << "Close" << endl;
> +    
> +    pc->GetMainThread()->Dispatch(
> +      WrapRunnable(pc, &sipcc::PeerConnectionImpl::Close),

I made this take a bool... so this will need to be reved

@@ +767,5 @@
>  class SignalingTest : public ::testing::Test {
>  public:
> +  void SetUp() {
> +    nsIThread *thread;
> +    

space
Attachment #682961 - Flags: review?(rjesup) → review+
Comment on attachment 682961 [details] [diff] [review]
Lock PeerConnection

Review of attachment 682961 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed.

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp
@@ +83,5 @@
>                                                               const char *fingerprint
>                                                               );
>  
> +// Convenience macro to acquire PC
> +#define ACQUIRE_PC_VOID() sipcc::PeerConnectionWrapper pc(peerconnection); \

For the record, I'm not a big fan of using things like pc and peerconnection in macros without passing them as arguments. Even if they're the same in every call, it makes it harder to see what is going on.

@@ +91,5 @@
> +      return; \
> +    }         \
> +  } while(0)
> +
> +#define ACQUIRE_PC() sipcc::PeerConnectionWrapper pc(peerconnection); \

Maybe ACQUIRE_PC_OR_RETURN... even having just read this macro I spent a while staring at vcmSetDataChannelParameters(), thinking the call could be completely removed, before remembering that it would change behavior if the acquisition failed.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +118,5 @@
>  
> +// Enter an API call and check that the state is OK,
> +// the PC isn't closed, etc.
> +#define PC_AUTO_ENTER_API_CALL(assert_ice_ready) \
> +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock); \

Could use PC_AUTO_ENTER_API_CALL_NO_CHECK here to make sure the code in one is never changed without changing the other.

@@ +122,5 @@
> +    MutexAutoLock pcApiLock_(PeerConnectionCtx::GetInstance()->mLock); \
> +    do { \
> +      /* do/while prevents res from conflicting with locals */    \
> +      nsresult res = CheckApiState(assert_ice_ready);             \
> +      if (NS_FAILED(res)) return res; } while(0)

} while(0) on a separate line, please.

@@ +301,5 @@
>    NS_GetMainThread(getter_AddRefs(mainThread));
>    NS_ProxyRelease(mainThread, mPCObserver);
>    */
>  }
>  

Can you make MakeMediaStream() and MakeRemoteSource() static? Otherwise they need to acquire a lock or assert lock ownership.

Also, CreateRemoteSourceStreamInfo() needs to acquire a lock or assert lock ownership, right?

@@ +711,4 @@
>    }
>  
>    return NS_OK;
>  }

Can you make ConvertConstraints() static? Otherwise it would need to acquire a lock or assert lock ownership.

Also, ConnectDataConnection(), CreateDataChannel(), NotifyConnection(), NotifyClosedConnection(), and NotifyDataChannel() all need to acquire a lock or assert lock ownership, right?

@@ +738,5 @@
> +  return CreateOfferInt(constraints);
> +}
> +
> +NS_IMETHODIMP
> +PeerConnectionImpl::CreateOfferInt(MediaConstraints& constraints) {

Brace doesn't match the style of the rest of the code.

@@ +739,5 @@
> +}
> +
> +NS_IMETHODIMP
> +PeerConnectionImpl::CreateOfferInt(MediaConstraints& constraints) {
> +

Please assert lock ownership here.

@@ +773,5 @@
> +}
> +
> +NS_IMETHODIMP
> +PeerConnectionImpl::CreateAnswerInt(MediaConstraints& constraints)
> +{

Please assert lock ownership here.

@@ +828,2 @@
>    if (mReadyState != PeerConnectionImpl::kClosed)  {
>      ChangeReadyState(PeerConnectionImpl::kClosing);

Unrelated to this patch: if we're in kClosing now, this will cause duplicate events to be issued. Is that really what we want or what is allowed/required by the spec?

@@ +914,4 @@
>    *fingerprint = tmp;
>    return NS_OK;
>  }
>  */

It would be nice to get rid of this dead code (not necessary in this patch, but it made it harder for me to confirm all the API calls were getting locks, since it changed which function Splinter thought this patch hunk applied to).

@@ +977,5 @@
>  
> +nsresult
> +PeerConnectionImpl::CheckApiState(bool assert_ice_ready)
> +{
> +  PeerConnectionCtx::GetInstance()->mLock.AssertCurrentThreadOwns();

Can you use PC_ASSERT_LOCK() for consistency, please?

@@ +1013,5 @@
>  }
>  
>  void
>  PeerConnectionImpl::ShutdownMedia()
>  {

Please assert lock ownership here.

@@ +1021,1 @@
>    // Check that we are on the main thread.

Unrelated to this patch: this comment is no longer accurate.

@@ +1041,1 @@
>  {

Can you add a comment that no lock is needed solely because it makes a single call to another function that is responsible for locking? That gives us a better chance that someone will remember to add a lock if this function grows later.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h
@@ +80,5 @@
>      kActive,
>      kClosing,
>      kClosed
>    };
>  

GetRole() and media() now need to assert lock ownership, right?

@@ +134,5 @@
>    void IceGatheringCompleted(NrIceCtx *aCtx);
>    void IceCompleted(NrIceCtx *aCtx);
>    void IceStreamReady(NrIceMediaStream *aStream);
>  
>    static void ListenThread(void *aData);

GetMainThread(), GetSTSThread(), GetIdentity(), and GetWindow() now need to assert lock ownership, right?

::: media/webrtc/signaling/test/FakeMediaStreams.h
@@ +74,3 @@
>   protected:
>    std::set<Fake_MediaStreamListener *> mListeners;
> +  mozilla::Mutex mMutex;

You need to document what this mutex is protecting. There are a number of members in subclasses (e.g., mTimer, mPeriodic in Fake_SourceMediaStream) whose access you are not consistently protecting with this lock. Is that okay?
Attachment #682961 - Flags: review?(tterribe)
Attachment #682961 - Flags: review?
Attachment #682961 - Flags: review+
Abandoning attempt to update this patch for nits in reviews; applying the current patch causes the "single host peerconnection" (i.e. loopback video call) test to deadlink.  I'll upload my un-bitrot of the current patch (no nits addressed)
Attachment #682961 - Attachment is obsolete: true
Attachment #682961 - Flags: review?
Attachment #682961 - Flags: feedback?(ethanhugg)
Alias: webrtc-big-lock
Attached patch Lock PeerConnection (obsolete) — Splinter Review
PC synchronization
Comment on attachment 688946 [details] [diff] [review]
Lock PeerConnection

Review of attachment 688946 [details] [diff] [review]:
-----------------------------------------------------------------

Ethan,

This now works on my machine. I need to scrub it, but you asked for early access, so...
Attachment #688946 - Flags: feedback?(ethanhugg)
Attachment #683211 - Attachment is obsolete: true
Attached patch Lock PeerConnection (obsolete) — Splinter Review
PC synchronization
Attachment #688946 - Attachment is obsolete: true
Attachment #688946 - Flags: feedback?(ethanhugg)
Attachment #689013 - Flags: feedback?(tterribe)
Attachment #689013 - Flags: feedback?(rjesup)
Attachment #689013 - Flags: feedback?(ethanhugg)
This patch still crashes on exit, apparently b/c we have not completely terminated the receiving media conduits. Assuming that everything else works, we can fix that in another patch.
Attached patch Lock PeerConnection (obsolete) — Splinter Review
PC synchronization
Attachment #689013 - Attachment is obsolete: true
Attachment #689013 - Flags: feedback?(tterribe)
Attachment #689013 - Flags: feedback?(rjesup)
Attachment #689013 - Flags: feedback?(ethanhugg)
Comment on attachment 689202 [details] [diff] [review]
Lock PeerConnection

Review of attachment 689202 [details] [diff] [review]:
-----------------------------------------------------------------

Partial review (had gone through a bunch on another machine; will push those comments in a bit).

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp
@@ +77,5 @@
> +    if (!pc.impl()) {                                                 \
> +      CSFLogDebug(logTag, "%s: couldn't acquire peerconnection %s", __FUNCTION__, peerconnection); \
> +      return; \
> +    }         \
> +  } while(0)

I'd prefer to not hide the local var, and have something like:

sipcc::PeerConnectionWrapper pc(peerconnection);
ENSURE_PC_VOID();
(or ENSURE_PC(/* */);)
Comment on attachment 689202 [details] [diff] [review]
Lock PeerConnection

Review of attachment 689202 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +112,5 @@
> +    do { \
> +      /* do/while prevents res from conflicting with locals */    \
> +      CheckThread();                                              \
> +      nsresult res = CheckApiState(assert_ice_ready);             \
> +      if (NS_FAILED(res)) return res; } while(0)

Is there a reason this should be a macro and not an inline function?  Might be a little easier when stepping through in gdb/etc

@@ +181,5 @@
>            nsDOMMediaStream* stream;
>            uint32_t hint;
>  
> +          if (!mRemoteStream)
> +          {

style

@@ +185,5 @@
> +          {
> +            CSFLogErrorS(logTag, __FUNCTION__ << " GetRemoteStream returned NULL");
> +          }
> +          else
> +          {

style

@@ +573,5 @@
> +}
> +
> +
> +// Not a member function so that we don't need to keep the PC live.
> +// TODO(jesup): Please review

remove comment

@@ +718,5 @@
>      return rv;
>    }
>  
> +
> + CreateAnswerInt(*cs);

indent

@@ +1109,5 @@
>    }
>  #endif
>  }
>  
> +// TODO(ekr@rtfm.com): Need to implement disconnect.

File a bug on this
Attachment #689202 - Flags: review+
Attached patch Lock PeerConnection (obsolete) — Splinter Review
PC synchronization
Attachment #689202 - Attachment is obsolete: true
Attachment #689916 - Flags: review?(tterribe)
Attachment #689916 - Flags: review?(ethanhugg)
Comment on attachment 689916 [details] [diff] [review]
Lock PeerConnection

Review of attachment 689916 [details] [diff] [review]:
-----------------------------------------------------------------

I'm r-'ing this only because you didn't address some of my comments from last time, but this is getting pretty close.

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp
@@ +71,5 @@
>                                                               const char *fingerprint
>                                                               );
>  
> +// Convenience macro to acquire PC
> +#define ACQUIRE_PC_VOID() sipcc::PeerConnectionWrapper pc(peerconnection); \

FWIW, I agree with jesup here. Also acceptable would be to take pc and peerconnection as macro parameters:

#define ACQUIRE_PC_VOID(pc,peerconnection) \
  sipcc::PeerConnectionWrapper pc(peerconnection); \
  ...

It's more verbose, but that's a good thing for those trying to follow the code, and allows you to, for example, acquire two difference PCs in the same function (if for some horrible reason we ever wanted to do that). In any case, the variable declaration should be on a separate line from the #define.

See also my previous comments about the name.

@@ +886,5 @@
>   *  new stream model.
>   *
>   *  Returns: zero(0) for success; otherwise, ERROR for failure
>   */
> +short vcmCreateRemoteStream_m(

Should be static.

@@ +1248,5 @@
>   *
>   *  Returns: zero(0) for success; otherwise, ERROR for failure
>   */
>  
> +int vcmRxStartICE_m(cc_mcapid_t mcap_id,

Should be static.

@@ +1870,5 @@
>   *
>   */
>  #define EXTRACT_DYNAMIC_PAYLOAD_TYPE(PTYPE) ((PTYPE)>>16)
>  
> +int vcmTxStartICE_m(cc_mcapid_t mcap_id,

Should be static.

@@ +2039,5 @@
> +                        fingerprint,
> +                        attrs,
> +                        &ret),
> +      NS_DISPATCH_SYNC);
> +  

Whitespace.

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.h
@@ +63,5 @@
>          StreamObserver* streamObserver;
>          MediaProviderObserver *mediaProviderObserver;
>          static int mAudioCodecMask;
>          static int mVideoCodecMask;
> +	static nsIThread *_mainThread;

You've got a mix of Mozilla mFoo and Google _foo style here. Can we pick one, please?

::: media/webrtc/signaling/src/mediapipeline/MediaPipeline.cpp
@@ +449,3 @@
>                                                    &MediaStream::AddListener,
>                                                    listener_),
> +                       NS_DISPATCH_NORMAL);

I'm going to have to leave detailed review of this change until you've documented the MediaPipeline threading model (in bug 807238), though I'm at least convinced this solves the lifetime issues.

@@ +674,5 @@
> +  nsRefPtr<MediaStream> stream (stream_->GetStream());
> +  return RUN_ON_THREAD(main_thread_, WrapRunnable(stream,
> +                                                  &MediaStream::AddListener,
> +                                                  listener_),
> +                       NS_DISPATCH_NORMAL);

I'm going to have to leave detailed review of this change until you've documented the MediaPipeline threading model (in bug 807238), though I'm at least convinced this solves the lifetime issues.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp
@@ +29,5 @@
> +void PeerConnectionCtx::InitializeGlobal(nsIThread *mainThread) {
> +  if (!gMainThread) {
> +    gMainThread = mainThread;
> +    CSF::VcmSIPCCBinding::setMainThread(gMainThread);
> +  }

If gMainThread != nsnull, then we should probably assert that it == mainThread, as anything else would be a serious error. You should also probably assert that we are in fact on mainThread (which requires it to be non-NULL... something that is not guaranteed unless you add some other checks I propose below, and is probably worth re-asserting here).

@@ +35,3 @@
>  
>  PeerConnectionCtx* PeerConnectionCtx::GetInstance() {
>    if (instance)

As per discussion on IRC: it would be better to split the singleton creation into a separate CreateInstance() function that is only called by PCImpl, because there's no legitimate reason for anyone else to ever create one on demand. We can just cleanly assert that instance is non-NULL here.

@@ +37,5 @@
>    if (instance)
>      return instance;
>  
>    CSFLogDebug(logTag, "Creating PeerConnectionCtx");
>    PeerConnectionCtx *ctx = new PeerConnectionCtx();

AIUI, InitializeGlobal() must have been called by this point to set CSF::VcmSipCCBinding::_mainThread. Can we assert that getMainThread() is not NULL here, to make sure that's true? Or just combine the CreateInstance() proposed above with InitializeGlobal().

Also, unlike gMainThread/CSF::VcmSipCCBinding::_mainThread, which are only set once before any SIPCC threads are created and never changed, this object can be deleted in Destroy() below, and might get re-created later. It's not clear to me what a) prevents the SIPCC threads from using the old object or b) ensures the initialization of any newly-created object has a happens-before relation with any attempt to access it from a SIPCC thread.

Since shutdown currently does not work at all, I'm okay with not addressing that in this patch, but we need to address it. Perhaps just add a TODO in Destroy that we need to shutdown SIPCC (i.e., return the CC device manager to the MGMT_STATE_IDLE state). It's possible we're already doing that (we call CCAPI_Service_stop(), but in some cases the actual shutdown is deferred, and it's not clear to me that we can't hit one of those cases).

@@ +152,2 @@
>    CSFLogDebug(logTag, "onCallEvent()");
> +  PeerConnectionWrapper pc(aCall->getPeerConnection());

If you parameterize the ACQUIRE_PC_VOID() macro as I suggested above, you could use it here.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.h
@@ +38,5 @@
>    CSF::CC_CallPtr createCall();
>  
>    PeerConnectionImpl::SipccState sipcc_state() { return mSipccState; }
>  
> +  // We could make these available only via accessors but it's too much trouble.

AIUI, this should only be accessed from PeerConnectionWrapper and PeerConnectionImpl. Can we make it private and make those two classes friends? That would guarantee everyone else has to go through the wrapper without the trouble of making accessors.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +110,5 @@
> +// the PC isn't closed, etc.
> +#define PC_AUTO_ENTER_API_CALL(assert_ice_ready) \
> +    do { \
> +      /* do/while prevents res from conflicting with locals */    \
> +      CheckThread();                                              \

I think you should delete this CheckThread() call and instead use PC_AUTO_ENTER_API_CALL_NO_CHECK() in CheckApiState().

@@ +112,5 @@
> +    do { \
> +      /* do/while prevents res from conflicting with locals */    \
> +      CheckThread();                                              \
> +      nsresult res = CheckApiState(assert_ice_ready);             \
> +      if (NS_FAILED(res)) return res; } while(0)

} while (0) on a separate line, please.

@@ +133,2 @@
>  
> +      streams = aInfo->getMediaStreams();

Combine the previous two lines.

@@ +214,5 @@
> +  StatusCode mCode;
> +  std::string mSdpStr;
> +  cc_call_state_t mCallState;
> +  std::string mStateStr;
> +  nsRefPtr<RemoteSourceStreamInfo> mRemoteStream;

Nice!

@@ +228,5 @@
>    , mPCObserver(NULL)
>    , mWindow(NULL)
>    , mIdentity(NULL)
>    , mSTSThread(NULL)
> +    , mMedia(new PeerConnectionMedia(this)) {}

Please MOZ_ASSERT(NS_IsMainThread()) here.

@@ +237,5 @@
> +  PeerConnectionCtx::GetInstance()->mPeerConnections.erase(mHandle);
> +  CloseInt(false);
> +
> +#if 0
> +  // TODO(ekr@rtfm.com): figure out how to shut down PCCtx.

Can you reference a bug number here? File one if you need to.

@@ +323,3 @@
>  
>    return NS_OK;
>  }

CreateRemoteSourceStreamInfo() needs to invoke PC_AUTO_ENTER_API_CALL_NO_CHECK(). You should also eliminate the NS_DISPATCH_SYNC dispatches, and just call the associated functions directly, as you are now running on the right thread to begin with.

@@ +327,5 @@
>  NS_IMETHODIMP
>  PeerConnectionImpl::Initialize(IPeerConnectionObserver* aObserver,
>                                 nsIDOMWindow* aWindow,
>                                 nsIThread* aThread) {
> +  PC_AUTO_ENTER_API_CALL_NO_CHECK();

mThread is NULL here, so this check does nothing (and what I propose below will break its usage here).

@@ +332,1 @@
>    MOZ_ASSERT(NS_IsMainThread());

This is the check that actually does something. After this point, however, I think we need to be more consistent in our usage of these two checks.

@@ +348,5 @@
>    mWindow = do_QueryInterface(aWindow);
>    NS_ENSURE_STATE(mWindow);
>  #endif
>  
>    // The thread parameter can be passed in as NULL

This comment is now a lie. In fact you should assert the opposite here.

@@ +545,1 @@
>  void

You still forgot to add PC_AUTO_ENTER_API_CALL_NO_CHECK() to ConnectDataConnection() and CreateDataChannel().

@@ +545,4 @@
>  void
>  PeerConnectionImpl::NotifyConnection()
>  {
>    MOZ_ASSERT(NS_IsMainThread());

Use PC_AUTO_ENTER_API_CALL_NO_CHECK() instead for consistency.

@@ +559,5 @@
>  
>  void
>  PeerConnectionImpl::NotifyClosedConnection()
>  {
>    MOZ_ASSERT(NS_IsMainThread());

Use PC_AUTO_ENTER_API_CALL_NO_CHECK() instead for consistency.

@@ +576,5 @@
> +// Not a member function so that we don't need to keep the PC live.
> +// TODO(jesup): Please review
> +static void NotifyDataChannel_m(nsRefPtr<nsIDOMDataChannel> aChannel,
> +                                nsCOMPtr<IPeerConnectionObserver> aObserver)
> +{

This should MOZ_ASSERT(NS_IsMainThread()).

@@ +586,5 @@
>  
>  void
>  PeerConnectionImpl::NotifyDataChannel(mozilla::DataChannel *aChannel)
>  {
>    MOZ_ASSERT(NS_IsMainThread());

Use PC_AUTO_ENTER_API_CALL_NO_CHECK() instead for consistency.

@@ +670,4 @@
>    }
>  
>    return NS_OK;
>  }

ConvertConstraints needs to either be made static or invoke PC_AUTO_ENTER_API_CALL_NO_CHECK().

@@ +696,5 @@
> +  return CreateOfferInt(constraints);
> +}
> +
> +NS_IMETHODIMP
> +PeerConnectionImpl::CreateOfferInt(MediaConstraints& constraints) {

I don't think we need a separate function for this anymore. There's no reason not to always invoke PC_AUTO_ENTER_API_CALL(true), and it's safer to do so.

@@ +731,5 @@
> +  return CreateAnswerInt(constraints);
> +}
> +
> +NS_IMETHODIMP
> +PeerConnectionImpl::CreateAnswerInt(MediaConstraints& constraints)

Don't need a separate function for this, either.

@@ +937,5 @@
>  
> +nsresult
> +PeerConnectionImpl::CheckApiState(bool assert_ice_ready)
> +{
> +  PR_ASSERT(!assert_ice_ready || (mIceState != kIceGathering));

See above @ PC_AUTO_ENTER_API_CALL().

@@ +955,5 @@
> +
> +
> +nsresult
> +PeerConnectionImpl::CloseInt(bool aIsSynchronous)
> +{

Need to invoke PC_AUTO_ENTER_API_CALL_NO_CHECK(). I know all the callers already do, but we could grow more callers in the future.

@@ +974,5 @@
>  void
>  PeerConnectionImpl::ShutdownMedia(bool aIsSynchronous)
>  {
> +
> +  if (!mMedia)

Can you explain why you moved this? We can still only safely be called from the main thread, and if that thread still exists, we shouldn't have a problem accessing it below (we never NULL out mThread).

@@ +981,1 @@
>    // Check that we are on the main thread.

Replace this with PC_AUTO_ENTER_API_CALL_NO_CHECK().

@@ +1049,3 @@
>      runnable->Run();
>    }
>  }

onCallEvent() needs to invoke PC_AUTO_ENTER_API_CALL_NO_CHECK().

@@ +1051,5 @@
>  }
>  
>  void
>  PeerConnectionImpl::ChangeReadyState(PeerConnectionImpl::ReadyState aReadyState)
>  {

Need to invoke PC_AUTO_ENTER_API_CALL_NO_CHECK().

@@ +1074,5 @@
>  }
>  
>  const std::string&
>  PeerConnectionImpl::GetHandle()
>  {

Need to invoke PC_AUTO_ENTER_API_CALL_NO_CHECK().

@@ +1114,3 @@
>  void
>  PeerConnectionImpl::IceCompleted(NrIceCtx *aCtx)
>  {

I have not reviewed that the lifetimes here are correct (i.e., that this function won't get invoked after this PeerConnectionImpl() has been destroyed) nor that there is a happens-before relationship that ensures the caller sees the correct value of mThread here.

You should probably at least document here what thread this gets called from (even if it's obvious to you, it won't be to others).

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h
@@ +174,5 @@
> +    // Thread assertions are disabled in the C++ unit tests because those
> +    // make API calls off the main thread.
> +    // TODO(ekr@rtfm.com): Fix the unit tests so they don't do that.
> +    bool on;
> +    if (!mThread)

Really? I think we should just MOZ_ASSERT(mThread) instead, as I can't see how it's possibly valid for it to be NULL anymore. In fact, we probably need to fix dom/media/bridge/IPeerConnection.idl to stop declaring the thread parameter optional. Alternatively, we should fix the unit tests to pump the XPCOM event loop properly and use the real main thread, and just get rid of this parameter entirely. That can be in a follow-up bug.

@@ +186,5 @@
>  
>    // Shut down media. Called on any thread.
>    void ShutdownMedia(bool isSynchronous);
>  
>    nsresult MakeMediaStream(uint32_t aHint, nsIDOMMediaStream** aStream);

This needs to either be made static or invoke PC_AUTO_ENTER_API_CALL_NO_CHECK(). Currently it only uses this for debug logging.

@@ +187,5 @@
>    // Shut down media. Called on any thread.
>    void ShutdownMedia(bool isSynchronous);
>  
>    nsresult MakeMediaStream(uint32_t aHint, nsIDOMMediaStream** aStream);
>    nsresult MakeRemoteSource(nsDOMMediaStream* aStream, RemoteSourceStreamInfo** aInfo);

This needs to either be made static or invoke PC_AUTO_ENTER_API_CALL_NO_CHECK(). Currently it doesn't use this at all.

::: media/webrtc/signaling/test/FakeMediaStreams.h
@@ +74,3 @@
>   protected:
>    std::set<Fake_MediaStreamListener *> mListeners;
> +  mozilla::Mutex mMutex;  // Locks the listener list against periodic.

Periodic what? Perhaps: "Lock to prevent the listener list from being modified while executing Periodic()." Good practice would be to make a PeriodicInt() which is protected and have the public Period() be non-virtual, acquire the lock, and invoke PeriodicInt() (but I realize this is just test code).

This explanation also doesn't explain why you acquire this lock in Start()/Stop() below... if you want to re-use this lock to protect additional members in the subclasses, you should document that where those members are declared.

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +303,3 @@
>    onAddStreamCalled = true;
>  
> +  streams.push_back(ms);

Why are you adding ms to streams twice?
Attachment #689916 - Flags: review?(tterribe) → review-
Attached patch Lock PeerConnection (obsolete) — Splinter Review
PC synchronization
Attachment #689916 - Attachment is obsolete: true
Attachment #689916 - Flags: review?(ethanhugg)
Attachment #690456 - Flags: review?(tterribe)
Attachment #690456 - Flags: review?(rjesup)
This should address derf and Jesup's issues, though there were a lot so I may have missed some.
Comment on attachment 690456 [details] [diff] [review]
Lock PeerConnection

Review of attachment 690456 [details] [diff] [review]:
-----------------------------------------------------------------

Let's land this thing.

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp
@@ +101,5 @@
>  
>  /* static */
>  StreamObserver * VcmSIPCCBinding::getStreamObserver()
>  {
> +    if (gSelf != NULL)

Mozilla style guide would have if (gSelf) here.

@@ +2042,5 @@
> +                        fingerprint,
> +                        attrs,
> +                        &ret),
> +      NS_DISPATCH_SYNC);
> +  

trailing whitespace.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp
@@ +36,5 @@
> +
> +  if (!gInstance) {
> +    CSFLogDebug(logTag, "Creating PeerConnectionCtx");
> +    PeerConnectionCtx *ctx = new PeerConnectionCtx();
> +    

We just removed all the trailing whitespace from signaling.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +572,5 @@
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  aObserver->NotifyDataChannel(aChannel);
> +#ifdef MOZILLA_INTERNAL_API

If you don't put this ifdef around the whole method, you'll make this warning:

/home/ehugg/mozilla/work/media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp:570:13: warning: ‘void sipcc::NotifyDataChannel_m(nsRefPtr<nsIDOMDataChannel>, nsCOMPtr<IPeerConnectionObserver>)’ defined but not used [-Wunused-function]
Attachment #690456 - Flags: review+
Comment on attachment 690456 [details] [diff] [review]
Lock PeerConnection

Review of attachment 690456 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the last few comments addressed.

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.h
@@ +63,5 @@
>          StreamObserver* streamObserver;
>          MediaProviderObserver *mediaProviderObserver;
> +        static int gAudioCodecMask;
> +        static int gVideoCodecMask;
> +	static nsIThread *gMainThread;

Thanks. Sorry to juberti you there.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp
@@ +31,5 @@
> +    gMainThread = mainThread;
> +    CSF::VcmSIPCCBinding::setMainThread(gMainThread);
> +  } else {
> +    MOZ_ASSERT(gMainThread == mainThread);
> +  }

You should also assert that we're actually on mainThread.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +111,5 @@
> +#define PC_AUTO_ENTER_API_CALL(assert_ice_ready) \
> +    do { \
> +      /* do/while prevents res from conflicting with locals */    \
> +      nsresult res = CheckApiState(assert_ice_ready);             \
> +      if (NS_FAILED(res)) return res; } \

It's the brace that really needs to be on a new line (the while(0) can be on the same line as the brace, though).

@@ +565,5 @@
> +                NS_DISPATCH_NORMAL);
> +#endif
> +}
> +
> +

The "Not a member function so that we don't need to keep the PC live." comment that used to be here was a good one.

@@ +673,2 @@
>  
> +  MediaConstraints cs;

Nice! I did not notice this leak before.

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.h
@@ +112,2 @@
>  
>    Role GetRole() const { return mRole; }

Sorry, I forgot to copy over some comments from my second review:

GetRole(), media(), GetMainThread(), GetSTSThread(), GetIdentity(), and GetWindow() all need to invoke PC_AUTO_ENTER_API_CALL_NO_CHECK().
Attachment #690456 - Flags: review?(tterribe) → review+
PC synchronization
Attachment #690456 - Attachment is obsolete: true
Attachment #690456 - Flags: review?(rjesup)
Comment on attachment 690650 [details] [diff] [review]
Lock PeerConnection

carrying forward r=derf, r=ehugg
Attachment #690650 - Flags: review+
Attachment #690650 - Flags: review+ → review?(rjesup)
Jesup: just want to make sure you are cool with this before I land it.
Comment on attachment 690650 [details] [diff] [review]
Lock PeerConnection

Review of attachment 690650 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/webrtc/signaling/src/media/VcmSIPCCBinding.cpp
@@ +808,4 @@
>  {
>    CSFLogDebug( logTag, "%s: PC = %s", __FUNCTION__, peerconnection);
>  
> +  sipcc::PeerConnectionWrapper pc(peerconnection);  

trailing space

@@ +2042,5 @@
> +                        fingerprint,
> +                        attrs,
> +                        &ret),
> +      NS_DISPATCH_SYNC);
> +  

trailing space

::: media/webrtc/signaling/src/peerconnection/PeerConnectionCtx.cpp
@@ +43,5 @@
> +  res = gMainThread->IsOnCurrentThread(&on);
> +  NS_ENSURE_SUCCESS(res, res);
> +  MOZ_ASSERT(on);
> +#endif
> +    

trailing spaces in a few places

::: media/webrtc/signaling/src/peerconnection/PeerConnectionImpl.cpp
@@ +279,3 @@
>  
>    nsIDOMMediaStream* stream;
> +  

trailing space

@@ +930,5 @@
> +PeerConnectionImpl::CloseInt(bool aIsSynchronous)
> +{
> +  PC_AUTO_ENTER_API_CALL(true);
> +
> +  if (mCall != nullptr)

I think ethan flagged this, but it's a linked_ptr and thus "if (foo)" won't work, so this is right.

::: media/webrtc/signaling/test/FakeMediaStreams.h
@@ +48,5 @@
>  
>  // Note: only one listener supported
>  class Fake_MediaStream {
> + public:
> +  Fake_MediaStream () : mListeners(), mMutex("MediaStream Mutex") {}

"Fake MediaStream" please (and we know it's a mutex)

::: media/webrtc/signaling/test/signaling_unittests.cpp
@@ +802,5 @@
> +    nsIThread *thread;
> +
> +    nsresult rv = NS_NewThread(&thread);
> +    ASSERT_TRUE(NS_SUCCEEDED(rv));
> +    

whitespace (and again lower)
Attachment #690650 - Flags: review?(rjesup) → review+
https://hg.mozilla.org/mozilla-central/rev/2350ed0a5b3f
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Whiteboard: [WebRTC], [blocking-webrtc+] → [WebRTC], [blocking-webrtc+] [qa-]
You need to log in before you can comment on or make changes to this bug.