Closed Bug 1276156 Opened 5 years ago Closed 5 years ago

deadlock shutting down vie_encoder in statistics


(Core :: WebRTC, defect, P1)




Tracking Status
firefox49 --- fixed


(Reporter: jesup, Assigned: jesup)




(1 file)

This may be a major cause of intermittent test-timed-out failures.

Looking through the logs of a couple timeouts on Try, I noticed that at least two had the same signature:

At least one thread hanging shutting down a vie_encoder thread: (MainThread in this instance:)
ProcessThreadImpl::DeRegisterModule() (criticalsection.h:117)
ViEEncoder::StopThreadsAndRemoveSharedMembers []
WebrtcVideoConduit::Destroy [VideoConduit.cpp:506]

This waits on the per-thread 'lock_' to be available, which is held while running code for the thread (note).

At least one thread in ProcessThreadImpl::Process() and *not* in Wait(), and generally in vie statistics code:
ProcessThreadImpl::Process []
VideoCodingModuleImpl::Process []
ViECodecImpl::GetSendCodecStatistics [] ***

and in another timeout (android):
vcm::VideoSender::Process []
ViEEncoder::SendStatistics []
VideoCodecStatistics::OutgoingRate [CodecStatistics.cpp:60]
ViECodecImpl::GetSendCodecStatistics [] ***
ViEInputManagerScoped::ViEInputManagerScoped []
RWLockGeneric::AcquireLockShared []

Note the *** lines
This implies a deadlock around ViEManagerScopedBase::ViEManagerScopedBase()/instance_rwlock_.AcquireLockShared()
this should only be possible if it's owned exclusively by someone I think.  I'll note all uses of     ViEManagerWriteScoped() are scopes, and unless it's sitting in them the lock should be released, and I don't see any thread in them.
Rank: 15
Blocks: 1275814
Blocks: 1248848
To avoid deadlocks between DeleteChannel() and Process() code

MozReview-Commit-ID: G650SWDH8s0
Attachment #8757492 - Flags: review?(pkerr)
This is the deadlock scenario:

[15:15]	jesup	we're in VideoCodingModuleImpl::Process() (and thus the ProcessThread lock_ is held while we run)
[15:15]	jesup	on MainThread, we delete the VideoConduit
[15:16]	jesup	which calls DeleteChannel, which grabs an exclusive lock on the ViEManager (ViEManagerWriteScoped wl ((this))
[15:17]	jesup	then ::Process tries to do ViECodecImpl::GetSendCodecStatistics() which grabs the lock in readonly mode.  Since it's held for write, that waits.  
[15:22]	jesup	and then DeleteChannel eventually calls ViEEncoder::StopThreadsAndRemoveSharedMembers [] which calls ProcessThreadImpl::DeRegisterModule() (criticalsection.h:117) which needs the lock_ -- which is held by the thread waiting on the channel lock
[15:22]	jesup	Boom, deadlock

Solution is to stop the threads *before* taking the exclusive lock which can deadlock us.
Duplicate of this bug: 1248848
Duplicate of this bug: 1275814
Attachment #8757492 - Flags: review?(pkerr) → review+
Pushed by
Stop encoder ProcessThreads before deleting the channel r=pkerr
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
You need to log in before you can comment on or make changes to this bug.