Closed Bug 845523 Opened 8 years ago Closed 8 years ago
TSan: Thread data race in sipcc::Peer
Connection Ctx::on Device Event() vs . sipcc::Peer Connection Ctx::Change Sipcc State()
During initial tests with ThreadSanitizer (LLVM version), we get a data race reported as described in the attached log. Trace was created on mozilla-central with changeset 122820:c233837cce08. According to the TSan devs, most of the reported traces should be real data races, even though they can be "benign". We need to determine if the race can/should be fixed, or put on the ignore list. Even for benign races, TSan devs suggest to fix them (second priority), as they can also cause problems .  http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
Perhaps another way the sipcc startup code can go wrong, unless this is a "benign" race. -> adam who's looked at this a lot
Assignee: nobody → adam
Whiteboard: [tsan][tsan-test-blocker] → [tsan][tsan-test-blocker][webrtc][blocking-webrtc+]
This is another race related to Bug 839647. It looks like we need to move the signaling of the ccAppReadyToStart condition up a level, from CallControlManagerImpl::startSDPMode() to PeerConnectionCtx::Initialize() (immediately before it returns). This should be fixed, but I would consider it ultralow priority at the moment. The CCApp thread will be doing a lot of stuff before it gets around to reading this variable, giving the main thread a very wide berth for its unsynchronized write (which happens immediately after the CCApp thread is given permission to start processing messages). It's a race that can't be lost. This should be coupled with a removal of the non-atomic hack to PeerConnectionCtx::onDeviceEvent that moves state into a stack variable "to avoid a data race": once the condition signaling is moved, the race is completely eliminated.
Priority: P2 → P5
comment 2 sends the impression that this is blocking- now
Whiteboard: [tsan][tsan-test-blocker][webrtc][blocking-webrtc+] → [tsan][tsan-test-blocker][webrtc][blocking-webrtc-]
I'm going to go ahead and fix this now, since it'll clear up some uncertainty I have regarding the behavior I see in the logs for Bug 839677.
Comment on attachment 720153 [details] [diff] [review] Move condition signaling into PeerConnectionCtx init Ethan: This is pretty much the same fix you reviewed for Bug 839647, but with the signaling of the condition moved up one frame.
Attachment #720153 - Flags: review?(ethanhugg)
Here's a try (along with a patch for bug 846942): https://tbpl.mozilla.org/?tree=Try&rev=a44a035255ff
Re-try after compile fixes for Linux & Windows: https://tbpl.mozilla.org/?tree=Try&rev=5157500f9c63
Comment on attachment 720153 [details] [diff] [review] Move condition signaling into PeerConnectionCtx init Review of attachment 720153 [details] [diff] [review]: ----------------------------------------------------------------- lgtm
Attachment #720153 - Flags: review?(ethanhugg) → review+
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Whiteboard: [tsan][tsan-test-blocker][webrtc][blocking-webrtc-] → [tsan][tsan-test-blocker][webrtc][blocking-webrtc-][qa-]
You need to log in before you can comment on or make changes to this bug.