Closed Bug 1729366 Opened 3 years ago Closed 3 years ago

test_peerConnection_twoAudioVideoStreams.html and test_peerConnection_twoAudioVideoStreamsCombined.html are timing out

Categories

(Core :: WebRTC, defect, P2)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: ng, Assigned: pehrsons)

References

Details

TEST-UNEXPECTED-FAIL | dom/media/webrtc/tests/mochitests/test_peerConnection_twoAudioVideoStreams.html | Test timed out. -

TEST-UNEXPECTED-FAIL | dom/media/webrtc/tests/mochitests/test_peerConnection_twoAudioVideoStreamsCombined.html | Test timed out. -

https://treeherder.mozilla.org/jobs?repo=try&author=mfroman%40mozilla.com&selectedTaskRun=W5ZoDEjWR0aRRM3eHpBahw.3

https://treeherder.mozilla.org/logviewer?job_id=350517429&repo=try&lineNumber=448084

https://treeherder.mozilla.org/logviewer?job_id=350517429&repo=try&lineNumber=448937

This is a deadlock between starting conduits (which triggers a SyncRunnable from Call thread to main as part of video codec init), and GetRtpSources (which grabs a conduit Mutex on main). Fixed by D124373.

Assignee: nobody → apehrson
Status: NEW → ASSIGNED

Hmm. This cannot have been the deadlock I mention in comment 1 because these tests don't use getContributingSources or getSynchronizationSources.

This being TSAN it just kinda looks like it's too slow. We could try to increase the size of the thread pool. Bug 1706925 is meant to follow up with this.

The call thread is a global (process-wide) TaskQueue which (among other things) routes all network packets (ugh, heavy), so may still be a bottleneck.

I'm not sure how many cores the machines running TSAN have, but I would like to see us setting a thread pool size that's a bit more adapted to the local machine's CPU. If we were saturating the thread pool this should help ease the pain for the call thread. It's at least worth trying to explore a higher number than 4 to see whether it has any effect on this test. We could also consider making the call thread a dedicated thread (or a single-thread thread-pool-backed TaskQueue, probably makes for a simpler patch) so it doesn't compete with the other TaskQueues over the threads in the pool. Do you have cycles to play with this Michael?

Flags: needinfo?(apehrson) → needinfo?(mfroman)

I've written a patch to improve this by making the call thread sit on a dedicated single-thread thread pool. Seems to help locally under rr. Checking tsan on try here.

Flags: needinfo?(mfroman)

See D127263 for this (on bug 1654112).

This appears fixed by D127263. See here. I'm going to close this as fixed for now.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.