test_peerConnection_twoAudioVideoStreams.html and test_peerConnection_twoAudioVideoStreamsCombined.html are timing out
Categories
(Core :: WebRTC, defect, P2)
Tracking
()
People
(Reporter: ng, Assigned: pehrsons)
References
Details
TEST-UNEXPECTED-FAIL | dom/media/webrtc/tests/mochitests/test_peerConnection_twoAudioVideoStreams.html | Test timed out. -
TEST-UNEXPECTED-FAIL | dom/media/webrtc/tests/mochitests/test_peerConnection_twoAudioVideoStreamsCombined.html | Test timed out. -
https://treeherder.mozilla.org/logviewer?job_id=350517429&repo=try&lineNumber=448084
https://treeherder.mozilla.org/logviewer?job_id=350517429&repo=try&lineNumber=448937
Assignee | ||
Comment 1•3 years ago
|
||
This is a deadlock between starting conduits (which triggers a SyncRunnable from Call thread to main as part of video codec init), and GetRtpSources (which grabs a conduit Mutex on main). Fixed by D124373.
Comment 2•3 years ago
|
||
Andreas, D124373 is in our stack now, but we're still seeing the timeouts on linux1804-64-tsan-qr opt
builds:
https://treeherder.mozilla.org/jobs?repo=try&revision=df56c392371dfd991f9647ef4fc275ea2a3d6595&selectedTaskRun=LD1K2tWbRnaFqER4U2BQCg.0
and
https://treeherder.mozilla.org/jobs?repo=try&revision=df56c392371dfd991f9647ef4fc275ea2a3d6595&selectedTaskRun=LD1K2tWbRnaFqER4U2BQCg.0
Any thoughts?
Assignee | ||
Comment 3•3 years ago
|
||
Hmm. This cannot have been the deadlock I mention in comment 1 because these tests don't use getContributingSources
or getSynchronizationSources
.
This being TSAN it just kinda looks like it's too slow. We could try to increase the size of the thread pool. Bug 1706925 is meant to follow up with this.
The call thread is a global (process-wide) TaskQueue which (among other things) routes all network packets (ugh, heavy), so may still be a bottleneck.
I'm not sure how many cores the machines running TSAN have, but I would like to see us setting a thread pool size that's a bit more adapted to the local machine's CPU. If we were saturating the thread pool this should help ease the pain for the call thread. It's at least worth trying to explore a higher number than 4 to see whether it has any effect on this test. We could also consider making the call thread a dedicated thread (or a single-thread thread-pool-backed TaskQueue, probably makes for a simpler patch) so it doesn't compete with the other TaskQueues over the threads in the pool. Do you have cycles to play with this Michael?
Assignee | ||
Comment 4•3 years ago
•
|
||
I've written a patch to improve this by making the call thread sit on a dedicated single-thread thread pool. Seems to help locally under rr. Checking tsan on try here.
Assignee | ||
Comment 5•3 years ago
|
||
See D127263 for this (on bug 1654112).
Comment 6•3 years ago
|
||
Description
•