Closed Bug 1728616 Opened 3 years ago Closed 2 years ago

Intermittent deadlock in webrtc/RTCDataChannel-close.html breaking tsan?

Tracking

()

Status:

RESOLVED DUPLICATE of bug 1795697

People

(Reporter: bwc, Unassigned)

References

Details

Byron Campen [:bwc]

Reporter

Description

•

3 years ago

•

Edited

This is different than what I've observed while working on bug 1635911:

[task 2021-08-31T23:32:14.266Z] 23:32:14 INFO - PID 1268 | [Child 1575: Main Thread]: D/DataChannel 7b2c00021160: Close()ing 7b3400034750
[task 2021-08-31T23:33:13.494Z] 23:33:13 INFO - PID 1268 | [Child 1575: Unnamed thread 7b440003bd80]: D/DataChannel In receive_cb, ulp_info=41
[task 2021-08-31T23:33:13.495Z] 23:33:13 INFO - PID 1268 | [Child 1575: Unnamed thread 7b440003bd80]: D/DataChannel In ReceiveCallback
[task 2021-08-31T23:35:21.451Z] 23:35:21 INFO - Got timeout in harness
[task 2021-08-31T23:35:21.454Z] 23:35:21 INFO - TEST-UNEXPECTED-TIMEOUT | /webrtc/RTCDataChannel-close.html | TestRunner hit external timeout (this may indicate a hang)
[task 2021-08-31T23:35:21.454Z] 23:35:21 INFO - TEST-INFO took 195004ms

That last log line is here:

https://searchfox.org/mozilla-central/rev/ac7da6c7306d86e2f86a302ce1e170ad54b3c1fe/netwerk/sctp/datachannel/DataChannel.cpp#2372

We do not see the following logging, which means we're in the case where !!data:

https://searchfox.org/mozilla-central/rev/ac7da6c7306d86e2f86a302ce1e170ad54b3c1fe/netwerk/sctp/datachannel/DataChannel.cpp#2375

From the logging, we are on an unnamed thread (in other words, we're getting a callback from libusrsctp), so we'll end up trying to lock here:

https://searchfox.org/mozilla-central/rev/ac7da6c7306d86e2f86a302ce1e170ad54b3c1fe/netwerk/sctp/datachannel/DataChannel.cpp#2379

Right before that, we see the "Close()ing" line; this ends up locking the same mutex here:

https://searchfox.org/mozilla-central/source/netwerk/sctp/datachannel/DataChannel.cpp#2989

It looks like there might be cases where we call into libusrsctp while holding that lock, which could cause a lock-order-inversion problem, and also cause main to deadlock, which would explain why we stop seeing logging for the entire process. This is just a hypothesis, though.

Byron Campen [:bwc]

Reporter

Updated

•

3 years ago

Severity: S3 → S2

Priority: P3 → P2

Byron Campen [:bwc]

Reporter

Comment 1

•

3 years ago

Maybe related to bug 1735972?

Comment 2

•

2 years ago

Are we still seeing this? Thinking this can be an S3 vs. S2 since it's not user facing.

Flags: needinfo?(docfaraday)

Byron Campen [:bwc]

Reporter

Comment 3

•

2 years ago

I'm fairly sure that bug 1795697 fixes this.

Status: NEW → RESOLVED

Closed: 2 years ago

Duplicate of bug: CVE-2022-46871

Flags: needinfo?(docfaraday)

Resolution: --- → DUPLICATE

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Intermittent deadlock in webrtc/RTCDataChannel-close.html breaking tsan?

Categories

(Core :: WebRTC: Networking, defect, P2)

Tracking

()

People

(Reporter: bwc, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3