Closed
Bug 807647
Opened 12 years ago
Closed 12 years ago
Hang when calling DataChannel.send
Categories
(Core :: WebRTC, defect)
Tracking
()
VERIFIED
FIXED
mozilla19
People
(Reporter: ted, Assigned: jesup)
References
Details
(Keywords: hang)
Attachments
(2 files, 3 obsolete files)
89.75 KB,
text/plain
|
Details | |
5.16 KB,
patch
|
ekr
:
review+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
I have a demo I'm hacking on that reliably hangs on my Linux64 system when I call DataChannel.send. Attached is "thread apply all bt" from the hung process.
The salient bits seem to be:
1) Connect a PeerConnection
2) Add two DataChannels (in this demo: one reliable, one unreliable)
3) Call send on the reliable DataChannel
I'll try to reduce my demo down to a smaller testcase.
Reporter | ||
Comment 1•12 years ago
|
||
Note that I have a different demo that only creates one DataChannel, and this does not hang (on the exact same build).
Reporter | ||
Comment 2•12 years ago
|
||
jesup's suggested hack makes the hang go away:
diff --git a/netwerk/sctp/datachannel/DataChannel.cpp b/netwerk/sctp/datachannel
--- a/netwerk/sctp/datachannel/DataChannel.cpp
+++ b/netwerk/sctp/datachannel/DataChannel.cpp
@@ -410,17 +410,18 @@ DataChannelConnection::SendPacket(const
/* static */
int
DataChannelConnection::SctpDtlsOutput(void *addr, void *buffer, size_t length,
uint8_t tos, uint8_t set_df)
{
DataChannelConnection *peer = static_cast<DataChannelConnection *>(addr);
int res;
- if (peer->IsSTSThread()) {
+ if (1) {
Assignee | ||
Comment 3•12 years ago
|
||
Reporter | ||
Comment 4•12 years ago
|
||
I've had no luck producing a really reduced testcase, but here's a test that does repro for me, albeit with a few manual steps:
http://people.mozilla.org/~tmielczarek/webrtc-datachannels-hang/data-demo.html
1) Load some other page (not sure why, but this is required)
2) Load that page
3) Once the URL changes to include ?id=..., copy the URL and load it in a new tab.
4) Once both tabs say "Connected" type something into the input on the first tab and hit enter
5) Hang
Assignee | ||
Comment 5•12 years ago
|
||
WIP #2, this appears to work in the reduced testcase - still with a few cleanups to do
Assignee | ||
Updated•12 years ago
|
Attachment #677429 -
Attachment is obsolete: true
Reporter | ||
Comment 6•12 years ago
|
||
This patch seems to break my demo from comment 4 in other interesting ways. Notably I don't get onconnect notifications for my PeerConnections on both sides, so things just get stuck. Also I get a 100% CPU hang on shutdown.
Assignee | ||
Comment 7•12 years ago
|
||
This should solve crashing and shutdown problems with your testcases; please try (had wrong return value for cheating about whether it succeded sending)
Assignee | ||
Updated•12 years ago
|
Attachment #677444 -
Attachment is obsolete: true
Reporter | ||
Comment 8•12 years ago
|
||
The combination of the latest patch here + the patch from bug 807929 fixes my issues.
Assignee | ||
Comment 9•12 years ago
|
||
Cleaned up, with commentary on why we're doing this and disabled code we can use if we resolve the lock issues in SCTP
Assignee | ||
Updated•12 years ago
|
Attachment #677696 -
Attachment is obsolete: true
Assignee | ||
Comment 10•12 years ago
|
||
Comment on attachment 677757 [details] [diff] [review]
proxy incoming DTLS data to SCTP via the STS thread to avoid locking issues
Once I have a chance to unravel the locks in SCTP I may be able to make the output call without any locks held, or just 'safe' locks, so I'm not deleting the IsSTSThread case, just disabling it and commenting.
Attachment #677757 -
Flags: review?(ekr)
Comment 11•12 years ago
|
||
Comment on attachment 677757 [details] [diff] [review]
proxy incoming DTLS data to SCTP via the STS thread to avoid locking issues
Review of attachment 677757 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm
Attachment #677757 -
Flags: review?(ekr) → review+
Assignee | ||
Comment 12•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/ff560027a8ba
The previous fix bug 806169 and this one (fallout from that) should be uplifted to Aurora after verification. Ted's tests and mine seem to run and stop cleanly now.
Assignee | ||
Comment 13•12 years ago
|
||
Marking FF18 as unaffected because it isn't until Bug 806169 uplifts (which fixes a major problem with calling NSS on teh wrong thread)
status-firefox18:
--- → unaffected
status-firefox19:
--- → affected
Target Milestone: --- → mozilla19
Reporter | ||
Updated•12 years ago
|
Assignee: nobody → rjesup
Comment 14•12 years ago
|
||
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: in-testsuite-
Resolution: --- → FIXED
Assignee | ||
Comment 15•12 years ago
|
||
Comment on attachment 677757 [details] [diff] [review]
proxy incoming DTLS data to SCTP via the STS thread to avoid locking issues
[Approval Request Comment]
Bug caused by (feature/regressing bug #): 806169
User impact if declined: Can't land bug 801221 (or need to disable DataChannels)
Testing completed (on m-c, etc.): baked on m-c for weeks
Risk to taking this patch (and alternatives if risky): Minimal; this copies all the data and makes the send asynchronous instead of synchronous (which can cause a deadlock on usrsctp_connect()).
String or UUID changes made by this patch: none
Attachment #677757 -
Flags: approval-mozilla-aurora?
Comment 16•12 years ago
|
||
Comment on attachment 677757 [details] [diff] [review]
proxy incoming DTLS data to SCTP via the STS thread to avoid locking issues
[Triage Comment]
Approving for FF18, as our understanding is that this could only regress WebRTC (pref'd off in FF18).
If this is landed on mozilla-aurora before ~11AM PT tomorrow, this will make the merge from Aurora 18 to Beta 18. If landed 11AM-5PM PT tomorrow on mozilla-beta, it will make the first FF18 Beta. If landed after that, it will end up in the second FF18 beta.
Attachment #677757 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 17•12 years ago
|
||
Comment 18•12 years ago
|
||
Ted - I tried running your test case steps from comment 4. However, when I try to run it, I end up getting an error reported:
Timestamp: 11/23/2012 11:41:58 PM
Error: Firefox can't establish a connection to the server at http://webrtc-broker.herokuapp.com/offer/6263524D-7C25-4FEE-BEF8-93AFC8322278/answer.
Source File: http://people.mozilla.org/~tmielczarek/webrtc-datachannels-hang/webrtc-broker-client.js
Line: 40
Any ideas?
Flags: needinfo?(ted)
Updated•12 years ago
|
Flags: needinfo?(ted)
Comment 19•12 years ago
|
||
Verified by doing a sending of a message from a reliable DC to a unreliable DC.
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in
before you can comment on or make changes to this bug.
Description
•