DataChannels broken by thread tests in mtransport

RESOLVED FIXED in Firefox 18

Status

()

defect
--
critical
RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: jesup, Assigned: jesup)

Tracking

Trunk
mozilla19
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox18 fixed, firefox19 fixed)

Details

(Whiteboard: [qa-])

Attachments

(1 attachment)

DataChannels (and SCTP over DTLS) assumes that it can make IO calls from threads other than the SocketTransportService thread, and relies on this to avoid main-thread IO for example, by spawning a thread specifically to call usrsctp_connect().  This worked with UDP sockets.

With DTLS/ICE/UDP sockets, this is broken by the CheckThread() calls in TransportLayer, in particular in SendPacket(), but probably other calls as well.

If all DTLS IO *must* occur on the STS thread, we'll need to proxy all IO to that thread, either in DataChannels, or internally within the Transport code (which would provide a nicer API for callers).  Or we'll need to modify things to allow multithreaded access...  but I'm not holding my breath for that.

Thinking about it, while I could move all the DataChannel IO to the STS thread (with pain, but I was thinking of it anyways), that wouldn't solve the problem of the SCTP library needing to do heartbeats, acks, etc, which it does via its own threads.  

Because of that, I think we really have to modify the transport to provide a true UDP-like API, in particular to allow writes (and maybe other calls like reads) from any thread, and internally proxying that to the STS thread if needed.

On the plus side, this will make an interface easier to interface with by application code.
I may be able to do this within the SendPacket() callback in DataChannel used to revector IO to DTLS, using WrapRunnableRet().  Still not great, but may get basics back up and working

Anant, the test page should be modified to warn people (nightlies on/after Sunday - check Sat; I don't know if the changeset made it into Nightly on Sat.

I have this half-coded but I'm too tired to finish now.
I'm very, very afraid - I wrote this patch while falling asleep on my keyboard; it compiled after one syntax typo, and worked first time
Attachment #675925 - Flags: review?(ekr)
Comment on attachment 675925 [details] [diff] [review]
proxy all SendPackets to the STS thread

Review of attachment 675925 [details] [diff] [review]:
-----------------------------------------------------------------

I am a little concerned that using SYNC here is going to result in reentrancy problems. You may want to put in a TODO to put in some asserts for that.
Attachment #675925 - Flags: review?(ekr) → review+
https://hg.mozilla.org/mozilla-central/rev/e069342dc665
With comment added

Note that on 18 things will work, but with IO off the STS thread random problems may crop up.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
Keywords: verifyme
Keywords: verifyme
Whiteboard: [qa-]
Assignee: nobody → rjesup
Comment on attachment 675925 [details] [diff] [review]
proxy all SendPackets to the STS thread

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 801221
User impact if declined: Can't land bug 801221
Testing completed (on m-c, etc.): baked on m-c for weeks
Risk to taking this patch (and alternatives if risky): Minimal risk, proxies sends to the STS thread.  Also needs fallout patch from this, bug 807647
String or UUID changes made by this patch: none
Attachment #675925 - Flags: approval-mozilla-aurora?
Comment on attachment 675925 [details] [diff] [review]
proxy all SendPackets to the STS thread

[Triage Comment]
Approving for FF18, as our understanding is that this could only regress WebRTC (pref'd off in FF18).

If this is landed on mozilla-aurora before ~11AM PT tomorrow, this will make the merge from Aurora 18 to Beta 18. If landed 11AM-5PM PT tomorrow on mozilla-beta, it will make the first FF18 Beta. If landed after that, it will end up in the second FF18 beta.
Attachment #675925 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.