Closed
Bug 1194817
Opened 10 years ago
Closed 10 years ago
Certain high-traffic uses of DataChannels/sctp fail when the packets get too large due to PMTU
Categories
(Core :: WebRTC: Networking, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla43
backlog | webrtc/webaudio+ |
People
(Reporter: jesup, Assigned: jesup)
References
Details
Attachments
(1 file, 1 obsolete file)
2.21 KB,
patch
|
tuexen
:
review+
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Certain complex scenarios with high traffic and large numbers of channels (hundreds) appear to cause the MTU to grow (PMTU), and when it hit 1464 bytes (of SCTP payload, as decoded via SCTP:4 logging) the packets don't get through the DTLS connection.
After a number of retransmits, it decides the connection has failed and aborts the association, killing all datachannels.
Assignee | ||
Updated•10 years ago
|
backlog: --- → webrtc/webaudio+
Rank: 10
Priority: -- → P1
Assignee | ||
Comment 1•10 years ago
|
||
tested that basic functionality works; bemasc - can you retest with this patch?
Attachment #8648174 -
Flags: review?(tuexen)
Assignee | ||
Updated•10 years ago
|
Assignee: nobody → rjesup
Status: NEW → ASSIGNED
Comment 2•10 years ago
|
||
Comment on attachment 8648174 [details] [diff] [review]
disable PMTU in DataChannels/SCTP
Review of attachment 8648174 [details] [diff] [review]:
-----------------------------------------------------------------
This patch doesn't set the PMTU, have a look at the patch I sent you a couple of minutes ago.
That disables the PMTUD and sets the PMTU to save value. If you don't want to set the PMTU,
you need to address the issues I marked.
::: netwerk/sctp/datachannel/DataChannel.cpp
@@ +454,5 @@
> goto error_cleanup;
> }
>
> + LOG(("Disabling SCTP PMTU"));
> + memset(&paddrparams, 0, sizeof(paddrparams));
The next line disables Heartbeats. I suggest to call
usrsctp_getsockopt() first and then do
paddrparams.spp_flags &= !SPP_PMTUD_ENABLE;
paddrparams.spp_flags |= SPP_PMTUD_DISABLE;
@@ +455,5 @@
> }
>
> + LOG(("Disabling SCTP PMTU"));
> + memset(&paddrparams, 0, sizeof(paddrparams));
> + paddrparams.spp_flags = SPP_PMTUD_DISABLE;
You need to set
paddrparams.spp_address.ss_family = AF_CONN;
You also want
#if defined(__Userspace_os_Darwin)
paddrparams.spp_address.ss_len = sizeof(struct sockaddr_conn);
#endif
@@ +456,5 @@
>
> + LOG(("Disabling SCTP PMTU"));
> + memset(&paddrparams, 0, sizeof(paddrparams));
> + paddrparams.spp_flags = SPP_PMTUD_DISABLE;
> + paddrparams.spp_assoc_id = SCTP_ALL_ASSOC;
Setting spp_assoc_id is not necessary.
Attachment #8648174 -
Flags: review?(tuexen) → review-
Assignee | ||
Comment 3•10 years ago
|
||
Based on Tuexen's patch by email, with some fixes and making sure it runs if the connect succeeded without EINPROGRESS. r+ from me for Tuexen's, r? to tuexen for my mods (though those don't really need it
Attachment #8648223 -
Flags: review?(tuexen)
Updated•10 years ago
|
Attachment #8648223 -
Flags: review?(tuexen) → review+
Comment 4•10 years ago
|
||
Just for clarification:
The problem is not related to high number of streams (data channels). It is just related to the fact that the SCTP association is up for more than 10 minutes. The initial path MTU is 1280 bytes, after 10 minutes path MTU discovery kicks in by a timeout and raises the path MTU to 1492 bytes. If now a user message is sent which allows messages of that size, they don't get through, since with the UDP and DTLS overhead it exceeds the MTU. Therefore the packet is dropped, even on retransmissions (SCTP relies currently on IP fragmentation). Finally the association is terminated. We are working on an improved version of PMTUD. Right now, it is best just to disable PMTUD and use a save value. This is what the patch does.
Comment 6•10 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox43:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Assignee | ||
Updated•10 years ago
|
status-firefox40:
--- → wontfix
status-firefox41:
--- → affected
status-firefox42:
--- → affected
status-firefox-esr31:
--- → wontfix
status-firefox-esr38:
--- → affected
Assignee | ||
Updated•10 years ago
|
Attachment #8648174 -
Attachment is obsolete: true
Assignee | ||
Comment 7•10 years ago
|
||
Comment on attachment 8648223 [details] [diff] [review]
disable PMTUD in DataChannels/SCTP, set initial MTU
Approval Request Comment
[Feature/regressing bug #]: N/A
[User impact if declined]: DataChannels can fail after 10 minutes if the transport doesn't allow PMTUD to work properly. This definitely affects uProxy, and *may* affect some TURN cases. Simple use of TURN with datachannels in a Hello call without this patch (and width TURN forced) didn't seem to cause PMTUD to be invoked. Other uses might, and there are scenarios (at least some unusual ones) where we believe it can fail.
[Describe test coverage new/current, TreeHerder]: Not easily testable in automation due to the network requirements and 10 minute timer. uProxy has tested the fix successfully. Should be manually verifiable with uProxy.
[Risks and why]: Very low risk - isolated change that disables an SCTP feature that's purely an efficiency-of-network feature, at least for discovery of larger-than-normal MTUs. Limit set is the number from the DataChannels IETF draft.
[String/UUID change made/needed]: none.
Attachment #8648223 -
Flags: approval-mozilla-beta?
Attachment #8648223 -
Flags: approval-mozilla-aurora?
Comment on attachment 8648223 [details] [diff] [review]
disable PMTUD in DataChannels/SCTP, set initial MTU
Approved for uplift to Aurora and Beta. Given that the likelihood of running into this scenario is low, it seems safe to uplift to Beta directly.
Attachment #8648223 -
Flags: approval-mozilla-beta?
Attachment #8648223 -
Flags: approval-mozilla-beta+
Attachment #8648223 -
Flags: approval-mozilla-aurora?
Attachment #8648223 -
Flags: approval-mozilla-aurora+
Comment 9•10 years ago
|
||
Comment 10•10 years ago
|
||
Comment 11•10 years ago
|
||
Randell, Coverity complains about this patch:
CID 1320439 (#1 of 1): Logical vs. bitwise operator (CONSTANT_EXPRESSION_RESULT)logical_vs_bitwise: paddrparams.spp_flags &= 0U /* !8 */ always assigns 0 to paddrparams.spp_flags.
Did you intend to use '~' rather than '!'?
https://dxr.mozilla.org/mozilla-central/source/netwerk/sctp/datachannel/DataChannel.cpp#616
Correct? Do you want a new bug report?
Flags: needinfo?(rjesup)
Assignee | ||
Comment 12•10 years ago
|
||
Ugh.. yes, that looks wrong; should be ~.
Flags: needinfo?(rjesup)
Comment 13•10 years ago
|
||
Yes, it needs to be ~
You need to log in
before you can comment on or make changes to this bug.
Description
•