Closed Bug 1510487 Opened 11 months ago Closed 9 months ago

DTLS without SRTP extension (for datachannel only) closes connection

Categories

(Core :: WebRTC: Networking, defect, P1)

64 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox-esr60 --- unaffected
firefox64 --- wontfix
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: stapatrick9, Assigned: drno, NeedInfo)

References

(Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36

Steps to reproduce:

Connect to Firefox 64 datachannel session through webrtc.
The remote/local SDP is attached. 


Actual results:

The DTLS exchange failed. Specifically after the DTLS certificate message is send, the remote DTLS server connection closes. Our Native DTLS exchange that we capture in our native stack is also attached.


Expected results:

I expected DTLS exchange to pass with datachannels. Did something change in Firefox 64 that could effect this?(maybe in the sdp ,DTLS or SCTP). I will attach a success version of DTLS exchange when connected to chrome (or firefox 63). 
Our data channel native stack connects fine with firefox 63 and before. It also working fine with Chrome. The DTLS exchange works fine with Audio/Video connection in firefox 64 also.
I am assigning a component to this issue in order to involve the development team and get an opinion on this.
Component: Untriaged → WebRTC
Product: Firefox → Core
(In reply to undefined from comment #undefined)
> 

Thanks
This is the wireShark pcapng file from the failted datachannels dtls exchange with firefox 64. 192.168.124.173 is firefox in this case.
We found that there is bug in firefox 64 where if connections with only data streams that don’t negotiate an SRTP profiles are immediately closed after the DTLS handshake completes.
Status: UNCONFIRMED → NEW
Component: WebRTC → WebRTC: Networking
Ever confirmed: true
jib, can you triage this?
Flags: needinfo?(jib)
Could LiveSwitch be objecting to the new SCTP window? Nils, WDYT?
Flags: needinfo?(drno)
No this is not caused by the window size.

Most likely it's the code from bug 1485883 which is causing this.

@mt: looks like we over looked a legit use case for not negotiating any SRTP profiles with DTLS, when you only want to negotiate a data channel only. Do you think this is covered by the specs in any way?

Assuming for now this a regression and setting therefore high priority.
Depends on: 1485883
Flags: needinfo?(drno) → needinfo?(martin.thomson)
Keywords: regression
Priority: -- → P1
Attached patch 1510487-wip.patch (obsolete) — Splinter Review
I'm surprised that this is a problem.  An error in DTLS will result in an error in the SRTP layer.  I think that we were previously ignoring failures from the SRTP negotiation.  If so, then what I will attach will likely work.

I had a look at the code here and removing the check changes the guarantees that TransportLayerDtls provides.  Basically, if we offer SRTP (which we always do), and the peer doesn't accept it because it expects to only be using data channels, then we will die later.  We don't conditionally add TransportLayerSrtp based on the availability of SRTP.

I have some changes (attached), but no time to go over them carefully.  I don't build gecko often enough and I'm currently stuck upgrading all sorts of things to allow it to build.  So I haven't tested this.  I think that we'll want lots of tests around this to be sure that we don't regress more.
Flags: needinfo?(martin.thomson)
(In reply to Martin Thomson [:mt:] from comment #9)
> I had a look at the code here and removing the check changes the guarantees
> that TransportLayerDtls provides.  Basically, if we offer SRTP (which we
> always do), and the peer doesn't accept it because it expects to only be
> using data channels, then we will die later.  We don't conditionally add
> TransportLayerSrtp based on the availability of SRTP.

Maybe this is the right thing to do going forward: only offer SRTP extensions if the transport contains audio or video. And then also only insert TransportLayerSrtp if it's needed. I think the biggest issue is at which layer to figure out if SRTP is needed or not and then pass it down to the transportlayer code.

The other question is: if we would no longer offer SRTP extension in DTLS for datachannel only transport how many implementations would barf?

I guess the less problematic option would to continue to negotiate the SRTP extension, but simply not insert TransportLayerSrtp if we know it a data channel only transport.
 
> I have some changes (attached), but no time to go over them carefully.  I
> don't build gecko often enough and I'm currently stuck upgrading all sorts
> of things to allow it to build.  So I haven't tested this.  I think that
> we'll want lots of tests around this to be sure that we don't regress more.

Thanks for the patch. It turns out that your last patch already had tests for both situations (client not offering SRTP and server not accepting SRTP), but they expected to fail. So I think that was simply the wrong expectation.

Unfortunately it looks like the TransportLayerSrtp code itself doesn't have any test coverage, except indirectly through mochitests. But these don't handle the non SRTP scenario. I guess in this case we would want to have a combined test of TransportLayerSrtp and TransportLayerDtls.

I think we should land as is for now and request uplift to Beta to fix this regression. And then take care of improvements in follow up bugs which ride the trains.
Summary: DTLS Exchange closing connection - datachannel Webrtc → DTLS without SRTP extension (for datachannel only) closes connection
Byron, can you drive this until Nils is back?
Flags: needinfo?(docfaraday)
I've added a review comment, this is pretty close to being able to land, but Nils owns the patch.
Flags: needinfo?(docfaraday)
Duplicate of this bug: 1520094

Is this close to ready to land? Tomorrow's the final Beta of the Fx65 cycle.

Assignee: nobody → drno
Blocks: 1485883
No longer depends on: 1485883
Flags: needinfo?(stapatrick9) → needinfo?(drno)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #16)

Is this close to ready to land? Tomorrow's the final Beta of the Fx65 cycle.

Yes I can get it in today.

Flags: needinfo?(drno)
Blocks: 1520692
Attachment #9031339 - Attachment is obsolete: true

Comment on attachment 9031730 [details]
Bug 1510487: allow DTLS connection w/o SRTP extension. r?bwc

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1485883

User impact if declined: For certain WebRTC services Data Channel connections fail to establish (it depends on how the service negotiates the connection on the DTLS layer).

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: No

Needs manual test from QE?: No

If yes, steps to reproduce:

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): It only removes an error check introduced by bug 1485883. Also the scenario which triggers this not very common.

String changes made/needed: N/A

Attachment #9031730 - Flags: approval-mozilla-beta?
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8fb54b5db199
allow DTLS connection w/o SRTP extension. r=bwc
Status: NEW → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66

Comment on attachment 9031730 [details]
Bug 1510487: allow DTLS connection w/o SRTP extension. r?bwc

[Triage Comment]
Fixes Data Channel connection failures for certain WebRTC services. Approved for 65.0b12.

Attachment #9031730 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Can you verify that the latest Firefox Beta build fixes the issue for you?

Flags: needinfo?(stapatrick9)

Yes, the issue is fixed in Firefox 65.0b12!

You need to log in before you can comment on or make changes to this bug.