Closed Bug 1619102 Opened 2 years ago Closed 2 years ago

WebRTC DTLS fails between Nightly 75 and Release 73 or Beta 74 (if Nightly is DTLS client)

Categories

(NSS :: Libraries, defect, P1)

defect

Tracking

(firefox-esr68 unaffected, firefox73 unaffected, firefox74 unaffected, firefox75+ disabled, firefox76+ fixed)

RESOLVED FIXED
Tracking Status
firefox-esr68 --- unaffected
firefox73 --- unaffected
firefox74 --- unaffected
firefox75 + disabled
firefox76 + fixed

People

(Reporter: cpeterson, Assigned: kjacobs)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Attachments

(1 file)

I was trying to use Facebook Messenger yesterday (Mac to Mac):

https://www.facebook.com/messages/

A call between Firefox 73 and Chrome 80.0.3987.122 worked correctly, but a call between Firefox 73 and 75 Nightly on the same computers connected but there was no audio or video played on either the 73 or 75 client (regardless of which client initiates the call). There was just an empty gray Facebook window with the local camera preview video playing in the corner.

This bug looks a lot like Messenger/WebRender bug 1600937 in Firefox 70, but that bug was resolved as invalid. My 75 client does have both Fission and WebRender enabled, but I'm not not sure why that would break audio and video on both clients. I didn't have time to test with Fission or WebRender. I haven't tested Firefox 73 and 74 Beta yet either.

If this is not a known bug, I can try to bisect the regression. I wanted to ask first because bisecting with mozregression will be a lot of work because I will have to use two separate computers and Facebook accounts to test the Messenger calls.

drno wondered whether this problem might be caused by DTLS 1.3 (bug 1284103).

I will try disabling DTLS 1.3 (by reverting the media.peerconnection.dtls.version.max pref to 771) and retesting later today.

Or pessimistic: bug 1615445

Priority: -- → P1

[Tracking Requested - why for this release]:

Facebook Messenger video calls broke in 75 Nightly due to WebRTC DTLS 1.3 bug 1284103.

(In reply to Chris Peterson [:cpeterson] from comment #1)

drno wondered whether this problem might be caused by DTLS 1.3 (bug 1284103).

I will try disabling DTLS 1.3 (by reverting the media.peerconnection.dtls.version.max pref to 771) and retesting later today.

I confirmed that DTLS 1.3 is the cause of this problem. I reverted media.peerconnection.dtls.version.max to 771 and Firefox 73 and 75 could call successfully. I then restored media.peerconnection.dtls.version.max to its new default 772 and then Firefox 73 and 75 could connect but didn't play audio or video on either end.

Regressed by: 1284103
See Also: 1600937
Summary: Facebook Messenger video call between Firefox 73 and 75 does not play audio or video → Facebook Messenger video call between Firefox 73 and 75 does not play audio or video (DTLS 1.3)
Component: WebRTC: Audio/Video → WebRTC: Networking

This is related to bug 1615208.

Firefox versions before this patch send TLS version numbers in the supported_versions extension for DTLS (which is incorrect), instead of DTLS version numbers. With that corrected, we can negotiate with Chrome, but not older versions of Firefox... Even though DTLS 1.3 disabled, negotiation is happening through supported_versions if it is present.

The only obvious solution I see is to advertise both TLS and DTLS version numbers. I have a build in-progress to verify this works.

Assignee: nobody → kjacobs.bugzilla
Severity: normal → critical
Component: WebRTC: Networking → Libraries
Product: Core → NSS
QA Contact: jjones
Version: unspecified → trunk
Summary: Facebook Messenger video call between Firefox 73 and 75 does not play audio or video (DTLS 1.3) → WebRTC DTLS fails between Nightly 75 and Release 73 or Beta 74 (if Nightly is DTLS client)
Blocks: 1614746

Kevin just to make sure I understand the impact here:
The problem is that if we enable DTLS 1.3 in Fx it adds supported_version to it's client hello, which then gets rejected by Fx DTLS server because it misinterprets the version numbers in there?
Currently we can't establish connections with older versions of Firefox, because of this problem, right?

How come that Chrome is now fine with Fx advertising 1.3, but Fx can't handle it?
Would it be possible to create a patch for Fx Beta, which makes it stop aborting because of the "wrong protocol" version? Or maybe that will even happen automatically once 75 becomes Beta?

I think I would need to have answers to the above question to understand before we decide if we need to disable DTLS 1.3 until we have a better interop situation.

Flags: needinfo?(kjacobs.bugzilla)

(In reply to Nils Ohlmeier [:drno] from comment #5)

Kevin just to make sure I understand the impact here:
The problem is that if we enable DTLS 1.3 in Fx it adds supported_version to it's client hello, which then gets rejected by Fx DTLS server because it misinterprets the version numbers in there?
Currently we can't establish connections with older versions of Firefox, because of this problem, right?

Correct - it gets rejected by an old (pre-75) Fx DTLS server. The interop problem is between pre-1615208 and post-1615208 versions, even though that patch itself was correct. There are two issues at play:

  1. Prior to the patch for bug 1615208, NSS (and Firefox) were sending and expecting TLS version numbers in the supported_versions extension of the DTLS 1.3 client hello. This is wrong and sending DTLS version numbers corrected the Chrome interop issue.
  2. NSS attempts to negotiate through the supported_versions extension even if DTLS and/or TLS 1.3 are disabled (if it is present). This is somewhat unexpected, since supported_versions was introduced in TLS 1.3.

How come that Chrome is now fine with Fx advertising 1.3, but Fx can't handle it?

Bug 1615208 fixed the issue of sending wrong versions in a DTLS 1.3 ClientHello. Old versions of Fx process the supported_versions extension (which only DTLS 1.3 sends) and expect it to hold the wrong version numbers.

Would it be possible to create a patch for Fx Beta, which makes it stop aborting because of the "wrong protocol" version? Or maybe that will even > happen automatically once 75 becomes Beta?

Not easily since this is low-level code in the TLS stack, and applies to all prior TLS1.3-comprehending versions of NSS.

Flags: needinfo?(kjacobs.bugzilla)

Add an experimental function for enabling a DTLS 1.3 supported_versions compatibility workaround.

Patched in NSS: https://hg.mozilla.org/projects/nss/rev/53803dc4628f9750125c2cb27319845df75cc189

I'll open a WebRTC bug for the patch enabling this in Firefox.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.52
Blocks: 1621036
Duplicate of this bug: 1620716

Is beta actually affected here, or are we ok at this point (bug 1284103 was nightly only)?

No, beta is not affected.

Duplicate of this bug: 1623258
You need to log in before you can comment on or make changes to this bug.