Closed Bug 1672703 Opened 4 years ago Closed 4 years ago

regression: CVE-2020-25648 fix breaks purple-discord: nss: Handshake failed (-12251)

Categories

(NSS :: Libraries, defect, P1)

3.58
x86_64
Linux

Tracking

(Root Cause:Coding: Logical Error, firefox-esr78 unaffected, firefox82 unaffected, firefox83 unaffected, firefox84 unaffected)

RESOLVED FIXED
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- unaffected
firefox83 --- unaffected
firefox84 --- unaffected

People

(Reporter: pabs3, Unassigned)

References

Details

(Keywords: regression)

Attachments

(3 files)

Attached file TLS Hello packets

After upgrading from nss 3.56 to 3.58, purple-discord can no longer connect to Discord due to a nss: Handshake failed (-12251) error. According to the Mozilla nss docs this means that "SSL received a malformed Change Cipher Spec record.". It seems other folks have the same problem but with XMPP, but my XMPP accounts don't have the same problem.

I was able to work around this in two different ways:

  • I used the Pidgin NSS Preferences plugin to disable TLS version 1.3 and this fixed the issue.
  • I recompiled NSS 3.58 with a revert for the fix for CVE-2020-25648 and that also fixed the issue.

I wasn't sure where to go from here, and Julien Cristau suggested I file a bug here, adding Daiki Ueno to the moreinfo field.

In Wireshark, I see a "TLSv1.3 Client Hello" packet, then a "TLSv1.3 Server Hello, Change Cipher Spec" packet, then from server to client a "TLSv1.3 Application Data" packet, then from client to server a "TLSv1.3 Application Data" packet and finally the TCP connection is torn down. The two TLS Hello packets from Wireshark are attached.

PS: I'm using NSS 2:3.58-1 on Debian GNU/Linux 11 (bullseye).

Flags: needinfo?(dueno)

The error is SSL_ERROR_RX_MALFORMED_CHANGE_CIPHER.

The problem is that the client is not enabling TLS 1.3 compatibility mode. I can see this because the ClientHello.legacy_session_id field is empty.

The fault is technically with the server. It assumes compatibility mode and sends ChangeCipherSpec even though the client did not request or permit it.

A workaround might be to set the SSL_ENABLE_TLS13_COMPAT_MODE option to PR_TRUE in the client using SSL_OptionSet[Default]. Taking this up with the server stack might be more difficult.

Oh, my mistake. I think that appendix D.4 expressly permits the server to send CCS without compatibility mode enabled. Gross.

We've just performed TLS interop tests against other libs (OpenSSL and GnuTLS), and when SSL_ENABLE_TLS13_COMPAT_MODE is enabled most of the failures are gone. Only left are between GnuTLS client and NSS server, but I suspect it might be an issue in GnuTLS, as we know a few issues already.

(In reply to Martin Thomson [:mt:] from comment #2)

Oh, my mistake. I think that appendix D.4 expressly permits the server to send CCS without compatibility mode enabled. Gross.

Would it still make sense to enable SSL_ENABLE_TLS13_COMPAT_MODE by default, or should we somehow tolerate CCS reception without enabling it?

Flags: needinfo?(dueno)

Daiki, are you able to take this one?

[Tracking Requested - why for this release]:

Connection failure regression in NSS 3.58 in Firefox 83. This is going to require a 3.58.1 point release and uplift to Beta.

Severity: -- → S1
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regression
Priority: -- → P1
Regressions: CVE-2020-25648

Daiki,

Would it still make sense to enable SSL_ENABLE_TLS13_COMPAT_MODE by default [...]

I think that the specific fix needs to be that we allow CCS always. Maybe we need to invert the polarity of the flag and have it as ccsReceived. Then it will be initialized to PR_FALSE and we can set it when the first CCS is received (or when we detect that the ClientHello is after a potentially stateless HelloRetryRequest: code is hard).

JC, this isn't a regression in Firefox as we set the SSL_ENABLE_TLS13_COMPAT_MODE. Firefox won't be affected.

Flagging JC to double check my work.

Root Cause: --- → Coding: Logical Error
Flags: needinfo?(jjones)

(In reply to J.C. Jones [:jcj] (he/him) [increased latency due to COVID-19] from comment #4)

Daiki, are you able to take this one?

I can take unless Martin has already started working on the fix.

Thanks, Martin, I forgot Necko set that. Confirmed.

Flags: needinfo?(jjones)

This flips the meaning of the flag for checking excessive CCS
messages, so it only rejects multiple CCS arrive while the first CCS
message is always accepted.

This fixes the problem we have seen with cURL using NSS backend for us.

Could you explain why for example only cURL showed the problem (i.e. curl https://github.com failed) but firefox itself linked against same NSS version was able to open https://github.com/? Same with Waterfox... you were unable to open google.com for example but everything was working fine in firefox at the same time.

Thomas, Firefox enables the compat mode, which works around the issue, as is explained in comment 5.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.59

FWIW, I confirm that the patch has fixed the issue for purple-discord.

It would be great to have a 3.58.1 release to fix the applications affected by the issue, which seems to mostly be pidgin plugins so far.

Are we sure this is fixed in nss-3.59? In Gentoo Linux we hit the same problem again: cURL linked against nss-3.59 is unable to connect to github.com for example. The TLS handshake look similar like the original problem. Downgrading NSS to 3.58.1 will make it work again.

Flags: needinfo?(dueno)

I bisected nss and tracked down the issue to bug 1663661. Undoing https://hg.mozilla.org/projects/nss/rev/0ed11a5835ac1556ff978362cd61069d48f4c5db will make it work. Filed a new bug 1679290. Sorry for the noise.

Flags: needinfo?(dueno)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: