Mozilla NSS TLS 1.3 CCS Flood remote DoS Attack
Categories
(NSS :: Libraries, defect, P1)
Tracking
(firefox-esr68 wontfix, firefox-esr78 disabled, firefox80 wontfix, firefox81 wontfix, firefox82 wontfix, firefox83 fixed)
People
(Reporter: lywang90, Assigned: ueno, NeedInfo)
References
(Regression)
Details
(Keywords: csectype-dos, regression, sec-moderate, Whiteboard: [adv-main83-] server-side, Firefox unaffected)
Attachments
(2 files)
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36
Steps to reproduce:
Step 1. Checkout the latest version of NSS code and build it. Revision ID: 661e3e3f6ba515a36fc97038164979a216c9f87b
Step 2. Run ssl_gtests.sh test to create the data selfserv tool needed.
HOST=localhost DOMSUF=localdomain USE_64=1 ./nss/tests/ssl_gtests/ssl_gtests.sh
Step 3. Run slefserv in TLS 1.3 mode.
NSS_DIR="$(pwd)/dist/$(cat dist/latest)"
LD_LIBRARY_PATH="$NSS_DIR/lib" "$NSS_DIR/bin/selfserv" -n rsa -p 4433 -d ~/nss-dev/tests_results/security/localhost.1/ssl_gtests/ -v -V tls1.3:tls1.3
Step 4. Config environment for PoC (named ccs_dos_poc.py in the attachments).
Python version: 3.8
Package requirements: tlslite-ng==0.8.0-alpha37
Change the host and port variables in the PoC code according to your environment.
Step 5. Run PoC code.
Actual results:
On server side, you will see the CPU usage of NSS server process reaches 100% immediately and keeps. However, the CPU usage of the PoC process on the client side is very low.
It is a remote server-side DoS (Denial of Service) issue. An unauthorized attacker can make a DoS attack to a NSS TLS 1.3 server remotely in a very high efficiency.
Expected results:
NSS server should handle CCS message in TLS 1.3 more carefully to prevent this kind of attack. I will explain this issue as detailed as possible.
Section 1. Issue Analysis
In TLS 1.3, CCS message is used only for compatibility purposes:
https://tools.ietf.org/html/rfc8446#section-5
https://tools.ietf.org/html/rfc8446#appendix-D.4
NSS did followed the RFC 8446, but it's a relatively loose state machine check and there are no other limits. It allows an attacker to send CCS messages in a row after ClientHello message. If an attacker put multiple CCS messages in a single tcp packet, the NSS server will stuck in a loop for many times to process the messages. The relevant code is in the ssl3_HandleRecord() function of /nss/lib/ssl/ss3con.c:
Normally, if an attacker need to keep sending packets for a remote DoS attack, it's not considered as a security vulnerability because the processing power requirement for client and server is basically the same. But in this issue, the server requires much more processing power than the client because an attacker can put multiple CCS messages in a single TCP packet. It's like a CCS message bomb. The server needs to loop for thousands of times to process a single TCP packet while the client only needs to do a raw socket sending.
A very good example of this kind of vulnerability is the SSL Renegotiation DoS problem (CVE-2011-1473, CVE-2011-5094) in which the processing power difference is only about 10 times. The issue here is more serious from the processing power difference perspective. Here's some information about the SSL Renegotiation DoS problem:
http://www.ietf.org/mail-archive/web/tls/current/msg07553.html
https://bugzilla.redhat.com/show_bug.cgi?id=707065
Section 2. Fix suggestion
A good example of how to handle CCS message properly in TLS 1.3 is the OpenSSL implementation. In OpenSSL, they limit the max consecutive CCS message. The way OpenSSL did this is a little bit complicated. First, OpenSSL treated CCS message in TLS 1.3 as empty record and get empty record count. Then, the empty record count is compared to MAX_EMPTY_RECORDS constant which is 32. In the code comment, they also explained the reason why the did this is to prevent similar attack we talk about here: "MAX_EMPTY_RECORDS defines the number of consecutive, empty records that will be processed per call to ssl3_get_record. Without this limit an attacker could send empty records at a faster rate than we can process and cause ssl3_get_record to loop forever."
The key point to address this issue is to limit the consecutive CCS messages numbers.
Section 3. Conclusion
The key point of the issue is the disparity processing power requiring on server and client which makes the remote DoS attack possible. Comparison with the SSL Renegotiation DoS problem and the OpenSSL implementation may help you to see this issue clearly. Hope this issue will be fixed in NSS.
Comment 1•4 years ago
|
||
jcj: just guessing at how important this might be for server clients, is sec-moderate appropriate? Sounds like it's not really a problem for Firefox client use of NSS.
Comment 2•4 years ago
|
||
Adding more folks to the list, particularly RedHat, to this list.
This could be used against WebRTC eventually, though TLS 1.3 WebRTC is enabled in prerelease Firefox only at present, probably until Q4 2020.
Comment 3•4 years ago
|
||
I could reproduce. This is probably an S2 severity for RedHat, though not currently a threat for Firefox.
Comment 4•4 years ago
|
||
I'm going to suggest that we do something simple, which is to set a flag when CCS is permitted (i.e., when we negotiate TLS 1.3 and the ClientHello contains a non-empty legacy_session_id field) and clear it when we find that CCS. Then, if we receive another CCS, we'll hit the standard processing paths and drop the connection.
Reporter | ||
Comment 5•4 years ago
|
||
Hello everyone, is there any progress in fixing this bug?
FYI, I agree with Martin. In fact, I found the same issue in wolfSSL and reported to them earlier. They fixed it in the same way as Martin suggested.
https://github.com/wolfSSL/wolfssl/pull/2927/files
Assignee | ||
Comment 6•4 years ago
|
||
This makes the server reject CCS when the client doesn't indicate the
use of the middlebox compatibility mode with a non-empty
ClientHello.legacy_session_id, or it sends multiple CCS in a row.
Updated•3 years ago
|
Reporter | ||
Comment 7•3 years ago
|
||
Hello everyone, it's been over three months since I reported this issue. How is everything going?
I hava a plan to make this issue public after it was fixed. So I'd like to know a approximate release date of the fixed version. Thanks.
Assignee | ||
Comment 9•3 years ago
|
||
Sorry, I have been a bit stuck for that (it's more complicated than I anticipated), but I'll make an update in this week.
Comment 10•3 years ago
|
||
I'm afraid this missed NSS 3.57 (Firefox 82), so we'll take it for NSS 3.58 / Firefox 83.
Updated•3 years ago
|
Comment 11•3 years ago
|
||
There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:ueno, could you have a look please?
For more information, please visit auto_nag documentation.
Comment 13•3 years ago
|
||
Scheduling this to land Monday as part of NSS_3_58_BETA1
Comment 14•3 years ago
|
||
Is Mozilla planning to assign a CVE to this or Red Hat should?
Comment 15•3 years ago
|
||
https://hg.mozilla.org/projects/nss/rev/57bbefa793232586d27cee83e74411171e128361
I suspect we'll let RedHat assign it - J.C.?
Comment 16•3 years ago
|
||
Red Hat has assigned CVE-2020-25648 to this flaw.
Updated•3 years ago
|
Updated•3 years ago
|
Comment 17•3 years ago
|
||
Do we need to do something for ESR78 here still this cycle?
Comment 18•3 years ago
|
||
I don't know that this really warrants an ESR78 uplift. TLS1.3 for WebRTC is not enabled in ESR 78, and there are no plans to enable that.
Unless someone has a differing opinion, I think we can mark this disabled for ESR.
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Description
•