Closed Bug 1134548 Opened 10 years ago Closed 10 years ago

Resumption attempts cap the version to the session version

Categories

(NSS :: Libraries, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mt, Assigned: mt)

Details

Attachments

(1 file, 1 obsolete file)

This means that a server that is intolerant to a higher version will not appear to be intolerant. And a server that is upgraded to support a higher version will not get an opportunity to negotiate that version. A great many implementations don't reduce the ClientHello version in this way so evidently this doesn't cause large scale compatibility issues. Given the stance on backward compatibility, we should probably add an opt-in to control this.
Attached patch bug1134548.patch (obsolete) — Splinter Review
Will need to find reviewers.
Flags: needinfo?(martin.thomson)
Comment on attachment 8577380 [details] [diff] [review] bug1134548.patch Review of attachment 8577380 [details] [diff] [review]: ----------------------------------------------------------------- Also on rietveld: https://codereview.appspot.com/219750043 ::: external_tests/ssl_gtest/ssl_loopback_unittest.cc @@ +45,5 @@ > TEST_P(TlsConnectGeneric, Connect) { > Connect(); > > // Check that we negotiated the expected version. > + client_->CheckVersion(SSL_LIBRARY_VERSION_TLS_1_2); This fix wasn't part of this, but it was annoying to have the tests fail, so I fixed it. ::: lib/ssl/ssl3con.c @@ +5007,5 @@ > + * Previously, we would cap the version based on sid->version, > + * but that prevents negotiation of a higher version if the > + * previous session was reduced (e.g., with version fallback) > + */ > + if (sid->version < ss->vrange.min || My bad, I'll clean up the whitespace.
Comment on attachment 8577380 [details] [diff] [review] bug1134548.patch Review of attachment 8577380 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/ssl/ssl3con.c @@ -4982,5 @@ > - * Whenever a client already knows the highest protocol known to > - * a server (for example, when resuming a session), it should > - * initiate the connection in that native protocol. > - * So we pass sid->version to ssl3_NegotiateVersion() here, except > - * when renegotiating. Please preserve this comment in the new code. I added this comment after asking around about the rationale behind the current behavior. @@ +5014,5 @@ > + } else { > + rv = ssl3_NegotiateVersion(ss, SSL_LIBRARY_VERSION_MAX_SUPPORTED, > + PR_TRUE); > + if (rv != SECSuccess) { > + return rv; /* error code was set */ Is this early return intentional? Why don't we set sidOK to PR_FALSE and fall through? If this is intentional, I think this is worth a comment because it is handled differently from all the other errors in this part of the code.
Thanks Wan-Teh, I'll preserve the comment; I was overzealous. The early return was entirely intentional, but I think that I might instead just use PORT_Assert. We shouldn't reach that point with a version range that doesn't permit that call to succeed.
Attached patch bug1134548.patchSplinter Review
Since you expressed interest here Wan-Teh. Don't feel obligated. If you want to discuss the change in behavior, we can wait until the call next week.
Attachment #8577380 - Attachment is obsolete: true
Attachment #8577451 - Flags: review?(wtc)
Assignee: nobody → martin.thomson
Flags: needinfo?(martin.thomson)
Comment on attachment 8577451 [details] [diff] [review] bug1134548.patch Review of attachment 8577451 [details] [diff] [review]: ----------------------------------------------------------------- ::: lib/ssl/ssl3con.c @@ +4978,5 @@ > sidOK = PR_FALSE; > } > > + /* Set ss->version based on the session cache. > + * TLS 1.0 (RFC 2246) Appendix E says: Ah, I'm sorry. I didn't notice that you changed the arguments to the ssl3_NegotiateVersion call on line 5020. Please remove my old comment. You don't need to keep a comment that describes old behavior. We just need to add comment that describes current behavior, and only if the code is not obvious. @@ +5018,5 @@ > sidOK = PR_FALSE; > + } else { > + rv = ssl3_NegotiateVersion(ss, SSL_LIBRARY_VERSION_MAX_SUPPORTED, > + PR_TRUE); > + PORT_Assert(rv == SECSuccess); PORT_Assert is conditionally compiled in debug builds only. We usually handle this kind of code like this: rv = ssl3_NegotiateVersion(ss, SSL_LIBRARY_VERSION_MAX_SUPPORTED, PR_TRUE); if (rv != SECSuccess) { PORT_Assert(0); // Equivalent to a "not reached" assertion. PORT_SetError(SEC_ERROR_LIBRARY_FAILURE); // This is our error code for programming errors. return rv; }
IIRC, you need to store ClientHello.client_version in the session and avoid resuming any session where the resumption's ClientHello.client_version is different than the session's ClientHello.client_version. See https://code.google.com/p/chromium/issues/detail?id=441456#c3 and https://github.com/tlswg/tls13-spec/issues/136.
I just reviewed the dialog on the Chrome bug. I think that we're OK, even if it is just on the basis that this is how boringssl operates. https://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=9e189b9dc10786c755919e6792e923c584c918a1 is the only potential risk identified. I can see an argument against change until that fix is more widely deployed in servers, but I think we're OK. Making the change risks errors as a result of OpenSSL resuming at 1.2 from a cached 1.1 (or lower) session. davidben notes that Chrome saw a 0.06% error rate bump, but that is in the presence of version fallback, which we've seen is largely triggered for bogus reasons: 0.06% is about right if you consider the value to be P(erroneous fallback)*P(OpenSSL servers). On Firefox, we only risk triggering the bug by connecting to an OpenSSL server that has a change of mind about what version it supports, without also flushing its session cache. In that case, we will have a hard fail. Not doing fallback likely avoids the problem entirely. More so given that most servers flush session caches after a config change. Chrome won't be affected now that they split the session cache to work around this problem. CC'ing davidben in case I'm wrong about that. I'm not aware of anyone else doing version fallback. They take on the same risk that Firefox does.
(I dunno, I think that bug was more of a monologue than a dialog. ;-) ) After splitting the cache, it dropped to 0.02% on our end (note: just canary and dev), which is low enough that I'll probably do a CL to start enforcing the match next week. But it's not zero. With the sharded cache, I believe it can no longer be fallback related. The only remaining scenario I can think of is server config changes and multi-homed hosts that can cross-resume (maybe they just share session ticket keys) but, for some reason, one of the servers has a different configuration than the other[*]. (If they have a separate session cache or keep the session ticket key in a file somewhere, it wouldn't get flushed on config change.) Note: we'd be more affected in this situation because we don't include the IP address in the session cache key on our BoringSSL ports. [*] This isn't all that far-fetched. I started down this rathole because https://www.debian.org/ has two IPs, one of does TLS 1.2 and the other TLS 1.0. Though they couldn't cross-resume.
Yes, cross-resumption or config changes + the OpenSSL bug seem the most plausible source of errors. (debian fixed their servers, at least according to ssllabs.com.)
Comment on attachment 8577451 [details] [diff] [review] bug1134548.patch Review of attachment 8577451 [details] [diff] [review]: ----------------------------------------------------------------- r=wtc. Martin, I am giving you review+ in advance because the changes I requested are straightforward. ::: external_tests/ssl_gtest/ssl_loopback_unittest.cc @@ +180,5 @@ > + SSL_LIBRARY_VERSION_TLS_1_1); > + Connect(); > + > + Reset(); > + EnsureTlsSetup(); IMPORTANT: could you explain what the Reset() and EnsureTlsSetup() calls do? Do they destroy and re-create the SSL/TLS session ID caches? If so, then session resumption is impossible in this test.
Attachment #8577451 - Flags: review?(wtc) → review+
(In reply to Wan-Teh Chang from comment #11) > Comment on attachment 8577451 [details] [diff] [review] > bug1134548.patch > > Review of attachment 8577451 [details] [diff] [review]: > ----------------------------------------------------------------- > > r=wtc. Martin, I am giving you review+ in advance because the changes > I requested are straightforward. > > ::: external_tests/ssl_gtest/ssl_loopback_unittest.cc > @@ +180,5 @@ > > + SSL_LIBRARY_VERSION_TLS_1_1); > > + Connect(); > > + > > + Reset(); > > + EnsureTlsSetup(); > > IMPORTANT: could you explain what the Reset() and EnsureTlsSetup() calls do? > > Do they destroy and re-create the SSL/TLS session ID caches? If so, then > session resumption is impossible in this test. The session cache is setup in the SetUp() function for the test: https://dxr.mozilla.org/mozilla-central/source/security/nss/external_tests/ssl_gtest/tls_connect.cc#29 The cache persists for the whole test run and is cleared in the TearDown() function: https://dxr.mozilla.org/mozilla-central/source/security/nss/external_tests/ssl_gtest/tls_connect.cc#42 Thus, Reset() does nothing to the session cache. What is does do is entirely replace and reinitialize the client and server sides of the test case. EnsureTlsSetup() just makes sure that the client and server are properly initialized. Only ekr knows why that doesn't happen in Reset().
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: