Closed Bug 839310 Opened 11 years ago Closed 11 years ago

Add insecure fallback from TLS 1.1 -> TLS 1.0

Categories

(Core :: Security: PSM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox26 --- fixed
firefox27 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: briansmith, Assigned: briansmith)

References

()

Details

Attachments

(1 file, 12 obsolete files)

42.47 KB, patch
briansmith
: review+
briansmith
: checkin+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #733647 +++
+++ This bug was initially created as a clone of Bug #565047 +++

If we encounter an error that seems to be TLS intolerance when we have TLS 1.1 enabled, then we need to fall back to TLS 1.0. (And, if we get TLS intolerance again then we need to fall back to SSL 3.0, like we currrently do.)
Depends on: 839644
https://bugzilla.mozilla.org/show_bug.cgi?id=733642#c28

Dolske says that nsIPermissionManager can maybe be used for persistent storage of intolerant servers; however, this poses a problem: what do we do if the user moves behind a firewall that's TLS version intolerant? The code will think that the site is definitely tolerant when it really isn't.

Do we have any statistics on those firewalls, corporate or otherwise?
I poked at Yngve wrt TLS prober statistics; this is what he sent me:


> Based on my numbers (from 567K servers), as reported on Dec 24th, 2012,
> the version (V) and extensions (E) intolerance numbers when client
> signals TLS 1.0 to TLS 1.2 are:
> 
> V: 0.81%
> E: 1.25%
> v+E: 0.79%
> v|E: 1.27%
> 
> That is, almost all version intolerant servers are also extension
> intolerant, and they are the main reason for performing a version rollback.
> 
> 8.93% of the intolerant servers are Renego patched, but that is just
> 0.11% of all servers (and 0.15% of the 75%  Renego patched servers).
> 
> My recommendation is to implement the procedure outlined in
> <http://datatracker.ietf.org/doc/draft-pettersen-tls-version-rollback-removal/>.
> 
> 
> That procedure will get you a working connection with 98.73% of servers
> in one attempt, and if you implement the rollback protection it will
> only fail with 0.11% of all servers (and IMO that number will be reduced
> to virtually 0.00% within a week or two, particularly if it announce
> clearly, perhaps with names of sites (can't give a more complete
> overview unless I can do a full IPv4 scan, and I do not have the
> capability available at present))

Should that process be implemented; if not, should we keep the current implementation, that is to say, poke at it with different versions till something works then lock it in; or should we pick a different one?
How can we avoid the downgrade attack?
(In reply to Alex Xu from comment #2)
> Should that process be implemented; if not, should we keep the current
> implementation, that is to say, poke at it with different versions till
> something works then lock it in; or should we pick a different one?

I think we should stick with the current implementation for now.

(In reply to Masatoshi Kimura [:emk] from comment #3)
> How can we avoid the downgrade attack?

The purpose of this bug should be to get us interoperating with servers that speak *only* TLS 1.1+. We can have a separate bug for mitigating or eliminating the possibility of downgrade attacks.
Now I'm confused, "interoperating with servers that speak *only* TLS 1.1+" is different from what the summary and your opening description says, which is about falling back to TLS 1.0 if a server does *not* (or not correctly) speak 1.1?
(In reply to rsx11m from comment #5)
> Now I'm confused, "interoperating with servers that speak *only* TLS 1.1+"
> is different from what the summary and your opening description says, which
> is about falling back to TLS 1.0 if a server does *not* (or not correctly)
> speak 1.1?

Let me put it another way: the goal of this bug shouldn't be to increase *security* of TLS in Gecko, but rather just to increase *compatibility*. That is, we shouldn't be too concerned about downgrade attacks in this bug; we will deal with them eventually in another bug.
Ok, thanks.
Assignee: nobody → mhamrick
Issue #1: Telemetry
Can we please have telemetry on this, i. e. the number of (different) hosts that need the fallback (e. g. number of affected *unique* hosts, plus total number of fallbacks [maybe per host, too, if this measure makes any sense at all])? I guess privacy policy does not allow us to collect actual hosts and/or IPs of those hosts? May it be possible to send hashes of IPs and/or Hosts back via telemetry to get a clearer picture on the actual number of broken hosts? 


Issue #2: Once a fallback, always a fallback?
What happens if a host changes to being TLS 1.1 capable? How is this detected, how can it be assured that a user always uses the safest connection available? 

There's support for temporary "permissions" for the site permission manager (as mentioned in Comment 1) coming up (due to CTP), so it may be good to just remember TLS 1 downgrade for a session, and retry v1.1 after the temporary "permission" has expired. That way, a user may use the most secure connection sooner than later. 


Issue #3: The warning.
Something I was meaning to send to firefox-dev but couldn't do so yet, so here's a rough sketch: 
I'd like to see a warning page similar (but not too similar ;-) ) to the certificate warning page when users hit a downgrading TLS connection. There's the risk of warning-fatigue, yes, but according to the stats above, there are acutally not that many servers that are v1.1 intolerant (with the exception of intolerant proxies which will degrade the UX of affected users for every site they visit). To lighten that issue (and to accommodate bad proxies), such a warning page can be shown for known-good sites only, at first (i. e. if the users once encountered a site being v1.1 ready, he may be warned upon a later, downgrading visit). 

Is that a viable option?
Before adding the _real_ fallback code, it seemed like a good idea to change the interface to the PSM's ssl socket info data structure to use a range of values instead of booleans for each revision. BSmith indicated this _may_ be a fix for another bug, but I did some searching and couldn't find it, so i'm adding it here since we think we want to do this to make fallback between TLS revisions easier.
Attachment #764169 - Flags: review?(bsmith)
Attachment #764169 - Attachment is patch: true
Attachment #764169 - Attachment mime type: text/x-patch → text/plain
(In reply to Florian Bender from comment #8)
> Issue #1: Telemetry
> Can we please have telemetry on this, i. e. the number of (different) hosts
> that need the fallback

Are the statistics from comment 2 not useful enough?

If you really think you need more statistics, I can try and get them.
Yes, they are helpful for initially implementing this fallback. However, keeping track of the numbers of affected hosts may help determining when this insecure fallback can be removed (or replaced with a more secure mechanism, e. g. see issue #3 resp. a variant), i. e. when the number of affected users (regarding the hit times in a specific timeframe) drops below a certain threshold. The data may also be helpful for other research into this field (assuming the data is privacy-aware) regardless of if this research is related to Fx development or not. 

Furthermore, if we can obtain URLs or at least IP addresses of affected hosts, we may be able to evangelize admins to update their software. However, I do understand that this poses a privacy problem, so we might not be able to gain this data (at least automatically). Also, this data can be skewed by bad (corporate) proxies (however the data is still valuable, and false positives can be detected with little effort).
It seems you need to replace this code in nsSSLIOLayerSetOptions with a call to SSL_VersionRangeSet:

  PRBool enabled;
  if (SECSuccess != SSL_OptionGet(fd, SSL_ENABLE_SSL3, &enabled)) {
    return NS_ERROR_FAILURE;
  }
  infoObject->SetSSL3Enabled(enabled);
  if (SECSuccess != SSL_OptionGet(fd, SSL_ENABLE_TLS, &enabled)) {
    return NS_ERROR_FAILURE;
  }
  infoObject->SetTLSEnabled(enabled);

Then, you may need to remove or replace SetSSL3Enabled and SetTLSEnabled.
bsmith... i'm not sure i understand your comment. the last patch i added does what you just mentioned, though it does a SSL_VersionRangeGet and not a SSL_VersionRangeSet.

it seemed this function was taking information from the NSS managed options and setting the PSM managed nsNSSSocketInfo structure without side-effects. after we retrieve the range from fd with SSL_VersionRangeGet and populate the range in the infoObject structure, i need to call SSL_VersionRangeSet with the same values?
I got the following information from Ivan Ristic, data from ssl labs, data for june 2013:

> - 169,801 no problems
> - 58 servers intolerant to TLS 1.0+
> - 1,691 servers intolerant to TLS 1.1+
> - 4 servers intolerant only to TLS 1.1
> - 56 servers intolerant only to TLS 1.2
>
> 1,809 problematic servers in total.

That's 1.05% that's problematic.

This is only testing for version tolerance without any extensions.
Attached patch new TLS Intolerance logic (obsolete) — Splinter Review
Here's the patch for the most recently discussed algorithm for dealing with intolerant sites:

when we successfully negotiate a TLS connection, we remember the TLS Version we used.

when we fail to negotiate a TLS connection. we remember that as well.

before connecting, we see if there's an entry for the current site:port in the list of successful connections. if so, we use the version in the table as the max tls version we'll try for the new connection.

if there's an entry in the failure (intolerance) table, we use the version in the table minus one.

if we fail to negotiate a connection and the max version is the same as the min version, then we don't ask for a retry.
Attachment #769158 - Flags: review?(bsmith)
Comment on attachment 769158 [details] [diff] [review]
new TLS Intolerance logic

what happens when the user roams to a network behind one of them "strict corporate firewalls"* that doesn't let TLS > 1.0 pass?

* so far, I have not seen any specific examples of such firewalls
(In reply to Alex Xu from comment #16)
> Comment on attachment 769158 [details] [diff] [review]
> new TLS Intolerance logic
> 
> what happens when the user roams to a network behind one of them "strict
> corporate firewalls"* that doesn't let TLS > 1.0 pass?

Such a firewall is AFAICT indistinguishable from a man-in-the-middle attempting a downgrade attack, therefore we should not tolerate it.

I'd also like to second Florian's request (comment 8) for telemetry on the number of affected hosts.  This will be how we know when we can drop support for TLS 1.0 altogether.
Attachment #769158 - Attachment is patch: true
Attachment #769158 - Attachment mime type: text/x-patch → text/plain
Comment on attachment 769158 [details] [diff] [review]
new TLS Intolerance logic

Review of attachment 769158 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +140,5 @@
>  {
>    return mHasCleartextPhase;
>  }
>  
> +void nsNSSSocketInfo::GetTLSVersionRange( uint16_t *min, uint16_t *max ) {

Please follow the style conventions in the rest of the code. Please read https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style

1. return type on its own line
2. no spaces after opening parentheses is before closing parenthesis.
3. opening brace of a function should be on its own line.
4. "if (x) {" not "if( x ) {"

This will require changes, not just here, but throughout the patch.

@@ +715,5 @@
>    }
>  
>    if (socketInfo->IsSSL3Enabled()) {
>      // Add this site to the list of TLS intolerant sites.
> +    

Remove this blank line (and also trailing whitespace).

@@ +731,5 @@
> +  nsAutoCString key;
> +  getSiteKey(socketInfo, key);
> +
> +  bool isIntolerant = isKnownAsIntolerantSite( key );
> +  if( isIntolerant  ) {

Is it necessary to call isKnownAsIntolerantSite?

Note that there is a race condition (maybe benign?), because isKnownIntolerantSite and removeIntolerantSite both acquire and release the mutex.

@@ +735,5 @@
> +  if( isIntolerant  ) {
> +    removeIntolerantSite(key);
> +  }
> +
> +  return isIntolerant;

It seems like this return value is never used and so the return type should just be "void".

@@ +760,5 @@
> +bool nsSSLIOLayerHelpers::isKnownAsTolerantSite(const nsCString &str)
> +{
> +  MutexAutoLock lock(*mutex);
> +  uint16_t value;
> +  return !!mTLSTolerantSites->Get(str,&value);

C++'s bool type doesn't require "!!".

@@ +2604,5 @@
>    key = nsDependentCString(host) + NS_LITERAL_CSTRING(":") + nsPrintfCString("%d", port);
> +  nsSSLIOLayerHelpers helpers = infoObject->SharedState().IOLayerHelpers();
> +
> +  SSLVersionRange range;
> +  if (SECSuccess != SSL_VersionRangeGet(fd, &range)) {

At this point, |range| will contain the default version range, as determined by the ".min" and ".max" preferences, since we called SSL_VersionRangeSetDefault in nsNSSComponent::Init (or in some function it called) based on the value of those preferences.

I guess you intended to clamp the values of the range you compute below according to the values retrieved from the call above (to prevent the use of any version above the maximum allowed by the pref or any version below the minimum allowed by the other pref). But, the code below doesn't do such clamping.

@@ +2614,5 @@
> +  // in the intolerant table, then use the version one less than the current one.
> +  // If it's in neither, use the range from the SSL_VersionRangeGet() call.
> +
> +  if(helpers.isKnownAsTolerantSite(key)) {
> +    helpers.tolerantAtWhatVersion(key,&(range.max));

Because of race conditions and whatnot, it seems like all of this logic should be moved to a place where it can execute under the protection of a mutex.

Also, it seems wasteful to do a lookup to calculate the result of isKnownAsTolerantSite and then do another lookup for tolerantAtWhatVersion.

In fact, since you've done the work of converting these tables to nsDataHashtables, it seems like you could actually combine both tables into one that maps key -> (maxTolerant, minIntolerant).

(Note: it seems like the code was racy even prior to your modifications, but the new code seems even more racy.)

@@ +2625,5 @@
>      // One advantage of this approach, if a site only supports the older
>      // hellos, it is more likely that we will get a reasonable error code
>      // on our single retry attempt.
> +
> +    infoObject->SetAllowTLSIntoleranceTimeout(false);

SetAllowTLSIntoleranceTimeout should only be done when we are about to try the lowest acceptable version of TLS (currently SSL 3.0). The idea of the original logic that called SetAllowedTLSIntoleranceTimeout is that it would disable the timeout logic when it was falling back to SSL 3.0 because there is nothing further to fall back to anyway.

@@ +2627,5 @@
>      // on our single retry attempt.
> +
> +    infoObject->SetAllowTLSIntoleranceTimeout(false);
> +    helpers.intolerantAtWhatVersion(key,&(range.max));
> +    range.max--;

What prevents range.max-- from underflowing? i.e. what prevents range.min > range.max after this?

@@ +2633,1 @@
>    }

Notice that the original code called SSL_OptionSet(fd, SSL_ENABLE_TLS, false) to disable TLS 1.0. It seems like here you need to call SSL_VersionRangeSet(fd, &range) here; otherwise, all the calculations we do above won't actually affect the range we choose to tell the server we support, unless I am overlooking something.
Attachment #769158 - Flags: review?(brian) → review-
Comment on attachment 764169 [details] [diff] [review]
Use range instead of booleans on socket info data structure

Review of attachment 764169 [details] [diff] [review]:
-----------------------------------------------------------------

Please fold all the patches for this bug together. This will make it easier to understand. In particular, it should be easier to understand which code has become dead code if they are combined together.

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +95,5 @@
>  
> +  bool IsSSL3Enabled() const { return mMinTLSVersion == NS_TLSV_SSL3; }
> +  bool IsTLSEnabled() const { return mMaxTLSVersion >= NS_TLSV_TLS1; }
> +  void SetSSL3Enabled(bool enabled) { mMinTLSVersion = NS_TLSV_SSL3; }
> +  void SetTLSEnabled(bool enabled) { mMaxTLSVersion = NS_TLSV_MAX; }

My guess is that, when this bug is completed, none of the above functions will be needed and so all of them should be removed. Similarly, the NS_TLSV_* macros you added above should be removed because I doubt any of them will be needed. In the event you do need those values, you can access them using the SSL_LIBRARY_VERSION_* macros in sslproto.h.
Attachment #764169 - Flags: review?(brian)
Attached patch new new TLS intolerance patch (obsolete) — Splinter Review
Merged two patches into one on top of m-c (with -F 3000 :\) and fixed miscellaneous cosmetic issues. (trailing whitespace, blatant dead code)
Attachment #775424 - Flags: review?(brian)
Comment on attachment 775424 [details] [diff] [review]
new new TLS intolerance patch

Review of attachment 775424 [details] [diff] [review]:
-----------------------------------------------------------------

In addition to addressing the comments below, please correct the newly-added code to conform to the formatting style of the existing code and the coding style guide:
https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style

In particular, s/NULL/nullptr/g, replace C-style casts with C++-style casts, s/if(/if (/g, ensure parameters are separated by spaces.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +800,5 @@
> +  getSiteKey(socketInfo, key);
> +
> +  if (isKnownAsIntolerantSite(key)) {
> +    removeIntolerantSite(key);
> +  }

Can't this just be "removeIntolerantSite"? It seems like otherwise there is a race condition because each method locks and unlocks the mutex independently.

@@ +815,5 @@
>  
> +  uint16_t maxTLSVersion;
> +  socketInfo->GetTLSVersionRange((uint16_t *)NULL, &maxTLSVersion);
> +
> +  PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("Remembering TLS tolerant site %s (TLS Version %04X)\n", key.get(), maxTLSVersion));

Do you need to pass maxTLSVersion as static_cast<PRIntn>(maxTLSVersion)? Or, does conversion from uint16_t to int happen automatically?

@@ +1538,3 @@
>    MutexAutoLock lock(*mutex);
>    // Remember intolerant site only if it is not known as tolerant
> +  if (!mTLSTolerantSites->Get(str,&value)) {

Should this be something like this:

if (!mTLSTolerantSites->Get(str,&value) ||
    value < maxTLSVersion)

For example, if the site is known to be tolerant to TLS 1.1 we may still want to remember that it is intolerant to TLS 1.2.

@@ +1560,5 @@
> +}
> +
> +bool nsSSLIOLayerHelpers::intolerantAtWhatVersion(const nsCString &str, uint16_t *maxTLSVersion)
> +{
> +  if( maxTLSVersion ) {

No caller calls this with a null maxTLSVersion so I suggest replacing this check with MOZ_ASSERT(maxTLSVersion).

@@ +1562,5 @@
> +bool nsSSLIOLayerHelpers::intolerantAtWhatVersion(const nsCString &str, uint16_t *maxTLSVersion)
> +{
> +  if( maxTLSVersion ) {
> +    MutexAutoLock lock(*mutex);
> +    return !!mTLSIntolerantSites->Get(str,maxTLSVersion);

No need for the "!!" construct.

@@ +1570,5 @@
> +}
> +
> +bool nsSSLIOLayerHelpers::tolerantAtWhatVersion(const nsCString &str, uint16_t *maxTLSVersion)
> +{
> +  if( maxTLSVersion ) {

ditto.

@@ +2700,5 @@
> +
> +  if(helpers.isKnownAsTolerantSite(key)) {
> +    helpers.tolerantAtWhatVersion(key,&(range.max));
> +    PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("found in list of tolerant sites: %s %04X\n", key.get(), range.max));
> +  }

I think this "if (helpers.isKnownAsTolerantSite(key)" block is wrong or at least unnecessary. AFAICT, we will do the right thing if we just delete it.

@@ +2701,5 @@
> +  if(helpers.isKnownAsTolerantSite(key)) {
> +    helpers.tolerantAtWhatVersion(key,&(range.max));
> +    PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("found in list of tolerant sites: %s %04X\n", key.get(), range.max));
> +  }
> +  else if(helpers.isKnownAsIntolerantSite(key)) {

Don't call isKnownAsIntolerantSite(key) before intolerantAtWhatVersion. intolerantAtWhatVersion already returns exactly what helpers.isKnownAsIntolerantSite(key) returns.

(One hint that this code is wrong is that you don't check the return value of intolerantAtWhatVersion, even though in the body of intolerantAtWhatVersion you realized that it is important to indicate to the caller whether an entry as found.)

@@ +2712,4 @@
>  
> +    infoObject->SetAllowTLSIntoleranceTimeout(false);
> +    helpers.intolerantAtWhatVersion(key,&(range.max));
> +    range.max--;

AFAICT, this should be something like:

You need to make sure you're not increasing range.max. For example, let's say range.max started out as "TLS 1.0" and let's say helpers.intolerantAtWhatVersion returns "TLS 1.2". Then, you *don't* want to change range.max to "TLS 1.1".

@@ +2717,3 @@
>    }
> +
> +  infoObject->SetTLSVersionRange( range.min, range.max );

The call to SSL_VersionRangeSet is still missing.

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +10,5 @@
> +#define NS_TLSV_SSL3   0x0300
> +#define NS_TLSV_TLS1   0x0301
> +#define NS_TLSV_TLS1_1 0x0302
> +#define NS_TLSV_TLS1_2 0x0303
> +#define NS_TLSV_MAX    0x0303

None of these are needed.

@@ +95,5 @@
>    }
>  
> +  bool IsSSL3Enabled() const { return mMinTLSVersion == NS_TLSV_SSL3; }
> +  bool IsTLSEnabled() const { return mMaxTLSVersion >= NS_TLSV_TLS1; }
> +  void SetTLSVersionRange(uint16_t min, uint16_t max) { mMinTLSVersion = min; mMaxTLSVersion = max; }

Neither IsSSL3Enabled or IsTLSEnabled are useful anymore and they should be removed.

Nit:

  void SetTLSVersionRange(const SSLVersionRange & range) {
      this.mTLSVersionRange = range;
  }

@@ +130,5 @@
>    CertVerificationState mCertVerificationState;
>  
>    mozilla::psm::SharedSSLState& mSharedState;
> +  uint16_t mMinTLSVersion;
> +  uint16_t mMaxTLSVersion;

Nit: SSLVersionRange mTLSVersionRange;
Attachment #775424 - Flags: review?(brian) → review-
Attached patch TLS intolerance patch (obsolete) — Splinter Review
(In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #21)
> Comment on attachment 775424 [details] [diff] [review]
> new new TLS intolerance patch
> 
> Review of attachment 775424 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> In addition to addressing the comments below, please correct the newly-added
> code to conform to the formatting style of the existing code and the coding
> style guide:
> https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style
> 
> In particular, s/NULL/nullptr/g, replace C-style casts with C++-style casts,
> s/if(/if (/g, ensure parameters are separated by spaces.

Fixed. Also fixed at least one other spot it was wrong in nsNSSIOLayer.cpp.

> ::: security/manager/ssl/src/nsNSSIOLayer.cpp
> @@ +800,5 @@
> > +  getSiteKey(socketInfo, key);
> > +
> > +  if (isKnownAsIntolerantSite(key)) {
> > +    removeIntolerantSite(key);
> > +  }
> 
> Can't this just be "removeIntolerantSite"? It seems like otherwise there is
> a race condition because each method locks and unlocks the mutex
> independently.

I don't know what happens if nsDataHashtable::Remove is called with a non-existent key, so I moved the check into removeIntolerantSite. The race condition is gone, but possibly it is still not efficient.

> @@ +815,5 @@
> >  
> > +  uint16_t maxTLSVersion;
> > +  socketInfo->GetTLSVersionRange((uint16_t *)NULL, &maxTLSVersion);
> > +
> > +  PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("Remembering TLS tolerant site %s (TLS Version %04X)\n", key.get(), maxTLSVersion));
> 
> Do you need to pass maxTLSVersion as static_cast<PRIntn>(maxTLSVersion)? Or,
> does conversion from uint16_t to int happen automatically?

I have no idea. Left as is.

> @@ +1538,3 @@
> >    MutexAutoLock lock(*mutex);
> >    // Remember intolerant site only if it is not known as tolerant
> > +  if (!mTLSTolerantSites->Get(str,&value)) {
> 
> Should this be something like this:
> 
> if (!mTLSTolerantSites->Get(str,&value) ||
>     value < maxTLSVersion)
> 
> For example, if the site is known to be tolerant to TLS 1.1 we may still
> want to remember that it is intolerant to TLS 1.2.

Done.

> @@ +1560,5 @@
> > +}
> > +
> > +bool nsSSLIOLayerHelpers::intolerantAtWhatVersion(const nsCString &str, uint16_t *maxTLSVersion)
> > +{
> > +  if( maxTLSVersion ) {
> 
> No caller calls this with a null maxTLSVersion so I suggest replacing this
> check with MOZ_ASSERT(maxTLSVersion).

Done.

> @@ +1562,5 @@
> > +bool nsSSLIOLayerHelpers::intolerantAtWhatVersion(const nsCString &str, uint16_t *maxTLSVersion)
> > +{
> > +  if( maxTLSVersion ) {
> > +    MutexAutoLock lock(*mutex);
> > +    return !!mTLSIntolerantSites->Get(str,maxTLSVersion);
> 
> No need for the "!!" construct.

Done, and the same for all the others.

> @@ +1570,5 @@
> > +}
> > +
> > +bool nsSSLIOLayerHelpers::tolerantAtWhatVersion(const nsCString &str, uint16_t *maxTLSVersion)
> > +{
> > +  if( maxTLSVersion ) {
> 
> ditto.

ditto.

> @@ +2700,5 @@
> > +
> > +  if(helpers.isKnownAsTolerantSite(key)) {
> > +    helpers.tolerantAtWhatVersion(key,&(range.max));
> > +    PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("found in list of tolerant sites: %s %04X\n", key.get(), range.max));
> > +  }
> 
> I think this "if (helpers.isKnownAsTolerantSite(key)" block is wrong or at
> least unnecessary. AFAICT, we will do the right thing if we just delete it.

Thought about it, and that seems reasonable. Done.

> @@ +2701,5 @@
> > +  if(helpers.isKnownAsTolerantSite(key)) {
> > +    helpers.tolerantAtWhatVersion(key,&(range.max));
> > +    PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("found in list of tolerant sites: %s %04X\n", key.get(), range.max));
> > +  }
> > +  else if(helpers.isKnownAsIntolerantSite(key)) {
> 
> Don't call isKnownAsIntolerantSite(key) before intolerantAtWhatVersion.
> intolerantAtWhatVersion already returns exactly what
> helpers.isKnownAsIntolerantSite(key) returns.
> 
> (One hint that this code is wrong is that you don't check the return value
> of intolerantAtWhatVersion, even though in the body of
> intolerantAtWhatVersion you realized that it is important to indicate to the
> caller whether an entry as found.)
> 
> @@ +2712,4 @@
> >  
> > +    infoObject->SetAllowTLSIntoleranceTimeout(false);
> > +    helpers.intolerantAtWhatVersion(key,&(range.max));
> > +    range.max--;
> 
> AFAICT, this should be something like:
> 
> You need to make sure you're not increasing range.max. For example, let's
> say range.max started out as "TLS 1.0" and let's say
> helpers.intolerantAtWhatVersion returns "TLS 1.2". Then, you *don't* want to
> change range.max to "TLS 1.1".

I'm not entirely sure what's being done here, so this was left as is.

> @@ +2717,3 @@
> >    }
> > +
> > +  infoObject->SetTLSVersionRange( range.min, range.max );
> 
> The call to SSL_VersionRangeSet is still missing.

Fixed. Although I suspect I'm calling it wrong.

> ::: security/manager/ssl/src/nsNSSIOLayer.h
> @@ +10,5 @@
> > +#define NS_TLSV_SSL3   0x0300
> > +#define NS_TLSV_TLS1   0x0301
> > +#define NS_TLSV_TLS1_1 0x0302
> > +#define NS_TLSV_TLS1_2 0x0303
> > +#define NS_TLSV_MAX    0x0303
> 
> None of these are needed.

Done.

> @@ +95,5 @@
> >    }
> >  
> > +  bool IsSSL3Enabled() const { return mMinTLSVersion == NS_TLSV_SSL3; }
> > +  bool IsTLSEnabled() const { return mMaxTLSVersion >= NS_TLSV_TLS1; }
> > +  void SetTLSVersionRange(uint16_t min, uint16_t max) { mMinTLSVersion = min; mMaxTLSVersion = max; }
> 
> Neither IsSSL3Enabled or IsTLSEnabled are useful anymore and they should be
> removed.

They're called in rememberPossibleTLSProblemSite. I've substituted SSL_LIBRARY_VERSION_* for NS_TLSV_* but otherwise left them as is, as I'm not sure how the logic here is supposed to work. As it is, something's not right here.

> Nit:
> 
>   void SetTLSVersionRange(const SSLVersionRange & range) {
>       this.mTLSVersionRange = range;
>   }
> 
> @@ +130,5 @@
> >    CertVerificationState mCertVerificationState;
> >  
> >    mozilla::psm::SharedSSLState& mSharedState;
> > +  uint16_t mMinTLSVersion;
> > +  uint16_t mMaxTLSVersion;
> 
> Nit: SSLVersionRange mTLSVersionRange;

Fixed. I hope.
Attachment #775424 - Attachment is obsolete: true
Attachment #775638 - Flags: feedback?(mhamrick)
Comment on attachment 775638 [details] [diff] [review]
TLS intolerance patch

Review of attachment 775638 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +76,5 @@
>      mCertVerificationState(before_cert_verification),
>      mSharedState(aState),
>      mForSTARTTLS(false),
> +    mMinTLSVersion(0),
> +    mMaxTLSVersion(0),

Why do those variables exist when you also have an SSLVersionRange?

@@ +775,4 @@
>  
>    if (socketInfo->IsSSL3Enabled()) {
>      // Add this site to the list of TLS intolerant sites.
> +    addIntolerantSite(key, maxTLSVersion);

What is this maxTLSVersion?  Is that with the current max of the range?
Wouldn't this if make more sense if you checked that max > min?

@@ +2686,4 @@
>  
> +    infoObject->SetAllowTLSIntoleranceTimeout(false);
> +    helpers.intolerantAtWhatVersion(key, &(range.max));
> +    range.max--;

Should there be a check that it doesn't go below min?

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +89,5 @@
>    }
>  
> +  bool IsSSL3Enabled() const { return mMinTLSVersion == SSL_LIBRARY_VERSION_3_0; }
> +  bool IsTLSEnabled() const { return mMaxTLSVersion >= SSL_LIBRARY_VERSION_TLS_1_0; }
> +  void GetTLSVersionRange() { return this.mTLSVersionRange; }

This should probably return SSLVersionRange instead of void

@@ +124,4 @@
>    CertVerificationState mCertVerificationState;
>  
>    mozilla::psm::SharedSSLState& mSharedState;
> +  SSLVersionRange mTLSVersionRange;

Why is this called mTLSVersionRange and not mSSLVersionRange?  Same goes for the functions.

@@ +187,5 @@
>  
>    static void getSiteKey(nsNSSSocketInfo *socketInfo, nsCSubstring &key);
>    bool rememberPossibleTLSProblemSite(nsNSSSocketInfo *socketInfo);
> +  void forgetPossibleTLSProblemSite(nsNSSSocketInfo *socketInfo);
> +  bool isKnownAsTolerantSite( const nsCString &str);

The known tolerant sites don't seem to be used?  I assume this is supposed to be used to prevent downgrade attacks, but that's not part of this patch?
(In reply to Kurt Roeckx from comment #23)
> Comment on attachment 775638 [details] [diff] [review]
> TLS intolerance patch
> 
> Review of attachment 775638 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: security/manager/ssl/src/nsNSSIOLayer.cpp
> @@ +76,5 @@
> >      mCertVerificationState(before_cert_verification),
> >      mSharedState(aState),
> >      mForSTARTTLS(false),
> > +    mMinTLSVersion(0),
> > +    mMaxTLSVersion(0),
> 
> Why do those variables exist when you also have an SSLVersionRange?

Oops. I only defined it in nsNSSIOLayer.h but didn't initialize it. That should be something like mTLSVersionRange(nullptr) if I'm correct.

> @@ +775,4 @@
> >  
> >    if (socketInfo->IsSSL3Enabled()) {
> >      // Add this site to the list of TLS intolerant sites.
> > +    addIntolerantSite(key, maxTLSVersion);
> 
> What is this maxTLSVersion?  Is that with the current max of the range?
> Wouldn't this if make more sense if you checked that max > min?

I don't know.

> @@ +2686,4 @@
> >  
> > +    infoObject->SetAllowTLSIntoleranceTimeout(false);
> > +    helpers.intolerantAtWhatVersion(key, &(range.max));
> > +    range.max--;
> 
> Should there be a check that it doesn't go below min?

Almost certainly. However, I'm still confused about where/how this is being called, so I left that alone.

> ::: security/manager/ssl/src/nsNSSIOLayer.h
> @@ +89,5 @@
> >    }
> >  
> > +  bool IsSSL3Enabled() const { return mMinTLSVersion == SSL_LIBRARY_VERSION_3_0; }
> > +  bool IsTLSEnabled() const { return mMaxTLSVersion >= SSL_LIBRARY_VERSION_TLS_1_0; }
> > +  void GetTLSVersionRange() { return this.mTLSVersionRange; }
> 
> This should probably return SSLVersionRange instead of void

Uh... yes. That used to be void (int *, int *).

> @@ +124,4 @@
> >    CertVerificationState mCertVerificationState;
> >  
> >    mozilla::psm::SharedSSLState& mSharedState;
> > +  SSLVersionRange mTLSVersionRange;
> 
> Why is this called mTLSVersionRange and not mSSLVersionRange?  Same goes for
> the functions.

Because that's what bsmith said to put there. It does seem that mSSLVersionRange would be a more sensible name, however.

> @@ +187,5 @@
> >  
> >    static void getSiteKey(nsNSSSocketInfo *socketInfo, nsCSubstring &key);
> >    bool rememberPossibleTLSProblemSite(nsNSSSocketInfo *socketInfo);
> > +  void forgetPossibleTLSProblemSite(nsNSSSocketInfo *socketInfo);
> > +  bool isKnownAsTolerantSite( const nsCString &str);
> 
> The known tolerant sites don't seem to be used?  I assume this is supposed
> to be used to prevent downgrade attacks, but that's not part of this patch?

I believe so.

I'm still wondering if we should somehow check for TLS extension intolerance as an indicator of TLS version intolerance. (see comment 2)
Attached patch Updated Fallback Patch (obsolete) — Splinter Review
This patch  is intended to replace  previous patches from Alex  and myself. It's
also supposed  to address issues raised  in previous comments. The  objective of
this patch  is to implement logic  that reliably and gracefully  downgrades from
higher TLS versions to lower ones.  However, a) it does not address the issue of
downgrade attacks and b) it does  not track the implementation of TLS connection
telemetry.

Additional problems addressed by this patch include:

a) how  do we  recover  from  intermittant connection  issues  (like the  server
   normally does 1.2, but an intermittant  problem has caused it to downgrade to
   1.0 or 1.1?)

b) if a server upgrades its maximum supported TLS version during a user browsing
   session, how do  we ensure that we  try the new, higher TLS  version. (i.e. -
   intolerance tracking  shouldn't prevent  the client  from trying  the maximum
   allowable TLS version.)

c) how  do we  properly  reflect  client configuration  changes  (like the  user
   changes the maximum allowable TLS version from 1.0 to 1.1.)

Numerous changes are introduced, most notably:

a) The mSSLEnabled and  mTLSEnabled boolean flags have been removed  in favor of
   the SSLVersionRange member mTLSVersionRange.

b) The  {Is|Set}{SSL|TLS}Enabled functions  have been  removed in  favor of  the
   TLSVersionRangeSet() and TLSVersionRangeGet() functions.

c) The mTLSIntolerantSites  and mTLSTolerantSites  hash tables have  been merged
   into  a  single table  (mTLSToleranceInfo).  Also,  we  are now  storing  max
   tolerant and intolerant revisions in the table, so it has been changed from a
   nsTHashtable to a nsDataHashtable.

Logic  for manipulating  socket  TLS  Version maximums  and  storing server  TLS
intolerance   has   been   moved  into   three   functions:   getRangeForSite(),
tolerantAtVersion() and  intolerantAtVersion(). The first  takes a site  key and
returns a SSLVersionRange  object used to initialize a  socket before attempting
to connect.  The latter two  are used to store  which sites are  (in)tolerant at
which versions.

Theory of Operation

During  secure socket  initialization,  the system  calls the  getRangeForSite()
function  with  the  server's  siteKey  and a  reference  to  a  SSLVersionRange
object. The function examines the state  of the mTLSToleranceInfo hash table and
returns a version range that reflects current configuration settings. This range
object is used to initialize the socket.

  SSLVersionRange range;
  nsSSLIOLayerHelpers& helpers = socketInfo->SharedState().IOLayerHelpers();
  helpers.getRangeForSite(socketInfo, &range);
  SSL_VersionRangeSet(range);

Following  successful  negotiation,  tolerantAtVersion()   is  called  with  the
currently negotiated TLS maximum version.

  SSLVersionRange range;
  nsSSLIOLayerHelpers& helpers = socketInfo->SharedState().IOLayerHelpers();
  SSL_VersionRangeGet(&range);
  helpers.tolerantAtVersion(socketInfo, range.max);

If an  intolerance error is  detected in the callback,  intolerantAtVersion() is
called;  it  returns a  boolean  indicating  whether  the connection  should  be
retried.

  bool wantRetry;
  SSLVersionRange range;
  nsSSLIOLayerHelpers& helpers = socketInfo->SharedState().IOLayerHelpers();
  SSL_VersionRangeGet(&range);
  wantRetry = helpers.intolerantAtVersion(socketInfo, range.max);

A Few Notes about Intolerance Logic

The mTLSIntoleranceInfo hashtable contains a  tIntoleranceEntry for each site we
with which we attempt to negotiate  a TLS connection. The tIntoleranceEntry data
structure  contains  two uint16_t's  initialized  to  0.  If the  values,  named
tolerant  and intolerant,  are non-zero,  then  they represent  the TLS  version
passed in from tolerantAtVersion() or intolerantAtVersion() calles.

The 'tolerant' structure member (if non-zero) represents the TLS version we most
recently successfully negotiated with a particular site. The 'intolerant' member
(if non-zero) represents the version  that most recently failed. The intolerance
logic starts  at the higest  TLS version set  in configuration settings,  so (in
theory)  'tolerant' represents  the highest  common version  between client  and
server.

For reasons described  below, we also establish an invariant  across accesses to
mTLSIntoleranceInfo hashtable such that the intolerant member (if non-zero) must
be  less than  the tolerant  member  (if non-zero).  In other  words, the  patch
enforces the tolerant > intolerant invariant.

When  getRangeForSite() is  called, we  check the  following conditions  for the
tIntoleranceEntry for the current site:

* Are tolerant and intolerant both unset?

  This probably  means we've never tried  to connect to this  site before, so...
  use   default  version   range   set   with  SSL_VersionRangeSetDefault()   in
  nsNSSComponent.cpp.

* Is tolerant unset but intolerant set?

  This  means we  tried to  connect  at the  version stored  in the  intolerance
  member, but failed.  We decrement the this revison and  assuming it's not less
  than the minimum version, use it to connect again.

* Is tolerant set but intolerant unset?

  This means  we connected  to the site  at the version  listed in  the tolerant
  entry without a problem. You might think  this means we should set the maximum
  range to the  version in the tolerant  member. Instead, we use  the default in
  case  the  user  has  raised  (or  lowered)  the  maxium  range  configuration
  setting. If there's no intolerant entry,  it means we didn't have problems the
  last time we tried  this, and if the defaults didn't change  we should get the
  same results.

* Are both tolerant and intolerant set?

  This means we  probably started with a  TLS version that was too  high for the
  server so  we backed off  until we found  one it could  handle. So we  use the
  version in the tolerant member as the max version for the version range.

If we encounter an intermittant error when negotiating a session, it is possible
for a call to tolerantAtVersion to store a tolerant version that is greater than
or equal to a version that had previously been intolerant. This happens after we
recover from the downgrade. In this situation, we delete the intolerant value so
the next time we connect we will start with the default maximum version from the
configuration settings.
Attachment #764169 - Attachment is obsolete: true
Attachment #769158 - Attachment is obsolete: true
Attachment #775638 - Attachment is obsolete: true
Attachment #775638 - Flags: feedback?(mhamrick)
Attachment #777710 - Flags: review?(brian)
Reformatted + spell checked: (hard wrap is IMPOSSIBLE to read with email):

This patch is intended to replace previous patches from Alex and myself. It's also supposed to address issues raised in previous comments. The objective of this patch is to implement logic that reliably and gracefully downgrades from higher TLS versions to lower ones. However, a) it does not address the issue of downgrade attacks and b) it does not track the implementation of TLS connection telemetry.

Additional problems addressed by this patch include:

a) how do we recover from intermittent connection issues (like the server normally does 1.2, but an intermittent problem has caused it to downgrade to 1.0 or 1.1?)

b) if a server upgrades its maximum supported TLS version during a user browsing session, how do we ensure that we try the new, higher TLS version. (i.e. - intolerance tracking shouldn't prevent the client from trying the maximum allowable TLS version.)

c) how do we properly reflect client configuration changes (like the user changes the maximum allowable TLS version from 1.0 to 1.1.)

Numerous changes are introduced, most notably:

a) The mSSLEnabled and mTLSEnabled boolean flags have been removed in favor of the SSLVersionRange member mTLSVersionRange.

b) The {Is|Set}{SSL|TLS}Enabled functions have been removed in favor of the TLSVersionRangeSet() and TLSVersionRangeGet() functions.

c) The mTLSIntolerantSites and mTLSTolerantSites hash tables have been merged into a single table (mTLSToleranceInfo). Also, we are now storing max tolerant and intolerant revisions in the table, so it has been changed from a nsTHashtable to a nsDataHashtable.

Logic for manipulating socket TLS Version maximums and storing server TLS intolerance has been moved into three functions: getRangeForSite(), tolerantAtVersion() and intolerantAtVersion(). The first takes a site key and returns a SSLVersionRange object used to initialize a socket before attempting to connect. The latter two are used to store which sites are (in)tolerant at which versions.

Theory of Operation

During secure socket initialization, the system calls the getRangeForSite() function with the server's siteKey and a reference to a SSLVersionRange object. The function examines the state of the mTLSToleranceInfo hash table and returns a version range that reflects current configuration settings. This range object is used to initialize the socket.

  SSLVersionRange range;
  nsSSLIOLayerHelpers& helpers = socketInfo->SharedState().IOLayerHelpers();
  helpers.getRangeForSite(socketInfo, &range);
  SSL_VersionRangeSet(range);


Following successful negotiation, tolerantAtVersion() is called with the currently negotiated TLS maximum version.

  SSLVersionRange range;
  nsSSLIOLayerHelpers& helpers = socketInfo->SharedState().IOLayerHelpers();
  SSL_VersionRangeGet(&range);
  helpers.tolerantAtVersion(socketInfo, range.max);


If an intolerance error is detected in the callback, intolerantAtVersion() is called; it returns a boolean indicating whether the connection should be retried.

  bool wantRetry;
  SSLVersionRange range;
  nsSSLIOLayerHelpers& helpers = socketInfo->SharedState().IOLayerHelpers();
  SSL_VersionRangeGet(&range);
  wantRetry = helpers.intolerantAtVersion(socketInfo, range.max);


A Few Notes about Intolerance Logic

The mTLSIntoleranceInfo hashtable contains a tIntoleranceEntry for each site with which we attempt to negotiate a TLS connection. The tIntoleranceEntry data structure contains two uint16_t's initialized to 0. If the values, named tolerant and intolerant, are non-zero, then they represent the TLS version passed in from tolerantAtVersion() or intolerantAtVersion() calls.

The 'tolerant' structure member (if non-zero) represents the TLS version we most recently successfully negotiated with a particular site. The 'intolerant' member (if non-zero) represents the version that most recently failed. The intolerance logic starts at the highest TLS version set in configuration settings, so (in theory) 'tolerant' represents the highest common version between client and server.

For reasons described below, we also establish an invariant across accesses to mTLSIntoleranceInfo hashtable such that the intolerant member (if non-zero) must be less than the tolerant member (if non-zero). In other words, the patch enforces the tolerant > intolerant invariant.

When getRangeForSite() is called, we check the following conditions for the tIntoleranceEntry for the current site:

* Are tolerant and intolerant both unset?

  This probably means we've never tried to connect to this site before, so... use default version range set with SSL_VersionRangeSetDefault() in nsNSSComponent.cpp.

* Is tolerant unset but intolerant set?

  This means we tried to connect at the version stored in the intolerance member, but failed. We decrement the intolerant version and assuming it's not less than the minimum version, use it to connect again.

* Is tolerant set but intolerant unset?

  This means we connected to the site at the version listed in the tolerant entry without a problem. You might think this means we should set the maximum range to the version in the tolerant member. Instead, we use the default in case the user has raised (or lowered) the maximum range configuration setting. If there's no intolerant entry, it means we didn't have problems the last time we tried this, and if the defaults didn't change we should get the same results.

* Are both tolerant and intolerant set?

  This means we probably started with a TLS version that was too high for the server so we backed off until we found one it could handle. So we use the version in the tolerant member as the max version for the version range.

If we encounter an intermittent error when negotiating a session, it is possible for a call to tolerantAtVersion to store a tolerant version that is greater than or equal to a version that had previously been intolerant. This happens after we recover from the downgrade. In this situation, we delete the intolerant value so the next time we connect we will start with the default maximum version from the configuration settings.
Comment on attachment 777710 [details] [diff] [review]
Updated Fallback Patch

Review of attachment 777710 [details] [diff] [review]:
-----------------------------------------------------------------

In addition to addressing the comments below, please look at the splinter view of the patch to find all the trailing whitespace you added, and remove it.

::: security/manager/ssl/src/nsNSSCallbacks.cpp
@@ +1019,2 @@
>    nsSSLIOLayerHelpers& ioLayerHelpers = infoObject->SharedState().IOLayerHelpers();
> +  if (SECSuccess == SSL_VersionRangeGet(fd, &range)) {

You can replace the call to SSL_VersionRangeGet with infoOject->GetTLSVersionRange, which is guaranteed to not fail. Then you can avoid the conditional logic.

@@ +1019,3 @@
>    nsSSLIOLayerHelpers& ioLayerHelpers = infoObject->SharedState().IOLayerHelpers();
> +  if (SECSuccess == SSL_VersionRangeGet(fd, &range)) {
> +    infoObject->TLSVersionRangeSet(range);

This was already done when the socket was set up, so no need to do it again.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +74,5 @@
>  nsNSSSocketInfo::nsNSSSocketInfo(SharedSSLState& aState, uint32_t providerFlags)
>    : mFd(nullptr),
>      mCertVerificationState(before_cert_verification),
>      mSharedState(aState),
> +    mTLSVersionRange({0,0}),

Does this build on all platforms? (did you push it to tryserver?)

@@ +597,1 @@
>    }

Let's just make mTLSIntoleranceInfo a smart pointer so that this block isn't needed.

@@ +751,5 @@
>  {
> +  IntoleranceEntry entry;
> +  MutexAutoLock lock(*mutex);
> +  return !!mTLSIntoleranceInfo->Get(key, &entry);
> +}

I think isKnownAsIntolerantSite is dead code now.

@@ +754,5 @@
> +  return !!mTLSIntoleranceInfo->Get(key, &entry);
> +}
> +
> +void
> +nsSSLIOLayerHelpers::tolerantAtVersion(nsNSSSocketInfo *socketInfo, uint16_t max)

It seems like "rememberTolerantAtVersion" would be a better name for this function.

@@ +765,5 @@
> +  MutexAutoLock lock(*mutex);
> +  if (!mTLSIntoleranceInfo->Get(key, &entry)) {
> +    entry.intolerant = 0;
> +  } 
> +  entry.tolerant = max;

Shouldn't this be entry.tolerant = std::max(entry.tolerant, max)?

@@ +768,5 @@
> +  } 
> +  entry.tolerant = max;
> +
> +  // Enforce invariant on intolerance info hashtable
> +  if ((0 != entry.intolerant)&&(entry.intolerant <= entry.tolerant)) {

add spaces around "&&"

@@ +769,5 @@
> +  entry.tolerant = max;
> +
> +  // Enforce invariant on intolerance info hashtable
> +  if ((0 != entry.intolerant)&&(entry.intolerant <= entry.tolerant)) {
> +    entry.intolerant = 0;

why not entry.intolerant = entry.tolerant + 1?

@@ +774,3 @@
>    }
>  
> +  mTLSIntoleranceInfo->Put(key,entry);

add space after the comma.

@@ +776,5 @@
> +  mTLSIntoleranceInfo->Put(key,entry);
> +}
> +
> +bool
> +nsSSLIOLayerHelpers::intolerantAtVersion(nsNSSSocketInfo *socketInfo, uint16_t max)

Again, it seems like "rememberIntolerantAtVersion" is a better name since it makes it clearer that this function is saving some information.

Also, please wrap lines at 80 characters.

It seems you don't need to pass "max" in as a parameter because it is redundant with range.max from GetTLSVersionRange().

Please add a comment above this method indicating the significance of the return value.

@@ +788,5 @@
> +  MutexAutoLock lock(*mutex);
> +  if (!mTLSIntoleranceInfo->Get(key, &entry)) {
> +    entry.tolerant = 0;
> +  } 
> +  entry.intolerant = max;

entry.intolerant = std::min(entry.intolerant, range.max)?

@@ +789,5 @@
> +  if (!mTLSIntoleranceInfo->Get(key, &entry)) {
> +    entry.tolerant = 0;
> +  } 
> +  entry.intolerant = max;
> +

Why don't you need to "Enforce invariant on intolerance info hashtable" like you did in the other method? It seems like a good idea to be consistent.

@@ +790,5 @@
> +    entry.tolerant = 0;
> +  } 
> +  entry.intolerant = max;
> +
> +  mTLSIntoleranceInfo->Put(key,entry);

space after comma.

@@ +792,5 @@
> +  entry.intolerant = max;
> +
> +  mTLSIntoleranceInfo->Put(key,entry);
> +
> +  return range.max != (range.min + 1);

shouldn't it be this?:

return range.min < entry.intolerance;

@@ +796,5 @@
> +  return range.max != (range.min + 1);
> +}
> +
> +bool
> +nsSSLIOLayerHelpers::getRangeForSite(nsNSSSocketInfo *socketInfo, SSLVersionRange *range, bool *useDefault)

This function shouldn't return a boolean result. It must be defined in such a way that it always calculates a reasonable result. Then you can make useDefault the return value instead of an out parameter.

Please add a comment above this function that documents its effects on the range in/out parameter and describes its return value and/or outputs.

In general, please label output parameters /*out*/ and in/out parameters /*in/out*/. (If you follow my suggestion, there will be no such parameters in this function.)

Since range seems to be an in/out parameter, a better name for the function would be "adjustVersionRange".

Also, wrap to 80 characters.

Note to self: re-review this function carefully when it gets updated.

@@ +800,5 @@
> +nsSSLIOLayerHelpers::getRangeForSite(nsNSSSocketInfo *socketInfo, SSLVersionRange *range, bool *useDefault)
> +{
> +  IntoleranceEntry entry;
> +  nsCString key;
> +  MutexAutoLock lock(*mutex);

I think you were intending to get the intolerance entry for the site after acquiring the mutex, but it seems that call is missing. As a result, entry is used uninitialized below.

When you update the code to initialize entry, then after that initialization you should have a MOZ_ASSERT that verifies that the invariant is true. That will simplify the logic below and make it clear that this function will always be able to calculate a reasonable version range.

@@ +808,5 @@
> +      *useDefault = true;
> +    } else {
> +      range->max--;
> +      if (range->max <= range->min) {
> +        return false;

range->min == range->max is an acceptable version range.

And, because of the invariant you enforce, range->max < range.min can never happen.

@@ +828,1 @@
>    }

I am pretty sure that all of these nested ifs can be simplified to:


// Rely on the invariant
MOZ_ASSERT(entry.intolerant == 0 || entry.intolerant > entry.tolerant);

// note entry.intolerant > range->min implies
// entry.intolerant != 0
if (entry.intolerant > range->min &&
    entry.intolerant <= range->max) {
  range->max = entry.intolerant - 1;
}

Or similar.

@@ +1476,4 @@
>    // Initialize the tolerant site hashtable to 16 items at the start seems
>    // reasonable as most servers are TLS tolerant. We just want to lower 
>    // the rate of hashtable array reallocation.
> +  mTLSIntoleranceInfo->Init(16);

You can remove this. This kind of thing doesn't matter any more. Plus the user probably isn't going to encounter an intolerant server at all.

@@ +2621,5 @@
> +    }
> +  }
> +
> +  PR_LOG(gPIPNSSLog, PR_LOG_DEBUG,
> +         ("Setting Range, min: %0X4, max: %04X\n", range.min, range.max));

This code seems resaonable.

Why did you put it here instead of in nsSSLIOLayerSetOptions where the version intolerance logic was previously? It seems to me like it should all be in the same place, instead of being split into two places.

@@ +2658,1 @@
>      infoObject->SetAllowTLSIntoleranceTimeout(false);

This logic isn't correct. The idea of this logic is that if we already fell back from TLS 1.0 to SSL 3.0, then we shouldn't let the SSL 3.0 connection time out for intolerance since there would be nothing to fall back through.

AFAICT, the condition needs to be changed to "range.min < range.max" where range is the range calculated based on the intolerance info.

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +62,5 @@
>  
>    void SetAllowTLSIntoleranceTimeout(bool aAllow);
>  
> +  void TLSVersionRangeSet(SSLVersionRange range) { mTLSVersionRange = range; }
> +  SSLVersionRange TLSVersionRangeGet() { return mTLSVersionRange; };

In Gecko, we make Get/Set prefixes, not suffixes. Please rename these.

@@ +175,5 @@
>    static PRIOMethods nsSSLIOLayerMethods;
>    static PRIOMethods nsSSLPlaintextLayerMethods;
>  
>    mozilla::Mutex *mutex;
> +  nsDataHashtable<nsCStringHashKey,IntoleranceEntry> *mTLSIntoleranceInfo;

This doesn't need to be a pointer. It can just be:

nsDataHashtable<nsCStringHashKey, IntoleranceEntry> mTLSIntoleranceInfo;
Attachment #777710 - Flags: review?(brian) → review-
(In reply to Alex Xu from comment #26)
> Additional problems addressed by this patch include:
> 
> a) how do we recover from intermittent connection issues (like the server
> normally does 1.2, but an intermittent problem has caused it to downgrade to
> 1.0 or 1.1?)
> 
> b) if a server upgrades its maximum supported TLS version during a user
> browsing session, how do we ensure that we try the new, higher TLS version.
> (i.e. - intolerance tracking shouldn't prevent the client from trying the
> maximum allowable TLS version.)
> 
> c) how do we properly reflect client configuration changes (like the user
> changes the maximum allowable TLS version from 1.0 to 1.1.)

You say those problems are addressed by this patch?  How?

I've been wondering if we should save the tolerant and intolerant version, and when we should reset them.

I'm guessing we don't want to do this version rollback on each connection attempt, and want to save at it at least for some time.  I could live with the intolerant version being reset after restarting the browser.

Only to prevent the downgrade attacks I want to keep the tolerant version, but I think this patch is not supposed to address that.
Attached patch Non Working Patch (obsolete) — Splinter Review
hey brian... this works with local testing but causes the SSL engine to fail in interesting ways. can you take a quick look at it and see if you can figure out what i'm doing wrong?
Attachment #781323 - Flags: feedback?(brian)
(In reply to Meadhbh Hamrick :mhamrick from comment #29)
> Created attachment 781323 [details] [diff] [review]
> Non Working Patch
> 
> hey brian... this works with local testing but causes the SSL engine to fail
> in interesting ways. can you take a quick look at it and see if you can
> figure out what i'm doing wrong?

Define "interesting ways."

Like I said before, it is confusing that some of the logic is in nsSSLIOLayerImportFD and another part is in nsSSLIOLayerSetOptions. I suggest that you move all the intolerance logic in nsSSLIOLayerImportFD to nsSSLIOLayerSetOptions to see if that clarifies things.

The logic in nsSSLIOLayerSetOptions looks strange. First, it calls infoObject->getTLSVersionRange to get the version range calculated in nsNSSIOLayerImportFD. Then, later it calls infoObject->setTLSVersionRange to change that range. It would be clearer if you only called infoObject->setTLSVersionRange once at the same time that you call SSL_VersionRangeSet.

I suspect that if you clarify that stuff the failures you are experiencing may get corrected.
Attachment #781323 - Flags: feedback?(brian)
okay. we originally said we were going to put all the logic in nsSSLIOLayerImportFD, but i'll put it all in nsSSLIOLayerSetOptions now.
Attached patch yet another diff (obsolete) — Splinter Review
meh. reverted back to last working diff. hopefully this will help the next person who works on this.
Attachment #777710 - Attachment is obsolete: true
Attachment #781323 - Attachment is obsolete: true
Attachment #784079 - Flags: review?(brian)
Assignee: mhamrick → nobody
Assignee: nobody → brian
Wan-Teh: If you have time, could you please review the logic that is tested with the GTest test TLSIntoleranceTest.cpp: rememberIntolerantAtVersion, rememberTolerantAtVersion, and adjustForTLSIntolerance?

David: I recommend reading TLSIntoleranceTest.cpp first, to get an idea of the expected behavior.
Attachment #784079 - Attachment is obsolete: true
Attachment #784079 - Flags: review?(brian)
Attachment #785566 - Flags: review?(wtc)
Attachment #785566 - Flags: review?(dkeeler)
Wan-Teh: Do you know of any examples of straight-up TLS 1.1 and/or TLS 1.2 version intolerance that are live on the internet now? Also, in this WIP, I have just hard-coded the intolerance into ssl3_HandleClientHello. It would be useful if we could add an option to libssl to let it simulate TLS intolerance. Do you think it would be bad to add such an option?

David: As you can see, this test was derived heavily from your OCSP stapling test. I used "hg cp" so that the patch contains diffs between your OCSP stapling test code and the TLS intolerance test code. We should refactor this stuff to avoid the code duplication. Let me know if you have suggestions for how to do that.
Attachment #785567 - Flags: feedback?(wtc)
Attachment #785567 - Flags: feedback?(dkeeler)
Attachment #785566 - Attachment description: Expand TLS intolerance logic to work for versions beyond TLS 1.0, r?keeler, r?cviecco → Expand TLS intolerance logic to work for versions beyond TLS 1.0
(In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #34)
> Wan-Teh: Do you know of any examples of straight-up TLS 1.1 and/or TLS 1.2
> version intolerance that are live on the internet now?

One site I know that still has a problem with a 1.2 ClientHello is: www.eboekhuis.nl
(It will just redirect for that URL)
(In reply to Kurt Roeckx from comment #35)
> (In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith)
> from comment #34)
> > Wan-Teh: Do you know of any examples of straight-up TLS 1.1 and/or TLS 1.2
> > version intolerance that are live on the internet now?
> 
> One site I know that still has a problem with a 1.2 ClientHello is:
> www.eboekhuis.nl
> (It will just redirect for that URL)

Thanks. I verified that this does indeed exercise the TLS intolerance logic:

74224[8c8f328]: [2564bb8] nsSSLIOLayerSetOptions: using TLS version range (0x0300,0x0303)
74224[8c8f328]: [2564bb8] checkHandshake: will retry with lower max TLS version
74224[8c8f328]: [2564bb8] nsSSLIOLayerSetOptions: using TLS version range (0x0300,0x0302)
74224[8c8f328]: [2564bb8] checkHandshake: will retry with lower max TLS version
74224[8c8f328]: [2564bb8] nsSSLIOLayerSetOptions: using TLS version range (0x0300,0x0301)
74224[8c8f328]: [17246ab8] HandshakeCallback: succeeded using TLS version range (0x0300,0x0301)
 
This server sends an "unexpected message" alert in response to a TLS 1.2 client hello and it sends a "close notify" alert in response to a TLS 1.1 client hello.

I also verified behavior similar to Firefox for Chrome Canary using Wireshark. It seems MSIE falls back immediately to TLS 1.0 when the TLS 1.2 handshake fails.
(In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #36)
> It seems MSIE falls back immediately to TLS 1.0 when the TLS 1.2
> handshake fails.

I hope you don't want to emulate that behaviour. :)
Though it will need another roundtrip to check for TLS 1.1, IMHO it's always better to always try to use the highest available security. Thus, if a server does not support TLS 1.2 but TLS 1.1, the latter will be used; if it supports neither, the connection setup will take longer, however that situation will improve in the future as more endpoints support TLS 1.1 or TLS 1.2.
I think it's unlikely that if a server is intolerant to 1.2 that it will be tolerant to 1.1 and it might be an option to fall back to 1.0 in this case.  Reading the rollback draft, it also seems to be saying to fall back to 1.0 directly.
Comment on attachment 785566 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0

Review of attachment 785566 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc.

Although I read every line of code in this patch, I didn't check the code
dealing with mTLSIntoleranceInfo in nsNSSIOLayer.cpp carefully. So please
doublecheck that logic yourself.

::: security/manager/ssl/src/TransportSecurityInfo.h
@@ +49,5 @@
>    nsresult SetSecurityState(uint32_t aState);
>    nsresult SetShortSecurityDescription(const PRUnichar *aText);
>  
> +  const nsACString & GetHostName() const { return mHostName; }
> +  const char * GetHostNameRaw() const { return mHostName.get(); }

Nit: it seems that this method (which returns a char * as a return value)
should have the same name as the GetHostName method below, which returns
a char * as an output argument. Perhaps the GetHostName method that returs
a nsACString should be renamed instead?

::: security/manager/ssl/src/nsNSSCallbacks.cpp
@@ +1019,5 @@
> +  SSLVersionRange versions(infoObject->GetTLSVersionRange());
> +
> +  PR_LOG(gPIPNSSLog, PR_LOG_DEBUG,
> +         ("[%p] HandshakeCallback: succeeded using TLS version range (0x%04x,0x%04x)\n",
> +          fd, static_cast<int>(versions.min), static_cast<int>(versions.max)));

It's better to cast to unsigned int.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ -69,5 @@
>  
>  #ifdef PR_LOGGING
>  extern PRLogModuleInfo* gPIPNSSLog;
>  #endif
> -

Nit: I would keep this blank line.

@@ +649,5 @@
>        nsIInterfaceRequestor *csi = static_cast<nsIInterfaceRequestor*>(socketInfo);
> +
> +      nsCString hostWithPortString;
> +      getSiteKey(socketInfo->GetHostName(), socketInfo->GetPort(),
> +                 hostWithPortString);

You should be able to call
    getSiteKey(*socketInfo, hostWithPortString);

@@ +791,5 @@
> +  if (mTLSIntoleranceInfo.Get(key, &entry)) {
> +    entry.AssertInvariant();
> +    if (intolerant <= entry.tolerant) {
> +      // We already know the server is tolerant at an equal or higher version.
> +      return false;

Should we reset entry.intolerant to 0 in this case?

@@ +1205,5 @@
>    return result;
>  }
>  
>  nsSSLIOLayerHelpers::nsSSLIOLayerHelpers()
> +: mutex("nsSSLIOLayerHelpers.mutex")

If you rename 'mutex' to 'mMutex' as I suggested, the mutex name should also be updated.

@@ +2652,5 @@
> +    .adjustForTLSIntolerance(infoObject->GetHostName(), infoObject->GetPort(),
> +                             range);
> +  PR_LOG(gPIPNSSLog, PR_LOG_DEBUG,
> +         ("[%p] nsSSLIOLayerSetOptions: using TLS version range (0x%04x,0x%04x)\n",
> +          fd, static_cast<int>(range.min), static_cast<int>(range.max)));

It's better to cast to unsigned int.

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +122,5 @@
>    CertVerificationState mCertVerificationState;
>  
>    mozilla::psm::SharedSSLState& mSharedState;
>    bool mForSTARTTLS;
> +  SSLVersionRange mTLSVersionRange;

Nit: maybe this member should be named mSSLVersionRange? It's up to you.

@@ +189,5 @@
> +      MOZ_ASSERT(intolerant == 0 || tolerant < intolerant);
> +    }
> +  };
> +  nsDataHashtable<nsCStringHashKey, IntoleranceEntry> mTLSIntoleranceInfo;
> +public:

It seems bad to have multiple 'private' and 'public' sections in a class.
Is this allowed by the codig style guidelines?

@@ +190,5 @@
> +    }
> +  };
> +  nsDataHashtable<nsCStringHashKey, IntoleranceEntry> mTLSIntoleranceInfo;
> +public:
> +  void rememberTolerantAtVersion(const nsACString & hostname, int16_t port,

Your use of a space between '&' and the argument name is different from the
style of existing code, which puts '&' right next to the argument name.

@@ +204,5 @@
>  
>    bool mFalseStartRequireNPN;
>    bool mFalseStartRequireForwardSecrecy;
>  private:
> +  mozilla::Mutex mutex;

It would be nice to document which members this mutex protects.
I think it protects mTLSIntoleranceInfo, mRenegoUnrestrictedSites,
mTreatUnsafeNegotiationAsBroken, and mWarnLevelMissingRFC5746.

Also, this mutex should be named mMutex to follow the naming convention of class members.

::: security/manager/ssl/tests/gtest/TLSIntoleranceTest.cpp
@@ +5,5 @@
> +
> +NS_NAMED_LITERAL_CSTRING(HOST, "example.org");
> +const int16_t PORT = 443;
> +
> +// The fixture for testing class Foo.

Should "Foo" be replaced by the name of a specific class?
Attachment #785566 - Flags: review?(wtc) → review+
Comment on attachment 785566 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0

Review of attachment 785566 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great - I certainly don't see any show-stoppers. I have a few questions about edge cases, though.
Also, just a note: I would have a build peer take a look at the moz.build/Makefile.in changes.

::: security/manager/ssl/src/SSLServerCertVerification.cpp
@@ +319,5 @@
>      = do_GetService(NS_STSSERVICE_CONTRACTID, &nsrv);
>    if (NS_SUCCEEDED(nsrv)) {
>      nsCOMPtr<nsISSLSocketControl> sslSocketControl = do_QueryInterface(
>        NS_ISUPPORTS_CAST(nsITransportSecurityInfo*, mInfoObject));
> +    nsrv = stss->IsStsHost(mInfoObject->GetHostNameRaw(),

Depending on when this lands, bug 887052 may cause a merge conflict with this change. (nsIStrictTransportSecurityService gets replaced with nsISiteSecurityService, etc.)

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +747,5 @@
> +
> +  IntoleranceEntry entry;
> +  if (mTLSIntoleranceInfo.Get(key, &entry)) {
> +    entry.AssertInvariant();
> +    entry.tolerant = std::max(entry.tolerant, tolerant);

Would it ever be the case that we would call rememberTolerantAtVersion with tolerant < entry.tolerant?

@@ +794,5 @@
> +      // We already know the server is tolerant at an equal or higher version.
> +      return false;
> +    }
> +    if ((entry.intolerant != 0 && intolerant >= entry.intolerant)) {
> +      // We already know that the server is intolerant at a lower version.

Again, I'm not clear on when this would happen. If it does, is it an error/warning?

@@ +830,5 @@
> +
> +  if (entry.intolerant != 0) {
> +    // We've tried connecting at a higher range but failed, so try at the
> +    // version we haven't tried yet, unless we have reached the minimum.
> +    if (range.min < entry.intolerant) {

What if range.min >= entry.intolerant? Should this be an error?

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +190,5 @@
> +    }
> +  };
> +  nsDataHashtable<nsCStringHashKey, IntoleranceEntry> mTLSIntoleranceInfo;
> +public:
> +  void rememberTolerantAtVersion(const nsACString & hostname, int16_t port,

I would say upper-case the initial letter of these functions, but it looks like that would make this file's style inconsistent, so maybe that can be part of some future cleanup.

::: security/manager/ssl/tests/gtest/Makefile.in
@@ +1,1 @@
> +#

Probably don't need this first line.

::: security/manager/ssl/tests/gtest/TLSIntoleranceTest.cpp
@@ +1,1 @@
> +#include "nsNSSIOLayer.h"

OK - I think I understand the desired behavior in these tests. It might help to have a block comment that explains what's supposed to happen or is a link to a document that does. (Or maybe that should go in the implementation itself.)

::: security/manager/ssl/tests/gtest/moz.build
@@ +1,1 @@
> +MODULE = 'ssltest'

I think this file needs the boilerplate mode lines and license block the others have.
Attachment #785566 - Flags: review?(dkeeler) → review+
(In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #34)
> 
> Wan-Teh: Do you know of any examples of straight-up TLS 1.1 and/or TLS 1.2
> version intolerance that are live on the internet now?

No. You can try searching in the Chromium issue tracker for "TLS 1.1" to find
them. But some intolerance is actually caused by a firewall device on a corporate
network. The firewall-injected TCP reset was the last TLS 1.1 intolerance I worked
on, which is why I no longer remember the websites that are TLS 1.1 intolerant.

> Also, in this WIP, I
> have just hard-coded the intolerance into ssl3_HandleClientHello. It would
> be useful if we could add an option to libssl to let it simulate TLS
> intolerance. Do you think it would be bad to add such an option?

Yes, this should be useful for testing. (I do note that this would be the first
option of this kind.)
Comment on attachment 785567 [details] [diff] [review]
WIP: Add xpcshell integration test of TLS intolerance logic with the PSM IO layer

Review of attachment 785567 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #34)
> David: As you can see, this test was derived heavily from your OCSP stapling
> test. I used "hg cp" so that the patch contains diffs between your OCSP
> stapling test code and the TLS intolerance test code. We should refactor
> this stuff to avoid the code duplication. Let me know if you have
> suggestions for how to do that.

My only suggestion would be to refactor that file into a generalized TLSTestingServer (or something) and add command-line switches (or continue with the environment-variable business).
Also, the shared code in the two xpcshell tests can probably go in head_psm.js.
Attachment #785567 - Flags: feedback?(dkeeler) → feedback+
Comment on attachment 785567 [details] [diff] [review]
WIP: Add xpcshell integration test of TLS intolerance logic with the PSM IO layer

Review of attachment 785567 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/manager/ssl/tests/unit/test_tls_intolerance.js
@@ +160,5 @@
>  
>  function run_test_body() {
> +  add_connection_test("localhost", Cr.NS_ERROR_NET_RESET); // TLS 1.2
> +  add_connection_test("localhost", Cr.NS_ERROR_NET_RESET); // TLS 1.1
> +  add_connection_test("localhost", Cr.NS_ERROR_NET_RESET); // TLS 1.0

Nit: it is not necessary to allow a "connection reset" error to cause a
fallback from TLS 1.0 to SSL 3.0. I have only observed an "EOF" error
(graceful TCP connection closure) or some sort of TLS/SSL error alert
message in response to a TLS 1.0 ClientHello.

::: security/nss/lib/ssl/ssl3con.c
@@ +7158,5 @@
>  
> +    if (ss->clientHelloVersion > SSL_LIBRARY_VERSION_3_0) {
> +      errCode = PR_INVALID_STATE_ERROR;
> +      goto loser;
> +    }

Is this change for simulating a TLS-intolerant server?
It would be nice to be able to control what the response
is (close the TCP connection, send a TCP reset packet, or
send a specific TLS/SSL error alert message).
Attachment #785567 - Flags: feedback?(wtc) → feedback+
Thanks for the careful review Wan-Teh. I really appreciate it.

(In reply to Wan-Teh Chang from comment #39)
> @@ +791,5 @@
> > +  if (mTLSIntoleranceInfo.Get(key, &entry)) {
> > +    entry.AssertInvariant();
> > +    if (intolerant <= entry.tolerant) {
> > +      // We already know the server is tolerant at an equal or higher version.
> > +      return false;
> 
> Should we reset entry.intolerant to 0 in this case?

Let's say that we attempted TLS 1.2 and that failed and then we attempted TLS 1.1 and that succeeded. It should be the case that we never call this function with intolerant < TLS 1.2 anyway. But, if we did, we wouldn't want to forget that we were intolerant to TLS 1.2.

> It seems bad to have multiple 'private' and 'public' sections in a class.
> Is this allowed by the codig style guidelines?

I will see if it is clearer to rearrange the code.

I will also make the other changes you suggested.
Thanks for reviewing the patch carefully David.

(In reply to David Keeler (:keeler) from comment #40)
> Depending on when this lands, bug 887052 may cause a merge conflict with
> this change. (nsIStrictTransportSecurityService gets replaced with
> nsISiteSecurityService, etc.)

No problem. Let me know if you need any help getting bug 887052 relanded soon.

> ::: security/manager/ssl/src/nsNSSIOLayer.cpp
> @@ +747,5 @@
> > +
> > +  IntoleranceEntry entry;
> > +  if (mTLSIntoleranceInfo.Get(key, &entry)) {
> > +    entry.AssertInvariant();
> > +    entry.tolerant = std::max(entry.tolerant, tolerant);
> 
> Would it ever be the case that we would call rememberTolerantAtVersion with
> tolerant < entry.tolerant?

It should never happen. But, I want to handle every case so that I have to worry less about potential race conditions with multiple connections and future changes and whatnot.

> > +    if ((entry.intolerant != 0 && intolerant >= entry.intolerant)) {
> > +      // We already know that the server is intolerant at a lower version.
> 
> Again, I'm not clear on when this would happen. If it does, is it an
> error/warning?

Same as above. It should never happen in normal operation but I want to handle all the cases as long as it is not problematic to do so.

> > +  if (entry.intolerant != 0) {
> > +    // We've tried connecting at a higher range but failed, so try at the
> > +    // version we haven't tried yet, unless we have reached the minimum.
> > +    if (range.min < entry.intolerant) {
> 
> What if range.min >= entry.intolerant? Should this be an error?

The idea is that adjustForTLSIntolerance will always give us a range that libssl will accept. If range.min >= entry.intolerant then that means we're trying to connect with the TLS version prefs set out of bounds compared to what we've learned from previous connections. But, what can we do? If the server is really intolerant with the unmodified range, there will be an error. But, maybe we get lucky and our previous information was wrong based on some kind of false positive.

> I would say upper-case the initial letter of these functions, but it looks
> like that would make this file's style inconsistent, so maybe that can be
> part of some future cleanup.

I will see what makes sense given the existing naming inconsistency.

> ::: security/manager/ssl/tests/gtest/TLSIntoleranceTest.cpp
> @@ +1,1 @@
> > +#include "nsNSSIOLayer.h"
> 
> OK - I think I understand the desired behavior in these tests. It might help
> to have a block comment that explains what's supposed to happen or is a link
> to a document that does. (Or maybe that should go in the implementation
> itself.)

OK. I will find a place in the implementation to document this. Basically, we try the highest version we support, and we keep retrying with lower versions, decrementing the version number one at a time, until we find a version that works. If we get all the way to the minimum version we support and that still doesn't work, we forget everything we learned and give up; the next time we try to connect to the server, we will start from the top again.

> I think this file needs the boilerplate mode lines and license block the
> others have.

Good catch. Thanks.
(In reply to Wan-Teh Chang from comment #43)
> >  function run_test_body() {
> > +  add_connection_test("localhost", Cr.NS_ERROR_NET_RESET); // TLS 1.2
> > +  add_connection_test("localhost", Cr.NS_ERROR_NET_RESET); // TLS 1.1
> > +  add_connection_test("localhost", Cr.NS_ERROR_NET_RESET); // TLS 1.0
> 
> Nit: it is not necessary to allow a "connection reset" error to cause a
> fallback from TLS 1.0 to SSL 3.0. I have only observed an "EOF" error
> (graceful TCP connection closure) or some sort of TLS/SSL error alert
> message in response to a TLS 1.0 ClientHello.

I noticed this before too. I just filed bug 901718 about it.

Note, though that the NS_ERROR_NET_RESET you see above is the error code that is OUTPUT from the TLS intolerance calculations, not the input. The higher levels (above PSM) react to NS_ERROR_NET_RESET by retrying the connection. So, even if/when we fix bug 901718, we will still be returning Cr.NS_ERROR_NET_RESET from the intolerance logic, IIUC.

> 
> ::: security/nss/lib/ssl/ssl3con.c
> @@ +7158,5 @@
> >  
> > +    if (ss->clientHelloVersion > SSL_LIBRARY_VERSION_3_0) {
> > +      errCode = PR_INVALID_STATE_ERROR;
> > +      goto loser;
> > +    }
> 
> Is this change for simulating a TLS-intolerant server?
> It would be nice to be able to control what the response
> is (close the TCP connection, send a TCP reset packet, or
> send a specific TLS/SSL error alert message).

Yes, this is how I simulated a TLS-intolerant server. I agree with your desire to have flexibility in how the server displays intolerance. I will make the option flexible enough to allow for all these options, but I may not implement all the options right away.
(In reply to Kurt Roeckx from comment #10)
> If you really think you need more statistics, I can try and get them.

This may help you, but will not with the bad-firewall-case: nakedsecurity.sophos.com/2013/08/20/welcome-to-zmap-the-one-hour-turnaround-internet-scanner/
Attachment #785567 - Attachment is obsolete: true
This test requires the NSS patch in bug 909162 and it also requires TLS 1.2 to be enabled by default.
Attachment #803985 - Flags: review?(dkeeler)
Comment on attachment 803985 [details] [diff] [review]
Integration test for TLS version intolerance logic

Review of attachment 803985 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome.

::: security/manager/ssl/src/TransportSecurityInfo.cpp
@@ +272,5 @@
>    return rv;
>  }
>  
> +NS_IMETHODIMP
> +TransportSecurityInfo::GetNegotiatedTLSVersion(uint16_t * result)

Not that this file is consistent at all, but "uint16_t *result" or "uint16_t* result" would be more consistent.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +112,5 @@
>      mProviderFlags(providerFlags),
>      mSocketCreationTimestamp(TimeStamp::Now()),
>      mPlaintextBytesRead(0)
>  {
> +  mTLSVersionRange.min = 0;

It's not a big deal, but I'm bothered I can't figure out what changed here.

::: security/manager/ssl/tests/unit/test_tls_intolerance.js
@@ +43,5 @@
> +  add_success_case("hates-tls-12-unexpected-message-alert.example.com", TLS_1_1);
> +  
> +  add_fallback_case("hates-tls-12-illegal-parameter-alert.example.com");
> +  add_success_case("hates-tls-12-illegal-parameter-alert.example.com", TLS_1_1);
> +  

some unnecessary whitespace here (and on the blank lines, above)

::: security/manager/ssl/tests/unit/tlsserver/cmd/TLSIntoleranceServer.cpp
@@ +26,5 @@
> +  {
> +    const char *mHostName;
> +    SSLIntoleranceTypeFlags mIntoleranceTypes;
> +    SSLIntoleranceResponseType mIntoleranceResponseType;
> +  } hosts[] = {

Is it worth doing this all in-line? (as in, maybe it would be more clear to separate the definition of the struct TLSIntoleranceHost and the declaration of the array hosts)

@@ +43,5 @@
> +      SSL_INTOLERANCE_VERSION_1_2, SSL_INTOLERANCE_PROTOCOL_VERSION_ALERT
> +    },
> +  };
> +
> +  const TLSIntoleranceHost * host = GetHostForSNI(aSrvNameArr,

"*host" would be more consistent
Attachment #803985 - Flags: review?(dkeeler) → review+
(In reply to David Keeler (:keeler) from comment #50)
> ::: security/manager/ssl/src/nsNSSIOLayer.cpp
> @@ +112,5 @@
> >      mProviderFlags(providerFlags),
> >      mSocketCreationTimestamp(TimeStamp::Now()),
> >      mPlaintextBytesRead(0)
> >  {
> > +  mTLSVersionRange.min = 0;
> 
> It's not a big deal, but I'm bothered I can't figure out what changed here.

LF -> CRLF it seems.
(In reply to Cykesiopka from comment #51)
> (In reply to David Keeler (:keeler) from comment #50)
> > ::: security/manager/ssl/src/nsNSSIOLayer.cpp
> > @@ +112,5 @@
> > >      mProviderFlags(providerFlags),
> > >      mSocketCreationTimestamp(TimeStamp::Now()),
> > >      mPlaintextBytesRead(0)
> > >  {
> > > +  mTLSVersionRange.min = 0;
> > 
> > It's not a big deal, but I'm bothered I can't figure out what changed here.
> 
> LF -> CRLF it seems.

In that case, let's not do that - my understanding is LF is preferred over CRLF.
No longer blocks: 901698
Attached patch fix-build.patch (obsolete) — Splinter Review
This patch fixes a bug (no null termination of the TLS intolerance host array) and fixes the build on B2G (apparently B2G can't handle local structure definitions used as template argument types).
Attachment #816049 - Flags: review?(dkeeler)
Comment on attachment 813892 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0 (rebased and review comments addressed)

Alex, thanks for contributing this patch. I actually already have fixed the review comments from the previous reviews that need fixing. Mostly I am waiting on NSS changes now. If we don't get the review of the NSS changes soon, then we can check in the change without the test that requires bug 909162 to be fixed in NSS. Let's wait a couple of days though.
Attachment #813892 - Flags: feedback?(brian)
Comment on attachment 816049 [details] [diff] [review]
fix-build.patch

Review of attachment 816049 [details] [diff] [review]:
-----------------------------------------------------------------

Cool.

::: security/manager/ssl/tests/unit/tlsserver/cmd/TLSIntoleranceServer.cpp
@@ +18,5 @@
> +  const char *mHostName;
> +  SSLIntoleranceTypeFlags mIntoleranceTypes;
> +  SSLIntoleranceResponseType mIntoleranceResponseType;
> +} hosts;
> + 

trailing space

::: security/manager/ssl/tests/unit/tlsserver/lib/TLSServer.h
@@ +50,2 @@
>  inline const Host *
>  GetHostForSNI(const SECItem *aSrvNameArr, uint32_t aSrvNameArrSize,

Do we need to change anything about how this gets called?
Attachment #816049 - Flags: review?(dkeeler) → review+
(In reply to David Keeler (:keeler) from comment #56)
> ::: security/manager/ssl/tests/unit/tlsserver/lib/TLSServer.h
> @@ +50,2 @@
> >  inline const Host *
> >  GetHostForSNI(const SECItem *aSrvNameArr, uint32_t aSrvNameArrSize,
> 
> Do we need to change anything about how this gets called?

I am not sure what you mean.
(In reply to Brian Smith (:briansmith, was :bsmith@mozilla.com) from comment #57)
> (In reply to David Keeler (:keeler) from comment #56)
> > ::: security/manager/ssl/tests/unit/tlsserver/lib/TLSServer.h
> > @@ +50,2 @@
> > >  inline const Host *
> > >  GetHostForSNI(const SECItem *aSrvNameArr, uint32_t aSrvNameArrSize,
> > 
> > Do we need to change anything about how this gets called?
> 
> I am not sure what you mean.

For example, https://mxr.mozilla.org/mozilla-central/source/security/manager/ssl/tests/unit/tlsserver/cmd/OCSPStaplingServer.cpp#249 calls GetHostForSNI - I just wanted to make sure that there weren't any changes that needed to happen to that line.
(In reply to David Keeler (:keeler) from comment #58)
> For example,
> https://mxr.mozilla.org/mozilla-central/source/security/manager/ssl/tests/
> unit/tlsserver/cmd/OCSPStaplingServer.cpp#249 calls GetHostForSNI - I just
> wanted to make sure that there weren't any changes that needed to happen to
> that line.

OK. No changes are needed. The type of sOCSPHosts is OCSPHost[13], and the compiler figures that out automatically. This uses the same technique described here:
http://stackoverflow.com/questions/6376000/how-does-this-array-size-template-work
Comment on attachment 803983 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0 (rebased and review comments addressed)

Review of attachment 803983 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/integration/mozilla-inbound/rev/5db1f111ac19
Attachment #803983 - Flags: checkin+
Whiteboard: [leave open]
Comment on attachment 813892 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0 (rebased and review comments addressed)

Thanks for your help on this bug, Alex.
Attachment #813892 - Attachment is obsolete: true
Depends on: 927209
Is this fixed with comment 62?
Is there a reason this is still open?
Flags: needinfo?(brian)
(In reply to Florian Bender from comment #63)
> Is this fixed with comment 62?

The fallback from TLS 1.2 -> TLS 1.1 -> TLS 1.0 -> SSL 3.0 was implemented.

(In reply to Kurt Roeckx from comment #64)
> Is there a reason this is still open?

Some of the tests either need to be rewritten, dropped, or we need to resolve bug 909162. Unfortunately, resolving bug 909162 did not go as smoothly as I would like. I am working on it.
Flags: needinfo?(brian)
Was the telemetry probe (for servers that triggered the fallback) added in any of these patches? I thought there was a separate bug covering this, but I can not find it … I'm asking because Bug 742500 might disable them too soon (I'd keep it at least until Fx 35 if not Fx 40, considering how slow the web adopts network updates).
Flags: needinfo?(brian)
(In reply to Florian Bender from comment #66)
> Was the telemetry probe (for servers that triggered the fallback) added in
> any of these patches? I thought there was a separate bug covering this, but
> I can not find it … I'm asking because Bug 742500 might disable them too
> soon (I'd keep it at least until Fx 35 if not Fx 40, considering how slow
> the web adopts network updates).

https://wiki.mozilla.org/Security/Features/TLS_Telemetry -> bug 707275
Comment on attachment 803983 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0 (rebased and review comments addressed)

[Approval Request Comment]
Bug caused by (feature/regressing bug #): none (new improvement)

User impact if declined: Users who have enabled TLS 1.1 and/or TLS 1.2 in mozilla-beta builds will have reduced compatibility with some sites that do not implement TLS version negotiation correctly. This is up to 1% or so of websites. This patch also makes it possible to apply the already-approved patch for bug 754356 cleanly; otherwise, I have to resolve merge conflicts by hand. This patch also makes it possible for us to enable TLS 1.1 in mozilla-beta

Testing completed (on m-c, etc.): All of the automated tests that haven't yet been checked in (due to delays in NSS-related coordination only) pass and this has been working successfully on m-c and m-a for weeks.

Risk to taking this patch (and alternatives if risky): This is a very low-risk change. We've had no reports of any difficulty related to this change whatsoever on Nightly or Aurora channels.

String or IDL/UUID changes made by this patch: none
Attachment #803983 - Flags: approval-mozilla-beta?
Ah yes, thanks Alex. Clearing ni?.
Flags: needinfo?(brian)
(In reply to Kurt Roeckx from bug 733647 comment 48)
> So some stats:
> June 2013:
> - 169,801 no problems
> - 58 servers intolerant to TLS 1.0+
> - 1,691 servers intolerant to TLS 1.1+
> - 4 servers intolerant only to TLS 1.1
> - 56 servers intolerant only to TLS 1.2
> 
> 1,809 problematic servers in total.
> 
> 
> November 2013:
> 
> Total servers: 163,587
> 
> TLS 1.0 intolerance        9
> TLS 1.1 intolerance    1,388
> TLS 1.2 intolerance    1,448 (~ 0.9%)
> TLS 1.3 intolerance   17,840 (~10.9%)
> TLS 2.98 intolerance 122,698 (~75.0%)
> 
> Long handshake intolerance: 4,795 (~2.9%)
> 
> Numbers provided by Ivan Ristic from ssllabs.

These numbers show that this workaround is necessary for many people that enable TLS 1.1 in any version of Firefox. Good justification for the uplift to m-b, IMO. Thanks Kurt.
Comment on attachment 803983 [details] [diff] [review]
Expand TLS intolerance logic to work for versions beyond TLS 1.0 (rebased and review comments addressed)

approving for uplift since it's low risk and early in beta cycle.
Attachment #803983 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Needs a branch-specific patch. Brian, please close this bug if nothing else is happening for it on m-c either.
Flags: needinfo?(brian)
Whiteboard: [leave open]
No longer blocks: 936818
Depends on: 936818
Marking this FIXED after filing bug 936818 for actually landing the tests.
No longer depends on: 909162
Flags: needinfo?(brian)
Target Milestone: --- → mozilla27
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO me if you want a response) from comment #74)
> Marking this FIXED after filing bug 936818 for actually landing the tests.

For real this time.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment on attachment 803985 [details] [diff] [review]
Integration test for TLS version intolerance logic

Replaced by the patch in attachment 830014 [details] [diff] [review] in bug 936818.
Attachment #803985 - Attachment is obsolete: true
Comment on attachment 816049 [details] [diff] [review]
fix-build.patch

This patch was folded into the patch in attachment 830014 [details] [diff] [review] [diff] [review] in bug 936818.
Attachment #816049 - Attachment is obsolete: true
Depends on: 939498
Keywords: verifyme
Is there any help needed for manual testing here? If yes can you please provide some steps in order to reproduce/verify this?
Flags: needinfo?(brian)
Since no steps/guidelines for verifying this were available, QA just tested browsing secure sites to make sure this fix didn't regress anything: https://etherpad.mozilla.org/Fx26b10-TLS-SSL.
Keywords: verifyme
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #80)
> Is there any help needed for manual testing here? If yes can you please
> provide some steps in order to reproduce/verify this?

Sorry I missed this. We actually did a bunch of automated testing (mwobensmith did) to verify that this was working.
Flags: needinfo?(brian)
Depends on: 1084025
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: