1084025 - Disable insecure TLS version fallback

Reporter

Description

•

10 years ago

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0
Build ID: 20141011015303

Steps to reproduce:

I connect to a server providing TLS 1.0, TLS 1.1 and TLS 1.2.


Actual results:

An attacker blockes the handshake if TLS 1.1 or TLS 1.2 is used.

The Browser downgrades the TLS version to TLS 1.0.


Expected results:

The Browser MUST not downgrade the security level. Since the TLS protocol negotiates the TLS version itself.

The offending code is in nsNSSIOLayer.cpp:

void
nsSSLIOLayerHelpers::adjustForTLSIntolerance(const nsACString& hostName,
                                             int16_t port,
                                             /*in/out*/ SSLVersionRange& range)
{
  IntoleranceEntry entry;

  {
    nsCString key;
    getSiteKey(hostName, port, key);

    MutexAutoLock lock(mutex);
    if (!mTLSIntoleranceInfo.Get(key, &entry)) {
      return;
    }
  }

  entry.AssertInvariant();

  if (entry.intolerant != 0) {
    // We've tried connecting at a higher range but failed, so try at the
    // version we haven't tried yet, unless we have reached the minimum.
    if (range.min < entry.intolerant) {
      range.max = entry.intolerant - 1;
    }
  }
}

Stefan Franke

Reporter

Updated

•

10 years ago

OS: Windows 7 → All

Hardware: x86_64 → All

Stefan Franke

Reporter

Updated

•

10 years ago

Component: Untriaged → Security

Stefan Franke

Reporter

Updated

•

10 years ago

Severity: normal → critical

Loic

Comment 1

•

10 years ago

Mozilla is already aware, see https://blog.mozilla.org/security/2014/10/14/the-poodle-attack-and-the-end-of-ssl-3-0/

"SSLv3 will be disabled by default in Firefox 34, which will be released on Nov 25. The code to disable it is landing today in Nightly, and will be promoted to Aurora and Beta in the next few weeks."

Status: UNCONFIRMED → RESOLVED

Closed: 10 years ago

Resolution: --- → WORKSFORME

Stefan Franke

Reporter

Comment 2

•

10 years ago

This ticket is not related to SSL 3.0

It's about bogus TLS 1.2 downgrade to TLS 1.0

Status: RESOLVED → UNCONFIRMED

Resolution: WORKSFORME → ---

Masatoshi Kimura [:emk]

Assignee

Comment 3

•

10 years ago

Then how is this bug related to POODLE?

Summary: (POODLE) TLS is downgraded by Browser → TLS is downgraded by Browser

Stefan Franke

Reporter

Comment 4

•

10 years ago

Part of POODLE is "On Downgraded Legacy Encryption".

An attacker can still provoke using TLS 1.0 instead of TLS 1.2.

And TLS 1.0 is weaker than TLS 1.2.

Masatoshi Kimura [:emk]

Assignee

Comment 5

•

10 years ago

It applies to downgrading attacks generally. How is it specially related to POODLE?

Masatoshi Kimura [:emk]

Assignee

Comment 6

•

10 years ago

Now you can effectively disable the insecure fallback by setting "security.tls.version.fallback-limit" to 3.

I found a discussion about removing the insecure fallback in dev.tech.crypto:
https://groups.google.com/d/msg/mozilla.dev.tech.crypto/Qis_B8b2b7U/CCOouHbcLN4J

FYI, Brian Smith introduced the insecure fallback to TLS 1.1 or later (Bug 839310). According to his comment, he didn't care much about making Firefox more secure than only-TLS1.0-capable browsers.
https://bugzilla.mozilla.org/show_bug.cgi?id=839310#c6

Blocks: 839310

Status: UNCONFIRMED → NEW

Component: Security → Security: PSM

Depends on: 1083058

Ever confirmed: true

Product: Firefox → Core

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 7

•

10 years ago

Please don't close bugs that you didn't open, unless you are a module owner or module peer in this module.

Back when I wrote my comments in bug 839310, our goal was to turn TLS 1.2 and TLS 1.1 on in Firefox ASAP. Because we wanted to do so ASAP, and because other browsers were already doing the non-secure fallback, we didn't have much choice except to do the fallback the same way the others do. I always planned on trying to find some way to do so with minimal compatibility impact, but I left Mozilla before I could do so.

Now, you can see on the IETF TLS WG mailing list that I am arguing that all browsers should remove the non-secure fallback. And, I've provided a pretty good argument as to why. See [1]. I believe now is the best time, politically, to disable the non-secure fallback in all browsers, and there are practical security concerns motivating the decision to do so. Mozilla should take the lead in doing so, so that other browsers feel comfortable in joining them. However, that said, I have not done the research to verify how plausible that is, as far as compatibility is concerned, and I haven't talked to Firefox management about whether they're willing to accept the compatibility risk. David Keeler is the PSM module owner now. So, it will be up to other people, at Mozilla, to decide this.

[1] http://www.ietf.org/mail-archive/web/tls/current/msg14072.html

Julien Pierre

Comment 8

•

10 years ago

As a former developer of NSS from 2001-2009, I wholeheartedly agree with Brian.
I have always thought that this PSM insecure fallback was crazy.
IMO, it's high time for it to go.

Martin Thomson [:mt:]

Comment 9

•

10 years ago

There is mitigation for this: bug 1083058 allows you to control this behaviour now.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 10

•

10 years ago

(In reply to Martin Thomson [:mt] from comment #9)
> There is mitigation for this: bug 1083058 allows you to control this
> behaviour now.

That's not a real mitigation, because we can't expect end-users to know about that preference. What matters is the default behavior.

Masatoshi Kimura [:emk]

Assignee

Comment 11

•

10 years ago

Attached patch Disable insecure version fallback (Nightly only) (obsolete) — Details — Splinter Review

According to [1], TLS 1.2 intolerance servers were under 1% even a year ago.
Let's experiment the fallback-free world.

[1] http://www.ietf.org/mail-archive/web/tls/current/msg10657.html

Attachment #8511585 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Attachment #8511585 - Attachment description: bug1084025_disable_insecure_fallback → Disable insecure version fallback (Nightly only)

Martin Thomson [:mt:]

Comment 12

•

10 years ago

What information do you plan to collect from this experiment?  The screams of frustrated users?  We haven't had many as a result of disabling SSL 3.0, even in Aurora, so I'm not sure that we'll be able to definitely say that the experiment succeeded.  Failure is easier to detect, but I don't want to have this experiment fail.

Here's an experiment that might work, and the only cost is latency (which can be tuned differently).

Leave fallback in place for now.  But - with some probability - ignore any previously learned intolerance information.  Then, record telemetry if a previously determined intolerant version becomes marked as tolerant; and the converse if the value is re-confirmed.  We can use this to learn about the false fallback rates.  From that, we can determine what reasons for fallback can be removed.

With SSL 3.0 out of the picture, this is definitely worth pursuing in one form or other, but I think that the above experiment might be a better first step.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 13

•

10 years ago

I extrapolated from Mozilla's telemetry data that disabling SSL 3.0 will cause about 7 times as many failures as disabling non-secure fallback from TLS 1.2 to lower versions. I think it is likely that the failure can be fixed in a different way than re-enabling the TLS intolerance fallback logic. Also, many of the things that need to be done to make disabling the TLS intolerance fallback logic tolerable should also be done prior to disabling SSL 3.0, to minimize the compatibility impact of SSL 3.0.

I think it is a good idea to land Masatoshi's patch from bug 1088915 before landing a patch for this bug, because Masatoshi's patch from bug 1088915 is likely to be a foundation for some compatibility workarounds that don't involve version fallback.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 14

•

10 years ago

Comment on attachment 8511585 [details] [diff] [review]
Disable insecure version fallback (Nightly only)

Review of attachment 8511585 [details] [diff] [review]:
-----------------------------------------------------------------

Please make sure you get David's review before checking this in.

::: netwerk/base/public/security-prefs.js
@@ +3,5 @@
>   * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
>  pref("security.tls.version.min", 1);
>  pref("security.tls.version.max", 3);
> +#ifdef NIGHTLY_BUILD

I don't think it is a good idea to have these conditions in these patches. Instead, you could just do this unconditionally, and then when Nightly gets uplifted to Aurora, the change can be backed out of Aurora, if it is causing problems. That is the normal way of doing things and it works well.

Attachment #8511585 - Flags: review+

Masatoshi Kimura [:emk]

Assignee

Comment 15

•

10 years ago

Attached patch Disable insecure version fallback, v2 — Details — Splinter Review

OK, it would be simple enough in case we have to flip it back.

Attachment #8511941 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Attachment #8511585 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Attachment #8511585 - Attachment is obsolete: true

Dave Garrett

Updated

•

10 years ago

Summary: TLS is downgraded by Browser → Disable insecure TLS version fallback

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 16

•

10 years ago

Comment on attachment 8511941 [details] [diff] [review]
Disable insecure version fallback, v2

Review of attachment 8511941 [details] [diff] [review]:
-----------------------------------------------------------------

I'm all for trying this, but we need some way of measuring the outcome (e.g. a telemetry count of failures due to not falling back). r=me with that.

Attachment #8511941 - Flags: review?(dkeeler) → review+

Masatoshi Kimura [:emk]

Assignee

Comment 17

•

10 years ago

(In reply to David Keeler (:keeler) [use needinfo?] from comment #16)
> I'm all for trying this, but we need some way of measuring the outcome (e.g.
> a telemetry count of failures due to not falling back). r=me with that.

SSL_TLS12_INTOLERANCE_REASON_PRE will serve the purpose.

Stefan Franke

Reporter

Comment 18

•

10 years ago

> I'm all for trying this, but we need some way of measuring the outcome (e.g.
> a telemetry count of failures due to not falling back). r=me with that.

Does this telemetry contain the information how many failures are caused by attackers?

Masatoshi Kimura [:emk]

Assignee

Comment 19

•

10 years ago

(In reply to Stefan Franke from comment #18)
> > I'm all for trying this, but we need some way of measuring the outcome (e.g.
> > a telemetry count of failures due to not falling back). r=me with that.
> 
> Does this telemetry contain the information how many failures are caused by
> attackers?

It's impossible by definition. If it's possible, we could stop the fallback only when attackers shut down the connection.

Masatoshi Kimura [:emk]

Assignee

Comment 20

•

10 years ago

Attached patch Add telemetry to measure failures due to not falling back (obsolete) — Details — Splinter Review

I had to change the logic to reduce false positives, so requesting an additional review.

Attachment #8513857 - Flags: review?(dkeeler)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 21

•

10 years ago

Comment on attachment 8513857 [details] [diff] [review]
Add telemetry to measure failures due to not falling back

Review of attachment 8513857 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing this. In general this is good, but it will need rebasing (see comments). Also, I made a suggestion for a slightly different way of collecting the data that I think will give us more insight. r- for now.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +879,5 @@
>  
>    MutexAutoLock lock(mutex);
>  
> +  bool fallbackLimitReached =
> +    intolerant <= minVersion || intolerant <= mVersionFallbackLimit;

I was thinking we would want to distinguish between hitting the limit from mVersionFallbackLimit vs. hitting minVersion (and maybe even depending on what minVersion is?)

How about we have a histogram with 8 buckets. The first 4 will be for hitting minVersion, for minVersion=0, 1, 2, 3. The last 4 will be the same, but for mVersionFallbackLimit. Does that sound like useful data to collect?

@@ +892,4 @@
>        // We already know that the server is intolerant at a lower version.
>        return true;
> +    } else if (fallbackLimitReached) {
> +      // We can't fall back any further. Assume that intolerance isn't the issue.

Looks like this will have to be rebased on top of the recent changes from bug 1088950.

@@ +906,5 @@
> +    Telemetry::Accumulate(Telemetry::SSL_FALLBACK_LIMIT_REACHED, true);
> +    return false;
> +  }
> +
> +

nit: unnecessary blank line

::: toolkit/components/telemetry/Histograms.json
@@ +6382,5 @@
>      "n_values": 64,
>      "description": "TLS/SSL version intolerance was falsely detected, server rejected handshake"
>    },
> +  "SSL_FALLBACK_LIMIT_REACHED": {
> +    "expires_in_version": "never",

We should probably have an expiration on this, since this is more or less a temporary experiment, right? Version 40 or maybe 42 is probably good.

Attachment #8513857 - Flags: review?(dkeeler) → review-

Masatoshi Kimura [:emk]

Assignee

Comment 22

•

10 years ago

Attached patch Add telemetry to measure failures due to not falling back (obsolete) — Details — Splinter Review

Resolved review comments.

Attachment #8513857 - Attachment is obsolete: true

Attachment #8515427 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Comment 23

•

10 years ago

Attached patch Add telemetry to measure failures due to not falling back (obsolete) — Details — Splinter Review

Moved the telemetry call to the caller.

Attachment #8515897 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Attachment #8515427 - Attachment is obsolete: true

Attachment #8515427 - Flags: review?(dkeeler)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 24

•

10 years ago

Comment on attachment 8515897 [details] [diff] [review]
Add telemetry to measure failures due to not falling back

Review of attachment 8515897 [details] [diff] [review]:
-----------------------------------------------------------------

I think we can gather this telemetry with fewer changes to the code. Also, be careful about how you convert version values to histogram buckets. For example, TLS 1.0 is represented by the value 0x0301, not 0x01.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +879,5 @@
>    MutexAutoLock lock(mutex);
> +  forgetIntolerance(key, lock);
> +}
> +
> +void nsSSLIOLayerHelpers::forgetIntolerance(const nsCString& key,

I'm not sure refactoring this function is useful.

@@ -903,5 @@
>  {
> -  if (intolerant <= minVersion || intolerant <= mVersionFallbackLimit) {
> -    // We can't fall back any further. Assume that intolerance isn't the issue.
> -    forgetIntolerance(hostName, port);
> -    return false;

These changes to rememberIntolerantAtVersion change the control flow in a way that I'm not convinced is equivalent to the original code. Why can't we add the bucket determination/telemetry accumulation here?

@@ +940,5 @@
> +    if (intolerant <= minVersion) {
> +      fallbackLimitBucket += minVersion;
> +    }
> +    if (intolerant <= mVersionFallbackLimit) {
> +      fallbackLimitBucket += (mVersionFallbackLimit << 2);

We should document how this is encoded (i.e. what each bucket in the histogram corresponds to).
And, actually, this won't work as-is, because mVersionFallbackLimit is something like 0x0301, which puts the bucket out of range for just 16 values.

@@ +1296,5 @@
>    if (!socketInfo->SharedState().IOLayerHelpers()
>                   .rememberIntolerantAtVersion(socketInfo->GetHostName(),
>                                                socketInfo->GetPort(),
> +                                              range.min, range.max, err,
> +                                              fallbackLimitBucket)) {

I think it would be best if rememberIntolerantAtVersion accumulated the appropriate telemetry on its own, rather than having an extra output parameter.

Attachment #8515897 - Flags: review?(dkeeler) → review-

Masatoshi Kimura [:emk]

Assignee

Comment 25

•

10 years ago

(In reply to David Keeler (:keeler) [use needinfo?] from comment #24)
> ::: security/manager/ssl/src/nsNSSIOLayer.cpp
> @@ +879,5 @@
> >    MutexAutoLock lock(mutex);
> > +  forgetIntolerance(key, lock);
> > +}
> > +
> > +void nsSSLIOLayerHelpers::forgetIntolerance(const nsCString& key,
> 
> I'm not sure refactoring this function is useful.

MutexAutoLock is an RAII helper for mozilla::Mutext [1] which is a bare wrapper around PRLock [2] which is not recursive at least on Linux [3][4]. I moved a forgetIntolerance call into the MutexAutoLock scope, so I have to add a non-lock version of the function.
[1] https://developer.mozilla.org/ja/docs/Namespace/Mozilla/MutexAutoLock
[2] https://developer.mozilla.org/en-US/docs/Namespace/Mozilla/Mutex
[3] https://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/pthreads/ptsynch.c?rev=3a49b88c5bb0#52
[4] https://stackoverflow.com/questions/19863734/what-is-pthread-mutex-adaptive-np#answer-19863902

> @@ -903,5 @@
> >  {
> > -  if (intolerant <= minVersion || intolerant <= mVersionFallbackLimit) {
> > -    // We can't fall back any further. Assume that intolerance isn't the issue.
> > -    forgetIntolerance(hostName, port);
> > -    return false;
> 
> These changes to rememberIntolerantAtVersion change the control flow in a
> way that I'm not convinced is equivalent to the original code. Why can't we
> add the bucket determination/telemetry accumulation here?

I'm attempting to avoid accumulating the telemetry if we already know whether the server is tolerant at the version. I called it "false positives" in comment #20. If we don't have to exclude this case, the second patch is unneccessary in the first place because SSL_TLS12_INTOLERANCE_REASON_PRE will be sufficient as I said in comment #17. Do you think this patch is not needed?

Flags: needinfo?(dkeeler)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 26

•

10 years ago

What scenarios cause these false positives?

Assignee: nobody → VYV03354

Flags: needinfo?(dkeeler) → needinfo?(VYV03354)

Masatoshi Kimura [:emk]

Assignee

Comment 27

•

10 years ago

1. Access example.com with TLS version 1.2. Firefox will remember the server is tolerant at TLS 1.2.
2. Press the reload button. Access failed due to network glitches.
3. rememberIntolerantAtVersion will find that the version reaches the fallback limit. It will accumulate the new telemetry and returns false while we already know the server is tolerant at version 1.2. I called it the "false positive".

Moreover, retryDueToTLSIntolerance will accumulate SSL_TLS12_INTOLERANCE_REASON_PRE and will not accumulate SSL_TLS12_INTOLERANCE_REASON_POST whenever rememberIntolerantAtVersion returns true. Therefore the new telemetry is redundant.

Flags: needinfo?(VYV03354)

Masatoshi Kimura [:emk]

Assignee

Comment 28

•

10 years ago

> whenever rememberIntolerantAtVersion returns true.

Correction: whenever rememberIntolerantAtVersion returns false.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

10 years ago

Comment 29

•

10 years ago

Attached patch Add telemetry to measure failures due to not falling back, v2 (obsolete) — Details — Splinter Review

I'm not still convinced of the usefulness of this telemetry, but I can live with that and I would like to land this early in the cycle.

> We should document how this is encoded (i.e. what each bucket in the
> histogram corresponds to).

I think the new code is self-documenting enough.

Attachment #8515897 - Attachment is obsolete: true

Attachment #8530477 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Comment 30

•

10 years ago

Attached patch Add telemetry to measure failures due to not falling back, v3 — Details — Splinter Review

OK, I found a much cleaner way to reduce false positives.

Attachment #8530477 - Attachment is obsolete: true

Attachment #8530477 - Flags: review?(dkeeler)

Attachment #8530480 - Flags: review?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Comment 31

•

10 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1084025

Yuhong Bao

Comment 32

•

10 years ago

My feeling is that the best thing would be to set a clear deadline to disable TLS 1.2 intolerance fallback (I suggest 6 months to a year), and to announce it widely. TLS 1.3 intolerance will probably take longer to solve though, though I recently tried to notify a few sites.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 33

•

10 years ago

(In reply to Yuhong Bao from comment #32)
> My feeling is that the best thing would be to set a clear deadline to
> disable TLS 1.2 intolerance fallback (I suggest 6 months to a year), and to
> announce it widely.

That cannot be done until more is learned about TLS 1.2/1.1 intolerance. EMK's patch in this bug will help us learn more about it. With EMK's patch, people at Mozilla can use Matt Wobensmith's compatibility testing system to identify some of the most popular servers that would be affected by this change. EMK's patch can always be augmented with improved compatibility logic and/or backed out if it is too disruptive. 

There are many compatibility issues that get treated as "version intolerance" that aren't really version intolerance, so once there is a (partial) list of affected sites, people can triage that list to find out if it is really version intolerance, or intolerance to a specific cipher suite, or intolerance to a specific extension, or intolerance to the specific ordering of extensions in NSS's ClientHello, or something else. Then safer workarounds can be implemented for those other issues (e.g. reordering cipher suites, and/or reordering extensions, and/or asking CAs to give their customers better certificates).

Yuhong Bao

Comment 34

•

10 years ago

(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #33)
> (In reply to Yuhong Bao from comment #32)
> > My feeling is that the best thing would be to set a clear deadline to
> > disable TLS 1.2 intolerance fallback (I suggest 6 months to a year), and to
> > announce it widely.
> 
> That cannot be done until more is learned about TLS 1.2/1.1 intolerance.
> EMK's patch in this bug will help us learn more about it. With EMK's patch,
> people at Mozilla can use Matt Wobensmith's compatibility testing system to
> identify some of the most popular servers that would be affected by this
> change. EMK's patch can always be augmented with improved compatibility
> logic and/or backed out if it is too disruptive. 

I agree actually, and that the deadline set should depend on how many servers are affected.

Stefan Franke

Reporter

Comment 35

•

10 years ago

(In reply to Yuhong Bao from comment #34)
> (In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment
> #33)
> > (In reply to Yuhong Bao from comment #32)
> > > My feeling is that the best thing would be to set a clear deadline to
> > > disable TLS 1.2 intolerance fallback (I suggest 6 months to a year), and to
> > > announce it widely.
> > 
> > That cannot be done until more is learned about TLS 1.2/1.1 intolerance.
> > EMK's patch in this bug will help us learn more about it. With EMK's patch,
> > people at Mozilla can use Matt Wobensmith's compatibility testing system to
> > identify some of the most popular servers that would be affected by this
> > change. EMK's patch can always be augmented with improved compatibility
> > logic and/or backed out if it is too disruptive. 
> 
> I agree actually, and that the deadline set should depend on how many
> servers are affected.

Summary:
Since it is impossible to distinguish between a bad server or an attacked server, attacks are permanently supported.

The intention of this tickt was to prevent attacks.

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 36

•

10 years ago

Comment on attachment 8530480 [details] [diff] [review]
Add telemetry to measure failures due to not falling back, v3

Review of attachment 8530480 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM with nits addressed.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +874,1 @@
>                                              int16_t port)

nit: indentation (also, put the return type on its own line)

@@ +917,5 @@
> +    uint32_t fallbackLimitBucket = 0;
> +    // added if the version has reached the min version.
> +    if (intolerant <= minVersion) {
> +      switch (minVersion) {
> +      case SSL_LIBRARY_VERSION_TLS_1_0:

nit: indent the body of the switch statement one more level

@@ +931,5 @@
> +    }
> +    // added if the version has reached the fallback limit.
> +    if (intolerant <= mVersionFallbackLimit) {
> +      switch (mVersionFallbackLimit) {
> +      case SSL_LIBRARY_VERSION_TLS_1_0:

nit: indentation

::: security/manager/ssl/src/nsNSSIOLayer.h
@@ +224,5 @@
>                                     uint16_t intolerant, uint16_t minVersion,
>                                     PRErrorCode intoleranceReason);
>    bool rememberStrongCiphersFailed(const nsACString& hostName, int16_t port,
>                                     PRErrorCode intoleranceReason);
> +  // returns the known tolerant version

nit: add "or 0 if there is no known tolerant version" or equivalent

::: toolkit/components/telemetry/Histograms.json
@@ +6456,5 @@
> +  "SSL_FALLBACK_LIMIT_REACHED": {
> +    "expires_in_version": "default",
> +    "kind": "enumerated",
> +    "n_values": 16,
> +    "description": "TLS/SSL version fallback reached the limit, stopped the fallback"

Which limit? From what I can tell, this telemetry will be collected both when the version fallback limit is reached and when the minimum enabled version is reached. We should document that here.

Attachment #8530480 - Flags: review?(dkeeler) → review+

Masatoshi Kimura [:emk]

Assignee

Comment 37

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/35f5ec149ad5
https://hg.mozilla.org/integration/mozilla-inbound/rev/38616c7e2da9

Status: NEW → ASSIGNED

Masatoshi Kimura [:emk]

Assignee

Comment 38

•

10 years ago

And followup to fix the wrong description:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa0ba58df12f

Carsten Book [:Tomcat]

Comment 39

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/35f5ec149ad5
https://hg.mozilla.org/mozilla-central/rev/38616c7e2da9
https://hg.mozilla.org/mozilla-central/rev/fa0ba58df12f

Status: ASSIGNED → RESOLVED

Closed: 10 years ago → 10 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla37

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

10 years ago

Updated

•

10 years ago

Depends on: 1109613

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Depends on: 1095507

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 40

•

10 years ago

I don't think bug 1095507 or any bugs for SSL3-only sites should be linked to *this* bug; instead, they should get linked (only) to the POODLEBITES bug. It's better to keep the dependency list of this bug reserved for TLS 1.2 and TLS 1.1 intolerance bugs. In particular, reverting the patch for this bug won't fix any of those SSL3-only sites.

No longer depends on: 1095507

Masatoshi Kimura [:emk]

Assignee

Comment 41

•

10 years ago

www.kredodirect.com.ua is not only SSLv3-only but also TLS 1.1+ intolerant. If the site administrator only enables TLS 1.0 without fixing the intolerance, Nightly still cannot access the site.
Should I file a separate bug even if the site URL happens to be the same?

Flags: needinfo?(brian)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 42

•

10 years ago

(In reply to Masatoshi Kimura [:emk] from comment #41)
> www.kredodirect.com.ua is not only SSLv3-only but also TLS 1.1+ intolerant.
> If the site administrator only enables TLS 1.0 without fixing the
> intolerance, Nightly still cannot access the site.
> Should I file a separate bug even if the site URL happens to be the same?

I would only file bugs for the case where the site already supports TLS 1.0+ but no longer works due to this patch. i.e. only for sites where backing out this patch would get the site working again.

It is possible that the site admin might fix the bug in a way that makes this bug relevant, but it is unlikely that will happen.

Flags: needinfo?(brian)

Masatoshi Kimura [:emk]

Assignee

Comment 43

•

10 years ago

> i.e. only for sites where backing out this patch would get the site working again.

The site will work again if *both* bug 1076983 and this bug are backed out. So I set the dependency on both.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 44

•

10 years ago

(In reply to Masatoshi Kimura [:emk] from comment #43)
> > i.e. only for sites where backing out this patch would get the site working again.
> 
> The site will work again if *both* bug 1076983 and this bug are backed out.
> So I set the dependency on both.

Yes, but at this point I think it is safe to assume that bug 1076983 won't get backed out, so it doesn't matter.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 45

•

10 years ago

Attached patch Part 3: Clean up some bits — Details — Splinter Review

When we change a preference in security-prefs.js or firefox.js or elsewhere, we also need to make a change like the one in this patch.

Attachment #8535930 - Flags: review?(dkeeler)

Attachment #8535930 - Flags: review?(VYV03354)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

10 years ago

Attachment #8535930 - Attachment description: disable-non-secure-fallback-p3.patch → Part 3: Clean up some bits

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 46

•

10 years ago

David and Masatoshi-san, I think we should uplift these patches to mozilla-aurora and mozilla-beta. I just saw the latest SSL Pulse report at https://www.trustworthyinternet.org/ssl-pulse/ that indicates that 10% of the (approx.) top 200,000 sites have the POODLE vulnerability. That's quite a large percentage.

Obviously, this change has some higher-than-normal compatibility risks. But, the setting is controlled by a pref, so we can revert it if necessary. At the very least, it would be good to get the code (including the new telemetry changes) into Firefox 35 so that there is the *option* of controlling this with minimal risk, should things really get out of hand. WDYT?

Flags: needinfo?(dkeeler)

Flags: needinfo?(VYV03354)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Updated

•

10 years ago

Attachment #8535930 - Flags: review?(dkeeler) → review+

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 47

•

10 years ago

I wouldn't be opposed to it, but I think we need more bake time for the pref change. We could start by uplifting the telemetry changes, see how the numbers look, and go from there.

Flags: needinfo?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Flags: needinfo?(VYV03354)

Attachment #8535930 - Flags: review?(VYV03354) → review+

Masatoshi Kimura [:emk]

Assignee

Comment 48

•

10 years ago

Oops, needinfo request was accidentally cleared. I don't disagree with uplifting the change.

Martin Thomson [:mt:]

Comment 49

•

10 years ago

(In reply to David Keeler (:keeler) [use needinfo?] from comment #47)
> I wouldn't be opposed to it, but I think we need more bake time for the pref
> change. We could start by uplifting the telemetry changes, see how the
> numbers look, and go from there.

Do you think that a few days on Nightly would suffice?  Or maybe we can get as much time as possible before the next release?  Is this worth a chemspill release?  It seems like it might, almost.

I think that we should consider updating the "SSL version control" add-on to bump this pref, or shipping another similar add-on, so that we can provide some sort of coverage in the meantime.  Richard, see comment 46.

Flags: needinfo?(rlb)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 50

•

10 years ago

Thanks for the quick reviews!

This is part 3: https://hg.mozilla.org/integration/mozilla-inbound/rev/e6e990808c4f

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 51

•

10 years ago

Comment on attachment 8530480 [details] [diff] [review]
Add telemetry to measure failures due to not falling back, v3

Approval Request Comment
[Feature/regressing bug #]: bug 1076983, the POODLE attack
[User impact if declined]: Increased exposure to the POODLE attack.
[Describe test coverage new/current, TBPL]: This is just telemetry. This has been on mozilla-central for 2014-12-09.

[Risks and why]: This is just telemetry. Seems like low risk. (:emk, :keeler: correct me if I'm wrong). AFAICT, no issues have been reported that have to deal with this *telemetry*.

[String/UUID change made/needed]: none

It is important that Mozilla be prepared to land all the patches in this bug in case the POODLE stuff gets real bad. Some would argue it is already really bad. In the discussion above, we agreed to uplift the lowest-risk part (the telemetry) ahead of a possible decision to uplift the riskier parts (the actual disabling of the non-secure fallback logic). Landing this patch in -aurora and -beta ASAP will reduce risk later.

Attachment #8530480 - Flags: approval-mozilla-beta?

Attachment #8530480 - Flags: approval-mozilla-aurora?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 52

•

10 years ago

Comment on attachment 8511941 [details] [diff] [review]
Disable insecure version fallback, v2

Approval Request Comment
[Feature/regressing bug #]: See above uplift request for the telemetry.

[User impact if declined]: Less testing of the impact of disabling non-secure version fallback. Without this testing, it will be riskier to decide whether to uplift this to -beta later. Note that :mt has already hinted at uplifting it to -release, even. If so, we need as much testing as possible.

[Describe test coverage new/current, TBPL]: The test coverage from pre-existing tests is good. This landed on -central on 2014-12-09.

[Risks and why]: Note that this uplift request is for -aurora only, not -beta. My hope is that testing on -central and -aurora over the next few days will go well, and then we'll uplift to -beta. The risk for -aurora is low, because it is so easy to undo (just revert the patch, which just flips a pref). Risk for -release is much higher, because there are potentially serious and hard-to-predict compatibility issues that can arise from this. (This bug is disabling a feature that trades security for compatibility, because it turns out the security tradeoff is much larger than we expected.) Firefox would be the first browser to turn off this compatibility feature. Although Mozilla can and should do additionally compatibility testing before uplifting to -beta and -release, lots of end-user testing on -aurora and soon -beta will be needed because of compatibility issues that arise due to anti-virus software and proxies, which cannot easily be tested.

[String/UUID change made/needed]: None

Attachment #8511941 - Flags: approval-mozilla-aurora?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 53

•

10 years ago

Comment on attachment 8535930 [details] [diff] [review]
Part 3: Clean up some bits

Approval Request Comment: See the most recent, aurora-only, comment.

Attachment #8535930 - Flags: approval-mozilla-aurora?

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 54

•

10 years ago

As you can see from the above uplift requests, my suggestion for handling this is to uplift to -aurora immediately. Concurrently with that, somebody should run the pkix-compatibility-test thing to find known-incompatible servers. The results of the pkix-compatibility-test should be triaged so that compatibility issues that Gecko can work around can be fixed, and so that Mozilla can contact server vendors about compatibility fixes. Then, a decision should be made about uplifting to -beta or, like :mt hinted at, to -release. Because the compatibility issues are due to things that pkix-compatibility-test cannot measure, as much end-user testing as possible is needed. It would be good for Mozilla to make a special public call-to-action to ask volunteers to test this.

Masatoshi Kimura [:emk]

Assignee

Comment 55

•

10 years ago

Attached patch Telemetry patch for aurora — Details — Splinter Review

Folded https://hg.mozilla.org/integration/mozilla-inbound/rev/38616c7e2da9 and https://hg.mozilla.org/integration/mozilla-inbound/rev/fa0ba58df12f for ease of uplift.

Masatoshi Kimura [:emk]

Assignee

Comment 56

•

10 years ago

Attached patch Telemetry patch for beta — Details — Splinter Review

The nightly/aurora patch did not apply beta cleanly.

Martin Thomson [:mt:]

Comment 57

•

10 years ago

Before we get carried away with uplifts to release channels, we have to ask if this is going to fix a real problem (this in particular: http://arstechnica.com/security/2014/12/meaner-poodle-bug-that-bypasses-tls-crypto-bites-10-percent-of-websites/).  That flaw hits a large number of sites, but doesn't get fixed by having a newer TLS version.  For instance, bankofamerica.com negotiates a CBC mode with TLS 1.2.

It might be addressed if an AEAD cipher suite is selected, but I haven't seen any evidence that the affected sites are selecting GCM, even though a number of them support TLS 1.2.  Someone asked if we were willing to mark CBC as "weak" in the same way that we mark RC4, but I suspect that the net effect will be that we slow down 90% of the web to no measurable improvement in security.

(Sorry if I got too enthuasiastic about this.  I thought that it might help.  After looking at the numbers, I'm fairly sure it will have a negligible impact.)

Flags: needinfo?(rlb)

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 58

•

10 years ago

(In reply to Martin Thomson [:mt] from comment #57)
> That flaw hits a large number of
> sites, but doesn't get fixed by having a newer TLS version.  For instance,
> bankofamerica.com negotiates a CBC mode with TLS 1.2.

Good point.

> It might be addressed if an AEAD cipher suite is selected, but I haven't
> seen any evidence that the affected sites are selecting GCM, even though a
> number of them support TLS 1.2.

>  Someone asked if we were willing to mark
> CBC as "weak" in the same way that we mark RC4, but I suspect that the net
> effect will be that we slow down 90% of the web to no measurable improvement
> in security.

I agree.

> (Sorry if I got too enthuasiastic about this.  I thought that it might help.
> After looking at the numbers, I'm fairly sure it will have a negligible
> impact.)

It would be good to see the actual numbers and methodology used to collect them.

Also, peek in mind that Ivan's current number of 0.7% of websites (7/1,000) seems to greatly overestimate the compatibility impact of this change, because when he collected those numbers he put 0x0303 in the ClientHello record layer, but browsers (including Firefox) put 0x0301. I think he's planning on collecting new numbers, but in the meantime the Firefox telemetry numbers are a better indicator of compatibility impact. As I said above, the compatibility impact is about 1/7th the compatibliity impact of disabling SSL 3.0, based on the Firefox telemetry.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 59

•

10 years ago

Also, I think that uplifting the additional telemetry to -aurora and -beta makes sense, regardless of the final decision on whether to uplift the actual change. And, I think it makes sense to uplift the preference change to -aurora for now, to get (slightly) more testing before a decision is made.

Lukas Blakk [:lsblakk] use ?needinfo

Comment 60

•

10 years ago

Comment on attachment 8530480 [details] [diff] [review]
Add telemetry to measure failures due to not falling back, v3

Approving the telemetry portion for now.

Attachment #8530480 - Flags: approval-mozilla-beta?

Attachment #8530480 - Flags: approval-mozilla-beta+

Attachment #8530480 - Flags: approval-mozilla-aurora?

Attachment #8530480 - Flags: approval-mozilla-aurora+

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Keywords: checkin-needed

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Attachment #8535985 - Flags: checkin?

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Attachment #8535986 - Flags: checkin?

Lukas Blakk [:lsblakk] use ?needinfo

Updated

•

10 years ago

status-firefox35: --- → affected

status-firefox36: --- → affected

status-firefox37: --- → fixed

tracking-firefox35: --- → +

tracking-firefox36: --- → +

Ryan VanderMeulen [:RyanVM]

Comment 61

•

10 years ago

Comment on attachment 8535985 [details] [diff] [review]
Telemetry patch for aurora

We have bug queries specfically for uplifts. Adding flags all over just clutters up other queries.

Attachment #8535985 - Flags: checkin?

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Attachment #8535986 - Flags: checkin?

Martin Thomson [:mt:]

Comment 62

•

10 years ago

(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #58)
> > (Sorry if I got too enthuasiastic about this.  I thought that it might help.
> > After looking at the numbers, I'm fairly sure it will have a negligible
> > impact.)
> 
> It would be good to see the actual numbers and methodology used to collect
> them.

I'm using these numbers, which were derived from telemetry (see slide 12): https://tools.ietf.org/agenda/91/slides/slides-91-saag-3.pdf

The key information I'm looking for is the number of sites that have GCM, but don't otherwise select it.  This indicates that it's better than you'd think.  25% moved off RC4; 1.2% moved from RC4 to GCM.  That's probably the 1.2% that were concerned about BEAST and weren't able to order for <=1.0 differently.  I don't expect that we can do anything at all about the 1.2% who moved to 3DES, and I don't typically see someone picking a CBC mode over GCM if GCM is present.  I don't see any significant moves.

Let's look at what the telemetry tells us about fallback.  I'm still concerned that we aren't buying ourselves much value here.  BEAST mitigation is perhaps the best we can hope for.  Maybe that's enough.
 
> Also, peek in mind that Ivan's current number of 0.7% of websites (7/1,000)
> seems to greatly overestimate the compatibility impact of this change,
> because when he collected those numbers he put 0x0303 in the ClientHello
> record layer, but browsers (including Firefox) put 0x0301. I think he's
> planning on collecting new numbers, but in the meantime the Firefox
> telemetry numbers are a better indicator of compatibility impact. As I said
> above, the compatibility impact is about 1/7th the compatibliity impact of
> disabling SSL 3.0, based on the Firefox telemetry.

Based on nightly, we hit our fallback limit pretty rarely.  I have 122k hitting the fallback (mostly due to TCP RST) against 2008k that encounter a fallback condition.  But we're only falling back now for RC4, so it's hard to tell how many of those might have succeeded if we let them progress.

Most fallbacks fall all the way through, so it's hard to get a good sense of what is going on.

Ryan VanderMeulen [:RyanVM]

Comment 63

•

10 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/409a02246b8a
https://hg.mozilla.org/releases/mozilla-beta/rev/e07fa09385d5

Ryan VanderMeulen [:RyanVM]

Updated

•

10 years ago

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 64

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/e6e990808c4f

:Gijs (he/him)

Updated

•

10 years ago

Depends on: 1111354

Alice0775 White

Updated

•

10 years ago

Depends on: 1112110

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Depends on: 1112399

Alexander L. Slovesnik

Updated

•

10 years ago

Depends on: 1095507

Masatoshi Kimura [:emk]

Assignee

Comment 65

•

10 years ago

(In reply to Masatoshi Kimura [:emk] from comment #41)
> www.kredodirect.com.ua is not only SSLv3-only but also TLS 1.1+ intolerant.
> If the site administrator only enables TLS 1.0 without fixing the
> intolerance, Nightly still cannot access the site.

Sadly I got it right :(

Sylvestre Ledru [:Sylvestre]

Comment 66

•

10 years ago

Comment on attachment 8511941 [details] [diff] [review]
Disable insecure version fallback, v2

Brian, Can an user override the error page?

Flags: needinfo?(brian)

Attachment #8511941 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Sylvestre Ledru [:Sylvestre]

Updated

•

10 years ago

Attachment #8535930 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Masatoshi Kimura [:emk]

Assignee

Comment 67

•

10 years ago

The user can change the "security.tls.version.fallback-limit" pref value from about:config to bring back the old behavior.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 68

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #66)
> Comment on attachment 8511941 [details] [diff] [review]
> Disable insecure version fallback, v2
> 
> Brian, Can an user override the error page?

No.

(In reply to Masatoshi Kimura [:emk] from comment #67)
> The user can change the "security.tls.version.fallback-limit" pref value
> from about:config to bring back the old behavior.

Yes. Another (better) option would be to create a pref that is a comma-separated list of domain names for which non-secure fallback to TLS 1.0 is still done. We could even pre-populate this pref with a list of domains we know are TLS 1.2 intolerant. This would help ensure that the feature stays enable for release.

Flags: needinfo?(brian)

Masatoshi Kimura [:emk]

Assignee

Updated

•

10 years ago

Depends on: 1114816

Masatoshi Kimura [:emk]

Assignee

Comment 69

•

10 years ago

Filed bug 1114816.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

10 years ago

Depends on: 1115883

Lukas Blakk [:lsblakk] use ?needinfo

Comment 70

•

10 years ago

We're too late in 35 to take this, wontfixing.

status-firefox35: affected → wontfix

Martin Thomson [:mt:]

Updated

•

9 years ago

Depends on: 1116891

Masatoshi Kimura [:emk]

Assignee

Updated

•

9 years ago

No longer depends on: 1116891

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

status-firefox36: affected → fixed

Martin Thomson [:mt:]

Comment 71

•

9 years ago

I think that I would like to see this in aurora for another release before taking this to release.  We have seen a few sites that break already, and we know that release and beta users encounter more legacy sites than on nightly and aurora.  :keeler, :emk, :briansmith, am I being too cautious here? Or could this actually risk losing us users?

:sylvestre, Providing that others agree with me, how do I ensure that this doesn't accidentally ride the trains?

Flags: needinfo?(sledru)

Flags: needinfo?(dkeeler)

Sylvestre Ledru [:Sylvestre]

Comment 72

•

9 years ago

Matt is working on an analysis of the impact of this change. With clear numbers, it should be easier to make a call on this for 36.

Flags: needinfo?(sledru) → needinfo?(mwobensmith)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Comment 73

•

9 years ago

36 is about to go to beta, right? I think it would be worthwhile to see the impact of this on the beta population (given that aurora and nightly are much smaller and different populations). If there's too much breakage, we can back it out of 36 once it's in beta.

Flags: needinfo?(dkeeler)

Masatoshi Kimura [:emk]

Assignee

Updated

•

9 years ago

Depends on: 1120942

Matt Wobensmith [:mwobensmith][:matt:]

Comment 74

•

9 years ago

I added two more sites to bug 1116891 that appear to be TLS intolerant in some form.

In short, only about a dozen sites out of over 200k run seem to be affected by this change, if I'm not mistaken.

I did notice a small amount of sites broken by our recent removal of 1024-bit signed root certs (bug 881553) but that's to be expected.

In summary, I'm not seeing a huge impact.

Flags: needinfo?(mwobensmith)

Martin Thomson [:mt:]

Comment 75

•

9 years ago

If you are willing to stand by that, I think that I'm willing to proceed with this.  If we get a catastrophic number of complaints in Beta, that might be a cue for backout, but I think that we're fine.  I should ping the other browser vendors about this.

Michael Catanzaro

Comment 76

•

9 years ago

(In reply to Martin Thomson [:mt] from comment #75)
> I should ping the other browser vendors about this.

Hi. I work on WebKitGTK+, which is used by many desktop applications and smaller browsers. I want to drop insecure TLS version fallback, but hesitate to do so because we already have enough problems with HTTPS sites that Firefox handles and we don't. If you go through with this change, then we will drop insecure fallback from libsoup, affecting both WebKitGTK+ and EFL WebKit.

This change will also be good for the web, since TLS intolerance is a chicken-and-egg problem: servers will likely be fixed much faster if a major browser like Firefox stops connecting, and other major browsers will hesitate to make the change until another one does first.

Matt Wobensmith [:mwobensmith][:matt:]

Comment 77

•

9 years ago

Martin, what would you consider to be an indicator of whether we ship this or not? Percent of sites affected? Specific sites affected? And - to be sure - the "connection reset" issues (bug 1116891) that only happen in Fx36 are solely due to this change, right?

Martin Thomson [:mt:]

Comment 78

•

9 years ago

(In reply to Michael Catanzaro from comment #76)
> If you go through with this change, then we
> will drop insecure fallback from libsoup, 

Thanks for the feedback. I have received positive signals from Microsoft and a "good luck with that" from Google.  Note that we will retain the preference that will re-enable this (that helps us back out the change too); and we will still likely retain the fallback code when we start deploying TLS 1.3.

(In reply to Matt Wobensmith from comment #77)
> Martin, what would you consider to be an indicator of whether we ship this
> or not? Percent of sites affected? Specific sites affected? 

Probably a combination of all of the above.  If we see a noticeable increase in connection failures in telemetry, that's a bad sign.  When we dropped SSL3, the change was hard to detect, though we might have lost somewhere in the order of 0.3%.  If this has a similar effect, then I think we're good.

> And - to be sure
> - the "connection reset" issues (bug 1116891) that only happen in Fx36 are
> solely due to this change, right?

Yes, those sites are intolerant to our handshake in various interesting ways.  Those should be addressed by making our fallback to RC4 more sensitive to different failures though (i.e., :emk's patch on that bug).

Interesting observation:  Removing fallback entirely has an interesting effect on exposing initial page loads to network glitches.  With fallback enabled, an initial page load has a higher probability of success because we get two or more attempts at connecting (assuming the server doesn't support the downgrade SCSV).  The glitch rate seems to be quite high (based on the number of spurious inappropriate_fallback alerts we're seeing).  Therefore, I suspect that it will be very hard to properly distinguish between mere network glitches and real intolerance with the telemetry we have.  That's why your testing is so valuable.

Matt Wobensmith [:mwobensmith][:matt:]

Comment 79

•

9 years ago

(In reply to Martin Thomson [:mt] from comment #78)

> Probably a combination of all of the above.  If we see a noticeable increase
> in connection failures in telemetry, that's a bad sign.  When we dropped
> SSL3, the change was hard to detect, though we might have lost somewhere in
> the order of 0.3%.  If this has a similar effect, then I think we're good.

I use two sets of data for testing SSL compatibility - the Pulse top sites list (220k) and the Google CT list (1.8 million). The former is more interesting as far as active, popular sites, while the latter contains a lot of outliers.

For Fx36, it appears that when using the Pulse list, we have a failure rate of less than .005% for this specific change. My findings were recorded in bugs 1116891, 1116892 and 1116893. To me, this seems very small. Affected sites are listed in those bugs.

I haven't run the entire Google CT list due to time constraints, but I did a pass yesterday with a smaller random selection of 100k sites. From that sample, I am seeing a failure rate of around .024% (24 out of 100k). I haven't dug into all errors individually, but the majority appear to be the same as bug 1116891. Another data point - many broken sites from that list appear abandoned or often unused (default IIS home page). I have the error site list but it is not conclusive since it is only a subset. If you need a full run with the Google CT list, it will take more time.

Let me know if you need more from me.

Masatoshi Kimura [:emk]

Assignee

Comment 80

•

9 years ago

Looks like this was not actually landed on beta:
https://mxr.mozilla.org/mozilla-beta/source/netwerk/base/public/security-prefs.js?rev=e23ba6e29b84&mark=7-7#7

status-firefox36: fixed → affected

Ryan VanderMeulen [:RyanVM]

Comment 81

•

9 years ago

Yeah, those approvals got missed due to the other approval noise from this bug (doing things the way this bug did them makes uplift tracking a big PITA, fwiw). Anyway, you'll need to request beta approval on those patches at this point.

Flags: needinfo?(VYV03354)

Ryan VanderMeulen [:RyanVM]

Comment 82

•

9 years ago

Also, Part 3 depends on bug 1093724, so presumably you're going to want to request beta approval on that as well.

Sylvestre Ledru [:Sylvestre]

Comment 83

•

9 years ago

Indeed, next time, it would be nice to have a bug for the experiment and an other for the change itself.

Anyway, is it a big deal if it is not released with 36 but with 37?

Martin Thomson [:mt:]

Comment 84

•

9 years ago

:sylvestre, I don't think the delay harms us much, if at all.  This does have the potential to cause problems.

Matt Wobensmith [:mwobensmith][:matt:]

Comment 85

•

9 years ago

Martin, any idea then what caused the TLS intolerance regressions in Fx36, if it wasn't this change? Was it bug 861310?

Flags: needinfo?(martin.thomson)

Martin Thomson [:mt:]

Comment 86

•

9 years ago

Matt, I don't know what you tested with.  I think that the patch is on Aurora, and it only missed the uplift to Beta.  If you tested on Aurora, then this would still be the cause.  None of the regressions that I've seen are likely to be as a result of false start changes.

Flags: needinfo?(martin.thomson)

Masatoshi Kimura [:emk]

Assignee

Comment 87

•

9 years ago

IMO if we land this on beta, we should land some mitigations either (e.g. bug 1112399 and bug 1114816).

Flags: needinfo?(VYV03354)

Masatoshi Kimura [:emk]

Assignee

Comment 88

•

9 years ago

(In reply to Martin Thomson [:mt] from comment #86)
> Matt, I don't know what you tested with.  I think that the patch is on
> Aurora, and it only missed the uplift to Beta.  If you tested on Aurora,
> then this would still be the cause.  None of the regressions that I've seen
> are likely to be as a result of false start changes.

Matt refers bug 1116891 to 1116893. And indeed bug 1116891 and 1116892 happen on Firefox 36 (currently Beta, was Aurora when Matt tested that).

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 89

•

9 years ago

I'd suggest that you do try to enable this in Firefox 36 beta, and make a decision whether to ship it in the release in the last beta cycle.

Reasoning:

* The non-secure fallback logic makes it easy for an active attacker to downgrade connections from the secure AEAD cipher suites to RC4 (and CBC-mode) cipher suites, because the AEAD cipher suites only work for TLS 1.2.

* There is a talk [1] scheduled for around March 24th, which is AFTER the Firefox 36 release and BEFORE the Firefox 37 release. Deferring this change until Firefox 37 means that it won't be helpful during the period of time between March 24 and the Firefox 37 release.

* I've heard rumors that another paper on attacking RC4 in TLS that will be released soon--maybe even sooner than the Blackhat paper.

* :emk has already written a patch for adding a whitelist of known-TLS-intolerant sites to Firefox.

* If Matt re-runs his compatibility tests against the entire Google CT dataset, he will be able to report a fairly comprehensive list of sites that can be added to the default value of the whitelist.

* Uplifting the patch for bug 1112399 (and bug 1096197) will allow Beta channel users to report additional websites that are broken through the "tell Mozilla about this" feature on the SSL error page. By monitoring this feedback, Mozilla can expand the whitelist further as necssary.

* Using the Hotfix update, Mozilla can dynamically expand the set of sites in the known-TLS-intolerant whitelist without having to do a chemspill release.

* There is telemetry for measuring this feature, which can be monitored during the next three weeks in order to make a decision whether to ship this in the final release.

* The patches that would need to be uplifted are all low risk and all are easy to undo.

* Even if it is decided to be too risky to ship in Firefox 36, and the feature gets delayed until Firefox 37, Firefox 37 would still benefit from the additional 3 weeks of learning about how the feature works on Firefox 36 beta.

We had a very similar situation with TLS 1.2 around this time last year. Firefox ended up getting some notably bad press due to being 2-3 weeks slower than people who decided to make a huge deal about the fact that it didn't support TLS 1.2 (even though it practically didn't matter). I think it would be good to learn from that experience and be more preemptive here.

[1] https://www.blackhat.com/asia-15/briefings.html#bar-mitzva-attack-breaking-ssl-with-13-year-old-rc4-weakness

Lawrence Mandel [:lmandel] (use needinfo)

Comment 90

•

9 years ago

This seems like a change that we should list in our release notes. Can you help with the wording? Is "Disabled insecure TLS version fallback" clear enough? Is this specific to a version of TLS or is there any other way that we should further qualify this change?

tracking-firefox37: --- → +

relnote-firefox: --- → ?

Flags: needinfo?(VYV03354)

Yuhong Bao

Comment 91

•

9 years ago

RC4 has nothing to do with TLS version fallback though.

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Comment 92

•

9 years ago

(In reply to Yuhong Bao from comment #91)
> RC4 has nothing to do with TLS version fallback though.

Yes it does, because many websites will use RC4 when the browser uses a TLS 1.1 or TLS 1.0 ClientHello, but will use AES-GCM when the browser uses a TLS 1.2 ClientHello.

Masatoshi Kimura [:emk]

Assignee

Comment 93

•

9 years ago

(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #90)
> This seems like a change that we should list in our release notes. Can you
> help with the wording? Is "Disabled insecure TLS version fallback" clear
> enough? Is this specific to a version of TLS or is there any other way that
> we should further qualify this change?

Brian could give much better wording than me.

Flags: needinfo?(VYV03354)

Martin Thomson [:mt:]

Comment 94

•

9 years ago

I think that :lmandel's suggestion is fine.

Sylvestre Ledru [:Sylvestre]

Comment 95

•

9 years ago

(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #89)
> I'd suggest that you do try to enable this in Firefox 36 beta, and make a
> decision whether to ship it in the release in the last beta cycle.
Thanks for the long and impressive rational :)
I would like to see the results of Matt's tests before making a call on this.

Matt Wobensmith [:mwobensmith][:matt:]

Comment 96

•

9 years ago

Attached file fx36_tls_fallback_3_errors.txt — Details

I've run site compatibilty for Fx36.0b2, changing the pref "security.tls.version.fallback-limit" from default of 1 to 3. I ran the Pulse top sites list of ~211k sites and diffed the results.

With the fallback limit set to 3, we see 561 uniquely broken sites, or approximately 0.27% rounded up. I'm attaching the list of broken sites to this bug. Of the broken sites:

- About half (284) have "NetworkInterruptError" with error string "pr_end_of_file_error". I've examined half a dozen of these sites and all have TLS version intolerance, so I'm guessing that might be the primary reason.

- The rest - including errors labeled "undefined" - appear to be a grab-bag of issues that are caused when the server is forced to downgrade its TLS version, such as attempting to use SSLv3.

That's the quantitative part. The qualitative part is that - while many sites on this list would be kindly labeled "inactive" - there are also some high-visibility sites such as "products.geotrust.com" and three properties for "bankofthewest.com".

If you need me to run this test against Google's CT list - 10x the domains - it will be at least another week of work, as I'm out of the office part of this week and couldn't begin until 2015-02-04. In the meantime, I hope the above gives you some insight into feasibility for inclusion into Fx36.

Lastly, please note that previous test results in comment 79 were erroneously thought to be testing this change, but were in fact due to bug 1088915. I've factored those failures out of this specific testing.

Yuhong Bao

Comment 97

•

9 years ago

Adding Rick Andrews as products.geotrust.com was mentioned in comment 96.

Dave Garrett

Updated

•

9 years ago

Depends on: TLS-Intolerance

Dave Garrett

Comment 98

•

9 years ago

(In reply to Matt Wobensmith from comment #96)
> The qualitative part is that - while many
> sites on this list would be kindly labeled "inactive" - there are also some
> high-visibility sites such as "products.geotrust.com" and three properties
> for "bankofthewest.com".

If there is a small set of high-visibility sites like this, then filing a dozen or so TE bugs for them now might be a good idea.

I just created bug 1126620 to start to organize this. Users will undoubtedly file more.

Masayuki Nakano [:masayuki] (he/him)(JST, +0900)(away 4/27 - 5/6)

Updated

•

9 years ago

Depends on: 1126652

Masayuki Nakano [:masayuki] (he/him)(JST, +0900)(away 4/27 - 5/6)

Updated

•

9 years ago

Depends on: 1126654

Masatoshi Kimura [:emk]

Assignee

Comment 99

•

9 years ago

Comment #96 summary:
0x804b0014 / NetworkError / pr_connect_reset_error: 1
0x804b0047 / NetworkInterruptError / pr_end_of_file_error: 282
0x804b0014 / NetworkError / pr_end_of_file_error: 2
0x805a2ff0 / SecurityError / ssl_error_bad_mac_alert: 5
0x805a2ff1 / SecurityError / ssl_error_bad_mac_read: 47
0x805a2fc3 / SecurityError / ssl_error_handshake_failure_alert: 1
0x805a2fc5 / SecurityError / ssl_error_handshake_unexpected_alert: 3
0x805a2ffe / SecurityError / ssl_error_no_cypher_overlap: 137
0x804b0014 / NetworkError / ssl_error_no_cypher_overlap: 1
0x805a2f9e / SecurityError / ssl_error_protocol_version_alert: 26
0x8000ffff / undefined / undefined: 56
I also manually checked "undefined" sites:
	pr_end_of_file_error: 37
	ssl_error_bad_mac_read: 3
	ssl_error_no_cypher_overlap: 15
	(Loading forever): 1

Sylvestre Ledru [:Sylvestre]

Comment 100

•

9 years ago

(In reply to Matt Wobensmith from comment #96)
> With the fallback limit set to 3, we see 561 uniquely broken sites, or
> approximately 0.27% rounded up. I'm attaching the list of broken sites to
> this bug.
561 websites is a big number, too big compared to the release date (less than 3 weeks).
I think we should do in 37 (with the agreement of the 37 driver, Lawrence).

status-firefox36: affected → wontfix

Yuhong Bao

Comment 101

•

9 years ago

I think it is more like 5 weeks than 3 weeks.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 102

•

9 years ago

(In reply to Yuhong Bao from comment #101)
> I think it is more like 5 weeks than 3 weeks.

If you're referring to the Firefox 36 release date, the release is scheduled for Feb 24, which is 26 days from now.

Masatoshi Kimura [:emk]

Assignee

Comment 103

•

9 years ago

Removing evang bugs because bug 1126620 will track them.

Depends on: 1128227
No longer depends on: 1095507, 1111354, 1112110, 1115883, 1126652, 1126654

Rick Andrews

Comment 104

•

9 years ago

About products.geotrust.com, here's the server configuration: Weblogic 10.3, using the Certicom SSL stack, with only TLS1 enabled and no support for TLS1.1/1.2. 

Is there something unique about that configuration that causes your logic to fail?

Yuhong Bao

Comment 105

•

9 years ago

It is more likely a bug in the server software than in the configuration, but thanks for naming the server software affected.

Rick Andrews

Comment 106

•

9 years ago

Are you asking us to take any particular action?

Yuhong Bao

Comment 107

•

9 years ago

See if you can obtain a fix from the vendor.

Rick Andrews

Comment 108

•

9 years ago

It's not clear to me from reading this thread exactly what "fix" I would ask for. Can you explain the bug as you see it?

Yuhong Bao

Comment 109

•

9 years ago

The server should respond with a ServerHello showing the TLS 1.0 (SSL 3.1) version in response to a ClientHello showing any later version (such as TLS 1.2 and TLS 1.3), basically.

Dave Garrett

Comment 110

•

9 years ago

(In reply to Rick Andrews from comment #108)

The total fix, however, is to upgrade your SSL/TLS stack to a version which supports TLS 1.2. (preferably also with FS & AES-GCM) TLS 1.0 will eventually be phased out as well and TLS 1.2 provides far better security. It would be best for everyone to start upgrading long before vendors have to do another panic disable for TLS 1.0 like was needed for SSL 3. (of course, nobody knows what the timetables are here, but being complacent is not good) TLS clients which support 1.2 would generally be expected to handle fallbacks correctly.

Qualys SSL Labs server test is the best way to check for TLS version intolerance at the moment:
https://www.ssllabs.com/ssltest/analyze.html?d=products.geotrust.com

The versions it breaks with due to intolerance are listed in the "Protocol Details" section. To not break with current standards, it can't be 1.1 or 1.2 intolerant (or extension intolerant, which is not a problem here). Your server is also intolerant to 1.3, so when the new version comes out you'll have issues there as well.

Further discussion on this topic should be done elsewhere, as this bug is resolved. New TLS intolerance TE bugs can be opened as blocking bug 1126620.

Rick Andrews

Comment 111

•

9 years ago

We're already working on upgrading the site, so I'm not going to file any bugs with any vendor.

Ed Morley [:emorley]

Updated

•

9 years ago

Depends on: 1129925

Matt Wobensmith [:mwobensmith][:matt:]

Comment 112

•

9 years ago

Attached file fx36_fallback3_googlect_results.txt — Details

I have completed a pass on Fx36 using the Google Certificate Transparency list of ~2.6 million sites, with the same parameters as what I used in comment 96. I am attaching a list of broken sites to this bug.

In summary:

* 7647 uniquely broken sites out of 2678142 sites total, or roughly 0.286% breakage.
* Most errors look like what we see with the previous run against the Pulse list - issues when downgrading to an unsupported TLS protocol or TLS intolerance.
* IMPORTANT: I have not individually verified all ~7k broken sites as legitimately broken. This is too time-consuming to do right now. This testing is imperfect and has noise - especially with a data set this large - so please allow a small degree of error. However, this still helps us understand the scope of the changes we wish to make.

The results of running this list seems pretty close to the previous run - whose results were meticulously confirmed - so I'm reasonably confident that we know how invasive this change will be, should we choose to do it.

Lawrence Mandel [:lmandel] (use needinfo)

Comment 113

•

9 years ago

relnoted as "Disabled insecure TLS version fallback for site security"

relnote-firefox: ? → 37+

Masatoshi Kimura [:emk]

Assignee

Comment 114

•

9 years ago

Looks like api.geotrust.com and products.geotrust.com were fixed, but the following subdomains are still broken:
canadaca.geotrust.com
services.geotrust.com
smarticon.geotrust.com

Masatoshi Kimura [:emk]

Assignee

Updated

•

9 years ago

Keywords: site-compat

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

9 years ago

Blocks: 901695

Brian Smith (:briansmith, :bsmith, use NEEDINFO?)

Updated

•

9 years ago

Blocks: 936818

perlith

Comment 116

•

9 years ago

Folks - while I applaud the technical decision about this - there needs to be better communication to end users about this change.  Forums are absolutely littered with discussion on this topic.  End users don't know or care why their web sites are no longer working, they just want them to work.  

We need to have a detailed article discussing this change and PROPER mitigation strategy should a user encounter a website that no longer works.  Stating "contact the website owner" is not acceptable.  Users will dump Firefox and switch to another web browser at that point which "just works" with the given website.

Disable insecure version fallback (Nightly only) 10 years ago Masatoshi Kimura [:emk] 1.19 KB, patch	briansmith : review+	Details \| Diff \| Splinter Review
Disable insecure version fallback, v2 10 years ago Masatoshi Kimura [:emk] 1.14 KB, patch	keeler : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Add telemetry to measure failures due to not falling back 10 years ago Masatoshi Kimura [:emk] 3.33 KB, patch	keeler : review-	Details \| Diff \| Splinter Review
Add telemetry to measure failures due to not falling back 10 years ago Masatoshi Kimura [:emk] 5.56 KB, patch		Details \| Diff \| Splinter Review
Add telemetry to measure failures due to not falling back 10 years ago Masatoshi Kimura [:emk] 30.57 KB, patch	keeler : review-	Details \| Diff \| Splinter Review
Add telemetry to measure failures due to not falling back, v2 10 years ago Masatoshi Kimura [:emk] 3.10 KB, patch		Details \| Diff \| Splinter Review
Add telemetry to measure failures due to not falling back, v3 10 years ago Masatoshi Kimura [:emk] 5.65 KB, patch	keeler : review+ lsblakk : approval-mozilla-aurora+ lsblakk : approval-mozilla-beta+	Details \| Diff \| Splinter Review
Part 3: Clean up some bits 10 years ago Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 1.31 KB, patch	emk : review+ keeler : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Telemetry patch for aurora 10 years ago Masatoshi Kimura [:emk] 6.00 KB, patch		Details \| Diff \| Splinter Review
Telemetry patch for beta 10 years ago Masatoshi Kimura [:emk] 3.43 KB, patch		Details \| Diff \| Splinter Review
fx36_tls_fallback_3_errors.txt 9 years ago Matt Wobensmith [:mwobensmith][:matt:] 38.69 KB, text/plain		Details
fx36_fallback3_googlect_results.txt 9 years ago Matt Wobensmith [:mwobensmith][:matt:] 528.11 KB, text/plain		Details