Closed Bug 944465 Opened 11 years ago Closed 11 years ago

Firefox 26.0b4/b5/b6/b7/b8 cannot connect on www.mymedicare.gov

Categories

(Core :: Security: PSM, defect, P1)

26 Branch
defect

Tracking

()

RESOLVED WORKSFORME
mozilla28
Tracking Status
firefox25 --- unaffected
firefox26 + fixed
firefox27 - affected
firefox28 - affected
b2g-v1.2 --- fixed

People

(Reporter: rvjanc, Assigned: briansmith)

References

Details

(Keywords: regression, sec-other)

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:26.0) Gecko/20100101 Firefox/26.0 (Beta/Release)
Build ID: 20131107161719

Steps to reproduce:

Went to https://www.mymedicare.gov/ using FF 26.0b4/b5/b6/b7/b8 and Aurora 27.0a2 (2013-11-28); got this message:

An error occurred during a connection to www.mymedicare.gov. SSL received a record with an incorrect Message Authentication Code. (Error code: ssl_error_bad_mac_read)


Actual results:

See above. 

Note this works on FF26.0b3.


Expected results:

Should connect.
Brian, this sounds like your turf.
Flags: needinfo?(brian)
Keywords: regression
Regression window(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/35b73bb96ca0
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 ID:20131029074058
Bad:
http://hg.mozilla.org/mozilla-central/rev/829d7bef8b0a
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 ID:20131029192237
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=35b73bb96ca0&tochange=829d7bef8b0a


Regression window(s-c)
Good:
http://hg.mozilla.org/services/services-central/rev/cdfe0db3e762
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 ID:20131029150117
Bad:
http://hg.mozilla.org/services/services-central/rev/829d7bef8b0a
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 ID:20131029192138
Pushlog:
http://hg.mozilla.org/services/services-central/pushloghtml?fromchange=cdfe0db3e762&tochange=829d7bef8b0a



Regression window(b2g-i)
Good:
http://hg.mozilla.org/integration/b2g-inbound/rev/568f78e76079
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0 ID:20131029092212
Bad:
http://hg.mozilla.org/integration/b2g-inbound/rev/a88f75f73769
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:28.0) Gecko/20100101 Firefox/28.0 ID:20131029093633
Pushlog:
http://hg.mozilla.org/integration/b2g-inbound/pushloghtml?fromchange=568f78e76079&tochange=a88f75f73769
Triggered by Bug 733647
FYI
setting security.tls.version.max = 1 fixes this
Blocks: 733647
Status: UNCONFIRMED → NEW
Ever confirmed: true
Not only Firefox, but also IE 11 and Opera Presto 12.16 fail to connect to this site when either TLS 1.1 or 1.2 is enabled.

Firefox 25.0.1
Beta 26.0b8
Aurora 27.0a2 (ID:20131128004001 CSet: 3cadac0a4a78)
Nightly 28.0a1 (ID:20131128030201 CSet: 3c44c1f43a67)
IE 11
Opera 12.16
(tested on Win 7, TLS 1.0 and SSL 3.0 are always enabled)
TLS 1.2 enabled,  TLS 1.1 enabled : bad  (default of Firefox 27a2, 28.0a1, IE 11)
TLS 1.2 enabled,  TLS 1.1 disabled: bad
TLS 1.2 disabled, TLS 1.1 enabled : bad  (default of Firefox 26b8)
TLS 1.2 disabled, TLS 1.1 disabled: good (default of Firefox 25.0.1, Opera 12.16)

Chrome 31.0.1650.57 and Opera Blink 18.0.1284.49 (both TLS 1.1 and 1.2 are enabled by default) can connect to this site, however, connections are on TLS 1.0 even though server says it supports TLS 1.2.
SSL server test: https://www.ssllabs.com/ssltest/analyze.html?d=mymedicare.gov

It seems to be a server-side problem according to these results.
Component: Untriaged → Security: PSM
OS: Mac OS X → All
Product: Firefox → Core
Hardware: x86 → All
Flags: needinfo?(brian)
See Also: → 442983
Flags: needinfo?(brian)
Google Chrome says: "The connection had to be retried using an older version of the TLS or SSL protocol. This typically means that the server is using very old software and may have other security issues.”

So, Chrome is doing TLS intolerance fallback for this issue. However, IE11 and Firefox are not.

I looked at the code for Gecko's version intolerance fallback, and SSL_ERROR_BAD_MAC_READ is already on our list of errors that we fall back for. So, it seems like Gecko's version intolerance fallback logic is not working in all situations that we expect it to work in.
Assignee: nobody → brian
Priority: -- → P1
Target Milestone: --- → mozilla28
Group: crypto-core-security, core-security
Flags: needinfo?(brian)
I think this is a bug in Blink (Chrome, Opera) and Firefox and IE11 are working correctly. The other possibility is that we are not correctly verifying the record MAC in libssl for the Finished message, which would be a serious bug in libssl.

It seems like this site is broken in a bad way that we've never encountered before. The server negotiates TLS 1.2 or TLS 1.1, depending on the maximum version that the client supports. The initial handshake completes, including the verification of the server's Finished message. Then we send some application data (the HTTP request). When we receive the response to the HTTP request, we get the SEC_ERROR_BAD_MAC_READ error.

In Gecko, we have had logic to do TLS intolerance fallback on SEC_ERROR_BAD_MAC_READ for years. However, we have always restricted our TLS intolerance fallback to conditions that happen BEFORE we complete the handshake. That is, we basically only fall back on SEC_ERROR_BAD_MAC_READ if the verification of the mac of the Finished message fails.

In this case, the verification of the Finished message succeeds but the verification of the next encrypted record (the first application_data record) fails. Because this happens after the handshake completes, Firefox will not execute its version intolerance fallback logic. However, Chrome does implement its version intolerance fallback logic. Presumably, this means that Chrome will re-send the HTTP request. However, from the trace, it looks like the server has already processed the HTTP request and has sent back a response. So, Chrome's behavior is only safe if the HTTP request is idempotent.

My current inclination is that Blink should be changed so that it limits its fallback logic similarly to Firefox and presumably IE11 do (limit it to the handshake phase only). Otherwise, Firefox and IE11 are likely going to get pushed to make our TLS intolerance fallback logic even more lax.

There is a (remote) possibility that this is a bug in libssl, where we do not verify the MAC of the server's Finished message correctly (or at all?). This would explain why we get the SEC_ERROR_BAD_MAC_READ error on the second encrypted record instead of the first.

This seems to be a high-profile site. The official Medicare site is https://www.medicare.gov. That page has a login button "MyMedicare.gov login" that takes the user to https://www.mymedicare.gov/, which triggers this error.

For the Firefox 26 release, I guess the safest course of action may be to disable TLS 1.1 support, though that would make me sad.

I will investigate this more on Monday.
Attachment #8340557 - Attachment description: mymedicare.gov.pcapng → mymedicare.gov.pcapng - Wireshark capture of *Chrome* (not Firefox). Firefox is similar, but without the TLS intolerance fallback
FWIW - Most users of this site go directly to https://www.mymedicare.gov/ via a bookmark.
This definitely looks like a major user impacting bug if we ship with TLS 1.1 support enabled so please prepare a disabling patch for our final beta & release candidate builds asap since builds are starting today.
Flags: needinfo?(brian)
Well since we are not the only ones with this bug, and we really want to ship TLS 1.1 (before 1.2 comes in the next Release), how about either taking the risk as-is, or implement a (bad) single-case exception for this site in this Release only?

Since this only concerns flipping a pref, if the fallout is bigger than necessary, the option can be disabled through a maintenance/hotfix addon (if needed).
(In reply to Florian Bender from comment #9)
> Well since we are not the only ones with this bug, and we really want to
> ship TLS 1.1 (before 1.2 comes in the next Release), how about either taking
> the risk as-is, or implement a (bad) single-case exception for this site in
> this Release only?

I share your thinking, Florian. But, I have to admit the negative end-user impact of disabling TLS 1.1 in Firefox 26 is not that big of a deal.

> Since this only concerns flipping a pref, if the fallout is bigger than
> necessary, the option can be disabled through a maintenance/hotfix addon (if
> needed).

Let's pick our battles. When the TLS 1.1 enabling was uplifted to Firefox 26, I said "If we notice any problems then we'll pull TLS 1.1 off of m-b immediately." Although it is encouraging that we have not received reports of other websites being broken in this way, we have to admit we really don't know what other websites have this issue. I'll be working with Matt Wobensmith and others on more wide-ranging compatibility testing regarding the enabling of TLS 1.2/1.1 that should be done while Firefox 27 is in beta, so that Firefox 27 doesn't run into similar last-minute issues.
Thank you Brian, this is better than putting out a hotfix *after* an outcry from users. Let's take advantage of the extra time in FF27 Beta to get through more compatibility testing.  Matt do you already have a list of top sites you typically test against or shall we go with Alexa top 1000 like with MCB?
Flags: needinfo?(mwobensmith)
QA Contact: mwobensmith
http://www.alexa.com/siteinfo/mymedicare.gov

mymedicare.gov is ranked 98,858 on Alexa. We'll have to come up with another mechanism for finding these compatibility issues for sites that are not ultra-popular. Matt's already contacted me about this by email last week. I will follow up with him privately after we've created a concrete plan.
What Brian said.

Going forward, we will increase our compatibility test coverage of all SSL-related features.  It's a high priority.
Clear needinfo.
Flags: needinfo?(mwobensmith)
Blocks: 943259
No longer blocks: 943259
See Also: → 943259
Disabling TLS 1.1 by default on Firefox 26 by this bug will give only strong negative impact to non-US users even though there is no report about compatibility problems on other websites.
1. Alexa rank of mymedicare.gov is low (almost 100,000).
1. Non-US people is completely irrelevant to mymedicare.gov (population of non-US users are much larger than that of US users).
3. Firefox will keep the title "only major browser which does not support either TLS 1.1 or 1.2 by default" for additional 6 weeks.
(In reply to Kosuke Kaizuka from comment #17)
> Disabling TLS 1.1 by default on Firefox 26 by this bug will give only strong
> negative impact to non-US users even though there is no report about
> compatibility problems on other websites.

I understand your concerns. FWIW, some of the delays in enabling TLS 1.1 and TLS 1.2 were due to the need to implement workarounds for non-US sites, including even Japanese sites. If our TLS 1.1 support caused the Japanese NHI website to break similarly, we'd likely do the same thing. Postponing TLS 1.1 until Firefox 27 gives us two months to work with the website and/or the Chromium team to get to a better state. When we break major websites, especially when those websites work in other browsers, then it creates big problems for us. For example, many people try to fix the problem themselves by permanently disabling TLS 1.2/1.1 support. See http://answers.microsoft.com/en-us/ie/forum/ie11-iewindows8_1/cannot-reach-wwwmymedicaregov-win-81ie11/38f01303-3937-437a-9185-07c38c33b410 to see how people find short-term solutions that cause long-term problems. We already had a similar experience last month where a similar issue in Canada resulted in the ISP telling users to switch to Chrome and/or disable TLS 1.0 support in Firefox. Now we have an unknown number of users in Canada that can't do anything but SSL 3.0.

https://www.ssllabs.com/ssltest/analyze.html?d=mymedicare.gov

Interestingly, the ssllabs test for this website gives it an A rating and says that it correctly implements TLS 1.2 and that other browsers will successfully negotiate TLS 1.2 with it. However, I just re-verified that Chrome 31 does TLS intolerance fallback and that IE11 on Windows 7 cannot load it at all. I CC'd Ivan on this bug so he is aware of this. I suspect that ssllabs is only measuring the connection up to the verification of the server's Finished message, and doesn't measure how the server acts to an actual request. Since the brokenness only manifests when the server attempts to respond to a request, ssllabs doesn't notice it, I guess.
(In reply to Brian Smith (:briansmith, was :bsmith; Please NEEDINFO? me if you want a response) from comment #18)
> We already
> had a similar experience last month where a similar issue in Canada resulted
> in the ISP telling users to switch to Chrome and/or disable TLS 1.0 support
> in Firefox. Now we have an unknown number of users in Canada that can't do
> anything but SSL 3.0.

Is there any bug to track this?
(In reply to Masatoshi Kimura [:emk] from comment #19)
> Is there any bug to track this?

It was bug 935394.
Assuming mymedicare.gov fixes this issue within the next few (<= 4) weeks, can we try to reenable TLS 1.1 via the hotfix addon? That way we can get more testing before TLS 1.2 hits release 27. Can the hotfix addon be throttled in rollout? That way we can reduce the impact of enabling it while getting significantly more data.
Regarding the SSL Labs results: most tests do not even try to complete the handshake; they just trust that the server will do what it says it wants to do (e.g., in ServerHello). This was necessary when I started in 2009, when it was difficult to find a TLS 1.2 stack, for example.

Toward the end of the assessment I do submit one full HTTP request (using OpenSSL), and in this case, it fails. (You can see "HTTP status code: Request failed" at the bottom of the page.) The failure is not prominently featured because many too servers fail at that point. I used to randomly check such servers with browsers, and the failures were almost always real. But I couldn't explain them. I suspect those are hostnames that are not really configured to operate as SSL servers.
It seems mymedicare.gov was fixed yesterday or today. Chrome, IE, and Firefox are all successfully loading the site and all are using TLS 1.2 (no TLS intolerance fallback).

(In reply to Ivan Ristic from comment #22)
> Regarding the SSL Labs results: most tests do not even try to complete the
> handshake; they just trust that the server will do what it says it wants to
> do (e.g., in ServerHello). This was necessary when I started in 2009, when
> it was difficult to find a TLS 1.2 stack, for example.

I see. FYI, besides this relatively unusual type of TLS intolerance, a more common form of TLS intolerance can only be detected when the verification of the MAC of the server's Finished message fails. So, I recommend trying to do a full handshake in your tests when it is practical. I can see why it may be less practical to send/receive an HTTP request. Thanks for sharing.
(In reply to Brian Smith (:briansmith, was :bsmith; Please NEEDINFO? me if you want a response) from comment #23)
> It seems mymedicare.gov was fixed yesterday or today. Chrome, IE, and
> Firefox are all successfully loading the site and all are using TLS 1.2 (no
> TLS intolerance fallback).
> 

I can verify that using FF26b8
(In reply to Brian Smith (:briansmith, was :bsmith; Please NEEDINFO? me if you want a response) from comment #23)
> It seems mymedicare.gov was fixed yesterday or today. Chrome, IE, and
> Firefox are all successfully loading the site and all are using TLS 1.2 (no
> TLS intolerance fallback).
> 

I have confirmed using both Firefox 26.0b8 and 26.0b10 with any combination of security.tls.version.(max|min) including (max = 3, min = 3: only TLS 1.2 enabled) and (max = 2, min = 2: only TLS 1.1 enabled).
Chrome 31, IE 11, Opera 18 and Opera 12.16 can also connect using TLS 1.1/1.2 without any fallback.
Brian, do you still think this is a security issue that needs to be hidden?  Thanks.
Flags: needinfo?(brian)
Wan-Teh, please see my description of what I think is a bug in Chrome, which is one reason this bug is hidden in security-group. Do you think that comment comment 6 merits this bug remaining hidden?
Flags: needinfo?(brian)
Flags: needinfo?(wtc)
Brian: thank you for the heads-up. I inspected your Wireshark packet capture.
I have tracked down the Chromium source code (it's not in Blink) that allows
TLS version fallback on SSL protocol errors that occur after the handshake
is completed. Please keep this bug hidden for now while I investigate it.
Flags: needinfo?(wtc)
Keywords: sec-other
Whiteboard: [leave open] → [leave open][keep hidden for Chrome investigation]
Brian: I think we can close this bug as WONTFIX or WORKSFORME now.

Please keep the bug hidden until 2014-02-28 for Chrome. Thanks.
Flags: needinfo?(brian)
Whiteboard: [leave open][keep hidden for Chrome investigation] → [keep hidden until 2014-02-28 for Chrome]
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(brian)
Resolution: --- → WORKSFORME
Group: crypto-core-security, core-security
Whiteboard: [keep hidden until 2014-02-28 for Chrome]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: