Closed Bug 935394 Opened 6 years ago Closed 6 years ago

Shaw.ca customers cannot use Facebook or Google TLS due to ssl_error_rx_record_too_long

Categories

(Tech Evangelism Graveyard :: English Other, defect, critical)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: MattN, Unassigned)

References

()

Details

Attachments

(4 files)

Shaw customers in parts of British Columbia, Canada are unable to access secure sites such as https://www.google.ca/ and https://www.facebook.com/ in Firefox  for many hours already. Some people have found that setting security.tls.version.max to 0 fixes the problem but my understanding is that is leaving them less secure.

I'm wondering if something like bug 839310 would have helped here. It would be good to know what the cause of the issue is which may mean contacting Shaw technicians.

References:
* https://support.mozilla.org/en-US/questions/976504
* https://community.shaw.ca/message/51905
* https://twitter.com/Shawhelp/status/397925373058748417
Flags: needinfo?(brian)
This is output from openssl from someone supposedly still experiencing the problem.
I am in Vancouver right now. Apparently their headquarters are just a few blocks from where I am staying. I talked to their tech support and I was told that I would be able to meet with their tech people tomorrow if I show up at Shaw Tower tomorrow. So, that is what I intend to do.
Flags: needinfo?(brian)
See https://support.mozilla.org/en-US/questions/976504?page=4#answer-498096:

"... http://pastebin.com/txYtt6aZ ... Somehow the google server is sending back html inside a TLSv1 Server Hello message, which is not supposed to happen."

I suspect it may be shaw.ca's captive portal acting badly. I just talked to some local people who are shaw.ca customers. Apparently this network has bandwidth caps.
I'd be more interested in knowing why this is happening to ONLY Firefox users. If this was just a error on the ISP network end of things you'd expect Google & Facebook to fail in ALL browsers.

Is this a good thing b/c our security is too strict? Or are we clearly doing something wrong? I also hope someone can offer a detailed technical account of what's going on here.
Also more specifically what are the security implications of changing:
setting security.tls.version.max to 0

We're going to have a few hundred or thousand people changing this pref since word has spread and some curious minds are inquiring "Should I be concerned that I'm forcing SSL 3.0 over TLS 1.0?". People want to know. I would assume yes since I know these settings were pulled from the UI due to Limi's "Checkboxes that Kill" study.

Will other sites break if this pref is set to 0? If yes, then we'll have to push a hotfix to reset the security.tls.version prefs once this episode has passed. As we all know people are not going to remember to change back security.tls.version.max (1 by default) & security.tls.version.min (0 by default) back to their defaults.
If any Mozilla people need access to a system on the affected Shaw network, please let me know (by email) and I can arrange to give you full access to a Windows VM.
A tcpdump of a failing connection would be very interesting. The OpenSSL trace from #3 shows a normal connection. (Except for the huge certificate, which is caused by not passing -servername www.google.ca on the command line to send an SNI extension.)

It sounds very much like the ISP is doing something very stupid but without more details I can't say. Hopefully bsmith can get those details from them. I've not heard reports from Chrome users yet and it's unclear why that is since we're both using NSS.
Also affected by this issue.  If there is any help that I can provide in resolving this issue, please ping me.  NI me here, @cyee on IRC or email.
(In reply to Casey Yee [:cyee] from comment #10)
> Also affected by this issue.  If there is any help that I can provide in
> resolving this issue, please ping me.  NI me here, @cyee on IRC or email.

A tcpdump of the failing connection? (i.e. tcpdump -i eth0 -s 9999 -w /tmp/dump, view /tmp/dump in Wireshark to make sure that there's nothing unexpected in there and attach to the bug.)
Adam, just sent you a dump to look at by email.  I have no idea what I'm looking for/at so if this isn't what you are looking for just direct me on what exactly I should be capturing.
I received a tcpdump from :cyee.

In frame 34 (for those following along) is a ClientHello for a handshake to twitter.com which completes successfully.

In frame 97 is a ClientHello for www.google.com. This is followed by some very odd TCP packets. The SYN-ACK packet and the immediate ACK of the ClientHello have an IP TTL of 43 and 44. Then we find packets from the `server' with a TTL of 63 which are ACKing a byte that the client didn't send, and are replying to the ClientHello with a couple of copies of the following:

HTTP/1.1 404 Not Found
Connection: close
Content-Type: text/html; charset=utf-8

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found.</p>
<hr>
<address></address>
</body></html>

The IP address that the client has is a valid Google IP address, but, if this is affecting many users of the same ISP, it would appear to be misbehaviour on the part of the ISP.
Would affected users mind visiting http://www.whatismyip.com/ (or search for "what is my ip" on Google, if you can reach Google) and reporting what IP address they appear to the world as? (You can also email me directly if you don't wish to post that publicly.)
the first affected users at https://support.mozilla.org/en-US/questions/976504 report that the issue appears resolved for them
My external IP is 96.48.113.4
As per philipp, the issue seems to have been cleared up.
as a post mortem it would be interesting to know what was going on nevertheless and why it was only affecting firefox...
Here is a link to Shaw's update on this. Not much new but it could provide contacts for a postmortem: https://community.shaw.ca/docs/DOC-2328

Brian, did you end up talking to anyone there today? Do you understand why Chrome was fine? Did Chrome fallback to TLS 1.0? (just a guess as I don't know if that's part of NSS or outside of it)
Flags: needinfo?(brian)
Whiteboard: [Issue is resolved. Leave open for postmortem]
Attached file bug-935394
Posting TCP dump for anyone that might find it useful.
BTW, this seems to have been fixed as of ~ noon yesterday. Do we know what was fixed?
I was able to get in touch with a tech person from Shaw.ca just as the problem was fixed. Adam Langley's analysis above is correct. Apparently some equipment on some local part of Shaw's network was causing this error. It may be the case that Chrome and Internet Explorer are more eager to fall back from newer, more secure, TLS versions to SSL 3.0 than Firefox is, and there's some speculation that Chrome and IE were automatically falling back to SSL 3.0 in response to the plain text HTTP response, whereas Firefox doesn't.

I filed bug 936850 about being smarter about captive portal detection as it applies to TLS version fallback.
Flags: needinfo?(brian)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
Whiteboard: [Issue is resolved. Leave open for postmortem]
Product: Tech Evangelism → Tech Evangelism Graveyard
You need to log in before you can comment on or make changes to this bug.