Closed Bug 1462303 Opened 7 years ago Closed 7 years ago

Intermittent SSL_RX_MALFORMED_SERVER_HELLO on HN and YouTube (TLS 1.3)

Tracking

(firefox-esr52 unaffected, firefox-esr6060+ fixed, firefox60+ fixed, firefox61+ fixed, firefox62+ fixed)

Status:

RESOLVED FIXED

Milestone:

3.38

Tracking Flags:

Tracking

Status

firefox-esr52

---

unaffected

firefox-esr60

60+

fixed

firefox60

fixed

firefox61

fixed

firefox62

fixed

People

(Reporter: ekr, Assigned: mt)

References

Details

Attachments

(1 file)

Bug 1462303 - Allow TLS 1.3 compat mode when attempting to resume TLS 1.2, r?ekr 7 years ago Phabricator Automation 46 bytes, text/x-phabricator-request	ttaubert : review+	Details \| Review

Eric Rescorla (:ekr)

Reporter

Description

•

7 years ago

We are seeing occasional reports of SSL_RX_MALFORMED_SERVER_HELLO on TLS 1.3-hosted sites with Fx60. Here's the available data: - These sites are running BoringSSL - It's intermittent: if you reload, it works - At least one user reports this error when they are pretty sure they are not behind a middlebox. I saw it myself on an airplane, but again, a reload fixed it. - We didn't get any reports of this with Fx59 even though it had TLS 1.3 on. There are no super-obvious changes to the NSS code that would cause this, though our current best bet is the error is coming from the session-id comparison code in compat mode: https://searchfox.org/nss/rev/535129af9c575342f8cd1fb227da7fc5722f3acd/lib/ssl/ssl3con.c#6165 We are trying to get a PCAP, but I think the next step is to run NSS (or better yet, Firefox) in a loop and try to reproduce.

Eric Rescorla (:ekr)

Reporter

Comment 1

•

7 years ago

We also need to measure the actual error rate we're getting here, and see if it's worth scaling back TLS 1.3 on 60 while we sort this out.

J.C. Jones [:jcj] (he/they)

Comment 2

•

7 years ago

We're working on a Firefox loop based on the Canary + tcpdump. Assigning to me.

Assignee: nobody → jjones

Priority: -- → P1

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Updated

•

7 years ago

status-firefox60: --- → affected

tracking-firefox60: --- → +

Martin Thomson [:mt:]

Assignee

Comment 3

•

7 years ago

ekr found the bug in theory, I reproduced with a test case and there is a fix inbound. Let's talk about uplift. It's a small change, but one that will prevent some ugly errors during TLS 1.3 rollout.

Martin Thomson [:mt:]

Assignee

Comment 4

•

7 years ago

https://hg.mozilla.org/projects/nss/rev/43018e031dee1b0c7cac16d3de268de3de6d938d

Assignee: jjones → martin.thomson

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → FIXED

Target Milestone: --- → 3.38

Version: trunk → 3.35

Martin Thomson [:mt:]

Assignee

Updated

•

7 years ago

status-firefox61: --- → affected

status-firefox62: --- → affected

OS: Unspecified → All

Hardware: Unspecified → All

Martin Thomson [:mt:]

Assignee

Comment 5

•

7 years ago

Ok, the case for/against uplift. This IS user visible. It will manifest with a scary warning once (for each user and each site). Reloading the page (or the browser) will make it go away. It only happens when the site upgrades to TLS 1.3, which we've seen a lot of recently. It only happens if the user has an open session during the upgrade. This is a pure interop issue. There is no security risk here, just an ugly warning. So... I would say that we want to uplift to Beta and - provided that release management are ok - wait for the next regular build of 60 to catch the change. I can start the process of getting the NSS releases ready in case someone wants to move.

Julien Cristau [:jcristau]

Comment 6

•

7 years ago

(In reply to Eric Rescorla (:ekr) from comment #1) > We also need to measure the actual error rate we're getting here, and see if > it's worth scaling back TLS 1.3 on 60 while we sort this out. Is someone working on getting that data? Should this block rolling out 60 further in the mean time?

Flags: needinfo?(ekr)

Phabricator Automation

Comment 7

•

7 years ago

Attached file Bug 1462303 - Allow TLS 1.3 compat mode when attempting to resume TLS 1.2, r?ekr — Details

Phabricator Automation

Comment 8

•

7 years ago

Comment on attachment 8976993 [details] Bug 1462303 - Allow TLS 1.3 compat mode when attempting to resume TLS 1.2, r?ekr Tim Taubert [:ttaubert] has approved the revision. https://phabricator.services.mozilla.com/D1310

Attachment #8976993 - Flags: review+

Liz Henry (:lizzard) (relman/hg->git project)

Updated

•

7 years ago

tracking-firefox61: --- → +

tracking-firefox62: --- → +

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox-esr52: --- → unaffected

status-firefox-esr60: --- → affected

tracking-firefox-esr60: --- → ?

Eric Rescorla (:ekr)

Reporter

Comment 9

•

7 years ago

I am running a Spark job to try to collect this,

Flags: needinfo?(ekr)

Eric Rescorla (:ekr)

Reporter

Comment 10

•

7 years ago

(In reply to Eric Rescorla (:ekr) from comment #9) > I am running a Spark job to try to collect this, But now that we have diagnosed it, I expect it to be pretty rare and given that we know a reload fixes it, I'm tempted not to block. What do others think?

J.C. Jones [:jcj] (he/they)

Comment 11

•

7 years ago

(In reply to Eric Rescorla (:ekr) from comment #10) > (In reply to Eric Rescorla (:ekr) from comment #9) > > I am running a Spark job to try to collect this, > > But now that we have diagnosed it, I expect it to be pretty rare and given > that we know a reload fixes it, I'm tempted not to block. What do others > think? Looking at the patch -- and assuming I understand it correctly -- I don't see a need to block deployment. I vote to continue.

Eric Rescorla (:ekr)

Reporter

Comment 12

•

7 years ago

Spark doesn't show that 60 is worse than 59, so I don't think there is a need to throttle 60. I suggest that we do as MT says and trial if for a little while on Beta and then look at promoting to Release.

Martin Thomson [:mt:]

Assignee

Comment 13

•

7 years ago

3.37: https://hg.mozilla.org/projects/nss/rev/2f5e8b4bf89729ea5ee18c089e684d4e8138b557 3.36: https://hg.mozilla.org/projects/nss/rev/ff35a3edaafb1586d839fca8cdcae53dcdf02ee9 I'll need help cutting (NSS) releases. The process is not nearly automated enough for my liking.

J.C. Jones [:jcj] (he/they)

Comment 14

•

7 years ago

This is included in NSS 3.36.2 [1] and 3.37.1 [2], and landed in m-c this morning [3]. [1] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/NSS_3.36.2_release_notes [2] https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/NSS_3.37.1_release_notes [3] https://hg.mozilla.org/mozilla-central/rev/6d3ee860c038

J.C. Jones [:jcj] (he/they)

Updated

•

7 years ago

Blocks: 1463537

J.C. Jones [:jcj] (he/they)

Updated

•

7 years ago

Blocks: 1463539

Julien Cristau [:jcristau]

Updated

•

7 years ago

status-firefox62: affected → fixed

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 15

•

7 years ago

Hey J.C., is there a good telemetry probe that we can monitor on release 60 to get a sense of how severe this issue is? Thanks!

Flags: needinfo?(jjones)

Julien Cristau [:jcristau]

Updated

•

7 years ago

tracking-firefox-esr60: ? → 61+

J.C. Jones [:jcj] (he/they)

Comment 16

•

7 years ago

(In reply to Ritu Kothari (:ritu) from comment #15) > Hey J.C., is there a good telemetry probe that we can monitor on release 60 > to get a sense of how severe this issue is? Thanks! I believe this is bucket 29 [1] of SSL_HANDSHAKE_RESULT [2]. Currently 0.01% of Nightly 62 errors. Graph: https://screenshots.firefox.com/ai8yzHlOVbK06VYn/telemetry.mozilla.org [1] https://searchfox.org/mozilla-central/rev/dc6d85680539cb7e5fe2843d38a30a0581bfefe1/security/nss/lib/ssl/sslerr.h#70 [2] https://mzl.la/2GJfiCM

Flags: needinfo?(jjones)

Eric Rescorla (:ekr)

Reporter

Comment 17

•

7 years ago

Do we know what the story was for Release 58 (no TLS 1.3)?

J.C. Jones [:jcj] (he/they)

Comment 18

•

7 years ago

(In reply to Eric Rescorla (:ekr) from comment #17) > Do we know what the story was for Release 58 (no TLS 1.3)? No idea. It looks like the background noise for that one is ~1k-4k for most release cycles. (I'm not seeing Release 58 as an option on t.m.o anymore, but Beta 58 has ~4k errors noted, which is 0%. I'm guessing you're looking at stmo?)

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox61: affected → fixed

status-firefox-esr60: affected → fixed

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox60: affected → fixed

tracking-firefox-esr60: 61+ → 60+

You need to log in before you can comment on or make changes to this bug.