Open Bug 1317857 Opened 3 years ago Updated 8 months ago

There are servers using old and broken versions of NSS

Categories

(Web Compatibility :: Desktop, defect, P3)

defect

Tracking

(Not tracked)

People

(Reporter: mwobensmith, Assigned: adamopenweb)

References

Details

(Keywords: dev-doc-needed, site-compat)

Attachments

(1 file)

Attached file broken_sites.txt
TLS Canary on latest Nightly 52.0a1 shows 20 sites newly broken with error pr_end_of_file_error. 

This seems to have regressed on 2016-11-09. Previous builds are fine.

See attachment for isolated list of broken sites, and see canary run for full results:

https://tlscanary.mozilla.org/runs/2016-11-10-09-39-35/
Maybe bug 1310516 (Enable TLS 1.3)?
Thanks for the canary run, Matt. I'm not sure what the issue is here. Testing the sites with NSS I can connect just fine. What's interesting is that the sites seem broken in Chrome Canary as well. This makes me think it's a cert validation issue.
(In reply to Masatoshi Kimura [:emk] from comment #4)
> Reverting <https://hg.mozilla.org/mozilla-central/rev/ca0017c90ad0> fixed
> the issue.

Ha, SSL_SignatureSchemePrefSet is messing about here. This looks like a bug in NSS.
Assignee: nobody → nobody
Component: Security: PSM → Libraries
Product: Core → NSS
See Also: → 1309446
Summary: PR_END_OF_FILE_ERROR regression on Nightly → SSL_SignatureSchemePrefSet is broken
Target Milestone: --- → 3.28
Version: 52 Branch → 3.28
The culprit is PSS. Disabling PSS signatures fixes this.
So the server is intolerant. I don't think there's anything we can do about this other then telling them to configure their servers properly. (Unless we want to disable PSS.)
Flags: needinfo?(martin.thomson)
Flags: needinfo?(dkeeler)
Summary: SSL_SignatureSchemePrefSet is broken → There are PSS intolerant servers
This isn't limited to 52, it's in 50 too.  Chrome has adopted the policy of sticking with PSS.  We should send the operators of those servers an email.  It worked last time (why we didn't discover these few extra, I don't know).
Flags: needinfo?(martin.thomson)
If DSA signature algorithms are present, the servers work even without disabling PSS. The servers are totally broken about signature algorithms negotiation.
Sounds like this is more of a tech evangelism bug.
Assignee: nobody → nobody
Component: Libraries → Desktop
Flags: needinfo?(dkeeler)
Product: NSS → Tech Evangelism
Target Milestone: 3.28 → ---
Version: 3.28 → unspecified
This may be that old bug in NSS (on the server side) where they only look at the first 8 algorithms or so....
(In reply to Franziskus Kiefer [:fkiefer or :franziskus] from comment #6)
> The culprit is PSS. Disabling PSS signatures fixes this.
> So the server is intolerant. I don't think there's anything we can do about
> this other then telling them to configure their servers properly. (Unless we
> want to disable PSS.)

My team can help with outreach, but we'll need some more information than this since we're not experts. Is there a link to more documentation or how to configure common servers in the way we need them to?
Flags: needinfo?(franziskuskiefer)
The servers in question have a bug. They don't need to configure it but rather update to a new server version. In the cases of the servers which are spitting out pr_end_of_file_error, I believe the have an old version of NSS and need to upgrade.
Specifically, if they are affected by the NSS bug, it would be bug #1119983. That bug has two failure modes:

1. The server will only sign rsa_pkcs1_sha1 and nothing else.

2. There is a bug in the signature_algorithms parsing. The connection will fail unless rsa_pkcs1_sha1 is in the first N entries, where N is the total number of entries that have a known hash algorithm as the first byte.

The combination of breaking with hash/sig decomposition for TLS 1.3, TLS 1.3 losing DSA, and the MUST-level requirement to put SHA-1 at the end means that a complaint TLS 1.3 ClientHello will basically always hit #2.

I am aware of a second RSA-PSS intolerance, which is in Erlang. Bug #1306481 is an example of that. That was fixed by these two commits:
https://github.com/erlang/otp/commit/ae7347bfdcab2486bb55dfe54918a0c994d8b7c7
https://github.com/erlang/otp/commit/0b19d46eaa4e4cf40be51acca9760c8f969638f2

If the bug triggered by merely adding RSA-PSS, it is probably the Erlang bug. If it only broke after https://hg.mozilla.org/mozilla-central/rev/ca0017c90ad0, it's probably the NSS bug. (Or maybe we have yet another server bug out there.)
According to my local testing,

fails (bug 1195434 order): ssl_sig_ecdsa_secp256r1_sha256, ssl_sig_ecdsa_secp384r1_sha384, ssl_sig_ecdsa_secp521r1_sha512, ssl_sig_rsa_pss_sha256, ssl_sig_rsa_pss_sha384, ssl_sig_rsa_pss_sha512, ssl_sig_rsa_pkcs1_sha256, ssl_sig_rsa_pkcs1_sha384, ssl_sig_rsa_pkcs1_sha512, ssl_sig_ecdsa_sha1, ssl_sig_rsa_pkcs1_sha1

fails (NSS default minus DSA): ssl_sig_ecdsa_secp256r1_sha256, ssl_sig_ecdsa_secp384r1_sha384, ssl_sig_ecdsa_secp521r1_sha512, ssl_sig_ecdsa_sha1, ssl_sig_rsa_pss_sha256, ssl_sig_rsa_pss_sha384, ssl_sig_rsa_pss_sha512, ssl_sig_rsa_pkcs1_sha256, ssl_sig_rsa_pkcs1_sha384, ssl_sig_rsa_pkcs1_sha512, ssl_sig_rsa_pkcs1_sha1

works (NSS default): ssl_sig_ecdsa_secp256r1_sha256, ssl_sig_ecdsa_secp384r1_sha384, ssl_sig_ecdsa_secp521r1_sha512, ssl_sig_ecdsa_sha1, ssl_sig_rsa_pss_sha256, ssl_sig_rsa_pss_sha384, ssl_sig_rsa_pss_sha512, ssl_sig_rsa_pkcs1_sha256, ssl_sig_rsa_pkcs1_sha384, ssl_sig_rsa_pkcs1_sha512, ssl_sig_rsa_pkcs1_sha1, ssl_sig_dsa_sha256, ssl_sig_dsa_sha384, ssl_sig_dsa_sha512, ssl_sig_dsa_sha1

The NSS default works even though it contains PSS, but fails if DSA (not PSS!) algorithms are removed from the list. So it would not be the NSS bug nor the Erlang bug.
And I'm not sure this is really a simple PSS intolerance although the server is completely broken anyway.
Thanks for looking into this :emk.  I would suggest then that we continue with the plan where we contact the server operator.
Thanks everyone for the info! 

Adam, can you make this a priority for outreach? 

See Comments #12 and after for a general description of the problem and suggestions for what the sites can do.
Assignee: nobody → astevenson
Flags: needinfo?(astevenson)
Comment #14: Hrm? That looks like the NSS bug to me. To restate, the rule is:

"2. There is a bug in the signature_algorithms parsing. The connection will fail unless rsa_pkcs1_sha1 is in the first N entries, where N is the total number of entries that have a known hash algorithm as the first byte."

fails (bug 1195434 order): ssl_sig_ecdsa_secp256r1_sha256, ssl_sig_ecdsa_secp384r1_sha384, ssl_sig_ecdsa_secp521r1_sha512, ssl_sig_rsa_pss_sha256, ssl_sig_rsa_pss_sha384, ssl_sig_rsa_pss_sha512, ssl_sig_rsa_pkcs1_sha256, ssl_sig_rsa_pkcs1_sha384, ssl_sig_rsa_pkcs1_sha512, ssl_sig_ecdsa_sha1, ssl_sig_rsa_pkcs1_sha1

N = 8, ssl_sig_rsa_pkcs1_sha1 is not in the first 8.

fails (NSS default minus DSA): ssl_sig_ecdsa_secp256r1_sha256, ssl_sig_ecdsa_secp384r1_sha384, ssl_sig_ecdsa_secp521r1_sha512, ssl_sig_ecdsa_sha1, ssl_sig_rsa_pss_sha256, ssl_sig_rsa_pss_sha384, ssl_sig_rsa_pss_sha512, ssl_sig_rsa_pkcs1_sha256, ssl_sig_rsa_pkcs1_sha384, ssl_sig_rsa_pkcs1_sha512, ssl_sig_rsa_pkcs1_sha1

N = 8, ssl_sig_rsa_pkcs1_sha1 is not in the first 8.

works (NSS default): ssl_sig_ecdsa_secp256r1_sha256, ssl_sig_ecdsa_secp384r1_sha384, ssl_sig_ecdsa_secp521r1_sha512, ssl_sig_ecdsa_sha1, ssl_sig_rsa_pss_sha256, ssl_sig_rsa_pss_sha384, ssl_sig_rsa_pss_sha512, ssl_sig_rsa_pkcs1_sha256, ssl_sig_rsa_pkcs1_sha384, ssl_sig_rsa_pkcs1_sha512, ssl_sig_rsa_pkcs1_sha1, ssl_sig_dsa_sha256, ssl_sig_dsa_sha384, ssl_sig_dsa_sha512, ssl_sig_dsa_sha1

N = 12, ssl_sig_rsa_pkcs1_sha1 is the 11th entry. 11 < 12
Flags: needinfo?(franziskuskiefer)
Summary: There are PSS intolerant servers → There are servers using old and broken versions of NSS
On Nov 16, Franziskus had set this bug as a blocker for the NSS 3.28 release.

Based on the latest comments, I understand you no longer think that any technical fix should go into NSS. Based on that, I'm removing the bug from the blocker list for the 3.28 release. Please comment if my understanding is wrong.
No longer blocks: 1305970
Duplicate of this bug: 1328431
I re-ran the TLS canary again for bug 1328600, and with the union of the results, we are still around ~20 regressed sites for NSS 3.28. We can also see that they break in Chrome canary, with the exception of two.

https://www.onlinecreditcenter6.com/
https://www.jcpcreditcard.com

So it might merit a look to see what makes these two sites different.
I've found those hosts to work flakily in Chrome canary which is really confusing. I'm in contact with someone from Synchrony Financial who runs both those hosts (see WHOIS) and they're going to upgrade the hosts. My assumption has been that they're in the process of updating and maybe sometimes we hit a server with the fix.

(Disappointingly, when it does work, they still only sign rsa_pkcs1_sha1. If that behavior persists to when they've completely fixed the problem, I'll investigate that.)
Please note that NSS 3.28.1 is in Firefox 51, which releases January 24.
Reference: Bug #1328600
(In reply to Kathleen Wilson from comment #23)
> Please note that NSS 3.28.1 is in Firefox 51, which releases January 24.
> Reference: Bug #1328600

Meaning that these compatibility regressions hit release on January 24.
I've just emailed all the hostmaster whois contacts from attachment 8811077 [details] pointing them to this bug, and the release dates of both affected Firefox and Chrome. (Note: there's 2 domains without emailable contacts: uob.com.my and morawa-buch.at)
Based on information from EKR, I understand NSS 3.17.4 and all newer versions contain the required fix.
This hasn't been fixed yet. How would people like to proceed?
Bad idea that would fix this: override the order of preference temporarily so that RSA+SHA1 appears earlier in the list (if not first).  It's wrong, but it would avoid the issue until those servers are updated.  We should put an expiration date on any fix though.

Note that NSS servers pick based on their preferences, so we wouldn't get a weak scheme unless the server was either busted or respected client preference (I don't know how common this is, it varies).
If ordering strategically, I'd suggest moving all the rsa_pkcs1_* entries before rsa_pss_*, rather than just rsa_pkcs1_sha1. This way servers that pick based on client preferences still pick correctly. You just won't be ordering rsa_pkcs1_* relative to rsa_pss_* correctly, but rsa_pss_* is new so this doesn't particularly matter yet.
Sorry for the delayed update. I reached out to 4 of these sites and had no responses from them. Even tried phoning a couple, support at reviewmyaccount.com didn't take it seriously, as I'm not a customer.

epsoninsider.com, gmprograminfo.com, lowesvisacredit.com, reviewmyaccount.com.
Flags: needinfo?(astevenson)
I am informed that the SYF has started the upgrade of their servers.
Every affected host I am aware that's run by SYF appears to now be fixed.
Duplicate of this bug: 1341375
It seems like we can close this now, maybe? All the sites are working, with the exception of the following, 

http://www.performnet.com/
http://www.onlinecreditcenter4.com/
http://www.onlinecreditcenter2.com/
http://cardoverview.com/
http://foodservicerewards.com/

These two get the following error/warning in Nightly? So does Canary.

Error code: MOZILLA_PKIX_ERROR_ADDITIONAL_POLICY_CONSTRAINT_FAILED
https://bioiatriki.gr/
https://www.epsoninsider.com/cgi-bin/Insider/Login/login.jsp

Matt, is that related or something else?
Flags: needinfo?(mwobensmith)
:miketaylr, the last two sites are fallout from issue #1409257.
Flags: needinfo?(mwobensmith)
Product: Tech Evangelism → Web Compatibility
You need to log in before you can comment on or make changes to this bug.