1.82 KB, patch
|Details | Diff | Splinter Review|
See https://bugzilla.mozilla.org/show_bug.cgi?id=279099#c8 part 1. Should the domain name in an SSL cert be the punycode-encoded domain name (that's what we expect), or raw Unicode (that's what some other browsers are expecting)?
13 years ago
I'm not sure whether to answer this question here or in bug 279099. The answer appears to be: it depends on where the server's name is encoded in the certificate. Remember that hostnames in CommonNames are now deprecated. There is another way, one that is RFC standard compliant, to include hostnames in certs, and mozilla supports it. The standard way would require punycode for names that include non-ASCII characters, because it encodes as IA5String. The old legacy way (commonName) allows any unicode string (including UTF8, UCS2/UTF16, and UCS4), and so presumably would NOT use punycode. According to the informational RFC 2818 there are two places in an https server cert where the server's dnsName can be encoded. They are: a) in the optional list of "subject alternative names", encoded as a "dnsName", or b) in the CommonName attribute of the cert's subjectName (the legacy method). According to RFC 3280, a "dNSName" may only be encoded as an IA5String. IA5String is a subset of ASCII. Therefore, a dNSName in the subjectAltName extension cannot contain unicode characters (except those that are part of the IA5String subset of Unicode). If a hostname that contained unicode characters was going to be encoded as a standards compliant dNSName in a subjectAltName, it would have to be in punycode. According to RFC 3280, in certs issued prior to Jan 1, 2004, the CommonName attribute may be encoded as any of these types: teletexString TeletexString (SIZE (1..ub-common-name)), printableString PrintableString (SIZE (1..ub-common-name)), universalString UniversalString (SIZE (1..ub-common-name)), utf8String UTF8String (SIZE (1..ub-common-name)), bmpString BMPString (SIZE (1..ub-common-name)) After December 31, 2003, they must be encoded as UTF8String, although we think some CAs still do it the old way. The CommonName method was put into use as a defacto industry standard before the "subject alternative name" extension was defined and standardized, and is still widely used. But now that the subjectAltName extension is standardized, use of the CommonName for the server's dnsname is *deprecated*. NSS allows the application that uses it to perform its own cert validation, and NSS provides its own functions that an application may use for this purpose. MOST applications that use NSS also use NSS's cert chain validation and host-name matching functions, and do not implement their own. NSS's function for cert name matching accepts as input arguments a cert object and a UTF8 string that is the name that the application expects to find in the cert. NSS attempts to match that application-supplied string against the names in the cert. First it looks for dnsNames in the cert's subjectAltName extension and compares against them. If no dnsNames are found in the cert's subjectAltName (or if the cert has no subjectAltName) it checks against the CommonName attribute in the cert's subjectName. If the CommonName attribute is encoded as teletexstring, bmpstring or universalstring (which are ISO-8859-1, UCS2 or UCS4), it converts the CommonName to UTF8 before comparing. (printableString is a subset of UTF8, and needs no conversion before being compared to a UTF8 string.) In some sense, the question being asked here, about what *should* be expected, is one for the standards bodies to answer. But the CA industry today is not conforming very tightly to the standards, and I expect we will have to wait and see what the CAs really do. If we find, as I expect, that punycode will be used in subjectAltNames, and not in subject commonNames, then we may need to change the NSS function for cert name checking to expect the application to provide the real (non-punycode) name in UTF8 form, and be able to convert it to punycode before comparing against dNSNames in subjectAltNames,
> If we find, as I expect, that punycode will be used in subjectAltNames, > and not in subject commonNames, then we may need to change the NSS > function for cert name checking to expect the application to provide the > real (non-punycode) name in UTF8 form, and be able to convert it to > punycode before comparing against dNSNames in subjectAltNames, Sounds reasonable. Can you do this?
My proposal (converting UTF8 to punycode before comparing dNSNames in subject alt names) is feasible. It requires a c-language callable function to do that conversion, which NSS presently does not have. Perhaps it would be expedient to do this conversion via a callback, just as NSS once used callbacks to convert between various flavors of unicode. That way, we could potentially re-use mozilla's existing punycode conversion code, through some c->c++ wrapper. My work priorities will not permit me to implement this this year. Perhaps one of my NSS colleages can do this.
> Perhaps it would be expedient to do this conversion via a callback, Yeah, that sounds like the right approach. There are several steps involved in UTF-8 conversion to punycode. One step involves NFKC normalization, which presently involves invoking code in mozilla/intl via an XPCOM interface. Necko's IDN support is provided via nsIIDNService.
(In reply to comment #4) > Necko's IDN support is provided via nsIIDNService. So, I think NSS would provide a function to register a punycode conversion callback, and the cert name checking function would call that function, if it is non-NULL, otherwise would work as it does now. Presumably PSM would contain the c-landuage callable c++ callback function that uses nsIIDNService, and would register that function with NSS.
Sigh, Even though we have callbacks to to utf conversions, NSS also supplies default functions as well. In this case maybe we just fail the punycode compare? This has a bigger impact on command tools then applications typically. bob
Here's one interesting hack to work with safari: http://bob.pythonmac.org/archives/2005/02/07/idn-spoofing-defense-for-safari/ Also, you may wish to review the following RFCs: 3454 3490 3491 Cheers, Eric
Here is a relevant presentation by someone from thawte: http://www.icann.org/presentations/valentin-idn-ct-01dec04.pdf
This whole situation reveals that mozilla is passing the punycode version of the hostname to NSS, rather than passing the UTF8 version of the hostname. I believe the correct thing to do is to pass the UTF8 hostname string to NSS. That change (to PSM, I believe) would be an immediate improvement to this issue, I think.
> I believe the correct thing to do is to pass the UTF8 hostname > string to NSS. Hmm... Necko passes PSM the hostname via nsISocketProvider. That API requires an ASCII hostname. So, I suppose we could make PSM convert from punycode to UTF-8 before communicating with NSS. nsIIDNService provides the methods IsACE() and ConvertACEtoUTF8(), which could be used.
In light of Darin's comment 10, let it be said that there are (at least) two ways we could approach this: a) enhance NSS's existing cert name checking function. Leave it defined as it is, where the name received from the application for comparision with the cert is defined to be UTF8, and enhance that function to make a punycode-encoded copy. b) define a new additional cert name checking function (could be in NSS or in PSM), which expects the application to supply it ASCII-only names (including IDN/punycode names) as mozilla now supplies. This function would make a UTF8 encoded copy of the input ASCII/punycode string. In either function, Subject Common Names would be compared to BOTH the UTF8 and punycode strings. SubjectAltName DNSnames would be compared only to the punycode string. In choosing between these solutions, it matters whether CAs are going to finally stop using cert subject CNs for DNSnames, and use subjectAltNames for this instead, or whether they are going to continue to encode DNS names in cert subject CNs, and in the latter case, whether they are going to encode those names with UTF8 (per RFC 3280) or encode them with punycode (as I gather is wanted by the punycode promoters). Option b above can be implemented without changing any existing NSS function signatures, and while preserving 100% backwards compatibility in NSS. Here's why and how: When libSSL has collected a cert chain and wants to validate that cert chain, including validating the name in the chain, it calls an *application-supplied* callback function. Most applications just register one of libSSL's own functions (namely SSL_AuthCertificate) as that callback function. SSL_AUthCertificate calls CERT_VerifyCertName which is the existing function that accepts UTF8 hostnames for comparison. But some applications (including Mozilla) register their own callback functions. Mozilla's is seen at http://lxr.mozilla.org/seamonkey/source/security/manager/ssl/src/nsNSSCallbacks.cpp#296 Mozilla's function could call a new function that behaves as SSL_AuthCertificate but which calls the function described in b above, and NSS's backwards compatibility would be unchanged.
As a practical matter IE is still the dominant browser, so CA's will do what works in IE+IDNplugin. From Thawte's slideshow in comment 8 and http://www.thawte.com/IDN/ it looks like they're putting user-readable stuff in the CN, and I guess punycode in the subjectAltName (because IE validates the cert) though they don't give details.
minus for 1.0.1 and plus for a well tested solution in 1.1
There is no need to speculate about how to compare IDNs, or whether particular fields of SSL certs should be expected to contain ACE form or Unicode form. Sections 2 and 3 of RFC 3490 answer those questions. Any field of an SSL cert, no matter whether it allows non-ASCII characters or not, is expected to contain the ACE form unless the spec for that field explicitly cites RFC 3490 and invites non-ASCII domain names into the field. That doesn't mean you are forbidden from accepting non-ASCII domain names in that field, but it does mean that whoever put a non-ASCII domain name there was violating the IDNA spec (and by the way, they can expect the SSL cert to fail unpredictably when processed by IDN-unaware applications, or by IDN-aware applications that choose not to be so liberal in what they accept). As for how to compare domain names, the required method is to convert them both to ASCII and then compare the ASCII names as usual. (Technically, you are free to use any method that always returns the same answer as that method, but the corner cases are tricky, so your best bet is probably just to do that.)
We're coming into the endgame on 1.8b2. Is this something that's gonna happen in the next few days or should it be pushed out to 1.8b3?
Asa, This bug is not going to be addressed for 1.8b2. Let's review the situation to understand the severity. As I understand it, there is *no spoofing issue here*. The only issue is a potential cert name mismatch when the cert contains a Unicode (non-ACE) DNSname. This leads to cert name mismatches, but no false positive matches, so there is no spoofing vulnerability here. Mozilla is passing ACE form names to NSS for comparison. This works JUST FINE for certs that have ACE form DNSnames in their CNs and/or subject alt names. In comment 14, Adam Costello asserts that RFC 3490 conformant certs will use ACE form names. So, the existing NSS code works fine for those certs. But we know that other standards that predate 3490 allow non ACE-form CNs, and that there are CAs that issue certs with non-ACE form DNSnames in CNs. We declare a hostname mismatch for them today. Some other browsers make these work. I think this bug is of interest because many people want mozilla to work in all cases where other browsers do (with the possible exception of MSIE :) Presently, we are effectively being strict about requiring certs to contain ACE form DNSnames. It is being suggested that we loosen this to also work for certs that contain Unicode DNSnames. Doing that means adding an MPLed UTF8-to-ACE converter, written in c, to NSS. I don't have one of those in my pocket. It won't happen this week, or next.
Created attachment 180993 [details] [diff] [review] pseudo-code patch shows where conversion is needed.
moving out to a 1.8b3 nomination.
AFAIK, we haven't ha any bugs filed in the last 24 months about certs containing non ACE-form names, so I think this is WFM.