Closed Bug 141882 Opened 23 years ago Closed 21 years ago

NSS should convert email query keys to lowercase before searching

Categories

(NSS :: Libraries, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: KaiE, Assigned: julien.pierre)

Details

Attachments

(1 file, 5 obsolete files)

See bug 130692 for background info, and which implements a workaround. In short, NSS seems to convert all cert's email address to lowercase internally, and an application needs to convert search keys to lowercase before it queries NSS for matching certs, otherwise the search will fail. It is suggested, if NSS decides to handle email addresses lowercase internally, it should also convert search keys.
Bob, Ian, Julien, Nelson, Is this a good idea?
Perhaps we should try the address as-is, and if nothing returns, try it again having converted it to lowercase. However, I agree that if we are always converting the address to lowercase before returning it, then we should only search by lowercase. For one thing, the cache entries will be keyed by a lowercase string.
As I understand it, when we create smime profile records, we downshift the email address, so the email address in the smime profile (used as a kind of "nickname", to find the cert for an email address in a message) often does not match the email address in the cert, due to case mismatch. When we search for a cert by email address, we do not downshift the address we're looking for, so case mismatches cause failure to find the cert. I believe the correct solution is to downshift only the hostname part of the email address, not the mailbox name part. Hostname comparisons should not be case sensitive, but mailbox name comparisons should be. This downshifting should be done when creating a new smime profile record and also when searching for a cert. The comparison of these values should remain case sensitive. Examples: When creating an SMIME profile for Fred@Murtz.com, it should be transformed into Fred@murtz.com (notice the hostname was downshifted, but the mailbox name (Fred) was not). Then when searching for a name, the same processing should be done on the name being searched. So, when looking for a cert for Fred@MURTZ.COM, that name would be transformed to Fred@murtz.com, and that would be found. But when searching for fred@murtz.com, no match would be found, because Fred doesn't match fred.
I disagree with Nelson's proposal. Everything everywhere should be considered case-insensitive: s/mime certificates rfcs considers email addresses to be case-insensitive. Although this is in contradiction to rfc 822, the usage has been that no ISPs out there will create accounts that are case sensitive, and all ISPs will route email addresses successfully regardless of the case of the local part. NSS and the application need to cooperate and agree to do all searches lower case. Hence, NSS should create s/mime profiles in lower case, and PSM etc... should convert to lower case before passing on to NSS (or NSS can do it). We have in fact modified PSM to do just that. Send yourself an email to nElSoNb@netscape.com and you'll receive it.
rfc2459: In addition, legacy implementations exist where an RFC 822 name is embedded in the subject distinguished name as an EmailAddress attribute. The attribute value for EmailAddress is of type IA5String to permit inclusion of the character '@', which is not part of the PrintableString character set. EmailAddress attribute values are not case sensitive (e.g., "fanfeedback@redsox.com" is the same as "FANFEEDBACK@REDSOX.COM"). if the E= is case insensitive, it could be that the subjectAltName is case sensitive (and then could be use to list alternative case-sensitive equivalent email addresses), but to cover an email address fully given the knowledge that one's organization is using a case-insensitive local-part, one would have to add 2^n email addresses to the subjectAltNames where "n" is the length of the local part. I think it's save to assume that the rfc822Name contained in subjectAltNames are also case insensitive.
NSS is always converting the email address before dealing with it. There's code called 'fixupEmailAddress' which makes sure the email address is always lower case. This is important since email addresses are case insensitive.
Assigned the bug to Julien.
Assignee: wtc → jpierre
Priority: -- → P1
Target Milestone: --- → 3.6
Kai, can you tell us whether we need this in 3.5? We've already changed PSM to assume lower case.
> can you tell us whether we need this in 3.5? Bug 130692 uses a workaround, and at this time, I'm not aware of other PSM issues that depend on this bug. So I'd say it's not urgent at the moment.
Which specific lookup APIs are affected by this ? PK11_FindCertByNickNameOrEmailAddr ? Is there any other ?
I'd like to fix this. Kai, could you please respond to my question in comment #10 ?
PSM uses CERT_FindCertByNicknameOrEmailAddr. I do not know whether there are other functions in NSS' API that offer to search by email address, but if there are, it would be nice if all of them behaved the same.
Moved to 3.7.
Target Milestone: 3.6 → 3.7
Comment on attachment 102888 [details] [diff] [review] convert case for e-mail search in CERT_FindCertByNicknameOrEmailAddress 1. If 'lowercaseName' is NULL, you will be passing NULL to NSSCryptoContext_FindBestCertificateByEmail. Does that function work correctly on a NULL argument? In any case, you should be able to put the whole thing inside the "if (lowercaseName) { ... }" block. 2. Is 'name' guaranteed to be (7-bit) ASCII? If not, the char-by-char tolower calls may not produce the correct result in some languages. I don't have a good solution if 'name' is not constrained to be ASCII.
1. You are correct, the call should be moved inside the if. Though you will note that there was nothing preventing the name passed in from the caller from being NULL before, so one would hope that it already deals with that case ... 2. I think email systems aren't guaranteed to work with addresses above 7 bits. I could always add code to check if the 8th bit is set on any character, and skip the conversion in that case. 3. It has come to my attention that PK11_FindCertFromNickname also does a token search by e-mail address. That code is conditioned by the presence of a '@' character in the string. This brings several possible enhancements : 1) Should I add that condition to my patch also, and skip the call to NSSCryptoContext_FindBestCertificateByEmail altogether if missing ? It seems to me that it isn't a strict requirement for an S/MIME profile to have a @ in it, but most email systems probably require one, unless they are setup to default to a particular domain if not specified (ie. you would send an email to "jpierre" without domain through the corporate mail server, and it would get sent to jpierre@netscape.com ). For the purpose of searching for a recipient cert though, I don't think we ever want to depend on such defaults, so it would be a good thing to add the check. The downside is that some existing profiles that don't have the @ in them may no longer be found. 2) I need to add code inside of PK11_FindCertFromNickname to do another case conversion. I can't do it at the upper level because that would break nickname case-sensitive searches. So it must be done inside, in the email search part of the code. I will generate a patch for that.
This patch improves upon the last one by : - checking for NULL before calling NSSCryptoContext_FindBestCertificateByEmail - checking for the presence of the @ character - checking for an 8-bit string
Attachment #102888 - Attachment is obsolete: true
The changes over the existing functions include : - check for an 8-bit string - conversion of the nickname to lowercase
Would anyone like to review these 2 patches before I check them in ?
Here's a problem with one of the patches: + while (!(eightbitstring = (0x80 & nickname[i++]))); For an ordinary null terminated string with no 8-bit characters, this will run past the end of the string until it crashes or finds an 8-bit character.
I see that while loop is in both patches. Also, one question: Why not downshift strings that contain ISO-Latin-1 chars? Although RFC 2822 says these should all be ASCII, we know from experience that a email product allows ISO-Latin-1 (8-bit) characters to be used in email addresses. Should a string not be downshifted just because it contains a vowel with extra marks, such as ÀÂâÖôüö ? Is tolower unable to downshift them properly??
The Solaris man page on "tolower" seems to imply that the machine's locale parameters are taken into account. So it is possible that the examples you posted are downshifted properly. However, when e-mailing someone using a different locale, there could be a problem. I'm not sure what specs say about the locale / codepage to use for E= in certificates. What RFC would I start looking at ? x509 ?
Comment on attachment 103000 [details] [diff] [review] do case-insensitive searches for email address in PK11_FindCertFromNickname 1. The two if (lowercaseName) statements can be combined. 2. If the PORT_Strdup call fails, we do nothing. Is that what we want? 3. It seems that the two while loops can be combined, that is, you test for 8-bit characters as you convert the characters to lowercase. In bug 152986 comment 17, Nelson said that RFC 822 and its successor RFC 2822 both state that all the characters in the email headers (including email addresses) must be ASCII characters in the range 1-127. So it seems that my worry about 8-bit characters in email addresses is unfounded. 4. Finally, it seems that if we can't convert 'nickname' to lowercase (because PORT_Strdup fails or 'nickname' contains an 8-bit character), we should still attempt to find the certs by 'nickname'.
While the standard forbids it, nonetheless, a certain widely deployed email program reportedly allows ISO-Latin-1 characters in certificates, even in string types that should never include them. I believe it is not terribly uncommon, especially in Europe, to see email addresses with 8-bit ISO-Latin-1 characters in them. Here's an expression for downshifting that works for 7-bit uppercase characters, in strings of type "printableString", IA5, ASCII, and ISO-Latin-1. Unsigned char cc; (cc >= 'A' && cc <= 'Z' ? cc | 0x20 : cc)
Are we trying to downshift nicknames, too?
Nelson, Case conversion cannot always be done character by character. An example is the German ß, which becomes two characters (SS) when converted to uppercase. RFC 3280 says an EmailAddress or rfc822Name is an IA5String. The code you provided can't downshift À to à, can it?
I propose that when we have a string which is a mixture of 7bit characters and 8-bit characters that we downshift the upper case 7bit characters and leave the 8bit characters alone. This proposal differs from what I saw in the patches, which was to not downshift any characters in the string if any of them was an 8big character.
Actually the behavior in the patch was to stop looking for an email cert if the address contained any 8 bit character.
Attached patch patch update (obsolete) — Splinter Review
Take Wan-Teh and Nelson's suggestions into account
Attachment #102997 - Attachment is obsolete: true
Attached patch patch update (obsolete) — Splinter Review
Take Wan-Teh and Nelson's suggestions into account
Attachment #103000 - Attachment is obsolete: true
Comment on attachment 103935 [details] [diff] [review] patch update Your while loop will stop the downshifting at the first 8-bit character and skip the cert lookup. This is not what Nelson proposed. You should remove the 'eightbitstring' boolean variable altogether and write the loop like this: for (i = 0; lowercaseName[i]; i++) { if (!(0x80 & lowercaseName[i])) { lowercaseName[i] = tolower(lowercaseName[i]); } }
Attachment #103935 - Flags: needs-work+
The question is if there are characters in the string greater than 128, what are those characters? Latin1? Unicode? UTF8? It seems best to leave those characters alone. Using the localization of your machine doesn't help because the email may have come from (and probably did come from) a machine with a different localization. Since the standard prohibits them, there is no way we can interpret them correctly. On the other hand, even though I get a ton of korean mail (spam), it looks like none of the email addresses are anything but 7 bit ascii (though the 'labels' next to that email tend to be Korean). bob
Sorry I should have read the rest of the discussion. I agree with Nelson's comment #27. This works for CodePage encodings and UTF8. I don't think straight Unicode would work at all with email addresses. RE Nickname: General nicknames should not be converted to lower case. Only email addresses. I believe there is already code to convert the email addresses: CERT_FixupEmailAddr (or something like that, grep -i for FixupEmail). bob
Here's yet another proposal. Given that - tolower is LOCALE dependent, and the downshifting should not be LOCALE dependent, and - we only want to downshift plain 7-bit ascii uppercase characters, and not any 8-bit characters I propose that we create a new #define macro NSS_TOLOWER(x) and replace all of NSS's tolower calls with calls to NSS_TOLOWER(x). Once this is done, if we decide to change the downshifting criteria, we only have to change one #define to affect all places where NSS downshifts. I propose this implementation of NSS_TOLOWER. #define NSS_TOLOWER(cc) ((cc) >= 'A' && (cc) <= 'Z' ? (cc) | 0x20 : (cc)) This correctly downshifts the ASCII letters A-Z and no other characters. This works for all character sets of which ASCII letters are a subset, e.g. UTF8, printable string, ISO-Latin-*, IA5, etc. It also eliminates the need for code like this: if (!(0x80 & lowercaseName[i])) lowercaseName[i] = tolower(lowercaseName[i]); That code can be replaced with lowercaseName[i] = NSS_TOLOWER(lowercaseName[i]); The 0x80 test is no longer necessary because the NSS_TOLOWER macro doesn't downshift any characters with the high bit set.
nelson's proposal in comment 34 gets my vote. bob
This is in response to comment #31 . I don't I agree with that particular suggestion, because in the real world, e-mail addresses may not be more than 8 bits. This is due to limitations in email servers, as well as as what's spelled out in RFCs. As examples of this : - I just tried to create an AKA with a latin-1 8 bit character, and it wouldn't let me, I think for this very reason. - Thawte disallows certificate requests if the email address contains 8-bit characters. It says the e-mail address is invalid. Therefore, a certificate with an 8-bit email address cannot be obtained. I'm not sure if this is the case with all CAs but I would argue that Thawte is doing the right thing in rejecting it This is my reasoning for not bothering to look up certs for e-mail addresses that contain 8-bit characters. Those e-mail addresses are invalid. I can't set an error because it is still possibly a valid cert nickname.
Refusing to lookup a cert just because we believe no such cert should exist is likely to lead to a bug report where someone says "I have this cert, but NSS won't find it". Unless we also prevent all such certs from getting in all PKCS#11 tokens that we ever deal with, I think we should still be able to search for them. It seems to me that the issue presented by 8-bit characters is not whether or not we should look for them but rather is whether or not we should attempt to downshift them. We've answered that latter question with no.
FYI, I tried to email an id at netscape.com containing an 8-bit character, rançon@netscape.com . The message bounced back because the user "ranon" didn't exist . Ie, the mail server stripped the 8-bit character out of the ID before searching for the e-mail user. Maybe that's what we should do too when searching for an e-mail cert ?
Moved to target milestone 3.8 because the original NSS 3.7 release has been renamed 3.8.
Target Milestone: 3.7 → 3.8
Following up on comment #37 : on the theoritical possibility that users would complain that their certs are not found, I ask the reverse question : in what cases should we expect the searches to succeed when 8-bit characters are present in the e-mail address ? If we decide not to downshift or convert anything, but instead do a raw search (ie. byte comparison), then a successful match could only be made if the search parameter passed to the search function and the email address in the cert happened to be in the same character set. As somebody pointed out earlier, it isn't very likely to be the case that everybody and their recipients uses the same application in the same character set, which is the only way the searches would succeed. Until there is proof that non-urlencoded 8-bit email addresses actually work somewhere, I would prefer that our search function be consistent in its behavior and never return any certs for such addresses. I think this may be better than randomly failing with machine-specific results. Such bug reports would be far more difficult to diagnose.
Target Milestone: 3.8 → 3.9
Not a vulnerability or crash, and AOL product request : re-prioritizing to P2
Priority: P1 → P2
In response to comment #33 from Bob : There is indeed already a function called CERT_FixupEmailAddr . Currently that function just calls tolower(), which has been determined to be locale-dependent . It would be good if we had a single downshifting function in NSS that was used accross the board for the conversion; ie. the same function when converting the e-mail address from the cert to create the S/MIME profile, and when doing the lookup. CERT_FixupEmailAddr isn't being called when saving the S/MIME profile. In fact, I looked at the code, and CERT_SMimeProfile does not seem to do any downshifting. It just takes the first e-mail address from the cert using CERT_GetFirstEmailAddress() . Then that gets passed into PK11_SaveSMimeProfile . If there is in fact downshifting happening when we store the cert in the profile, does anyone know where it happens ? If there is no downshifting going on in the database, then I think it's fair to say that we should not change the query function. At this point, I believe the only reason things have worked with our code so far is that all the e-mail addresses we have in our certs subject and subjectaltname are lower case. I submit that if they were not, then PSM would break today, because it converts the address to lowercase before calling the NSS query function, and would not find the cert.
Actually, in alg1845.c, the function appendStringToBuf does a tolower() conversion ... That is the one that is actually done when saving the profile. So indeed the profile is saved in lowercase using tolower() ... That means we should probably do the same when doing queries.
Attachment #103934 - Attachment is obsolete: true
Attachment #103935 - Attachment is obsolete: true
Attachment #132377 - Flags: review?(wchang0222)
Comment on attachment 132377 [details] [diff] [review] use CERT_FixupEmailAddr for conversions during queries >Index: certdb/stanpcertdb.c >=================================================================== >RCS file: /cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v >retrieving revision 1.57 >diff -u -r1.57 stanpcertdb.c >--- certdb/stanpcertdb.c 18 Feb 2003 20:53:12 -0000 1.57 >+++ certdb/stanpcertdb.c 29 Sep 2003 23:50:05 -0000 >@@ -482,13 +482,21 @@ > NSSCertificate *c, *ct; > CERTCertificate *cert; > NSSUsage usage; >+ >+ if (NULL == name) { >+ return NULL; >+ } We need to set the SEC_ERROR_INVALID_ARGS error code before returning NULL. (BTW, the indentation of "return NULL" seems to be wrong.) I gave this patch r=wtc with some comments. 1. It *may* be better to do the conversion to lowercase in the Stan layer. 2. It's hard to verify whether all the necessary changes have been made. I checked pk11func.h and cert.h, looking for words related to email. Please examine the following four functions (especially the first two) and see if they need change: - PK11_FindSMimeProfile - PK11_SaveSMimeProfile - CERT_FindSMimeProfile - CERT_SaveSMimeProfile
Attachment #132377 - Flags: review?(wchang0222) → review+
Comment on attachment 132377 [details] [diff] [review] use CERT_FixupEmailAddr for conversions during queries Julien, please ignore my comment about the indentation of "return NULL".
Wan-Teh, PK11_FindSMimeProfile and PK11_SaveSMimeProfile do not downconvert the e-mail address. Neither of these functions is exported. CERT_SaveSMimeProfile downconverts the e-mail address, before calling PK11_SaveSMimeProfile with that address. I'm still investigating CERT_FindSMimeProfile .
CERT_FindSMimeProfile is also OK . It uses the cert->emailAddr field which has also been downconverted for the query; or if the cert is a temporary cert it it looks up in the Stan cache by issuer and SN. Checking in certdb/stanpcertdb.c; /cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v <-- stanpcertdb.c new revision: 1.58; previous revision: 1.57 done Checking in pk11wrap/pk11cert.c; /cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v <-- pk11cert.c new revision: 1.124; previous revision: 1.123 done
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: