Closed Bug 141882 Opened 22 years ago Closed 21 years ago

NSS should convert email query keys to lowercase before searching

Categories

(NSS :: Libraries, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: KaiE, Assigned: julien.pierre)

Details

Attachments

(1 file, 5 obsolete files)

See bug 130692 for background info, and which implements a workaround.

In short, NSS seems to convert all cert's email address to lowercase internally,
and an application needs to convert search keys to lowercase before it queries
NSS for matching certs, otherwise the search will fail.

It is suggested, if NSS decides to handle email addresses lowercase internally,
it should also convert search keys.
Bob, Ian, Julien, Nelson,

Is this a good idea?
Perhaps we should try the address as-is, and if nothing returns, try it again
having converted it to lowercase.

However, I agree that if we are always converting the address to lowercase
before returning it, then we should only search by lowercase.  For one thing,
the cache entries will be keyed by a lowercase string.
 As I understand it, when we create smime profile records, we downshift
the email address, so the email address in the smime profile (used as a 
kind of "nickname", to find the cert for an email address in a message)
often does not match the email address in the cert, due to case mismatch.

When we search for a cert by email address, we do not downshift the 
address we're looking for, so case mismatches cause failure to find the
cert.

I believe the correct solution is to downshift only the hostname part
of the email address, not the mailbox name part.  Hostname comparisons
should not be case sensitive, but mailbox name comparisons should be.
This downshifting should be done when creating a new smime profile record 
and also when searching for a cert.  The comparison of these values should 
remain case sensitive.

Examples:  When creating an SMIME profile for Fred@Murtz.com, it 
should be transformed into Fred@murtz.com (notice the hostname was
downshifted, but the mailbox name (Fred) was not).  Then when searching
for a name, the same processing should be done on the name being searched.
So, when looking for a cert for Fred@MURTZ.COM, that name would be 
transformed to Fred@murtz.com, and that would be found.  
But when searching for fred@murtz.com, no match would be found, because
Fred doesn't match fred.  

I disagree with Nelson's proposal.  Everything everywhere should be considered
case-insensitive:

s/mime certificates rfcs considers email addresses to be case-insensitive.
Although this is in contradiction to rfc 822, the usage has been that no ISPs
out there will create accounts that are case sensitive, and all ISPs will route
email addresses successfully regardless of the case of the local part.

NSS and the application need to cooperate and agree to do all searches lower
case. Hence, NSS should create s/mime profiles in lower case, and PSM etc...
should convert to lower case before passing on to NSS (or NSS can do it).

We have in fact modified PSM to do just that.

Send yourself an email to nElSoNb@netscape.com and you'll receive it.





rfc2459:

   In addition, legacy implementations exist where an RFC 822 name is
   embedded in the subject distinguished name as an EmailAddress
   attribute.  The attribute value for EmailAddress is of type IA5String
   to permit inclusion of the character '@', which is not part of the
   PrintableString character set.  EmailAddress attribute values are not
   case sensitive (e.g., "fanfeedback@redsox.com" is the same as
   "FANFEEDBACK@REDSOX.COM").

if the E= is case insensitive, it could be that the subjectAltName is case
sensitive (and then could be use to list alternative case-sensitive equivalent
email addresses), but to cover an email address fully given the knowledge that
one's organization is using a case-insensitive local-part, one would have to add
2^n email addresses to the subjectAltNames where "n" is the length of the local
part.
I think it's save to assume that the rfc822Name contained in subjectAltNames are
also case insensitive.

NSS is always converting the email address before dealing with it. There's code
called 'fixupEmailAddress' which makes sure the email address is always lower case.


This is important since email addresses are case insensitive.
Assigned the bug to Julien.
Assignee: wtc → jpierre
Priority: -- → P1
Target Milestone: --- → 3.6
Kai, can you tell us whether we need this in 3.5?

We've already changed PSM to assume lower case.
> can you tell us whether we need this in 3.5?

Bug 130692 uses a workaround, and at this time, I'm not aware of other PSM
issues that depend on this bug. So I'd say it's not urgent at the moment.
Which specific lookup APIs are affected by this ?
PK11_FindCertByNickNameOrEmailAddr ?
Is there any other ?
I'd like to fix this. Kai, could you please respond to my question in comment #10 ?
PSM uses CERT_FindCertByNicknameOrEmailAddr.
I do not know whether there are other functions in NSS' API that offer to search
by email address, but if there are, it would be nice if all of them behaved the
same.
Moved to 3.7.
Target Milestone: 3.6 → 3.7
Comment on attachment 102888 [details] [diff] [review]
convert case for e-mail search in CERT_FindCertByNicknameOrEmailAddress

1. If 'lowercaseName' is NULL, you will be passing
NULL to NSSCryptoContext_FindBestCertificateByEmail.
Does that function work correctly on a NULL argument?
In any case, you should be able to put the whole
thing inside the "if (lowercaseName) { ... }" block.

2. Is 'name' guaranteed to be (7-bit) ASCII?  If not,
the char-by-char tolower calls may not produce the
correct result in some languages.  I don't have a
good solution if 'name' is not constrained to be
ASCII.
1. You are correct, the call should be moved inside the if. Though you will note
that there was nothing preventing the name passed in from the caller from being
NULL before, so one would hope that it already deals with that case ...

2. I think email systems aren't guaranteed to work with addresses above 7 bits.
I could always add code to check if the 8th bit is set on any character, and
skip the conversion in that case.

3. It has come to my attention that PK11_FindCertFromNickname also does a token
search by e-mail address. That code is conditioned by the presence of a '@'
character in the string.

This brings several possible enhancements :

1) Should I add that condition to my patch also, and skip the call to
NSSCryptoContext_FindBestCertificateByEmail altogether if missing ?
It seems to me that it isn't a strict requirement for an S/MIME profile to have
a @ in it, but most email systems probably require one, unless they are setup to
default to a particular domain if not specified (ie. you would send an email to
"jpierre" without domain through the corporate mail server, and it would get
sent to jpierre@netscape.com ). For the purpose of searching for a recipient
cert though, I don't think we ever want to depend on such defaults, so it would
be a good thing to add the check. The downside is that some existing profiles
that don't have the @ in them may no longer be found.

2) I need to add code inside of PK11_FindCertFromNickname to do another case
conversion. I can't do it at the upper level because that would break nickname
case-sensitive searches. So it must be done inside, in the email search part of
the code. I will generate a patch for that.
This patch improves upon the last one by :
- checking for NULL before calling NSSCryptoContext_FindBestCertificateByEmail
- checking for the presence of the @ character
- checking for an 8-bit string
Attachment #102888 - Attachment is obsolete: true
The changes over the existing functions include :
- check for an 8-bit string
- conversion of the nickname to lowercase
Would anyone like to review these 2 patches before I check them in ?
Here's a problem with one of the patches:

+        while (!(eightbitstring = (0x80 & nickname[i++])));

For an ordinary null terminated string with no 8-bit characters, this will
run past the end of the string until it crashes or finds an 8-bit character.
I see that while loop is in both patches.

Also, one question:  Why not downshift strings that contain ISO-Latin-1 chars?
Although RFC 2822 says these should all be ASCII, we know from experience 
that a email product allows ISO-Latin-1 (8-bit) characters to be used in 
email addresses.  
Should a string not be downshifted just because it contains a vowel with 
extra marks, such as ÀÂâÖôüö ?
Is tolower unable to downshift them properly??  
The Solaris man page on "tolower" seems to imply that the machine's locale
parameters are taken into account. So it is possible that the examples you
posted are downshifted properly. However, when e-mailing someone using a
different locale, there could be a problem. I'm not sure what specs say about
the locale / codepage to use for E= in certificates. What RFC would I start
looking at ? x509 ?
Comment on attachment 103000 [details] [diff] [review]
do case-insensitive searches for email address in PK11_FindCertFromNickname

1. The two if (lowercaseName) statements can be combined.

2. If the PORT_Strdup call fails, we do nothing.  Is that
what we want?

3. It seems that the two while loops can be combined, that
is, you test for 8-bit characters as you convert the characters
to lowercase.  In bug 152986 comment 17, Nelson said that RFC
822 and its successor RFC 2822 both state that all the
characters in the email headers (including email addresses)
must be ASCII characters in the range 1-127.  So it seems
that my worry about 8-bit characters in email addresses is
unfounded.

4. Finally, it seems that if we can't convert 'nickname'
to lowercase (because PORT_Strdup fails or 'nickname' contains
an 8-bit character), we should still attempt to find the certs
by 'nickname'.
While the standard forbids it, nonetheless, a certain widely deployed email
program reportedly allows ISO-Latin-1 characters in certificates, even in 
string types that should never include them.  I believe it is not terribly 
uncommon, especially in Europe, to see email addresses with 8-bit ISO-Latin-1 
characters in them.  

Here's an expression for downshifting that works for 7-bit uppercase characters, 
in strings of type "printableString", IA5, ASCII, and ISO-Latin-1.

     Unsigned char cc;
     (cc >= 'A' && cc <= 'Z' ? cc | 0x20 : cc)    
Are we trying to downshift nicknames, too?
Nelson,

Case conversion cannot always be done character
by character.  An example is the German ß, which
becomes two characters (SS) when converted to
uppercase.

RFC 3280 says an EmailAddress or rfc822Name is
an IA5String.

The code you provided can't downshift À to à,
can it?
I propose that when we have a string which is a mixture of 7bit characters
and 8-bit characters that we downshift the upper case 7bit characters and
leave the 8bit characters alone.  
This proposal differs from what I saw in the patches, which was to not
downshift any characters in the string if any of them was an 8big character.
Actually the behavior in the patch was to stop looking for an email cert if the
address contained any 8 bit character.
Attached patch patch update (obsolete) — Splinter Review
Take Wan-Teh and Nelson's suggestions into account
Attachment #102997 - Attachment is obsolete: true
Attached patch patch update (obsolete) — Splinter Review
Take Wan-Teh and Nelson's suggestions into account
Attachment #103000 - Attachment is obsolete: true
Comment on attachment 103935 [details] [diff] [review]
patch update

Your while loop will stop the downshifting at
the first 8-bit character and skip the cert
lookup.  This is not what Nelson proposed.

You should remove the 'eightbitstring' boolean
variable altogether and write the loop like this:

    for (i = 0; lowercaseName[i]; i++) {
	if (!(0x80 & lowercaseName[i])) {
	    lowercaseName[i] = tolower(lowercaseName[i]);
	}
    }
Attachment #103935 - Flags: needs-work+
The question is if there are characters in the string greater than 128, what are
those characters? Latin1? Unicode? UTF8? It seems best to leave those characters
alone. Using the localization of your machine doesn't help because the email may
have come from (and probably did come from) a machine with a different
localization. Since the standard prohibits them, there is no way we can
interpret them correctly.

On the other hand, even though I get a ton of korean mail (spam), it looks like
none of the email addresses are anything but 7 bit ascii (though the 'labels'
next to that email tend to be Korean).

bob
Sorry I should have read the rest of the discussion. I agree with Nelson's
comment #27. This works for CodePage encodings and UTF8. I don't think straight
Unicode would work at all with email addresses.

RE Nickname: General nicknames should not be converted to lower case. Only email
addresses. I believe there is already code to convert the email addresses:
CERT_FixupEmailAddr (or something like that, grep -i for FixupEmail).

bob
Here's yet another proposal.  
Given that 
- tolower is LOCALE dependent, and the downshifting should not be LOCALE 
dependent, and 
- we only want to downshift plain 7-bit ascii uppercase characters, and 
not any 8-bit characters

I propose that we create a new #define macro NSS_TOLOWER(x) and replace all
of NSS's tolower calls with calls to NSS_TOLOWER(x).   Once this is done,
if we decide to change the downshifting criteria, we only have to change
one #define to affect all places where NSS downshifts.  

I propose this implementation of NSS_TOLOWER.

#define NSS_TOLOWER(cc) ((cc) >= 'A' && (cc) <= 'Z' ? (cc) | 0x20 : (cc))    

This correctly downshifts the ASCII letters A-Z and no other characters.
This works for all character sets of which ASCII letters are a subset, 
e.g. UTF8, printable string, ISO-Latin-*, IA5, etc.

It also eliminates the need for code like this:

        if (!(0x80 & lowercaseName[i])) 
            lowercaseName[i] = tolower(lowercaseName[i]);

That code can be replaced with 
        lowercaseName[i] = NSS_TOLOWER(lowercaseName[i]);

The 0x80 test is no longer necessary because the NSS_TOLOWER macro doesn't
downshift any characters with the high bit set.
nelson's proposal in comment 34 gets my vote.

bob
This is in response to comment #31 .

I don't I agree with that particular suggestion, because in the real world,
e-mail addresses may not be more than 8 bits. This is due to limitations in
email servers, as well as as what's spelled out in RFCs.

As examples of this :
- I just tried to create an AKA with a latin-1 8 bit character, and it wouldn't
let me, I think for this very reason.

- Thawte disallows certificate requests if the email address contains 8-bit
characters. It says the e-mail address is invalid. Therefore, a certificate with
an 8-bit email address cannot be obtained. I'm not sure if this is the case with
all CAs but I would argue that Thawte is doing the right thing in rejecting it

This is my reasoning for not bothering to look up certs for e-mail addresses
that contain 8-bit characters. Those e-mail addresses are invalid. I can't set
an error because it is still possibly a valid cert nickname.

Refusing to lookup a cert just because we believe no such cert should exist
is likely to lead to a bug report where someone says "I have this cert, but 
NSS won't find it".  Unless we also prevent all such certs from getting in
all PKCS#11 tokens that we ever deal with, I think we should still be able
to search for them.  

It seems to me that the issue presented by 8-bit characters is not whether
or not we should look for them but rather is whether or not we should 
attempt to downshift them. We've answered that latter question with no.
FYI, I tried to email an id at netscape.com containing an 8-bit character,
rançon@netscape.com .

The message bounced back because the user "ranon" didn't exist . Ie, the mail
server stripped the 8-bit character out of the ID before searching for the
e-mail user.

Maybe that's what we should do too when searching for an e-mail cert ?
Moved to target milestone 3.8 because the original
NSS 3.7 release has been renamed 3.8.
Target Milestone: 3.7 → 3.8
Following up on comment #37 : on the theoritical possibility that users would
complain that their certs are not found, I ask the reverse question : in what
cases should we expect the searches to succeed when 8-bit characters are present
in the e-mail address ?

If we decide not to downshift or convert anything, but instead do a raw search
(ie. byte comparison), then a successful match could only be made if the search
parameter passed to the search function and the email address in the cert
happened to be in the same character set.

As somebody pointed out earlier, it isn't very likely to be the case that
everybody and their recipients uses the same application in the same character
set, which is the only way the searches would succeed.

Until there is proof that non-urlencoded 8-bit email addresses actually work
somewhere, I would prefer that our search function be consistent in its behavior
and never return any certs for such addresses. I think this may be better than
randomly failing with machine-specific results. Such bug reports would be far
more difficult to diagnose.
Target Milestone: 3.8 → 3.9
Not a vulnerability or crash, and AOL product request : re-prioritizing to P2


Priority: P1 → P2
In response to comment #33 from Bob :

There is indeed already a function called CERT_FixupEmailAddr . Currently that
function just calls tolower(), which has been determined to be locale-dependent . 

It would be good if we had a single downshifting function in NSS that was used
accross the board for the conversion; ie. the same function when converting the
e-mail address from the cert to create the S/MIME profile, and when doing the
lookup.

CERT_FixupEmailAddr isn't being called when saving the S/MIME profile. In fact,
I looked at the code, and CERT_SMimeProfile does not seem to do any
downshifting. It just takes the first e-mail address from the cert using
CERT_GetFirstEmailAddress() . Then that gets passed into PK11_SaveSMimeProfile .
If there is in fact downshifting happening when we store the cert in the
profile, does anyone know where it happens ?

If there is no downshifting going on in the database, then I think it's fair to
say that we should not change the query function.

At this point, I believe the only reason things have worked with our code so far
is that all the e-mail addresses we have in our certs subject and subjectaltname
are lower case.

I submit that if they were not, then PSM would break today, because it converts
the address to lowercase before calling the NSS query function, and would not
find the cert.
Actually, in alg1845.c, the function appendStringToBuf does a tolower()
conversion ... That is the one that is actually done when saving the profile. So
indeed the profile is saved in lowercase using tolower() ...
That means we should probably do the same when doing queries.
Attachment #103934 - Attachment is obsolete: true
Attachment #103935 - Attachment is obsolete: true
Attachment #132377 - Flags: review?(wchang0222)
Comment on attachment 132377 [details] [diff] [review]
use CERT_FixupEmailAddr for conversions during queries

>Index: certdb/stanpcertdb.c
>===================================================================
>RCS file: /cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v
>retrieving revision 1.57
>diff -u -r1.57 stanpcertdb.c
>--- certdb/stanpcertdb.c	18 Feb 2003 20:53:12 -0000	1.57
>+++ certdb/stanpcertdb.c	29 Sep 2003 23:50:05 -0000
>@@ -482,13 +482,21 @@
>     NSSCertificate *c, *ct;
>     CERTCertificate *cert;
>     NSSUsage usage;
>+
>+    if (NULL == name) {
>+	return NULL;
>+    }

We need to set the SEC_ERROR_INVALID_ARGS error code before returning
NULL.  (BTW, the indentation of "return NULL" seems to be wrong.)

I gave this patch r=wtc with some comments.

1. It *may* be better to do the conversion to lowercase in the Stan
layer.

2. It's hard to verify whether all the necessary changes have been
made.  I checked pk11func.h and cert.h, looking for words related to
email.	Please examine the following four functions (especially the
first two) and see if they need change:
- PK11_FindSMimeProfile
- PK11_SaveSMimeProfile
- CERT_FindSMimeProfile
- CERT_SaveSMimeProfile
Attachment #132377 - Flags: review?(wchang0222) → review+
Comment on attachment 132377 [details] [diff] [review]
use CERT_FixupEmailAddr for conversions during queries

Julien, please ignore my comment about the indentation of "return NULL".
Wan-Teh,

PK11_FindSMimeProfile and PK11_SaveSMimeProfile do not downconvert the e-mail
address. Neither of these functions is exported.

CERT_SaveSMimeProfile downconverts the e-mail address, before calling
PK11_SaveSMimeProfile with that address.

I'm still investigating CERT_FindSMimeProfile .
CERT_FindSMimeProfile is also OK . It uses the cert->emailAddr field which has
also been downconverted for the query; or if the cert is a temporary cert it it
looks up in the Stan cache by issuer and SN.

Checking in certdb/stanpcertdb.c;
/cvsroot/mozilla/security/nss/lib/certdb/stanpcertdb.c,v  <--  stanpcertdb.c
new revision: 1.58; previous revision: 1.57
done
Checking in pk11wrap/pk11cert.c;
/cvsroot/mozilla/security/nss/lib/pk11wrap/pk11cert.c,v  <--  pk11cert.c
new revision: 1.124; previous revision: 1.123
done
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: