Closed Bug 196717 Opened 22 years ago Closed 22 years ago

Decode ACE to UTF-8

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: nhottanscp, Assigned: nhottanscp)

Details

Attachments

(2 files, 1 obsolete file)

Decoding ACE to UTF-8, nsIIDNService::ConvertACEtoUTF8 is not implemented. darin, what kind of situation we get ACE from server?
lot's of cases... - HTTP redirect - cookies - HTML content web content in general can rightfully contain ACE encoded hostnames. nsStandardURL simply needs to always check the hostname to see if it matches an ACE encoding. seems to me that nsIIDNService should have a method to check if a string is ACE encoded or not. because if not, then it is cheaper to avoid the buffer copy implicit in nsIIDNService::ConvertACEtoUTF8.
yes, we can check the prefix "xn--"
Attachment #117922 - Flags: review?(shanjian)
Comment on attachment 117922 [details] [diff] [review] Implements ACE to UTF-8 and modified the test program. looks fine.
Attachment #117922 - Flags: review?(shanjian) → review+
Attachment #117922 - Flags: superreview?(darin)
Comment on attachment 117922 [details] [diff] [review] Implements ACE to UTF-8 and modified the test program. >Index: dns/src/nsIDNService.cpp >+ if (!IsASCII(input)) { >+ _retval.Assign(input); >+ return NS_OK; >+ } hmm... i'm not sure we should be checking this here. it seems like added overhead that shouldn't be needed. it should be enough to just scan the first bytes for the ACE prefix. remember: this function is going tobe called a lot, so it is worth it to spend the time to carefully optimize it. >+ // RFC 3490 - 4.2 ToUnicode >+ // The ToUnicode output never contains more code points than its input. >+ PRInt32 prefixLen = strlen(mACEPrefix); strlen(mACEPrefix) == 4 ... why not just hardcode the value? or use a #define at least. strlen is overkill. >+ PromiseFlatCString( >+ Substring( >+ in, >+ prefixLen, >+ in.Length() - prefixLen)).get(), isn't this equivalent to: in.get() + 4 provided you declare |in| to be |const nsAFlatCString&| or |const nsCString&|. PromiseFlatCString is expensive here since it will heap allocate and copy the input string. avoid if possible. >+ // UCS4 -> UTF8 >+ output[output_length] = 0; >+ nsAutoString utf16; >+ ucs4toUtf16(output, utf16); >+ delete [] output; >+ out.Assign(NS_ConvertUCS2toUTF8(utf16)); seems like there should be a common routine for this conversion someplace. are you duplicating code that could be factored into a common subroutine? >+ // Validation: encode back to ACE and compare the strings ouch! is this strictly required? i guess so, but it is going to make the conversion from ACE to UTF-8 much more expensive. can't we just assume the input is valid? can this validation step be DEBUG only?
Attachment #117922 - Flags: superreview?(darin) → superreview-
>+ if (!IsASCII(input)) { >+ _retval.Assign(input); >+ return NS_OK; >+ } This is based on RFC 3490 4.2.1. The spec seems to be allowing non-Ascii (4.2.2) which we do not support but we need to check that case. The prefix check is required after the Ascii check, I forgot to put that but I need to add the prefix check. I think the caller may call IsACE() before calling ConvertACEtoUTF8() for performance. >+ PRInt32 prefixLen = strlen(mACEPrefix); This is because the prefix is also specified by pref. But since we only support "xn--" and "bq--" anyway, we can change it to a constant. I will do the change. >are you duplicating code that could be factored into a common subroutine? No, that is specific to that code. There is other code which converts UCS4 to UTF-16 but not to UTF-8. >+ // Validation: encode back to ACE and compare the strings Yes, that is required (RFC 3490 4.2.6-7). I think it is better to follow the spec at least for now since we may get invalid domain names and it is safter to do the required check.
>I think the caller may call IsACE() before calling ConvertACEtoUTF8() >for performance. But the case like www.xn--ENCODED.com, IsACE() returns false for that. I think IsACE() has to change to search for "xn--" not only for the beginning of the input string. (but I have no idea how to do that for nsACString).
>+ PRInt32 prefixLen = strlen(mACEPrefix); This is because the prefix is also specified by pref. But since we only support "xn--" and "bq--" anyway, we can change it to a constant. I will do the change. remember you made IsACE hardcode 4. if it is variable, then you should use strlen. i didn't realize it was pref controlled.
That was my mistake using 4 there. But we can make it constant now since "xn--" is official and we only support "xn--" and "bq--".
Attachment #118341 - Flags: superreview?(darin)
does our pref checking code ensure that mACEPrefix will always be 4 characters in length?
yes, the code checks the length http://lxr.mozilla.org/seamonkey/source/netwerk/dns/src/nsIDNService.cpp#89 89 rv = prefBranch->GetCharPref("network.IDN_prefix", getter_Copies(prefix)); 90 if (NS_SUCCEEDED(rv) && 91 prefix.Length() <= kACEPrefixLen) { 92 strncpy(mACEPrefix, prefix.get(), kACEPrefixLen); 93 mACEPrefix[sizeof(mACEPrefix)-1] = '\0'; 94 }
Comment on attachment 118341 [details] [diff] [review] Use constant value for prefix length, changed IsACE to handle entire domain name as an input. >Index: dns/src/nsIDNService.cpp might be good to move decl of nsDependentCString(mACEPrefix, ...) out: nsDependentCString prefix(mACEPrefix, kACEPrefixLen); then you won't have to instantiate it twice. >+ // also check for the case like "www.xn--ENCODED.com" >+ // in case for this is called for an entire domain name what about "aaa.bbb.xn--bar.com"? nothing show-stopping... the code has a lot of buffer copies, but i don't see a quick way to eliminate those.
>what about "aaa.bbb.xn--bar.com"? that is also taken care by searching for "." + prefix
Attachment #118442 - Flags: superreview?(darin)
Comment on attachment 118442 [details] [diff] [review] nsDependentCString change by darin's suggestion. sr=darin (ok)
Attachment #118442 - Flags: superreview?(darin) → superreview+
checked in to the trunk darin, who should call this decode function, shall I file a separate bug for that?
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
nsStandardURL will need to call this decode function. yes, a separate bug would be best.
filed bug 199229 for the caller side implementation
Attachment #118341 - Flags: superreview?(darin)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: