Closed
Bug 196717
Opened 22 years ago
Closed 22 years ago
Decode ACE to UTF-8
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
FIXED
People
(Reporter: nhottanscp, Assigned: nhottanscp)
Details
Attachments
(2 files, 1 obsolete file)
7.65 KB,
patch
|
Details | Diff | Splinter Review | |
7.58 KB,
patch
|
darin.moz
:
superreview+
|
Details | Diff | Splinter Review |
Decoding ACE to UTF-8, nsIIDNService::ConvertACEtoUTF8 is not implemented.
darin, what kind of situation we get ACE from server?
Comment 1•22 years ago
|
||
lot's of cases...
- HTTP redirect
- cookies
- HTML content
web content in general can rightfully contain ACE encoded hostnames.
nsStandardURL simply needs to always check the hostname to see if it matches an
ACE encoding. seems to me that nsIIDNService should have a method to check if a
string is ACE encoded or not. because if not, then it is cheaper to avoid the
buffer copy implicit in nsIIDNService::ConvertACEtoUTF8.
Assignee | ||
Comment 2•22 years ago
|
||
yes, we can check the prefix "xn--"
Assignee | ||
Comment 3•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #117922 -
Flags: review?(shanjian)
Comment 4•22 years ago
|
||
Comment on attachment 117922 [details] [diff] [review]
Implements ACE to UTF-8 and modified the test program.
looks fine.
Attachment #117922 -
Flags: review?(shanjian) → review+
Assignee | ||
Updated•22 years ago
|
Attachment #117922 -
Flags: superreview?(darin)
Comment 5•22 years ago
|
||
Comment on attachment 117922 [details] [diff] [review]
Implements ACE to UTF-8 and modified the test program.
>Index: dns/src/nsIDNService.cpp
>+ if (!IsASCII(input)) {
>+ _retval.Assign(input);
>+ return NS_OK;
>+ }
hmm... i'm not sure we should be checking this here. it seems like
added overhead that shouldn't be needed. it should be enough to just
scan the first bytes for the ACE prefix.
remember: this function is going tobe called a lot, so it is worth it
to spend the time to carefully optimize it.
>+ // RFC 3490 - 4.2 ToUnicode
>+ // The ToUnicode output never contains more code points than its input.
>+ PRInt32 prefixLen = strlen(mACEPrefix);
strlen(mACEPrefix) == 4 ... why not just hardcode the value? or
use a #define at least. strlen is overkill.
>+ PromiseFlatCString(
>+ Substring(
>+ in,
>+ prefixLen,
>+ in.Length() - prefixLen)).get(),
isn't this equivalent to:
in.get() + 4
provided you declare |in| to be |const nsAFlatCString&| or |const nsCString&|.
PromiseFlatCString is expensive here since it will heap allocate and copy the
input string. avoid if possible.
>+ // UCS4 -> UTF8
>+ output[output_length] = 0;
>+ nsAutoString utf16;
>+ ucs4toUtf16(output, utf16);
>+ delete [] output;
>+ out.Assign(NS_ConvertUCS2toUTF8(utf16));
seems like there should be a common routine for this conversion someplace.
are you duplicating code that could be factored into a common subroutine?
>+ // Validation: encode back to ACE and compare the strings
ouch! is this strictly required? i guess so, but it is going to make
the conversion from ACE to UTF-8 much more expensive. can't we just
assume the input is valid? can this validation step be DEBUG only?
Attachment #117922 -
Flags: superreview?(darin) → superreview-
Assignee | ||
Comment 6•22 years ago
|
||
>+ if (!IsASCII(input)) {
>+ _retval.Assign(input);
>+ return NS_OK;
>+ }
This is based on RFC 3490 4.2.1. The spec seems to be allowing non-Ascii (4.2.2)
which we do not support but we need to check that case.
The prefix check is required after the Ascii check, I forgot to put that but I
need to add the prefix check.
I think the caller may call IsACE() before calling ConvertACEtoUTF8() for
performance.
>+ PRInt32 prefixLen = strlen(mACEPrefix);
This is because the prefix is also specified by pref. But since we only support
"xn--" and "bq--" anyway, we can change it to a constant. I will do the change.
>are you duplicating code that could be factored into a common subroutine?
No, that is specific to that code. There is other code which converts UCS4 to
UTF-16 but not to UTF-8.
>+ // Validation: encode back to ACE and compare the strings
Yes, that is required (RFC 3490 4.2.6-7). I think it is better to follow the
spec at least for now since we may get invalid domain names and it is safter to
do the required check.
Assignee | ||
Comment 7•22 years ago
|
||
>I think the caller may call IsACE() before calling ConvertACEtoUTF8()
>for performance.
But the case like www.xn--ENCODED.com, IsACE() returns false for that.
I think IsACE() has to change to search for "xn--" not only for the beginning of
the input string. (but I have no idea how to do that for nsACString).
Comment 8•22 years ago
|
||
Comment 9•22 years ago
|
||
>+ PRInt32 prefixLen = strlen(mACEPrefix);
This is because the prefix is also specified by pref. But since we only support
"xn--" and "bq--" anyway, we can change it to a constant. I will do the change.
remember you made IsACE hardcode 4. if it is variable, then you should use
strlen. i didn't realize it was pref controlled.
Assignee | ||
Comment 10•22 years ago
|
||
That was my mistake using 4 there. But we can make it constant now since "xn--"
is official and we only support "xn--" and "bq--".
Assignee | ||
Comment 11•22 years ago
|
||
Attachment #117922 -
Attachment is obsolete: true
Assignee | ||
Updated•22 years ago
|
Attachment #118341 -
Flags: superreview?(darin)
Comment 12•22 years ago
|
||
does our pref checking code ensure that mACEPrefix will always be 4 characters
in length?
Assignee | ||
Comment 13•22 years ago
|
||
yes, the code checks the length
http://lxr.mozilla.org/seamonkey/source/netwerk/dns/src/nsIDNService.cpp#89
89 rv = prefBranch->GetCharPref("network.IDN_prefix", getter_Copies(prefix));
90 if (NS_SUCCEEDED(rv) &&
91 prefix.Length() <= kACEPrefixLen) {
92 strncpy(mACEPrefix, prefix.get(), kACEPrefixLen);
93 mACEPrefix[sizeof(mACEPrefix)-1] = '\0';
94 }
Comment 14•22 years ago
|
||
Comment on attachment 118341 [details] [diff] [review]
Use constant value for prefix length, changed IsACE to handle entire domain name as an input.
>Index: dns/src/nsIDNService.cpp
might be good to move decl of nsDependentCString(mACEPrefix, ...) out:
nsDependentCString prefix(mACEPrefix, kACEPrefixLen);
then you won't have to instantiate it twice.
>+ // also check for the case like "www.xn--ENCODED.com"
>+ // in case for this is called for an entire domain name
what about "aaa.bbb.xn--bar.com"?
nothing show-stopping... the code has a lot of buffer copies,
but i don't see a quick way to eliminate those.
Assignee | ||
Comment 15•22 years ago
|
||
>what about "aaa.bbb.xn--bar.com"?
that is also taken care by searching for "." + prefix
Assignee | ||
Comment 16•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #118442 -
Flags: superreview?(darin)
Comment 17•22 years ago
|
||
Comment on attachment 118442 [details] [diff] [review]
nsDependentCString change by darin's suggestion.
sr=darin (ok)
Attachment #118442 -
Flags: superreview?(darin) → superreview+
Assignee | ||
Comment 18•22 years ago
|
||
checked in to the trunk
darin, who should call this decode function, shall I file a separate bug for that?
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 19•22 years ago
|
||
nsStandardURL will need to call this decode function. yes, a separate bug would
be best.
Assignee | ||
Comment 20•22 years ago
|
||
filed bug 199229 for the caller side implementation
Updated•22 years ago
|
Attachment #118341 -
Flags: superreview?(darin)
You need to log in
before you can comment on or make changes to this bug.
Description
•