All important is described on URL: http://tinyurl.com/am8gjy
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090225 Minefield/3.2a1pre I see www.google.comOOOOOOOOOOOOOOOOOOOOO.phreedom.org where each O is a "1158" hexbox. Do you see hexboxes or something different?
FWIW, the tinyURL redirects to: https://www.google.xn--com-edoaaaaaaaaaaaaaaaaaaaaaaaaaaaa.phreedom.org/
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2a1pre) Gecko/20090224 Minefield/3.2a1pre (.NET CLR 3.5.30729) Still see the same problem. I will attach screenshot.
Seems like \u1158 needs to be added to the IDN blacklist?
Sorry, it's 115A, not 1158. And it doesn't make sense to blacklist it by itself, since it's one of several invalid unicode characters near each other: http://www.fileformat.info/info/unicode/char/115b/index.htm
Why is it invisible on Windows?
I also see the hexbox on XP.
batang.ttc and gulim.ttc (on Vista by default) provide blank glyphs for U+115A-115E.
Filed bug 480450 for creating tests capable of catching this kind of issue.
(In reply to comment #9) > batang.ttc and gulim.ttc (on Vista by default) provide blank glyphs for > U+115A-115E. I see the blanks on Mac OS X 10.5, but I believe this is using fonts that I have installed personally, not ones shipped with the system by default. It's a long time since I looked at IDN rules, but I'm surprised unassigned Unicode characters would be permitted at all.
http://tools.ietf.org/html/rfc3490#section-4 1) Decide whether the domain name is a "stored string" or a "query string" as described in [STRINGPREP]. If this conversion follows the "queries" rule from [STRINGPREP], set the flag called "AllowUnassigned". http://tools.ietf.org/html/rfc3454#section-7 In general, "stored strings" are strings that are used in protocol identifiers and named entities, such as names in digital certificates and DNS domain name parts. All code points not assigned in the character repertoire named in a stringprep profile are called "unassigned code points". Stored strings using the profile MUST NOT contain any unassigned code points. Queries for matching strings MAY contain unassigned code points. So existing domain name parts should not contain unassigned code points. It seems that this would be a sensible verification to perform. Even though the verification can only use the latest know version of Unicode, the worst case is that a domain name using a newer version of Unicode is presented in its ASCII/Punycode form. If the verification is not performed, a new Unicode assignment may add further space or control characters or other characters that should be prohibited, and fonts may appropriately provide support for these characters. It wouldn't be sensible to rely on the font system to ensure that all currently unknown characters are visible, or always look different from known characters.
(In reply to comment #12) > It seems that this would be a sensible verification to perform. > Even though the verification can only use the latest know version of Unicode, > the worst case is that a domain name using a newer version of Unicode is > presented in its ASCII/Punycode form. There is already an API to do this, idn_nameprep_isunassigned, which we could presumably add to the checks in nsIDNService::stringPrep. I'm not sure how or where the distinction between stored string and queries exists in our code.
Oh, and the data tables used by nsIDNService deliberately don't get updated when we update Unicode support to the latest version, so we would be using the version of Unicode specified in the IDN RFCs.
I can reproduce same problem on Windows XP SP3 / Japanase. And, "Security" pane in Page info is also spoofed.
At Windows, title bar of Cert viewer shows incorrect site name when seeing certficate of www.google.xn--com-edoaaaaaaaaaaaaaaaaaaaaaaaaaaaa.phreedom.org . (U+11a5 becomes to "Latin small letter a with acute" and "middle single dot"). Should I file it as another bug?
(In reply to comment #14) > so we would be using the version of Unicode specified in the IDN RFCs. fwiw those rfc's are getting updated (bug 479520) but even using the old list would be an improvement.
Simon, is this something you can work on? If not, please let me know. I think we should block on this.
(In reply to comment #19) > Simon, is this something you can work on? If not, please let me know. I think > we should block on this. Yes, taking.
Created attachment 365641 [details] [diff] [review] Patch I think this is sufficient to fix the bug. Are there any other consequences of stringPrep() failing apart from forcing display of the URL in punycode?
(In reply to comment #21) > Are there any other consequences of stringPrep() failing apart from forcing > display of the URL in punycode? I'm assuming that conversion of a user provided domain in Unicode form also passes through this code. That would be consistent with the IDNA ToAscii transformation, http://tools.ietf.org/html/rfc3490#section-4.1 . But a user provided domain would be a "query string" so the "AllowUnassigned" flag should be true in that case. However, our code doesn't have the AllowUnassigned flag. What do you think about doing the unassigned check only in nsIDNService::ConvertToDisplayIDN? (Or perhaps even nsIDNService::isInWhitelist.)
9 years ago
Comment on attachment 365641 [details] [diff] [review] Patch So the unit test seems to set the "network.IDN.whitelist.com" pref to false if it's not already set? Just want to make sure that's what we want (assuming I'm reading the code correctly). The actual code patch seems logical, but I'm waiting for answer comment #22 before approving review.
Created attachment 368266 [details] [diff] [review] patch v.2 So the answer to comment 22 is that Karl is right, and we need to do the unassigned check only when converting for display.
Comment on attachment 368266 [details] [diff] [review] patch v.2 My apologies for taking so long to review--I didn't notice you had a new patch attached to your last comment. Patch looks good. Handing off for SR to JST, who should be able to either approve or say who should take it.
Comment on attachment 368266 [details] [diff] [review] patch v.2 Looks good. sr=jst
Comment on attachment 368266 [details] [diff] [review] patch v.2 Approved for 126.96.36.199, a=dveditz for release-drivers
Verified fixed in 188.8.131.52 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:184.108.40.206pre) Gecko/2009051105 GranParadiso/3.0.11pre (.NET CLR 3.5.30729).
This appears to affect the 1.8 branch as well.
Backporting this to 1.8 will require changing the nsIIDNService interface. Is that OK?
No, we require that published IDL files not change because we've had bad cases where extensions or other apps depended on them unbeknownst to us. The standard workaround is a new iface inheriting from the old: interface nsIIDNService_MOZILLA_1_8_BRANCH : nsIIDNService Or even just a separate interface to the object if you're not simply adding methods. Either is fine, whichever is easier in the given situation. Note that when we're strict something that was already extended in 2.0.0.x needs yet another interface when modified again in 2.0.0.y. Note poor nsIDocShell is up to _BRANCH3: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/docshell/base/nsIDocShell.idl&rev=220.127.116.11&mark=454#454 nsIDocShell didn't bother to inherit, nsIXPConnect did: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/js/src/xpconnect/idl/nsIXPConnect.idl&rev=18.104.22.168&mark=725#725
Created attachment 381260 [details] [diff] [review] Patch for 1.8 branch This growed a bit... I took the interface change over from bug 402008, which I thought would be simpler and safer than trying to reinvent it, and then had to include bug 427957 as well, since it fixed a regression from bug 402008.