Closed
Bug 286534
Opened 20 years ago
Closed 19 years ago
Implement IDN punycode display by .tld
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
FIXED
People
(Reporter: gerv, Assigned: jshin1987)
References
Details
Attachments
(1 file, 2 obsolete files)
7.47 KB,
patch
|
gerv
:
review+
darin.moz
:
superreview+
asa
:
approval-aviary1.1a2+
|
Details | Diff | Splinter Review |
We need to replace the network.IDN_show_punycode pref with something more
fine-grained, based on the TLD of the domain. This allows us to enable/disable
correct display of IDNs in individual TLDs based on whether or not they have
good homograph-protection policies in place.
I suggest a pref branch, so:
network.IDN.disable.com
network.IDN.disable.net
...
Then, we can assess a particular URL with a single pref lookup. (The only way
this might one day run into trouble is if we get non-ASCII characters in TLDs
themselves. How likely is that?)
Gerv
Comment 1•20 years ago
|
||
Actually, it's quite likely in the long run. But then, you cound probably do
something like network.IDN.disable.xn--asdfghjk ... does this mean that pref
names are ASCII-only?
Comment 2•20 years ago
|
||
We have too many hostname based lookups. In theory, nsIPermissionManager could
do this as well, but those setting aren't this easy to update from xpi's
Assignee | ||
Comment 3•20 years ago
|
||
*** Bug 290275 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 4•20 years ago
|
||
The following was taken from bug 290275 (assuming this bug is not only about
blacklisting),
All IDN urls are currently displayed as Punycode by default. Whitelist of safe
TLDs like Opera is needed to resolve this IDN problem more smartly.
http://www.opera.com/windows/changelogs/800b2/
I suggest new pref like this:
pref("network.IDN_TLDs_whitelist", "no, jp, de, se, kr, tw, cn, at, dk, ch, li");
If IDN url matches whitelist, the url should be displayed as Unicode whether
"network.IDN_show_punycode" pref is true or false.
Comment 5•20 years ago
|
||
Opera 8.0 beta 3 allows the user to switch from a whitelist to a blacklist:
http://www.opera.com/windows/changelogs/800b3/
How about using the same format in Mozilla? E.g.:
user_pref("network.IDN_TLD", ":no:jp:de:se:kr:tw:cn:at:dk:ch:li:museum:hu:");
user_pref("network.IDN_TLD", "~:com:");
This allows users of both Mozilla and Opera to copy and paste their preferences
from one product to the other.
Reporter | ||
Comment 6•20 years ago
|
||
> This allows users of both Mozilla and Opera to copy and paste their preferences
> from one product to the other.
And who apart from you is ever going to do that? :-)
This bug is currently being worked on by Neil and Lorenzo, with me providing
input. The current design calls for a pref tree (given that this sort of thing
is what they were invented for), allowing a zero-parsing, single-pref-lookup
solution:
network.idn.whitelist.com
network.idn.whitelist.org
...
This also allows users and extensions to easily set or unset a single TLD
without worrying about the others.
Gerv
Comment 7•20 years ago
|
||
> And who apart from you is ever going to do that? :-)
:-)
> network.idn.whitelist.com
Is this a boolean pref? (I assume network.idn.blacklist.* is not planned?)
Comment 8•20 years ago
|
||
Reporter | ||
Comment 9•20 years ago
|
||
Erik: yes, they are boolean prefs. We could have a whitelist or a blacklist;
having both doesn't make much sense. I argued for a blacklist, but I think it's
more important that we do something which is consistent with the stance of the
other browser manufacturers.
Masayuki: both ideas require a single pref lookup; Mozilla has support for
things called "pref trees" which allow you to get a handle on e.g.
network.idn.whitelist and then look up relative prefs. The "all in one string"
idea requires extra parsing. I agree the performance difference probably isn't
big - the key requirements are ease of updating and maintenance.
If it's all one string, how do you suggest the Foundation publish updated
whitelists to people who have modified their whitelist? In the pref branch plan,
it's easy - the new prefs.js is shipped, and everything automatically works.
Gerv
Reporter | ||
Comment 10•20 years ago
|
||
Nominating for 1.8b3. We need this for Firefox 1.1.
Gerv
Blocks: 290872
Flags: blocking1.8b3?
Updated•20 years ago
|
Flags: blocking1.8b3?
Flags: blocking1.8b3+
Flags: blocking-aviary1.1+
Comment 11•20 years ago
|
||
Gerv:
Who fix this bug?
If nobody can create the patch, I try it.
Reporter | ||
Comment 12•20 years ago
|
||
Masayuki: thank you for your kind offer. Two guys are supposed to be working on
it; I'll give them a ping and make sure they assign this bug to themselves.
Gerv
Comment 13•20 years ago
|
||
For a possible implementation, see bug 279099 comment 230. I'm working on this,
but I probably won't have time to do much until next week. If someone else
thinks they can have a patch ready before then, feel free to step up and do it.
Status: NEW → ASSIGNED
Comment 14•20 years ago
|
||
> ------- Additional Comment #13 From Lorenzo Colitti 2005-04-19 16:21 PDT
[reply] -------
>
>For a possible implementation, see bug 279099 comment 230. I'm working on this,
>but I probably won't have time to do much until next week. If someone else
>thinks they can have a patch ready before then, feel free to step up and do it.
Lorenzo, did you get a chance to work on this...
jshin, darin and I thought you would be a good person to dedicate some time to
this bug so it doesn't fall thought the cracks. can you help out with coming
up with a fix or talking a look at the proposals made by lorenzo or others?
thanks
Assignee: darin → jshin1987
Status: ASSIGNED → NEW
Assignee | ||
Comment 15•20 years ago
|
||
Ok. I'll try to drive this. Lorenzo, have you had a chance to work on this
since? I'll begin my work later today.
Assignee | ||
Comment 16•20 years ago
|
||
I introduced 'network.IDN.safe.TLD' (the name is tentative). For TLDs with
'network.IDN.safe.TLD set to true, we show IDNs in UTF-8 rather than in
punycode. If pref. is not defined for a TLD or set to false, we use punycode.
These preference entries are only referred to when network.IDN_show_punycode'
is 'false'. If it's set to true, we always use punycode for all TLDs regardless
of the values of 'network.IDN.safe.TLD'.
I'm not sure if I need caching and/or adding observer(s).
darin, what would you say?
Assignee | ||
Updated•20 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 17•20 years ago
|
||
If we decide to support both whitelist and blacklist (not at the same time),
'network.IDN.show_punycode' (with an appropriate name change) can be used to
indicate which to use, blacklist or whitelist. We can do as follows:
if ((whitelist && not explicitly safe) || (blacklist && explicitly set unsafe))
use punycode
else
use UTF-8
Needless to say, we can only support blacklist if it's decided that it's better.
Assignee | ||
Comment 18•20 years ago
|
||
any comment, good or bad?
Comment 19•20 years ago
|
||
Jungshik, it's great that you're doing this work. I won't attempt to answer
all of your questions, but Gerv said that he would like to adopt the same
stance as other browser vendors, and Opera has chosen the whitelist approach
(by default), so Mozilla should too.
I don't think we can allow the user to switch to a blacklist since Gerv would
like it to be possible to update the whitelist from the mozilla.org side.
E.g. additional pref entries for whitelisted domains.
Gerv, is Mozilla going to adopt Opera's whitelist?
Reporter | ||
Comment 20•20 years ago
|
||
We're planning to adopt Opera's whitelist unless either someone comes to me with
a good reason why we should remove one of their entries, or I have time to do
some in-depth research myself.
So yes, we do not want to allow switching between whitelist and blacklist. This
is a security feature, and should have no UI.
I would prefer the pref stub to be "network.IDN.whitelist.". Other than that,
this patch looks pretty good in terms of function.
Gerv
Comment 21•20 years ago
|
||
Jungshik: sorry for not replying earlier, but I wasn't CC'd on this bug.
The patch looks like what I wanted to do, except it's better written. :) I have
a couple of questions though:
1. You use PromiseFlatCString(host).get() to convert the hostname to ASCII. Is
this safe? And if any Unicode TLDs are ever assigned, it will need to be changed.
2. Why not add another pref observer and put the code that loads the pref into
PrefsChanged()? If I understand the code correctly (but maybe I don't) that
would be simple and would give you the benefit of being able to react to pref
changes.
Comment 22•20 years ago
|
||
(In reply to comment #21)
> 1. You use PromiseFlatCString(host).get() to convert the hostname to ASCII. Is
> this safe? And if any Unicode TLDs are ever assigned, it will need to be changed.
That function does not convert to ASCII... it essentially just allows you to
call .get().
Comment 23•20 years ago
|
||
I think we should consider using nsIPermissionManager for hosts the user wants
to explicitly allow or explicitly block. The preferences systems is a bad
choice for storing this information since it is affected by changes to the
default prefs.
IMO, we should use a non-pref backend for storing the information we download.
Perhaps we could store a file in the profile, and fetch it periodically (using a
conditional HTTP query). We could look at extending the permission manager to
aggregate multiple data files, or we could layer something on top of the
permission manager.
Reporter | ||
Comment 24•19 years ago
|
||
> The preferences systems is a bad choice for storing this information since it is
> affected by changes to the default prefs.
IMO, that's what makes it a good choice. If the Turks and Caicos Islands come
out with appropriate policies on IDN, the next update we ship needs to turn on
IDN for that TLD for everyone who has not explicitly disabled it in the past.
Changing the default prefs achieves exactly that, doesn't it?
Gerv
Assignee | ||
Comment 25•19 years ago
|
||
(In reply to comment #23)
> I think we should consider using nsIPermissionManager for hosts the user wants
> to explicitly allow or explicitly block.
'Per host' en/disabling is beyond the scope of this bug even if it's desirable.
Reporter | ||
Comment 26•19 years ago
|
||
Jungshik: any chance of an updated patch with the tweak I requested? I'll then
review it and we can chase SR.
Gerv
Assignee | ||
Comment 27•19 years ago
|
||
I changed the name of the pref. branch to network.IDN.whitelist as suggested by
gerv.
Attachment #184849 -
Flags: review?(gerv)
Comment 28•19 years ago
|
||
jshin:
Your patch is not valid file. Please check it.
Assignee | ||
Comment 29•19 years ago
|
||
ooops. thanks for pointing that out. this is 'the' real thing.
Attachment #182326 -
Attachment is obsolete: true
Attachment #184849 -
Attachment is obsolete: true
Attachment #184858 -
Flags: review?(gerv)
Assignee | ||
Updated•19 years ago
|
Attachment #184849 -
Flags: review?(gerv)
Comment 30•19 years ago
|
||
Why is only de, jp and kr included in the patch ? What about no, se, tw, cn, at,
dk, ch, and li (rest of the Opera domains) ?
Reporter | ||
Comment 31•19 years ago
|
||
Comment on attachment 184858 [details] [diff] [review]
patch update
r=gerv.
I note that this patch doesn't include all the domains. That's fine - we can do
a separate patch adding (or removing) domains later. This is certainly not a
reason to hold this up.
Gerv
Attachment #184858 -
Flags: superreview?(darin)
Attachment #184858 -
Flags: review?(gerv)
Attachment #184858 -
Flags: review+
Comment 32•19 years ago
|
||
Comment on attachment 184858 [details] [diff] [review]
patch update
>Index: netwerk/base/src/nsStandardURL.cpp
>+
>+ nsCOMPtr<nsIPrefService> prefs( do_GetService(NS_PREFSERVICE_CONTRACTID) );
>+ if (prefs) {
>+ nsCOMPtr<nsIPrefBranch> branch;
>+ if (NS_SUCCEEDED(prefs->GetBranch( NS_NET_PREF_IDNWHITELIST,
>+ getter_AddRefs(branch) )))
>+ NS_ADDREF(gIdnWhitelistPrefBranch = branch);
> }
hmm... can't you just QI prefBranch to nsIPrefService instead of calling
GetService again?
> NS_IF_RELEASE(gIDN);
> NS_IF_RELEASE(gCharsetMgr);
>+ NS_IF_RELEASE(gIdnWhitelistPrefBranch);
I think it'd be good to stick with "IDN" instead of using "Idn" here.
>+ if (gShowPunycode || !IsInWhitelist(PromiseFlatCString(host).get())) {
Can you make it so that IsInWhitelist does not need a null
terminated host string? How about using the string API
all the way down into IsInWhitelist?
> static nsIIDNService *gIDN;
>+ static nsIPrefBranch *gIdnWhitelistPrefBranch;
> static nsICharsetConverterManager *gCharsetMgr;
> static PRBool gInitialized;
> static PRBool gEscapeUTF8;
> static PRBool gAlwaysEncodeInUTF8;
> static PRBool gShowPunycode;
nit: please list these variables in the same order in the cpp file
as they are listed here.
general comment: so, we anticipate using the update service to add or
remove TLDs periodically? and just bail on trying to deal with the
problems associated with user-set preferences matching a changed
default pref value, thus resulting in the user-set value being cleared?
Attachment #184858 -
Flags: superreview?(darin) → superreview+
Reporter | ||
Comment 33•19 years ago
|
||
(I appear to remember having this conversation before, but I may need
enlightening again.) Regarding Darin's general comment: have I misunderstood how
default prefs work? Surely user-set prefs override default prefs? That's why
they are called "default"?
Gerv
Assignee | ||
Comment 34•19 years ago
|
||
Comment on attachment 184858 [details] [diff] [review]
patch update
asking for a.
I made sure this works well. We for sure want this feature in 1.1
Darin's comments were addressed in my tree. As for IsInWhitelist, I tried a few
alternatives and ended up doing this. At one point, I need a null-terminated
string because GetBoolPref needs 'char *'. Because |host| is usually less than
64 bytes, this seems to be a good way to avoid the memory allocation.
if (gIDNWhitelistPrefBranch &&
(pos = nsCAutoString(host).RFind(".")) != kNotFound &&
NS_SUCCEEDED(gIDNWhitelistPrefBranch->
GetBoolPref(nsCAutoString(Substring(host, pos + 1)).get(),
&safe)))
return safe;
Attachment #184858 -
Flags: approval-aviary1.1a2?
Updated•19 years ago
|
Attachment #184858 -
Flags: approval-aviary1.1a2? → approval-aviary1.1a2+
Assignee | ||
Comment 35•19 years ago
|
||
fix checked in. I added no, se, tw, cn, at,dk, ch, and li to the pref. file.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Comment 36•19 years ago
|
||
Jungshik, any reason we're still blocking hebrew and arabic tld(s)?
Assignee | ||
Comment 37•19 years ago
|
||
It's not about scripts/languages but about TLDs. As Gerv wrote in comment #31,
we have to deal with that in a separate 'bug'. Anyway, basically what we did is
the same as what Opera did: (
http://www.opera.com/windows/changelogs/800b2/)
# TLDs are considered safe if they have implemented anti-homographic character
policies or otherwise limited the available set of characters to prevent spoofing.
# Current whitelist contains the following top-level domains: no, jp, de, se,
kr, tw, cn, at, dk, ch, and li
Comment 38•19 years ago
|
||
> what we did is the same as what Opera did:
> (http://www.opera.com/windows/changelogs/800b2/)
>
> Current whitelist contains the following top-level domains:
> no, jp, de, se, kr, tw, cn, at, dk, ch, and li
The cited document is incorrect. Starting several months ago, the whitelist
in the Opera distribution package has been:
IDNA White List=:no:jp:de:se:kr:tw:cn:at:dk:ch:li:museum:hu:
Comment 39•19 years ago
|
||
> > what we did is the same as what Opera did:
> > (http://www.opera.com/windows/changelogs/800b2/)
From the changelog in http://www.opera.com/windows/changelogs/800b3/ :
Improvement to IDNA handling.
* Script detection now allows more IDNs to be displayed in Unicode. Only a
single script and specific combinations are allowed in each label (separated by
"." or "-").
* Added support for switching from IDN whitelist to blacklist by adding a
'~' as the first character in the string, that is, "~:com:tw:"
* Added .hu and .museum to list of trusted domains.
Reporter | ||
Comment 40•19 years ago
|
||
Please don't make any changes to the list until I've had a chance to review the
current one. If you think a TLD has the appropriate policies in place and should
be added, please send me an email with URLs proving that this is the case.
Gerv
Reporter | ||
Comment 41•19 years ago
|
||
OK. I've reviewed the existing list, and also .museum and .hu. Here's the result:
http://www.mozilla.org/projects/security/tld-idn-policy-list.html
I've filed bug 299927 on getting those last two added to the list.
Gerv
You need to log in
before you can comment on or make changes to this bug.
Description
•