Closed Bug 447815 Opened 16 years ago Closed 16 years ago

Update eTLD (public suffix) list

Categories

(Core :: Networking, defect)

1.9.0 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: pamg.bugs, Assigned: pamg.bugs)

Details

Attachments

(2 files, 3 obsolete files)

Attached patch Updates and additions (obsolete) — Splinter Review
Additional information has become available, or has been tracked down, for some TLDs in our list. Here's a patch incorporating the missing eTLDs for .ru contributed by Sami Tolvanen in bug 403655, adding .rs and .me information, and updating info for several TLDs based on posted registrar information (with occasional small additions from Wikipedia).
Attachment #331125 - Flags: review?(jo.hermans)
Summary: Update eTLD (public host) list → Update eTLD (public suffix) list
Attached patch Patch v.1 (obsolete) — Splinter Review
Here's a diff containing all the authenticated submissions I have. Please merge this in also. If you have the same changes from another source, please preserve the lines giving the registry source, as those are still important :-) Thanks, Gerv
Comment on attachment 331248 [details] [diff] [review] Patch v.1 > // pa : http://www.nic.pa/ >+// Submitted by registry <edna.samudio@utp.ac.pa> 2008-06-18 > *.pa >+!nic.pa >+!pannet.pa >+!presidencia.pa >+!milpolleras.pa >+!sume911.pa >+!root-ca.pa >+ac.pa >+gob.pa >+com.pa >+org.pa >+sld.pa >+edu.pa >+net.pa >+ing.pa >+abo.pa >+med.pa >+nom.pa The *.pa at the top subsumes all these later [something].pa entries. If *.pa stays, the individual ones can go; if the individual ones stay, none of the exception lines are needed. > // sa : http://www.saudinic.net.sa/page.php?page=1&lang=1 >+// Submitted by registry <sa-tld-tech-contact@nic.net.sa> 2008-06-23 > *.sa >+com.sa >+net.sa >+org.sa >+gov.sa >+med.sa >+pub.sa >+edu.sa >+sch.sa Same here.
Pam: I made those points to the submitters, but that's what they asked for. Given that we'd like it to be their responsibility if it's wrong, my thought is that we stick to what they give us, particularly if it's functionally equivalent to what we would do. What do you think? Gerv
It irks me to have them there, but I have to agree: as long as they're (mostly) harmless we ought to leave them alone. And look at adding to the documentation so it's clearer that this is pointless? If somebody tries to send a 4000-line [something].[blah] entry with a *.[blah] at the top, though, I'll be making more of a fuss. :)
I think the file should express intent clearly, and we should be willing to edit submissions (and note the edits we made inline in comments) if they don't meet that. For example, with "sa" above, is the intent that the registrar has established 8 eTLDs for now but will establish an arbitrary number of additional ones in the future? Or did they just screw up when putting in *.sa? The case for "pa" is even less obvious to me: http://pannet.pa/ is a live site, so rather than "*.pa", I think the registrar meant "pa"?
i think we should apply the same standard of code review to this file, when accepting changes - making sure that they make sense and aren't contradictory or redundant. some people might be confused by these things and either a) think it's a mistake, as peter points out above, or b) think the documentation is wrong or that they don't understand things properly. and we don't want these mistakes to spread and appear in new entries and such... can we point this out by email to them and give them two options to correct it (either just *.sa or the list of entries)?
This patch merges the 2 already submitted patches by Pam Greene and Gervase Markham. Furthermore it adds these changes: + [bz] Added edu.bz and gov.bz (source: Wikipedia) + [bw] Added org.bw (source: Wikipedia) + [asia] Added asia (source: Wikipedia) + [tl] Added gov.tl (source: Wikipedia) + [mk] Added inf.mk + name.mk (source: http://dns.marnet.net.mk/postapka.php) + [cd] Added gov.cd (source: https://www.nic.cd/domain/insertDomain_2.jsp?act=1) (If you need a patchfile with only the changes I made, just drop me a mail). As I need the list for a project, I am interested in keeping it up to date. If you need help to maintain this list and/or publicsuffix.org, just drop me a mail as well.
David: thank you for your offer in helping to maintain the list. Don't go away :-) Not sure why Jo isn't on this bug; CCing him. He has some additional fixes to make too. We should get this done as soon as we can, as 3.1 is entering the path to release. As for those two cases, I'll post the correspondence when I get back from holiday, and people can decide based on what they said. Gerv
Please check this patch using http://www.domain.me/ and then roll it in too; it was submitted via the email address on the website. Gerv
Attachment #331125 - Attachment is obsolete: true
Attachment #331248 - Attachment is obsolete: true
Attachment #331125 - Flags: review?(jo.hermans)
There is a dot in front of priv.me, but other than that it looks valid (verified using the offical registry homepage[1]). The .me info is already included in the patch by Pam Greene and hence my patch. There are some registries in the list, that are just selling subdomains and are not approved as registries by the IANA (e.g priv.at or the centralnic.com ones), but still reachable using the "official" DNS root. Considering this, should we add domains from alternative DNS root zones[2] too? Maybe using a second list for unoffical ones? ------- 1: http://www.domain.me/index.php?page=6 2: http://en.wikipedia.org/wiki/Alternative_DNS_root
(In reply to comment #9) > Created an attachment (id=333218) [details] > .me info, submitted. > > Please check this patch using http://www.domain.me/ and then roll it in too; it > was submitted via the email address on the website. > > Gerv > Gerv : the list for Montenegro (.me) is also in Pam's list, but more correct (there's an extra dot in front of priv.me that doesn't belong there)
comments on Pam's list (as promised in private mail) : .ao : ok .bf : not found in any official list, but gov.br really exists .bg : ok, but maybe we can simplify it with a "?.bg" rule .bh : no official list found, but com.bh exists .bi : ok .bw : the official document only mentions co.bw (indirectly), but org.bw also exists (and is mentioned on wikipedia). Should we add it ? .by : com.by isn't mentioned in any official document .bz : ok .ci : ok .cl : no official document found (probably ok) .cm : no official document found (probably ok) .cx : no official document found (probably ok) .gy : ok .io : no official document found .km : first list is correct, but the "suggestions" are only suggestions (and not complete, and pharmacien.km is misspelled) .kn : mostly ok (gov.kn isn't mentioned in that url, but probably ok) .ma : ok .me : ok (see also remark on Gerv's attachment) .mr : no official list found, but probably ok .mu : ok .museum : well yes, it's the list at <http://index.museum/>, but that shows both domains and eTLD's, for instance clinton.museum and film.museum. How are we going to maintain it, since it's so large and dynamic. The regular "museum" rule would cover most of the cases. .na : ok .ps : ok .rs : ok .ru : ok (I'll duplicate the other bug-reports that mentions this to here) .sz : ok .tn : ok .tv : no official list found .ws : ok
Gerv : a while ago, you've send me a sorted list that was pulled from a website somewhere. It's largely overlapping with Pam's list, but misses several eTLD's, and even contains some spelling mistakes. However, while walking thru it together with Pam's list, I've discovered several additions (besides the ones that David mentions). I can add them later, when we can agree with Pam's list and the changes mentioned above, otherwise it's too complicated with all these patches.
Here are some more eTLDs (without patch, as I don't know if we should add these): * .gb: Old domain for the United Kingdom, Wikipedia[1] says that there is still one active page[2], but I can't resolve it. * .kp: Domain for North Korea. No registrations are possible yet, but two domains are already active (see Wikipedia[3]). * .so: Somalia is currently without an internationally recognized government. The domain however is active, but according to Wikipedia[4] 2 of 3 nameservers are dead. nic.so, wcd.so and www.amoud-university.borama.ac.so/ resolve fine for me though. * .pm, .yt: French oversea territories. No registrations are allowed, but nic.pm/nic.yt redirect to the Afnic homepage. * .wf: French oversea territory. No registrations are allowed, but nic.wf as well as some domains of government organisations[5][6] are active. Gervase: You said, that you wanted to write to a mailing list for registries[5]. There are some entries marked as "Confirmed/Submitted by registry", but are those few the only ones where the registry answered? Can we ask the ICANN to send a second mail, asking for more submissions and updates? ------- 1: http://en.wikipedia.org/wiki/.gb 2: dra.hmg.gb 3: http://en.wikipedia.org/wiki/.kp 4: http://en.wikipedia.org/wiki/.so_(domain_name) 5: http://www.afnic.fr/actu/nouvelles/general/CP20080404 6: http://www.google.com/search?site=seach&hl=en&q=site%3A*.wf&btnG=Search 7: https://bugzilla.mozilla.org/show_bug.cgi?id=403655#c8
Let's focus on David's patch, and do a new one for further additions. If we've decided we should canonicalize, let's fix .pa and .sa, and then I think it's ready to go. David/Pam? Gerv
At least one registry (norid for .no domains) already declined to notify us in case of changes. We should try to position us as a central collecting point for eTLDs, where they have (to some extent) free choice what they submit, otherwise more registries will do the same. If we find errors/redundancies, we can point them out, but I guess that it will lead to discomfort between publicsuffix.org/Mozilla and the registries if we alter submissions. (Of course, we can't accept a list of 4000 subdomains where a *.tld would be enough). Other than that, I think we are clear to commit (actually the sooner the better, as every missing entry can be a security problem).
Letting registries submit apparently-redundant data is problematic because such submissions are ambiguous. It is not clear from: *.foo a.foo b.foo c.foo ...Whether "d.foo" was intended to be treated as a valid TLD or not. Therefore we should not commit submissions blindly but should instead respond to such submissions by saying "this is the same as just *.foo. Is that what you meant?" and come to some kind of agreement on an unambiguous version we can commit.
I agree that we shouldn't add unclear, self-contradictory information. It largely defeats the purpose of the list if you can't tell unambiguously what the public suffixes actually are. We should point out the problem and ask for clarification, and if none is forthcoming (or we want something to use in the meantime), make our best/safest guess.
Can someone please fix the patch and post a new version for review? I can then review it, and we can get it checked in and move on to the next set of changes. :-) Thanks, Gerv
Okay, I removed the redundant entries by keeping the wildcards and removing the rest. The new entries look like this: // sa : http://www.saudinic.net.sa/page.php?page=1&lang=1 // List with redundant entries as submitted by registry <sa-tld-tech-contact@nic.net.sa> 2008-06-23 //*.sa //com.sa //net.sa //org.sa //gov.sa //med.sa //pub.sa //edu.sa //sch.sa // As discussed in bug #447815 on bugzilla.mozilla.org, the list has been truncated. *.sa
Attachment #332539 - Attachment is obsolete: true
Tests run and passed; checked in as changeset 18451:b4c2e3d7f6ed. I'll open a new bug for any further changes. Gerv
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
We want this into Firefox 3.0.x at some point. After it's had a week on the trunk I'll nominate it. Gerv
So we've decided to keep all those *.museum entries, even though some of them are plain domains rather than public suffixes?
Well, you put them there :-) I must confess I'd missed Jo's comment. I know the guy who runs .museum - I can send him an email. Gerv
Pam: what makes you think we've done .museum wrong? It looks like we took this list: http://index.museum/ and removed all the domains with an asterisk, thereby leaving the two-label names which "may be included in a number of independent three-label names". Is that wrong? Gerv
I was going by Jo's comment #12, thinking that I'd left the non-TLD ones in by mistake, but I now recall that I removed them. According to the site, clinton.museum and film.museum are actually eTLDs (and film.museum has two independent domains registered). I retract my comment #23. Maintenance is still a potential issue, of course.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: