Closed Bug 1018056 Opened 10 years ago Closed 6 years ago

PSL: Adding new TLDs - proposed process change: add ALL remaining new ICANN TLDs

Categories

(Core Graveyard :: Networking: Domain Lists, defect, P5)

x86_64
Windows 7
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jothan, Assigned: weppos)

Details

(Whiteboard: [necko-would-take])

Attachments

(5 files)

This is a request for feedback.  The PSL maintainers would like to alter the PSL process as it relates to how new ICANN gTLDs are added.

There has been a request from ICANN to add the remainder of the ICANN new TLDs that have completed and passed evaluation.

We have been under the practice of incrementally adding the new TLDs by following the new gTLD contracting announcements as they happen, trailing shortly behind them in adding them to the PSL.  

This incremental process was established prior to the evaluation of the TLDs, but represented a faster pace of changes due to anticipation of it from ICANN's new TLD program. 

The incremental release process currently used would change slightly.  

Rather than continue to make incremental updates as the contracting announcements are made by ICANN, there would be one final edit, provided by ICANN, to include all remaining nTLDs that have completed evaluation, and then any incremental changes to the PSL from the new TLDs would be limited to any TLDs that need updates or require deletion.

This change would be beneficial as it would reduce the administrative burden on the maintainers of the PSL; it would also have the beneficial outcome of helping to ensure software that uses static snapshots of the PSL can recognize any of the nTLDs without requiring frequent refreshes.

While this reduces overhead logistics for the maintainers, we want to ensure there is not any problem introduced by the change prior to implimentation.  We respectfully request any concerns be identified or questions for clarification be raised as comments to this bug.

We would appreciate any feedback or questions.
Do you have a proposed patch?

Previously (Chrome 35 and earlier), a large addition would have bloated our binary size significantly. We recently changed how we use the PSL, so it would be good to know what the size impact.

Two other downsides from our use of the PSL - it would affect our Omnibox handling if, say, flowers was added to the PSL even though it wasn't delegated, as we would prefer to treat it as a user query (until it actually is delegated). The other is that we use the PSL for restrictive security (eg: limits on publicly trusted certs). Additions to the PSL would imply a level of delegation/process that isn't desirable.

I suppose to that extent, I would prefer the PSL stick to the announcements - especially since we have incorporated respecting those announcements in other fora (like CA/Browser Forum)
Jothan: 

* Is it possible to know how many TLDs would be involved, and therefore how big the patch would be?

* Do we know when roughly (e.g. what year or month) ICANN expects or hopes to have finished the process of delegating all of the TLDs which passed evaluation?

* Can you expand more upon the problems with Safari that you mentioned by email? Are they using a static list and not updating it?

Gerv
I think, that adding new gTLDs after contracting is a bit to early. Most of the new registries takes their time to launch the TLD. Wouldn't it be better to add a new TLD after delegation, because no one can use it before anyway?

For new gTLDs ICANN has a list of new delegated strings: http://newgtlds.icann.org/en/program-status/delegated-strings

Up until now there should be about 300 new gTLDs delegated. There should be another 500 to 700 coming up in the next 2 years.

It is save to say that we will probably see another round of new gTLDs in the next years (>2017).
Opened bug 1024514 for the next round of additions while we discuss this.

Gerv
I'll response in thread to Ryan, Gerv, and Tobias individually, and I have added Steve to the thread in the cc list.

ICANN's concept was to generate, all at once, the whole new TLD list and then generate a revised one as changes occur, which in theory would put them in charge of withdrawals.

This conceptually works until one of the delegated TLDs introduces third (or deeper) level entries like .WED recently requested (which BTW has thousands of sub-domain entries and makes the .jp or .museum PSL lists look very short) where the new TLD would need to shift out from the ICANN supplied list and into the rest of the PSL for existing TLDs. 

Apple appears to be working with a time frozen snapshot of the PSL or other list that is current as of the time they do their release of Safari, and it is also the case that that snapshot timing is different between the different devices (so the results vary by platform).  The objective would be to get them the entire list of ICANN TLDs in the new GTLD funnel so at least if they are snapshotting it they have the whole list.

The idea of being able to make one change and then trickle out incremental changes was attractive as it was thought to remedy the issue where Safari gets 'time lag' from use of snapshots that are comprehensive as of their most recent sw version on recognizing new TLDs and routing them to search.

Many people take the time to navigate through Apple's bug reporting process to submit a report of the new TLD domain not working (and the TLD in question is a changing cast of characters over time).  The investment of time there with the Safari browser is then summarily met with the soul crushing and maddening 'this is a duplicate of another bug and is being closed' with no ETA of resolution, and I get earfuls all the time about this from different people, and then the batch of registries experiencing the problem get satisfied when the next release of Safari addressed THEIR problem.    

What happens is the next batch of registries then rotate in for the next round of same problem and then they email me, complain to ICANN, etc.   This is a spiral that would continue until the current pipeline of TLDs are delegated or the PSL is filled up with those names.   The Full list option was quixotically intended to fast forward past that pain loop.

Between the RSEP request that .WED has in to ICANN, the unintentionally adverse impacts Ryan mentions, and in reasoning the change out with other people who are downstream users of PSL or are in our consumption chain, it sounds like it might be wise to stick to the current path/process.

@Ryan ( in-thread "}}")
}} Do you have a proposed patch?

I am sure we can get one generated by Steve (added to cc list) to test

}} Previously (Chrome 35 and earlier), a large 
}} addition would have bloated our binary size 
}} significantly. We recently changed how we 
}} use the PSL, so it would be good to know 
}} what the size impact.

Without having you disclose the methodology used, the patch file might help measure this.  Steve can you generate one (AND LOOK AT THE NEXT IN-LINE RESPONSE)?

}} Two other downsides from our use of the 
}} PSL - it would affect our Omnibox handling 
}} if, say, flowers was added to the PSL even 
}} though it wasn't delegated, as we would 
}} prefer to treat it as a user query (until 
}} it actually is delegated). The other is that 
}} we use the PSL for restrictive security 
}} (eg: limits on publicly trusted certs). 
}} Additions to the PSL would imply a level of 
}} delegation/process that isn't desirable.

One thought here after a conversation with one of your co-workers ("W.K.") at Nanog about the delegation timing was to see if ICANN would be able to annotate entries to indicate those which have and have not been delegated with some special method, like adding the delegation state as another component of the comment line.  Perhaps if the word "UNDELEGATED" were on the comment line, it could allow for you (or others) to program logic that could ignore the TLD on the next line.  Once the TLD is added to the root and delegated, it would lose that annotation.

Still, this means frequent patching, but it MIGHT make it slightly more manageable.  So would adding some volunteers, though, and with Tobias and Stefan on deck volunteering to aid in patch efforts we might be able to keep up.

}} I suppose to that extent, I would prefer the 
}} PSL stick to the announcements - especially 
}} since we have incorporated respecting those 
}} announcements in other fora 
}} (like CA/Browser Forum)

AGREE though I would like to do what is possible regarding Safari, so if you could tolerate introducing a process to parse the status of the TLD from ICANN and perhaps omit conditionally, could that work?

@Gerv ( in-thread "]]" )
]] * Is it possible to know how many TLDs would be 
]] involved, and therefore how big the patch would be?

Tobias' estimation may be right, but we can get the 
exact number of names that would be involved from 
Steve (who I added to the cc list).

]] * Do we know when roughly (e.g. what year or month) 
]] ICANN expects or hopes to have finished the process 
]] of delegating all of the TLDs which passed 
]] evaluation?

An estimation could be 24 months, perhaps.  Estimation might be as good as it gets.

Reason being, ICANN are really producing the strings at an impressive pace and their operations and gTLD teams deserve some praise for this, but timing and schedule are likely difficult for them to control beyond being efficient with their piece of it. They are not always in control over the schedule, as there are portions of the release process that are upon the applicant or third parties, legal review, or other gating factors.

]] * Can you expand more upon the problems with Safari 
]] that you mentioned by email? Are they using a 
]] static list and not updating it?

See my rant above.  It looks like they use a static list per product. For example, Safari on iOS for iPhone, Safari on iOS for iPad, Safari for Windows, Safari for Mountain Lion - all of them "Think Different" about if they should send a domain typed without http:// to search or DNS.  This seems to track to what is contained in PSL at or near the time of their release, and then remains frozen until next update.

@Tobias ( in-thread ";;" )
;; I think, that adding new gTLDs after contracting 
;; is a bit to early. Most of the new registries 
;; takes their time to launch the TLD. Wouldn't 
;; it be better to add a new TLD after delegation, 
;; because no one can use it before anyway?

Tobias this not always true...   as the Certification Authorities (CA) have been provided 120 days to deal with wildcard certs or certs issued outside of compliance with ICP-3.  Having the ability to test this in browsers may be important for testing that the revoking process works.

;; For new gTLDs ICANN has a list of new delegated 
;; strings: http://newgtlds.icann.org/en/program-status/delegated-strings

We are aware of this resource but thank you for making it available for others that might review this bug.

;; Up until now there should be about 300 new 
;; gTLDs delegated. There should be another 
;; 500 to 700 coming up in the next 2 years.

Yes, sir.

;; It is save to say that we will probably see 
;; another round of new gTLDs in the next 
;; years (>2017).

I don't necessarily agree with you about this.   18-24 months has also been thrown around
This is the referenced registry change that was submitted by .wed and is under comment period where community comments can be made about the request
This is the list of names that .WED requested be available for third-level registration below, and is intended to go along with the registry request.
I have added the .WED RSEP request for reference.  THIS WAS NOT A REQUEST FOR INCLUSION IN PSL. 

I included the request only for context.
Jothan, thanks for the summary of the current situation.

The Safari / Apple thing is quite an issue. There would be some ways to change that, but that depends on Apple. For instance they could add an update function in iOS / OSX to pull a new snapshot instead of waiting for the next release cycle of their software, couldn't they?

(In reply to Jothan Frakes from comment #5)
> One thought here after a conversation with one of your co-workers ("W.K.")
> at Nanog about the delegation timing was to see if ICANN would be able to
> annotate entries to indicate those which have and have not been delegated
> with some special method, like adding the delegation state as another
> component of the comment line.  Perhaps if the word "UNDELEGATED" were on
> the comment line, it could allow for you (or others) to program logic that
> could ignore the TLD on the next line.  Once the TLD is added to the root
> and delegated, it would lose that annotation.

We could add all new gTLDs to the list and we could parse ICANN's data (http://newgtlds.icann.org/en/program-status/delegated-strings) to change the status of the TLD from UNDELEGATED to DELEGATED. So this would run automatically.

> Still, this means frequent patching, but it MIGHT make it slightly more
> manageable.  So would adding some volunteers, though, and with Tobias and
> Stefan on deck volunteering to aid in patch efforts we might be able to keep
> up.

Another thought is what I sent you off-list, to set up an website to allow registries to manage their data by themselves. Any changes done by them would trigger a request. Registries could be identified by their IANA data automatically or can request login credentials.
(In reply to Jothan Frakes from comment #5)
> }} Previously (Chrome 35 and earlier), a large 
> }} addition would have bloated our binary size 
> }} significantly. We recently changed how we 
> }} use the PSL, so it would be good to know 
> }} what the size impact.
> 
> Without having you disclose the methodology used, the patch file might help
> measure this.  Steve can you generate one (AND LOOK AT THE NEXT IN-LINE
> RESPONSE)?

Our methodology is open source; it was contributed by Opera after they were particularly feeling the pain of the PSL.

http://src.chromium.org/viewvc/chrome/trunk/src/net/tools/tld_cleanup/

has the new version, the SVN history has the older version.

Note that this is simply one example. Other applications, such as Guava, also parse the entire list. In older versions, the creation of the trie used to be done on the fly; now, it uses a radix trie ( https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/thirdparty/publicsuffix/TrieParser.java )

> One thought here after a conversation with one of your co-workers ("W.K.")
> at Nanog about the delegation timing was to see if ICANN would be able to
> annotate entries to indicate those which have and have not been delegated
> with some special method, like adding the delegation state as another
> component of the comment line.  Perhaps if the word "UNDELEGATED" were on
> the comment line, it could allow for you (or others) to program logic that
> could ignore the TLD on the next line.  Once the TLD is added to the root
> and delegated, it would lose that annotation.

I have requested this of ICANN in the past. After a recent meeting, this does appear like it may become part of the data feed that ICANN/IANA provide.

However, it doesn't really solve the problem you're trying to solve if applications (Safari or otherwise) respect DELEGATED vs UNDELEGATED, for presumably the same reasons.

> AGREE though I would like to do what is possible regarding Safari, so if you
> could tolerate introducing a process to parse the status of the TLD from
> ICANN and perhaps omit conditionally, could that work?

We already parse the PSL and convert it. I don't know if it's that reasonable to change the PSL format to accomodate this, but it's a possibility.

> See my rant above.  It looks like they use a static list per product. For
> example, Safari on iOS for iPhone, Safari on iOS for iPad, Safari for
> Windows, Safari for Mountain Lion - all of them "Think Different" about if
> they should send a domain typed without http:// to search or DNS.  This
> seems to track to what is contained in PSL at or near the time of their
> release, and then remains frozen until next update.

Have you (or Steve) engaged Apple on this? Are we sure this is a problem that the PSL should be trying to solve, or is this an issue with one vendors use?

If Chrome was not updating its copy of the PSL regularly (... which it didn't, in the past), I would treat it as a Chrome bug, not a PSL bug.

> Tobias this not always true...   as the Certification Authorities (CA) have
> been provided 120 days to deal with wildcard certs or certs issued outside
> of compliance with ICP-3.  Having the ability to test this in browsers may
> be important for testing that the revoking process works.

I don't understand this remark. Testing revocation is orthogonal to the PSL. No browser yet enforces a PSL-based revocation; the PSL does not even have enough data to support this (hence the request to ICANN for a data feed that does, above).

I don't think this should be seen as a use case/argument for the inclusion.
Can anyone explain what Safari is doing with the list that is problematic? Do they have a Chrome-like Omnibox and are therefore searching when they should navigate?

Take a TLD like "flowers". I would expect typing "flowers" to do a search in all browsers for ever, because ICANN has (rightly) come out against A records at the top level. However, www.flowers should do a navigation. (And it does today, in Firefox.) Is that not happening in Safari?

My understanding would be that, because the PSL algorithm has a default to treat unknown TLDs as if they were in the list as flat top-level (i.e. like .com), is it only applications like Chrome's omnibox which try and use the list to distinguish domain names from search terms (which was not an original designed use for the PSL, although we try and support it) that have a significant problem with lots of domains being missing. Clearly, if a domain has sub-structure like .wed, that is a problem for everyone, but we've not seen many (or any?) new TLDs coming to us to report a sub-structure.

Gerv
A possible compromise would be to add TLDs that have answered ICANN CIRs (Contract Information Request). This means the TLD is going to be signed, and that would allow more time for slower software development cycles to adapt. To be clear of what that means, it means the TLD has:
1) Been approved in Initial Evaluation
2) Is not in contention with other applicants
3) Is not under any objection or other accountability process
4) Has been asked by ICANN to sign a contract
5) Has answered ICANN positively by the primary point of contact that they will sign a contract.
rubensk: is there a way of finding out which TLDs are in that state?

How long does it normally take from entering that state until contract signing? I.e. how much extra lead time would that change give us?

Gerv
rubensk: ping (on comment 13)?

Gerv
Gerv Ruben has a newly born baby so I suspect this is why we've not heard a response, but let me drive us towards the direction of conclusion.

I believe we have two paths available: 
a] Status Quo: to stick to the state of contracting as we have been doing, or 

b] work from the full list as a dump from ICANN every fortnight or so where they send the whole list but designate in the comment line if the name is contracted (inferring it will be live shortly if it is not already).
(In reply to Gervase Markham [:gerv] from comment #13)
> rubensk: is there a way of finding out which TLDs are in that state?

Not for now. It would require ICANN to have a new level of transparency of the progress of the TLD in their pipeline.

> How long does it normally take from entering that state until contract
> signing? I.e. how much extra lead time would that change give us?


About 6 weeks, so considering the 5-month PSL-to-release cycle of Apple Desktop Software, or 6-month PSL-to-release cycle of Apple Mobile Software, it would only partially cover the time where users can get names on a TLD but cannot use them. 

It's of notice that may be 2 weeks, new delegations will require controlled interruption (a wildcard record returning a 127.0.53.53 IP address) for 90 days after delegation, so the possible use of a TLD that today starts 120 days after contract signing will be postponed for about a month, which would make the above suggestion near totally efective. 

That said, I don't know if ICANN is willing to provide the transparency for that to happen...
ICANN (Steve Sheng) has just made me aware of this list which is updated daily and stored in a publicly accessible spot on the ICANN server for downloading.

This is the text file csv (it contains UTF-8 Characters).
Sample PSL Formatted output from the python script
Attached file analyse-tlds.py
Python Script from Steve Sheng and Haya Shulman at ICANN to parse the csv text file into PSL output.
Hi-

This summarizes the last three comments from attachments I just uploaded to this bug.  Steve Sheng at ICANN (who is cc'd on this bug) has furnished us with a CSV download of the TLDs which have contracted with ICANN, along with the date of contracting and the other information we use in our spreadsheet in Google Docs to track the new TLDs as they launch.  ADDITIONALLY, they have included a field which is the date of root zone addition, which, if blank, means that the TLD is between contracting and being made live in the root.

According to Steve, this is updated twice daily.

The output is sorted alphabetically, which is different than the chronological sort we've been using, but from what I understand of our current update process (at least where Simone is making the updates) he swaps out the whole new gTLD section, so this might represent an enhancement.

@Simone could you attempt a d/l of the csv from the ICANN url in comment 18, and then run the analyse-tlds.py (attached to comment 19) against it (you'll need to update the working dir in the script)?  

If this can be used to automate or expedite updates, it would be a great step forward for us on maintaining pace with ICANN's contracting rhythm.
As a follow-up, I have updated the google docs spreadsheet that we have been using to track the new gTLDs so that it can be triggered to pull the ICANN CSV and then it assembles the PSL entries.
BTW, why can't we solve the .wed problem with:

*.wed

?

Gerv
Whiteboard: [necko-would-take]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P5
We currently have a process (see e.g. https://github.com/publicsuffix/list/issues/643) to incorporate new gTLD changes.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Assignee: nobody → weppos
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: