Closed Bug 1063724 Opened 10 years ago Closed 10 years ago

Recommending introduction of a header line in effective_tld_names.dat with date/time of generation and/or version number

Categories

(Core Graveyard :: Networking: Domain Lists, defect)

x86
Windows 8.1
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jothan, Unassigned)

Details

User Story

add header line 4 after mpl notice in "Public Suffix List" / effective_tld_names.dat that contains date/time of last generation
      No description provided.
The PSL has grown to include a vast number of uses beyond its original scope, many of which are incorporating it in a static manner within derivative use.

Historically, the TLD system had a very glacial pace of change and most changes to the PSL largely surrounded enhancing or introducing subtlety within ccTLDs or private namespace.  With the introduction of new gTLDs by ICANN, there is a weekly addition of TLDs to the root (up to 1000/year) until the current pool of 2012 applicants are all added.

For those software libraries, or other derivative uses and developers that incorporate static snapshots of effective_tld_names.dat, there is not any clear indicator of age or version so they (or derivative works developers, etc.) might be able to know to update their copy.

This small change, introducing another header line after the MPL, could help to reduce confusion and increase reliable navigation, as well as ultimately improve the user experience.

I am proposing we add this, but want to find out if there is any reason this could introduce problems or if there is resistance.
User Story: (updated)
Summary: Recommending a header line with date of generation and → Recommending introduction of a header line in effective_tld_names.dat with date/time of generation and/or version number
I see the advantages of this; the only difficulty is how we can make it happen in a consistent (and therefore automated) fashion. Manually updating it each time would be a real pain, as it would need to be in patches, and patches don't get checked in in the order they get made.

When we have our own repo (bug 991749) then we might be able to add a commit hook to do it.

Gerv
For Chrome, we've just always referred to the commit-hash. This is complicated by the publicsuffix.org redirecting to the raw output, which means I end up digging through mxr/hgweb to get the right hash, but at least it's fully unambiguous.
But I don't think it's possible for a file to contain its own commit hash for its latest version...

The issue Jothan is raising is not "how can someone unambiguously refer to a particular version of the PSL?", it is "given a copy of the PSL, how do you tell which version/how old it is?". He is suggesting putting a datestamp in the file so this is clear to all readers.

It's worth observing, though, that many PSL consumers, including both Google and Mozilla, "compile" the PSL to a more compact representation before using it. If e.g. Apple does the same, this change would not make it any easier to examine their shipped binaries and tell which PSL version they used.

Gerv
Jothan: do you have comments on any of the above?

Gerv
Flags: needinfo?(jothan)
@Gerv thanks for the nudge - In thinking this through, I reckon the larger of the developer/integrators of PSL that incorporate it in some fashion do compress or compile the list for incorporation.

The suggestion here was aimed at the large pool of developer/integrator/incorporated use/other type of consumer of the PSL who downloads and incorporates a snapshot. The objective of the datestamp would be so that they might have means to understand how recent their copy is.

As Ryan mentioned, the hash can be a means to check if current, but the challenge, I think, is HOW current.  

Historically the pace of change in gtld land was fairly glacial, to where someone could theoretically have had a fairly comprehensive solution if they updated their snapshot of the list annually or quarterly - the current expansion has ICANN and IANA adding names to the root zone file each week.  We are doing updates more frequently than in the past, not weekly but semi monthly to monthly where we can to keep pace.

Here is an example of the problem I was hoping we could solve with this:
Many of the libraries out there which rely on PSL include a copy direct from publicsuffix.org at the time of install, or at the time the developer / author incorporated it.  With no time stamp in place, there's not really a clear means for the developers who are incorporating those libraries' PSL to know they might have an older version of the PSL, or how old it is.  This would lend itself to not incorporating the latest version.

The basis of the request was to aid integrators/developers in having more clarity on the age of their PSL snapshot in the interest of having folks using the latest and greatest.
Flags: needinfo?(jothan)
If someone can propose a fairly pain-free technical way of doing this, I might bite, but manually having to update a datestamp with each checkin sounds like a bit of a pain to me.

Hey: if we wanted to be able to test which version of the list any particular implementation was using, we could do something like add N.publicsuffix.org as a public suffix, where N was an incrementing integer. Then, to see which list version is in use, just examine the behaviour for all N until you get an anomaly.

Gerv
Re: Hey
I think version number is helpful but the problem I was hoping to solve where someone benefits more from understanding how old their version might be vs just knowing that it is old.

The volume of change is the driving factor here.  When the root zone and list of TLDs didn't have much change happening, this was not really as important, but while IDN ccTLDs and the new gTLDs are being approved and added to the root zone at the current pace, it is important we have date stamps so that there is some clear "human readable" indicator of age of the list someone might include in a derivative library or list.

I think if it was always the 4th line in the file that the unidiff line would be fairly simple to maintain, but I may be trivializing something far more complicated than it seems.
There isn't an automated way of doing this that I've found. And I don't want to have to manage version numbers in the file manually (you keep getting merge conflicts if you apply patches out of order). 

If you want to find out what version of the PSL an app is using, you'll simply need to test various domains and correlate that with the dates their TLDs were added.

Gerv
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.