Closed Bug 1155581 Opened 5 years ago Closed 5 years ago

Create alias URL for downloading the Public Suffix List

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

x86
Linux
task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gerv, Assigned: nmaul)

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/1195] )

The IT team maintains hosting for publicsuffix.org, a basic static website served out of SVN.

There is a special URL for downloads of the Public Suffix List:
https://publicsuffix.org/list/effective_tld_names.dat

This is not stored in the SVN repo; I belive this is on a CDN for performance reasons. In bug 926891 fubar set up a cron job to pull it every so often and put it wherever it needs to be.

The request is: can we please have another URL:
https://publicsuffix.org/list/public_suffix_list.dat

which serves exactly the same file from the same place, without an HTTP-level redirect? The old URL needs to continue to work.

The name "Effective TLD" was retired a long time ago and remains only in the name of the file. I want to retire it some more.

Thanks,

Gerv
I maintain the publicsuffix package in debian, and i agree that the name of the file should be changed.  I propose to change 

 /usr/share/publicsuffix/effective_tld_names.dat

to:

 /usr/share/publicsuffix/public_suffix_list.dat

instead.  I'll probably keep around a symlink from the old name for compatibility with older tools for a release cycle or two.

If there are any concerns with that approach, please let me know.
I would wait until this change has been made, but otherwise, no, no problems with that.

Gerv
I don't see a reason to delay on the file rename in the package (given the legacy symlink), unless you think your suggestion here is going to be rejected for some reason.  But the request has been open for about a month and no one has objected, so it seems unlikely that this sensible request is going to be rejected to me.

Do you plan to change the name of the file in hg as well (i.e.  from netwerk/dns/effective_tld_names.dat to netwerk/dns/public_suffix_list.dat) ?

(also: feel free to cc me in the future on any PSL-related issues that arise here that might have relevance to debian packaging)
I wasn't planning to change it in hg, as that might cause unforseen breakage. We could try changing it and see :-)

Gerv
Assignee: nobody → server-ops-webops
Component: MOC: Service Requests → WebOps: Other
QA Contact: lypulong → smani
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/1195]
So there was no response for a month because this was in the wrong queue ;) 

Now that it's in the right queue -

Option 1 :

If you want to keep the old URL as is, no change we can add an Alias to the apache config to serve the old filename when the new one is requested

aka 

Alias /list/public_suffix_list.dat /path/to/effective_tld_names.dat

Option 2 :

You go ahead and change the filename and we can Alias the old URL to the new file

aka

Alias /list/effective_tld_names.dat /path/to/public_suffix_list.dat

You guys let me know what works and we'll push this change. 

Note that if there are special CDN configs (and I haven't looked) changing the file name might have more consequences and I'll have to do some more digging before making the change.
(In reply to Shyam Mani [:fox2mike] from comment #5)
> So there was no response for a month because this was in the wrong queue ;) 

Can you explain how I was supposed to know what the right queue was?

Also, why is it that people triaging a queue and finding a bug in the wrong place, don't at least add a comment saying "this is in the wrong place", or better still, move it?

> If you want to keep the old URL as is, no change we can add an Alias to the
> apache config to serve the old filename when the new one is requested

I don't have the power to change the filename of the file on the webserver disk; AIUI the file is pulled from Mercurial by a script, it's not part of the uploaded website. So the script would need to be changed.

Whatever we do, both URLs would need to keep working.

We can change the name of the file in Mercurial, but the changes here would need to be coordinated with that, so the script didn't break.

> Note that if there are special CDN configs (and I haven't looked) changing
> the file name might have more consequences and I'll have to do some more
> digging before making the change.

I believe that this file is indeed served from a CDN, because it has high traffic (for unknown reasons). This is why I filed a bug to look into what the consequences would be of changing the name.

Gerv
Assignee: server-ops-webops → smani
Assignee: smani → nmaul
publicsuffix.org is currently hosted locally in our PHX1 datacenter (on the 'static' cluster), so no CDN to worry about here. It will be moving to AWS in the next couple months most likely, but I don't feel that we need to wait until then to do something on this.

It was previously being served out of MXR, where it was causing problems because of the dynamic/CGI environment it was being served from. In its current location it is heavily cached (at the load balancer) and completely static, so there's very little performance impact from it. Usage is high, but nothing like AUS or Snippets, which are both also locally hosted right now.


In any case, I've set this up:

https://publicsuffix.org/list/public_suffix_list.dat
https://publicsuffix.org/list/effective_tld_names.dat


For reference, here's how it works:

wget -q -O effective_tld_names.dat http://hg.mozilla.org/mozilla-central/raw-file/default/netwerk/dns/effective_tld_names.dat

cp -f effective_tld_names.dat public_suffix_list.dat


In words, we fetch that one file from Hg (as we always have) and then copy it to the new name, so that both names exist.

I did it this way instead of an Apache Alias or symlink in the hopes that it would be simpler to flip the update script over to the new name, if/when that change happens in mozilla-central (plz let us know), and so that it might remain possible to host this easily out of AWS S3 when we migrate it in a few weeks, rather than needing an EC2 instance to run Apache. S3 can redirect, but can't handle Apache commands or symlinks, so this keeps things slightly simpler for the future.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Please use https://hg.mozilla.org/ instead of http://hg.mozilla.org/ -- there is no reason to do a cleartext fetch here, right?
Very good point, I hadn't even noticed (that part of the script has been there since this was originally set up a long time ago). Changed to https, still seems to work.
jakem: that's all great, thanks.

Gerv
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.