Closed Bug 1083971 Opened 6 years ago Closed 11 months ago

Auto-update the public suffix list out-of-Firefox-release-band

Categories

(Core :: Networking: DNS, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox69 --- fixed

People

(Reporter: Gijs, Assigned: arpit73)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-would-take])

Attachments

(1 file, 2 obsolete files)

The PSL being out of date has bitten us a few times now, also wrt ESR releases and so on, because it means we can't accurately determine what public domain suffixes are in use. We're also planning to use it for deciding whether URL bar input is likely to be a real domain rather than a search, for which keeping it up to date becomes still more important. We should fetch updates to it every now and again (roughly every week sounds sensible to me) instead of having it be fixed per-Firefox-release.
Weekly sounds fine. One issue is that the PSL is "compiled" before being shipped so that it's more compact and quicker to access. This is done by a build script: netwerk/dns/prepare_tlds.py. You'd need to decide whether to build that compilation into Firefox, or ship the compiled version (which I believe is C++, so that would be complex), or invent a new intermediate format.

Gerv
Gavin, this isn't exactly a frontend bug... do you think we have time to pick this up in one of the 36 iterations so we can pick up bug 1080682 in one of the 37 ones? (or, alternatively, can we convince one of the core folks in this area to pick this up?)
Flags: needinfo?(gavin.sharp)
This sounds like something that needs to be broken down - is that what you're suggesting?
Flags: needinfo?(gavin.sharp)
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #3)
> This sounds like something that needs to be broken down - is that what
> you're suggesting?

I meant that the other bug being blocked by this (bug 1080682) probably should be fixed after this one.

As for breaking it up, I'm not sure if it makes sense for front-end to pick up any of the pieces, but I suppose breaking it up makes sense, at least to split the list data structure off in such a way that it can be updated and isn't compiled into the binary, and then more steps to actually update it - or to support reading updates on top of the compiled-in list or something. Still not sure if front-end is the best to decide the architecture here and/or do the breakdown, though...

Gavin/Patrick, thoughts on this?
Flags: needinfo?(mcmanus)
Flags: needinfo?(gavin.sharp)
All I'm saying is that it seems like a large enough undertaking that we need to break down the multiple steps required. I agree that someone closer to the core bits of this would be helpful in that process.
Flags: needinfo?(gavin.sharp)
I think jason was the last person to touch this code - so I'll pass the buck for his opinion.

this is starting to sound like the phishing list, and the tracker list.. maybe we can reuse that infrastructure? (cc monica)
Flags: needinfo?(jduell.mcbugs)
Flags: needinfo?(mcmanus) → needinfo?(mmc)
Hey folks,

The phishing list and tracker list use the safebrowsing infrastructure, which fetches updates every 45 minutes or so from a pre-determined server. That sounds like it may be overkill for PSL, unless it thrashes a lot. For other lists related to security stuff, we use buildbot which runs weekly and updates Nightly and Aurora, but not Beta. That would basically mean that the PSL list needs to be live for 14 weeks when it hits Beta. The buildbot work for HPKP and HSTS was done in

https://bugzilla.mozilla.org/show_bug.cgi?id=836097
https://bugzilla.mozilla.org/show_bug.cgi?id=1004279
Flags: needinfo?(mmc)
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #7)
> Hey folks,
> 
> The phishing list and tracker list use the safebrowsing infrastructure,
> which fetches updates every 45 minutes or so from a pre-determined server.
> That sounds like it may be overkill for PSL, unless it thrashes a lot. For
> other lists related to security stuff, we use buildbot which runs weekly and
> updates Nightly and Aurora, but not Beta. That would basically mean that the
> PSL list needs to be live for 14 weeks when it hits Beta. The buildbot work
> for HPKP and HSTS was done in
> 
> https://bugzilla.mozilla.org/show_bug.cgi?id=836097
> https://bugzilla.mozilla.org/show_bug.cgi?id=1004279

every 45 minutes /would/ be overkill, but we're looking for something that's updatable out-of-release-bands, so the buildbot work isn't a great fit either, unfortunately.
(In reply to Patrick McManus [:mcmanus] from comment #6)
> this is starting to sound like the phishing list, and the tracker list..
> maybe we can reuse that infrastructure? (cc monica)

I keep saying we need a generic local-info-updating infrastructure!

As Gijs says, this needs to happen at non-release times, and needs to keep happening even when we stop making updates to a particular release. Every 45 minutes would be OK, if that's all we have, as long as the server can return Not Modified, and can cope with that many hits.

The PSL tends to change approximately monthly, but it can be more frequent. However, a month's delay from checkin to full deployment would still be much, much better than what we have now.

Gerv
wget recently got PSL support built-in (so they now have the Firefox problem with possibly stale PSL information for a long time) and we're looking at doing the same for curl/libcurl soonish.

If this turns out to be a suitable decent generic binary PSL download solution, there might be more out there that can benefit... Then I don't mean using the hosting infrastructure for downloads perhaps, but more a decent binary format and associated code for doing PSL operations. I'm sure Tim over at libpsl will be interested: https://github.com/rockdaboot/libpsl
Flags: needinfo?(jduell.mcbugs)
Whiteboard: [necko-would-take]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P5
Blocks: 1416247
Hi,

Honestly, right now, my number one frustration with Firefox is bug 1080682 which is blocked by this.  I'm very interested in getting involved with open source development.  What can I do to help get this started?  It looks like (at a high level) the following steps would need to happen:

1. Download PSL from https://publicsuffix.org/list/public_suffix_list.dat, leveraging caching
2. Store it somewhere semi-permanent (not sure where, maybe same place as cache?)
3. Use https://github.com/rockdaboot/libpsl or similar library to decode the list
4. Redownload weekly

These changes should either be made in a way that clients of the existing built-in PSL can use it, or if more appropriate, clients of the existing PSL should be modified to use this new implementation.

Any thoughts, suggestions?  I'm willing to put in the work on this because bug 1080682 makes me want to punch Firefox in the face on a regular basis.
(In reply to Gervase Markham [:gerv] (not reading bugmail) from comment #9)
> (In reply to Patrick McManus [:mcmanus] from comment #6)
> > this is starting to sound like the phishing list, and the tracker list..
> > maybe we can reuse that infrastructure? (cc monica)
> 
> I keep saying we need a generic local-info-updating infrastructure!

We've got one of these now, it's called "remote settings" (formerly/also known as kinto). Mathieu, what would it take to use that for the PSL updates? Especially considering the PSL code right now is largely written in C++, not JS... I guess we could do a similar thing to what the gfx blocklist does right now and have JS code to handle the updates and have that broadcast to C++ with any changes or something.

(In reply to Connor from comment #12)
> Hi,
> 
> Honestly, right now, my number one frustration with Firefox is bug 1080682
> which is blocked by this.  I'm very interested in getting involved with open
> source development.  What can I do to help get this started?  It looks like
> (at a high level) the following steps would need to happen:
> 
> 1. Download PSL from https://publicsuffix.org/list/public_suffix_list.dat,
> leveraging caching

I think we'd probably want to use a more structured format than a text file, ideally with diffing and/or push notification support so we don't have to redownload the entire list, as well as strong security/integrity guarantees (stronger than "just" downloading it over https). I believe kinto offers all of this. We already use it for the onecrl list of distrusted intermediate certificates (ie certs that sit somewhere between root certificates and site/endpoint certificates).

> 2. Store it somewhere semi-permanent (not sure where, maybe same place as
> cache?)

The data would be stored in the user's Firefox profile, but this is something the existing remotesettings JS client takes care of already - you wouldn't need to be concerned with the exact specifics of this.

> Any thoughts, suggestions?  I'm willing to put in the work on this because
> bug 1080682 makes me want to punch Firefox in the face on a regular basis.

This is really encouraging, thanks for offering to help with this bug.
Flags: needinfo?(mathieu)
This would indeed be a perfect use-case for RemoteSettings.

Depending how often it gets updated and how big it is, there are several strategies.
For example, the two obvious ones would be:
- have one record per suffix (optimal sync but tedious to edit)
- have one record with the list as attachment (redownloaded completely on each server side update)

Then, regarding its load by the Firefox code, we also have several approaches available.
For example, in the "sync" event callback we could either:
- write a dump on disk and send a signal for the C++ to reload it [0]
- serialize the list and send it to C++ as string in the message payload [1]
- define a C++ interface and call it from JS [2]

[0] https://searchfox.org/mozilla-central/rev/ef51c56995c72e21683b1db390f920fedd93a91c/services/common/blocklist-clients.js#309-328
[1] https://searchfox.org/mozilla-central/rev/d2966246905102b36ef5221b0e3cbccf7ea15a86/toolkit/mozapps/extensions/Blocklist.jsm#1206-1232 
[2] https://searchfox.org/mozilla-central/rev/d2966246905102b36ef5221b0e3cbccf7ea15a86/services/common/blocklist-clients.js#69-104
Flags: needinfo?(mathieu)
I don't know what the work administration-wise would be for updating this, so I'm not sure whether one record per suffix vs one record for the whole list is better.  For context, the list right now has 12661 lines and 7849 suffixes (the other lines are blank or comments). The copy in Firefox's source [0] hasn't really ever been systematically updated, but the "official source" on github [1] is updated ranging from multiples times a week to every other week.  I propose that the optimal solution would use one record per suffix, and that a script could be used to add and remove records based on changes to the file.  Should I move forward with implementing it that way (one record per suffix)?

[0] https://hg.mozilla.org/mozilla-central/filelog/28ad9a9e95d518e1163e550ae19c972aabb44df5/netwerk/dns/effective_tld_names.dat
[1] https://github.com/publicsuffix/list/commits/master/public_suffix_list.dat
Flags: needinfo?(mathieu)
(In reply to Connor from comment #15)
> I don't know what the work administration-wise would be for updating this,
> so I'm not sure whether one record per suffix vs one record for the whole
> list is better.  For context, the list right now has 12661 lines and 7849
> suffixes (the other lines are blank or comments). The copy in Firefox's
> source [0] hasn't really ever been systematically updated, but the "official
> source" on github [1] is updated ranging from multiples times a week to
> every other week.  I propose that the optimal solution would use one record
> per suffix, and that a script could be used to add and remove records based
> on changes to the file.  Should I move forward with implementing it that way
> (one record per suffix)?

Is it possible for the list to be preprocessed on the server at all, and then Firefox could just download the new blob of data structure, rather than doing a bunch of munging client side?  IIUC, this would require one record for the whole list, so we might be downloading many largish updates (sorry, not terribly familiar with the proposed update mechanism, apologies for sounding ignorant), but I think that's OK, if we can get the list in some sort of thing that's easily indexable.
On closer examination, the current implementation requires that it be a DAFSA [0], so unless there's a compelling reason to change away from that, I will implement it with that expectation.  The DAFSA should probably be generated server side, which means it will be one record for the entire serialized data structure.  As it stands now, the raw serialized DAFSA is around 35k.  I think this is a perfectly reasonable size for updates, especially knowing that it will only download when changed.

[0] https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton
That sounds ideal, thank you!
From its description.

> A "public suffix" is one under which Internet users can (or historically could) directly register names.

It looks like it fully degenerated into a list of subdomain providers.
https://github.com/publicsuffix/list/commit/65ddeb3eca4cfd9f436f7b2fed49df57624d40f7#diff-7a8a497c39dadd4b04d30f5e8e679bf8

Good luck with that.
FYI: at least Fedora is already shipping a package called "publicsuffix-list" which is exactly the PSL as a DAFSA file. So it seems to be a general consensus to be "the way" to do it. curl will use that file to load PSL dynamically and allow it to be independently updated.
So, a possible approach would be:

1. have a script that builds the DAFSA file from the latest data 
2. publish the file on Remote Settings server using the REST API
3. have someone in charge of signing off the change
4. let the client download and ingest the file using the Remote Settings client API


For 1, I can't help much ;)

For 2, we can use the DEV server to build the prototype. Publishing the record with the attached file would just consist in running something like this Gist [0]

Once the prototype is done, we'll have to setup STAGE/PROD with a new collection, signoff, give VPN access etc. (full procedure is on Mana [1])

Step 3. only makes sense once using STAGE/PROD.

For step 4. the official API documentation is here: https://searchfox.org/mozilla-central/source/services/common/docs/RemoteSettings.rst
In order to use the DEV server, a tutorial is being published [2].

Ping me or send me an email if you need more info ;)

[0] https://gist.github.com/leplatrem/b67c3465321d61aa05e3f07f8f3ca05a
[1] https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=66655528
[2] https://github.com/mozilla/remote-settings/pull/66
Flags: needinfo?(mathieu)
The tutorials for Remote Settings were published:
https://remote-settings.readthedocs.io

And I added one about attachments:
https://remote-settings.readthedocs.io/en/latest/tutorial-attachments.html 

Have fun :)
Connor,

Did you make some progress? Can I help you in some way?

I updated the docs with new tutorials, screencasts etc. https://remote-settings.readthedocs.io
Flags: needinfo?(connor.hewitt)
Mathieu,

Yep, I've been making progress.  Since this is my first contribution, I've been taking some time to familiarize myself with the Firefox source.  I will check out the new tutorials, and I'll let you know if I have any questions!

Thanks!
Flags: needinfo?(connor.hewitt)
Hi Connor,

Are you still interested to work on this? Can I help in some way?

Let us know ;)
Hi Mathieu,

Unfortunately school has started again, and while I keep hoping I'll find time to work on this, I don't think I will for the foreseeable future.  I haven't made any significant progress, so if someone else wants to work on it, they should!

Thanks!

(In reply to Mathieu Leplatre [:leplatrem] from comment #22)

The tutorials for Remote Settings were published:
https://remote-settings.readthedocs.io

And I added one about attachments:
https://remote-settings.readthedocs.io/en/latest/tutorial-attachments.html

Have fun :)

Hello Mathieu, there hasn't been any development on this issue in a while and I see it is mentioned in the GSOC 2019 brainstorming document.
Will you be mentoring that project? I'll be going through the resources you've linked to and get more familiar with this code. Do you have any further information about, how this project will proceed in GSOC.

Hi, Mathieu, I would also like to contribute here.
(In reply to Arpit Bharti from comment #27)

(In reply to Mathieu Leplatre [:leplatrem] from comment #22)

The tutorials for Remote Settings were published:
https://remote-settings.readthedocs.io

And I added one about attachments:
https://remote-settings.readthedocs.io/en/latest/tutorial-attachments.html

Have fun :)

Hello Mathieu, there hasn't been any development on this issue in a while and I see it is mentioned in the GSOC 2019 brainstorming document.
Will you be mentoring that project? I'll be going through the resources you've linked to and get more familiar with this code. Do you have any further information about, how this project will proceed in GSOC.

I addition to what Arpit asked, I would also like to know any mini tasks or starter issues to work on relevant to this one, in order to get started here.

Flags: needinfo?(mathieu)

I can't really spend time on this right now, but once the project gets approved we'll start by writing a detailed roadmap.

In the mean time, if you've gone through all the tutorials mentioned above and want to dig more stuff, there's also some easy-pick issues in the ecosystem that powers Remote Settings: https://github.com/issues?q=is%3Aopen+is%3Aissue+archived%3Afalse+user%3AKinto+label%3Aeasy-pick

Thanks for your interest and motivation

Flags: needinfo?(mathieu)

Hello everyone, this bug will be worked on as a project under Google's Summer of Code program. I am Arpit from Delhi, India and I will be working under the mentorship of Mathieu[:leplatrem] for the next three months to submit patches for this bug.
We have come up with a strategy detailed in this blueprint document:
https://docs.google.com/document/d/1kxlAhu87MQtATxYfBdfRO-WjMHVNo1jA9Gr5mdVBnN8/edit?usp=sharing

Depends on D34331

Bug 1083971 - fixed lint warnings, added comment for to_bin()

Mathieu, do you know if the PSL is only loaded in the parent process? One of the benefits of having the list baked directly into the executable is that it can be shared b/w all processes with no overhead.

Flags: needinfo?(mathieu)

Mathieu, do you know if the PSL is only loaded in the parent process? One of the benefits of having the list baked directly into the executable is that it can be shared b/w all processes with no overhead.

That's a good question, and I hadn't thought of this.

In our plan, the list would still be baked into the executable, for new profiles and to avoid disk i/o on startup. It's only soon after startup that we would read it from the profile folder.
Do you have other ideas? We haven't started the C++ part so far :)

Flags: needinfo?(mathieu)

(In reply to Mathieu Leplatre [:leplatrem] from comment #35)

Mathieu, do you know if the PSL is only loaded in the parent process? One of the benefits of having the list baked directly into the executable is that it can be shared b/w all processes with no overhead.

That's a good question, and I hadn't thought of this.

In our plan, the list would still be baked into the executable, for new profiles and to avoid disk i/o on startup. It's only soon after startup that we would read it from the profile folder.
Do you have other ideas? We haven't started the C++ part so far :)

Not a C++ expert, but I'd suspect that you could share the additional data through shared memory from the parent to the child. Until that data is available, we could potentially fall back on the builtin list. Though in general this is all quite interesting - PSL information affects a lot of things, and having it change while Firefox is running could cause "interesting" issues (e.g. cookies sent with one request but not the other). We already have some of these issues and haven't come up with a good solution yet (bug 1365892). That said, from a performance perspective we /really/ don't want to wait for all our http requests / database opening until the updated list has loaded from disk...

One other possible solution here would be to have consumers of the etld service "opt in" to getting the up-to-date information to avoid conflicting behaviour for consumers that don't expect it to change...

Attachment #9070986 - Attachment description: Bug 1083971 - Created to_bin(), words_to_bin() functions in make_dafsa and modified prepare_tlds.py to deal with different number of arguments → Bug 1083971 - Add an option to output a binary file for the PSL data
Pushed by mleplatre@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/822cb68b6ab7
Add an option to output a binary file for the PSL data r=leplatrem,erahm
Pushed by dvarga@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/27de3a352a39
Added a new line in xpcom/ds/tools/make_dafsa.py to fix lint failure
Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Assignee: nobody → arpitbharti73

Is there a new bug tracking the cpp part? The bugs that this one was blocking would depend on that, even if this is fixed, they aren't actionable yet. Likely there should be a tracking bug for the feature that we can depend on.

Resolution: FIXED → INCOMPLETE

No, patches landed here, so "FIXED" is the right resolution. What Marco is saying is that we need a new bug for the remaining parts. Mathieu, can you help?

Flags: needinfo?(mathieu)
Resolution: INCOMPLETE → FIXED
Attachment #9070939 - Attachment is obsolete: true
Attachment #9070932 - Attachment is obsolete: true

Indeed, that's a pity, and that's my fault.
I didn't realize that the Python script patch was attached here. We should have created a dedicated bug as a blocker for this one.

Flags: needinfo?(mathieu)

(In reply to :Gijs (back Aug 12; he/him) from comment #36)

Not a C++ expert, but I'd suspect that you could share the additional data through shared memory from the parent to the child. Until that data is available, we could potentially fall back on the builtin list. Though in general this is all quite interesting - PSL information affects a lot of things, and having it change while Firefox is running could cause "interesting" issues (e.g. cookies sent with one request but not the other). We already have some of these issues and haven't come up with a good solution yet (bug 1365892). That said, from a performance perspective we /really/ don't want to wait for all our http requests / database opening until the updated list has loaded from disk...

We should be able to handle this by just storing the new data in a binary format that we can mmap and use directly. If we mmap it early, with the correct flags, then the data should ideally be available by the time we need it. And if it's already mmapped in the parent, child processes should get it more or less for free (particularly if we just send them an open file descriptor rather than having them open the file themselves).

Hello everyone, I'm almost done with dafsa reloading part of the project, being worked on here https://phabricator.services.mozilla.com/D40058

I will be getting started with the tests soon (one issue left before I move on to it)
So far I have considered the following strategy:

  1. Build a fake dafsa with the suffix .xpcshelltest
  2. Send a signal with the location of said dafsa
  3. Assert that our fake suffix is now known

The third step is where I want some input, how do I check that the new dafsa is loaded and the .xpcshelltest suffix exists.

(looks like comment #45 is being resolved in bug 1563246 and the associated phab revision https://phabricator.services.mozilla.com/D40058 ).

The revision for now reloading the dafsa is here: https://phabricator.services.mozilla.com/D42470

Re-initializing mGraph, causes fiefox to crash for now.
Here the crash log when it segfaults after sending a signal https://pastebin.com/X41QwYiu
Here's a the backtrace output in gdb https://pastebin.com/yhX21z6H

The code compiles successfully, but further than that I'm unable to pin point why the segfault occurs.
Can anyone familiar with with dafsa c++ code look into it?

/home/arpit/Development/moz/mozilla-central/xpcom/ds/Dafsa.cpp:33 is triggering the segfault, if that helps.

No longer blocks: 1597337
You need to log in before you can comment on or make changes to this bug.