Open Bug 1600043 Opened 5 years ago Updated 6 months ago

Figure out a way to load the updated etld on startup

Categories

(Core :: Networking: DNS, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: valentin, Unassigned)

References

Details

(Whiteboard: [necko-triaged])

Bug 1563246 and bug 1563226 added a mechanism to update the etld suffix list at runtime using the Remote Settings component.

However, in bug 1582647 we discovered this has the potential to make the base domain of principals that have already been created be incorrect, so we preffed off the update mechanism (we only use the PSL that is shipped with Firefox).

It seems like the only safe way to load the updated PSL would be at startup, before we start creating any principals. However, this would mean blocking IO on the main thread, which we normally want to avoid. Or maybe there is a better way to do this?

(In reply to Valentin Gosu [:valentin] (he/him) from comment #0)

It seems like the only safe way to load the updated PSL would be at startup, before we start creating any principals. However, this would mean blocking IO on the main thread, which we normally want to avoid. Or maybe there is a better way to do this?

I don't know of one. If we start the IO early enough we could avoid doing it on the mainthread, and just pass over ownership of the in-memory structure once netwerk needs it, reducing the blocking as much as possible. Florian, do you have better ideas? For context, we need principal objects to make pretty much any requests.

I'm also confused - discussion in bug 1582647 suggests that we'd reload the list at runtime, but it doesn't talk about what happens on startup.

Finally, how do these updates affect stored principals in e.g. session restore?

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(florian)

Unless I'm missing something, this data should only be needed before we do network requests, not before opening chrome:// files, right? If so, I think this should be done off main thread, and block the first network request. Do we also need this before creating content processes, or can we share this data with them after they are started?

Flags: needinfo?(florian)

Do we want to do update in background and only apply it on the next restart?

(In reply to Dragana Damjanovic [:dragana] from comment #3)

Do we want to do update in background and only apply it on the next restart?

I think the point is that we don't load it from file right now, we load it from data that's in the binary. There's no way to directly update that data.

(In reply to Florian Quèze [:florian] from comment #2)

Unless I'm missing something, this data should only be needed before we do network requests, not before opening chrome:// files, right? If so, I think this should be done off main thread, and block the first network request.

No, principals can be created anywhere, not necessarily for network requests but also for things like the permissions database, for deserializing session restore state, etc. It'd have to block the main thread, because that's where we create and use those objects. They're used for security checks that happen sync before even starting the network request, so we can't very well make them lazy or async...

Do we also need this before creating content processes, or can we share this data with them after they are started?

I don't think it matters, we'll need this data well before we first create a content process.

Flags: needinfo?(florian)

Why can we not make the http/https/ftp codebase principals adapt? That is, if they encounter a non-matching baseDomain where they trigger the crash that we're talking about, why can't they enforce re-parsing the origin using the updated PSL? I assume we keep enough info in the principal to make this work...

(In reply to :Gijs (he/him) from comment #5)

Why can we not make the http/https/ftp codebase principals adapt? That is, if they encounter a non-matching baseDomain where they trigger the crash that we're talking about, why can't they enforce re-parsing the origin using the updated PSL? I assume we keep enough info in the principal to make this work...

That's a good idea. I think that would work. Christoph?

We also have to consider if there's a way that checks performed pre-update would cause wrong behaviour post-update. Such as third-party checks. I suspect this is more difficult than updating the principals with the info from the updated PSL.

Flags: needinfo?(valentin.gosu) → needinfo?(ckerschb)

(In reply to Valentin Gosu [:valentin] (he/him) from comment #6)

(In reply to :Gijs (he/him) from comment #5)

Why can we not make the http/https/ftp codebase principals adapt? That is, if they encounter a non-matching baseDomain where they trigger the crash that we're talking about, why can't they enforce re-parsing the origin using the updated PSL? I assume we keep enough info in the principal to make this work...

That's a good idea. I think that would work. Christoph?

Yeah, I think creating a new principal sounds fine to me in that case. Please flag me (or :jkt) for review - thanks!

Flags: needinfo?(ckerschb)

P1 bug, assigning to... Valentin?

Assignee: nobody → valentin.gosu

(In reply to :Gijs (he/him) from comment #5)

Why can we not make the http/https/ftp codebase principals adapt? That is, if they encounter a non-matching baseDomain where they trigger the crash that we're talking about, why can't they enforce re-parsing the origin using the updated PSL? I assume we keep enough info in the principal to make this work...

This makes me uncomfortable. Updating the principals would mean the subsumption checks that previously passed in some places would suddenly start failing, and vice versa. That already sort-of happens with ConsideringDomain subsumption checks, but those are kind of a special case, with all callers already expecting to have to deal with it. I'm honestly not sure what side-effects that may cause. It would also have the potential side-effect of sometimes putting two realms with incompatible principals in the same compartment... which I think is an invariant that we definitely don't want to break...

Also, we really want to make principals immutable, so they can be accessed off-main-thread, which this would complicate.

In any case, I don't think we should worry too much about having to map the PSL file at startup. It would be almost exactly equivalent to mapping a shared library, which we certainly already do enough of without worrying. And none of the data would actually be read until the pages were touched, which is the same case with the list compiled in libxul. The only difference would be where exactly the pages in question would be read in from.

And if it turns out to matter, we can also do some hackery to tell the kernel to try to pre-load those pages in the background, so that they're available by the time they're needed, which could conceivably make the whole process faster than the current compiled-in approach.

Oh, and in the case of Fission, it would actually change which processes documents are expected to be loaded in, since we group things into processes based on their eTLD+1. I don't even know what kind of chaos changing that might cause...

(In reply to Kris Maglione [:kmag] from comment #9)

(In reply to :Gijs (he/him) from comment #5)

Why can we not make the http/https/ftp codebase principals adapt? That is, if they encounter a non-matching baseDomain where they trigger the crash that we're talking about, why can't they enforce re-parsing the origin using the updated PSL? I assume we keep enough info in the principal to make this work...

This makes me uncomfortable. Updating the principals would mean the subsumption checks that previously passed in some places would suddenly start failing, and vice versa. That already sort-of happens with ConsideringDomain subsumption checks, but those are kind of a special case, with all callers already expecting to have to deal with it. I'm honestly not sure what side-effects that may cause. It would also have the potential side-effect of sometimes putting two realms with incompatible principals in the same compartment... which I think is an invariant that we definitely don't want to break...

I don't understand this, sorry. We're minting a new principal anyway. The new principal will have the same URI but a different toplevel / etld+1, and the consequences will be approximately the same as the ConsideringDomain case (ie origin changes by 1 host-part in either direction when things get added / removed vs. before/after document.domain gets applied). Why does that suddenly make things worse?

Also, we really want to make principals immutable, so they can be accessed off-main-thread, which this would complicate.

Ditto - we're minting a new principal, so it could still be immutable?

(In reply to Kris Maglione [:kmag] from comment #10)

Oh, and in the case of Fission, it would actually change which processes documents are expected to be loaded in, since we group things into processes based on their eTLD+1. I don't even know what kind of chaos changing that might cause...

But the change would affect the entire process (which would transition from origin A to A'), so why would that make a difference? :-\

As for preloading, I don't see how that would suddenly make the IO free... I'm also not sure how much we can really tell the kernel given that the files live in the profile, so we don't know what we're loading until we select a profile. I could also see us needing some principal information before the profile loads, so there's a chicken/egg problem there too...

Flags: needinfo?(kmaglione+bmo)
See Also: → 1365892

This bug is connected to bug 1365892. We need o think about this as a whole.
How often is this happening? I expect we do not update PSL that often? This is will not be a small task.

I wonder if we could build the etld list as a separate shared library. Updating the list would just mean updating the library, so we don't have a big startup delay, nor do we do it while we are running
However, there might be a few problems with this approach too:

  1. Does this need to be handled by the Firefox updater, and would that be hard to do?
  2. Is there a chance the update could fail and leave the library in a bad state, thus preventing Firefox from loading?

Otherwise we could simply uplift all changes to the PSL to release, and ship them with every dot release.

I'm not actively working on this right now.

Assignee: valentin.gosu → nobody
Severity: normal → S3
Priority: P1 → P2

Necko is unlikely to work on this anytime soon.
Please let us know if you are blocked by this in any way and we can work with you on potential solutions.

Priority: P2 → P3
Flags: needinfo?(florian)
Flags: needinfo?(kmaglione+bmo)
You need to log in before you can comment on or make changes to this bug.