Open Bug 1650089 Opened 10 months ago Updated 1 day ago

Fix process allocation for null principal

Categories

(Core :: DOM: Navigation, task, P3)

task

Tracking

()

Fission Milestone M8

People

(Reporter: neha, Assigned: nika)

References

(Blocks 2 open bugs)

Details

From our Fission meeting discussions:
We currently use a random process for null principal. We should instead allocate the process after DocumentChannel is done resolving the final principal.

Blocks: fission
Severity: -- → N/A
Type: defect → task
Fission Milestone: --- → M6b
Priority: -- → P3

I think it is usually written as "null principal".

Summary: Fix process allocation for NULL principal → Fix process allocation for null principal
Fission Milestone: M6b → M6c

Doesn't need to block Fission Nightly. Tracking for our Fission Beta experiment (M7).

Fission Milestone: M6c → M7
Depends on: 1671983
Fission Milestone: M7 → MVP

We should do this before starting Release experiment in M8.

Assignee: nobody → nika
Fission Milestone: MVP → M8

I had some conversations with :annevk about the difficulty of deciding on a process selection strategy for null principals on Slack earlier today. Leaving a ni? for any further thoughts on the issues next week.

One issue is that we likely cannot afford to mint a new process for every single null principal. We probably need to do some sort of strategy for collecting these processes together. This matches what Chrome does already where they appear to track the true site origins of null principals over time, so that when they're loaded in the future, they're loaded within the site which created them. For something like a data: URI, this would be the site which initiated the load (TriggeringPrincipal here), and for something like a sandboxed frame, it would be the unsandboxed result principal.

We could try to follow this strategy as well, however it can become quite complicated in some situations, as doing so might require adding tracking to our null principal representation of the origin which created that null principal. We don't do anything like that right now, and it seems like a strange addition to the null principal type, especially when it's only intended to help with tracking process selections. Alternatively, we could do some form of best-effort tracking, and when all else fails use a process fallback.

The most secure option, of course, would be to use a distinct process for every null principal which we load, but that is likely untenable in MVP for memory use and performance reasons.

Flags: needinfo?(annevk)

My current idea is something along the lines of this (ignoring the potential issues around re-using a process for sandboxed loads and unsandboxed loads. This could potentially be adapted to handle that situation by adding an extra sandbox flag which is used during process selection):

  1. If another global already exists in the BCG with this exact null principal, use the same process as that global.
  2. If the load is a sandboxed load:
    • If the unsandboxed principal is a content principal, use that principal to perform process selection.
    • If another global exists in the BCG with the unsandboxed principal, use the same process as that global.
  3. If the load is a data URI load:
    • If the principal to be inherited (as-if it was about:blank) was a content principal, use that principal to perform process selection.
    • If another global exists in the BCG with the principal to be inherited, used the same process as that global.
  4. If none of the previous steps successfully found a process to perform the load in, perform the load in a web content process.

I think this should catch most cases, but will miss some such as history navigations to documents created by other documents with null principals. In those cases, the creator document may have been destroyed and so looking up its' null principal for process selection would fail, and we'd end up in the catchall web content process. We could include details like process selection decisions in history entries to work around this in that specific case, but I wouldn't want to persist this information in sessionstore, as I would prefer not to stabilize process selection decisions or the remote type format.

This also has an odd edge case if somehow two loads with the exact same null principal are triggered at the same time with different triggering principals, where there is a short time between the process selection decision and the document being created during which we could make a different process selection decision for the second null principal. I can't think of a way this would happen in the wild though, and the worst case scenario is just that these two documents cannot communicate when they should be able to.

Can the final fallback not be a process for the null principal in question? It seems quite easy to end up in 4.

The one alternative I came up with that I'd like us to consider (and maybe urge Chrome to also adopt) is that each site has a parallel sandboxed site that we use for null principals. So https://example.com gets a process and if https://example.com ended up sandboxed or did a data URL load, those would end up in the sandboxed https://example.com process instead. While still not ideal, it seems a lot better than all of that ending up in the same process.

Having some telemetry available here would also be helpful.

Flags: needinfo?(annevk)

(In reply to Anne (:annevk) from comment #6)

Can the final fallback not be a process for the null principal in question? It seems quite easy to end up in 4.

Possibly. There's a risk of ending up with an absurd number of processes if we end up in 4 too often, but it might be OK? Perhaps it can be added as a tweakable preference.

The one alternative I came up with that I'd like us to consider (and maybe urge Chrome to also adopt) is that each site has a parallel sandboxed site that we use for null principals. So https://example.com gets a process and if https://example.com ended up sandboxed or did a data URL load, those would end up in the sandboxed https://example.com process instead. While still not ideal, it seems a lot better than all of that ending up in the same process.

Yeah, this is what the future steps around sandboxed loads would probably end up looking like for process selection. I think adding support for that is a follow-up issue though, so I'm not focusing on it right now.

You need to log in before you can comment on or make changes to this bug.