Bug 1836707 Comment 3 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Andrew Sutherland [:asuth] (he/him)

on 2023-06-05 14:33:58 PDT

(In reply to Julian Descottes [:jdescottes] from comment #2)
> Right, it sounds useful especially for the requests performed to download the script. For those scenarios it might make more sense to have something on the request itself in order to link it to the browsing context which called `register`? For regular workers using PFetch, I think Eden made it so that it could work, but maybe this is different? A nsIServiceWorkerManager API that would also handle this use case sounds harder to design.

An additional complication is that because service workers has a [job queue](https://w3c.github.io/ServiceWorker/#dfn-job-queue) with [coalescing semantics in step 6 of schedule job](https://w3c.github.io/ServiceWorker/#schedule-job-algorithm) we end up with a situation where potentially multiple browsing contexts could be associated with a single ServiceWorker register() job and the resulting new registration.  Like if you open 5 tabs in parallel and they all call register, they would all end up with promises waiting on the same single register job.

And for completeness, some related edge cases:
- A ServiceWorker registered for a scope that doesn't actually exist for interception purposes (ex: "/this-url-does-not-exist"), but does exist for push notification purposes or some other rationale where a page would call `getRegistrations()` to locate the registration whose scope does not match the page, and then postMessage() the SW.
- A page "/foo/bar" is controlled by the SW with scope "/".  Another tab registers a new SW with scope "/foo" that claims the pre-existing "/foo/bar" page (which did not call register).  This is extreme (and probably a bad idea for the site), but possible.
- The [update check in step 7 of the update algorithm](https://w3c.github.io/ServiceWorker/#update-algorithm) is not performed by the script loader (which would re-do any loads if the update check finds changes, although those might then be cached).

Maybe a sketch for an API that could handle the various permutations would be:
- Add `nsIServiceWorkerManager::CreateBrowsingContextFilter` method that takes a browsing context id and some flags to scope level of interest and the method returns a new class/interface `nsIServiceWorkerFilter`.
- `nsIServiceWorkerFilter::FilterLoadInfo(nsILoadInfo)` is the key method that can be used to efficiently check if the fetch is coming from an involved ServiceWorker, and it returns an enum or bitmask describing the relationship between the browsing context and the SW.  I propose integer enum values below in order of decreasing precedence where 0 would be that the fetch isn't interesting or related:
  - Controlling: The SW is controlling the page.  This could perhaps be further broken down into:
    - ControllingLoaded: The page has been controlled by the SW since its load.  (Note that a SW that does not handle "fetch" events is still notionally controlling a page, so this does not distinguish if the page load actually went to the network or not.)
    - ControllingClaimed: The page was claimed by the SW (and may or may not have been previously claimed by a different SW).
  - PageRegistered: The page issued a register call to this SW.
  - RegistrationUpdating: The page didn't call register() but an update check for the registration resulting from a functional event or handle fetch algorithm ran and dediced we need to update the SW and the SW is somewhere in the evaluating/installing/waiting lifecycle, with "active" corresponding to the ControllingClaimed value above.
  - RegistrationUpdateCheck: There isn't actually a SW yet, we're just in the update-check as part of the update algorithm, so there's no script loader active yet.  This is the predecessor state of RegistrationUpdating above.
  - ScopeMatches: The page's URL will match the scope of the SW but the page didn't call register, but this inherently would be the result of another page issuing a fresh call to register().
  - PageMessaged: The page used `ServiceWorker.postMessage` to talk to the SW.  (A page can find all SWs via `getRegistrations()`.)
  - ServiceWorkerMessaged: The ServiceWorker messaged the page by using the Clients API.  (Both this and the above could be true, which might be an argument for a bit-mask or separate flags.)
- We could have a bunch of (less performant) helpers that could be used after the (faster) filter check above to return the ServiceWorkerID and/or the nsIServiceWorkerInfo for the ServiceWorker (although there would be neither of these for an update check fetch).
- Medium-term future: The filter could let devtools indicate that it wants all the ServiceWorkers to effectively have `setDebuggerReady(false)` called on them as they're created and allow for a listener callback that gets invoked with the corresponding (new) `nsIWorkerDebugger`.  I guess this would also implicitly call attachDebugger on the nsIServiceWorkerInfo.
  - The context is that, as part of our attempt to improve ServiceWorker performance by not having them be owned by the main thread, we will likely move to having their exposed nsIWorkerDebuggers be on the parent process main thread rather than on the main thread of the content process where they live.  This would also end up happening for SharedWorkers too.  But it means we can have the association between nsIServiceWorkerInfo and nsIWorkerDebugger be much more direct because they will be exposed in the same process.

This would represent a shift of more decision-making into ServiceWorker C++ logic, but this is desirable because a number of the notable changes for a SW becoming relevant to a browsing context already happen on the PBackground thread, and in the future it's quite possible we could move even more of them to a background thread and we could update the state more atomically with this API so that fetches don't get lost, whereas if we had to message devtools for it to update its own predicates first, it might miss a bunch of fetches, especially if/as we shard PBackground into multiple threads/task queues and start doing more toplevel IPC endpoint stuff.

Revision 1 by

Andrew Sutherland [:asuth] (he/him)

on 2023-12-13 13:36:39 PST

edit note: I have changed previous use of "browsing context id" to [BrowserId](https://searchfox.org/mozilla-central/rev/655f49c541108e3d0a232aa7173fbcb9af88d80b/docshell/base/BrowsingContext.h#158-165) for clarity.

(In reply to Julian Descottes [:jdescottes] from comment #2)
> Right, it sounds useful especially for the requests performed to download the script. For those scenarios it might make more sense to have something on the request itself in order to link it to the browsing context which called `register`? For regular workers using PFetch, I think Eden made it so that it could work, but maybe this is different? A nsIServiceWorkerManager API that would also handle this use case sounds harder to design.

An additional complication is that because service workers has a [job queue](https://w3c.github.io/ServiceWorker/#dfn-job-queue) with [coalescing semantics in step 6 of schedule job](https://w3c.github.io/ServiceWorker/#schedule-job-algorithm) we end up with a situation where potentially multiple browsing contexts could be associated with a single ServiceWorker register() job and the resulting new registration.  Like if you open 5 tabs in parallel and they all call register, they would all end up with promises waiting on the same single register job.

And for completeness, some related edge cases:
- A ServiceWorker registered for a scope that doesn't actually exist for interception purposes (ex: "/this-url-does-not-exist"), but does exist for push notification purposes or some other rationale where a page would call `getRegistrations()` to locate the registration whose scope does not match the page, and then postMessage() the SW.
- A page "/foo/bar" is controlled by the SW with scope "/".  Another tab registers a new SW with scope "/foo" that claims the pre-existing "/foo/bar" page (which did not call register).  This is extreme (and probably a bad idea for the site), but possible.
- The [update check in step 7 of the update algorithm](https://w3c.github.io/ServiceWorker/#update-algorithm) is not performed by the script loader (which would re-do any loads if the update check finds changes, although those might then be cached).

Maybe a sketch for an API that could handle the various permutations would be:
- Add `nsIServiceWorkerManager::CreateBrowsingContextFilter` method that takes a (edited:) [BrowserId](https://searchfox.org/mozilla-central/rev/655f49c541108e3d0a232aa7173fbcb9af88d80b/docshell/base/BrowsingContext.h#158-165) and some flags to scope level of interest and the method returns a new class/interface `nsIServiceWorkerFilter`.
- `nsIServiceWorkerFilter::FilterLoadInfo(nsILoadInfo)` is the key method that can be used to efficiently check if the fetch is coming from an involved ServiceWorker, and it returns an enum or bitmask describing the relationship between the browsing context and the SW.  I propose integer enum values below in order of decreasing precedence where 0 would be that the fetch isn't interesting or related:
  - Controlling: The SW is controlling the page.  This could perhaps be further broken down into:
    - ControllingLoaded: The page has been controlled by the SW since its load.  (Note that a SW that does not handle "fetch" events is still notionally controlling a page, so this does not distinguish if the page load actually went to the network or not.)
    - ControllingClaimed: The page was claimed by the SW (and may or may not have been previously claimed by a different SW).
  - PageRegistered: The page issued a register call to this SW.
  - RegistrationUpdating: The page didn't call register() but an update check for the registration resulting from a functional event or handle fetch algorithm ran and dediced we need to update the SW and the SW is somewhere in the evaluating/installing/waiting lifecycle, with "active" corresponding to the ControllingClaimed value above.
  - RegistrationUpdateCheck: There isn't actually a SW yet, we're just in the update-check as part of the update algorithm, so there's no script loader active yet.  This is the predecessor state of RegistrationUpdating above.
  - ScopeMatches: The page's URL will match the scope of the SW but the page didn't call register, but this inherently would be the result of another page issuing a fresh call to register().
  - PageMessaged: The page used `ServiceWorker.postMessage` to talk to the SW.  (A page can find all SWs via `getRegistrations()`.)
  - ServiceWorkerMessaged: The ServiceWorker messaged the page by using the Clients API.  (Both this and the above could be true, which might be an argument for a bit-mask or separate flags.)
- We could have a bunch of (less performant) helpers that could be used after the (faster) filter check above to return the ServiceWorkerID and/or the nsIServiceWorkerInfo for the ServiceWorker (although there would be neither of these for an update check fetch).
- Medium-term future: The filter could let devtools indicate that it wants all the ServiceWorkers to effectively have `setDebuggerReady(false)` called on them as they're created and allow for a listener callback that gets invoked with the corresponding (new) `nsIWorkerDebugger`.  I guess this would also implicitly call attachDebugger on the nsIServiceWorkerInfo.
  - The context is that, as part of our attempt to improve ServiceWorker performance by not having them be owned by the main thread, we will likely move to having their exposed nsIWorkerDebuggers be on the parent process main thread rather than on the main thread of the content process where they live.  This would also end up happening for SharedWorkers too.  But it means we can have the association between nsIServiceWorkerInfo and nsIWorkerDebugger be much more direct because they will be exposed in the same process.

This would represent a shift of more decision-making into ServiceWorker C++ logic, but this is desirable because a number of the notable changes for a SW becoming relevant to a browsing context already happen on the PBackground thread, and in the future it's quite possible we could move even more of them to a background thread and we could update the state more atomically with this API so that fetches don't get lost, whereas if we had to message devtools for it to update its own predicates first, it might miss a bunch of fetches, especially if/as we shard PBackground into multiple threads/task queues and start doing more toplevel IPC endpoint stuff.

Back to Bug 1836707 Comment 3

Bugzilla

Quick Search

Bug 1836707 Comment 3 Edit History