Closed Bug 1293277 Opened 7 years ago Closed 6 years ago

service worker Client interface and APIs won't work with multiple content processes

Categories

(Core :: DOM: Service Workers, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla59
Iteration:
54.3 - Mar 6
Tracking Status
relnote-firefox --- -
firefox59 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

(Depends on 1 open bug, Blocks 4 open bugs)

Details

(Keywords: dev-doc-needed, Whiteboard: [e10s-multi:+])

Attachments

(9 files, 223 obsolete files)

11.48 KB, patch
baku
: review+
Details | Diff | Splinter Review
2.88 KB, patch
baku
: review+
Details | Diff | Splinter Review
6.13 KB, patch
baku
: review+
Details | Diff | Splinter Review
28.89 KB, patch
baku
: review+
Details | Diff | Splinter Review
13.38 KB, patch
baku
: review+
Details | Diff | Splinter Review
1.56 KB, patch
baku
: review+
Details | Diff | Splinter Review
1.01 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
25.28 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
92.70 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
Service workers includes an API like this:

  clients.matchAll({ includeUncontrolled }).then(list => {
    // list is ordered such the most recently focused window is first
    list.forEach(client => {
      // snapshot information about each client
      console.log('Found client - url:' + client.url +
                  ' visible:' + client.visibilityState +
                  ' focused:' + client.focused);

      // can also operate on the client
      if (url === myWindowURL) {
        client.navigtate(newURL);
      } else if (url === interestingURL) {
        client.focus();
      } else if (url === messageBuddy) {
        client.postMessage(msg);
      }
    });
  });

We implement this today, but our current solution will only work in non-e10s or e10s with a single child process.

To make this work for multiple content processes we need:

1) A way to find all windows (top level and nested) for a particular origin across the entire browser.
2) A way to navigate/focus each window
3) A way to postMessage to each window

In the future we will need to support Workers as clients as well, but today we only support windows.
Jim, this is one where we could probably use some help.  We're not sure whats available currently for locating and operating on windows across different child processes.  What do you think?
Flags: needinfo?(jmathies)
(In reply to Ben Kelly [:bkelly] from comment #0)
> To make this work for multiple content processes we need:
> 
> 1) A way to find all windows (top level and nested) for a particular origin
> across the entire browser.
> 2) A way to navigate/focus each window
> 3) A way to postMessage to each window

As far as I know, we don't have anything like this right now. It would have to be implemented fresh.
I don't know of anything. The multi e10s team might be able to help, adding to their triage list.
Flags: needinfo?(jmathies)
Whiteboard: e10s-multi:?
I have some ideas about how I want to implement this.  I'm going to take it for now.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Attached patch wip (obsolete) — Splinter Review
Snapshot of my work-in-progress.  Nothing functional yet.  Its mainly just IPC boilerplate scaffolding so far.

Basic structure:

* There is a thread-local-singleton called ClientManager.
* The ClientManager creates a PClientManager actor using PBackground to reach the real service singleton.
* The ClientManager can be used by windows/workers to create a ClientSource.
* The ClientSource is a ref-counted object representing the client entity.  When it is destroyed the client is considered removed from the system.
* The ClientManager can be used to query what clients are available.  These are returned as ClientHandle objects.
* The ClientHandle can be used to operate on the client.  It will be notified when the underlying client is destroyed.
* Since these are PBackground actors they are locked to their creation threads.
* ClientManager will support serializing ClientHandles to ClientToken objects.
* A ClientToken can be passed cross thread and process boundaries and use to re-create a ClientHandle on the other side.
Attached patch wip (obsolete) — Splinter Review
Updated work-in-progress that creates PClientSource actors for windows and workers.  I verified via debug that the actors are correctly created and destroyed as you navigate around.  I also verified that they are properly destroyed and re-created when going into and out of bfcache.

This still uses the "document is a Client" definition for windows, but I plan to fix that to conform to the spec.  I need to check the spec a bit more closely.
Attachment #8812928 - Attachment is obsolete: true
Attachment #8818696 - Attachment is patch: true
Note, the current WIP changes the plan in comment 5 such that ClientSource is now a UniquePtr<> instead of ref-counted.  This helps avoid accidentally keeping the RAII style object alive too long.

Also, a design note:

The ClientSource could be separated from ClientManager.  The actors really could live directly under PBackground right now.  I am keeping the hierarchy with ClientManager, though, so that in the future we can more easily support an optimization where a ClientHandle directly references a ClientSource in the same process and thread.  This is only possibly if we have a child-side place to do maintain a cache of sources, though.  That place is the ClientManager object.
Some notes so I might remember what I am doing next week:

In order to do the reserved Client stuff correctly we'll need to create the Client earlier than we do today.  Right now we don't make a document "visible" as a Client until nsDocument::SetScriptGlobalObject().  We need to create the reserved Client, however, *before* the nsIChannel is created in nsDocShell.  We then need to propagate the Client through until we have an active window/document.  So this means somewhere in nsDocShell::InternalLoad() or nsDocShell::DoURILoad().

For workers we probably will need to create the Client in or before ScriptLoader.  Workers have an added problem, however, in that they do not have their principal or URL fully set at that point.  We may need to initialize a request principal/URL and then updated them after we complete the script nsIChannel.  This has a spec issue too:

https://github.com/w3c/ServiceWorker/issues/1034

When a window and its workers are frozen to be put into the bfcache we will need to "hide" the associated Client.  We should not destroy the Client, however, as we will want to re-use the same Client and its ID if the window/workers are thawed.

Finally, there are some problems with the spec around reserved clients during redirects:

https://github.com/w3c/ServiceWorker/issues/1031
Blocks: 1032521
Blocks: 1183625
Blocks: 1231211
Blocks: 1231218
Depends on: 1333573
Attached patch wip (obsolete) — Splinter Review
Updated work-in-progress that works with the patches in bug 1333573.  We can now reserve the Client before the initial worker script network request.

Next I'll make sure we can do the same for window Client objects as well.  This will mean disentangling where in the nsGlobalWindow/nsDocShell/nsDocument web the Client needs to live.  Right now we perform Client creation/destruction in nsDocument, but thats probably not right.

I think we will probably want to move it to nsGlobalWindow.  For example, if a page is put into the bfcache and then comes back out it would be nice to maintain the exact same Client object and ID.  This suggests we want it to live on nsGlobalWindow.
Attachment #8818696 - Attachment is obsolete: true
Attachment #8832648 - Attachment is patch: true
Note to self, this is the best comment describing how the various Clients work during a network request:

https://github.com/w3c/ServiceWorker/issues/870#issuecomment-245559217
Boris suggests that we get the ClientSource from the NS_NewChannel() location to the creation of the inner window via the nsILoadInfo.
(In reply to Ben Kelly [:bkelly] from comment #12)
> Boris suggests that we get the ClientSource from the NS_NewChannel()
> location to the creation of the inner window via the nsILoadInfo.

Is ClientSource IPC serializable?  IINM this would require serializing it and sending it to the parent as part of the load info serialization.
(In reply to :Ehsan Akhgari from comment #13)
> Is ClientSource IPC serializable?  IINM this would require serializing it
> and sending it to the parent as part of the load info serialization.

We will send an nsID representing the Client.  ClientSource itself cannot be passed because PClientSource is a PBackground actor and the channel is a PContent actor.
Depends on: 1337439
Depends on: 1337522
Whiteboard: e10s-multi:? → [e10s-multi:?]
Attached patch wip (obsolete) — Splinter Review
Updated work-in-progress that also creates the ClientSource before the docshell load and passes the reserved client on to the resulting inner window.
Attachment #8832648 - Attachment is obsolete: true
Depends on: 1339587
No longer depends on: 1339587
Depends on: 1290936
Depends on: 1339844
No longer depends on: 1290936
Blocks: 1264177
Depends on: 1340201
Attached patch wip (obsolete) — Splinter Review
Update work-in-progress with code to set the window Client ExecutionReady.
Attachment #8837307 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Update work-in-progress to reflect how mutable client state will be propagated.

My original idea was to push this information to the parent as it happen.  Given this information includes visibility data, though, I think this might be quite excessive.  Scrolling through twitter where there are tons of iframes would create a lot of IPC traffic.

So for now, lets plan to ping each ClientSource for its state before we return it to the Clients API.
Attachment #8838268 - Attachment is obsolete: true
Other thoughts recently:

I think we should expose a snapshot ClientInfo structure.  This can be attached to nsILoadInfo or nsIChannel to reflect Client related information like principal, creation URL, etc.  These could be produced equally from window and worker instead of our current nsIDocument vs WorkerPrivate nightmare.  Since its a snapshot and not a live actor it can be safely passed across threads.

Over time we could add things like referrer-policy and CSP to the ClientInfo structure.  The network stack could also get its triggering principal from the ClientInfo, etc.

Also, since the Clients API only exposes snapshots of Clients it would be adequate for implementing that API.

In order to replace nsIDocument in ServiceWorkerManager (bug 1231218), however, we will need a live actor and not just a snapshot.  The ServiceWorkerManager will need the actor destruction message to know when Clients go away.
Depends on: 1266747
Attached patch wip (obsolete) — Splinter Review
Incorporate focus time now that bug 1266747 added it to the document.
Attachment #8839708 - Attachment is obsolete: true
Depends on: 1343308
Attached patch wip (obsolete) — Splinter Review
Update to set the Client as controlled.  So far this only handles window clients.  I need to figure out a way to mark the Worker Client object as controlled as well.  Its tricky because the network request and SWM all happen on a different thread.
Attachment #8842042 - Attachment is obsolete: true
(In reply to Ben Kelly [not reviewing due to deadline][:bkelly] from comment #20)
> Update to set the Client as controlled.  So far this only handles window
> clients.  I need to figure out a way to mark the Worker Client object as
> controlled as well.  Its tricky because the network request and SWM all
> happen on a different thread.

So, I was thinking I could make a WorkerClientSourceMainThreadProxy object that gets attached to the LoadInfo, but that will stop working once we move interception to the parent process.

Instead I think I should make the interception code use ClientManager its IPC actors to mark the Client controlled.  This could be a method on PClientManager or a method on a separate PClientHandle actor.  The separate actor may be overkill here, but we will eventually need it for other things in order to complete the multi-e10s refactor.
Attached patch wip (obsolete) — Splinter Review
Starting the ClientHandle here.  I am duplicating a lot of code with the ClientSource.  I'll try to de-duplicate that once things are stable.
Attachment #8842606 - Attachment is obsolete: true
Iteration: --- → 54.3 - Mar 6
Attached patch wip (obsolete) — Splinter Review
Attachment #8843084 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Ok, this handles marking Client objects controlled from the ServiceWorkerManager using a ClientHandle actor.  This means it works for Window clients and the cases where we currently support controlling Worker clients.

I say "where we currently support", because our coverage for controlling worker clients sucks.  For example, clients.claim() and skipWaiting() will not start controlling a worker today.
Attachment #8843422 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Updated to actually attached the ClientHandle to the ClientSource in the parent service backend.  The "controlled" message is now completely routed from the handle back to the source.

Note, we probably still need to mark nsIChannels as controlled when we implement more of the e10s refactor.  These changes are not enough because the CLientHandle-to-ClientSource path can be too slow in order to know we need to intercept an immediate <script> or importScripts() request.  With debugging today I can see we mark workers execution ready before we get the controlled message to worker ClientSource (perhaps because of the loading sync loop).
Attachment #8843484 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This updates the patch to set mark the nsIChannel's load info with the controlling service workers scope and ID.  The window and worker then get this info out and update the ClientSource immediately.
Attachment #8843527 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Makes navigator.serviceWorker.controller use the window's ClientSource->GetControlled() data instead of the document-to-registration map.

This almost passes all WPT tests, but fails:

/service-workers/service-worker/skip-waiting-installed.https.html
-----------------------------------------------------------------
FAIL Test skipWaiting when a installed worker is waiting
/service-workers/service-worker/skip-waiting-using-registration.https.html
--------------------------------------------------------------------------
TIMEOUT [Parent]
Attachment #8843594 - Attachment is obsolete: true
(In reply to Ben Kelly [not reviewing due to deadline][:bkelly] from comment #27)
> /service-workers/service-worker/skip-waiting-installed.https.html
> -----------------------------------------------------------------
> FAIL Test skipWaiting when a installed worker is waiting
> /service-workers/service-worker/skip-waiting-using-registration.https.html
> --------------------------------------------------------------------------
> TIMEOUT [Parent]

These are likely due to ServiceWorkerClients caching the controller object and we don't clear it when ClientSource::Controlled() is called.
Attached patch wip (obsolete) — Splinter Review
This triggers the controller change logic in ServiceWorkerContainer when the ClientSource::Controlled() method is called.  It also avoids the old paths that trigger the controller change directly from SWM.

This lets the patch pass both WPT and mochitests locally.
Attachment #8843598 - Attachment is obsolete: true
Note to self about work remaining:

1) Replace ClientControlledArgs with a ServiceWorkerDescriptor IPC type.  (ServiceWorkerInfo is already taken....)
2) Change the signals from Controlled(ClientControlledArgs) to SetController(ServiceWorkerDescriptor).
3) Create an actor under PClientManager to represent querying for a list of ClientInfo objects.
  a) Query params like matchAll() options in actor constructore
  b) Find list of matching ClientSourceParent objects
  c) SendRequestState() to each ClientSourceParent
  d) Wait for RecvUpdateState() or actor destruction for each ClientSourceParent
  e) Sort list of ClientInfo+ClientState objects
  f) Send list back in actor __delete__ message
4) Finish implementing Freeze/Thaw of Clients
  a) Make sure we call Freeze/Thaw in appropriate places
  b) Hide frozen actors from ClientHandle and matchAll query
  c) Don't think we need to do anything about client controller here
5) Implement infrastructure for postMessage()
6) Implement infrastructure for focus()
7) Implement infrastructure for navigate()
8) Implement infrastructure for openWindow()
9) Start implementing Clients API on top of the infrastructure
10) Clean up patches for review

Steps (1), (2), (3), (4), and (9) could probably be done within the week.

But I can't skipp steps (5), (6), (7), and (8).  Those may take some thinking to figure out.  Difficult to say how close we are.
More notes to self:

Things like focus(), navigate(), openWindow(), and matchAll() all require resolving a promise after the operation actually completes.  My initial thought was just to various async messages for this, but since we need to match things up at the end we should probably us actors to represent the operations.

So I need to build an "operation" actor system similar to Cache API's system:

https://dxr.mozilla.org/mozilla-central/source/dom/cache/PCacheOp.ipdl

I will need a PClientHandleOp that goes from child-to-parent and a PCacheSourceOp that goes from parent-to-child.

Another unrelated thought:

Currently the wip patch maintains the WorkerHolder at the ClientManager level and depends on the PClientManager destroying all its child actors before giving up the WorkerHolder.  Because of this we may need to check if the PClientManagerParent is tearing down before creating a PClientSourceOp.  This may just be handled correctly by the teardown semantics, though.  I'll need to check.
Attached patch wip (obsolete) — Splinter Review
Rough-in all the boilerplate needed to add two more actors.
Attachment #8843617 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This mostly replaces the "control a Client" path with the new operation actor system.  It uses MozPromise to handle the result resolve/reject of the operation.  This is nice because it lets me fix the problem with the last patch where we did not wait for the Clients to actually be controlled before resolving the Clients.claim() promise.

The current patch has a bug, though, where it does not keep the ClientHandle alive if there is an operation in progress.  This means that when code creates a handle, starts an op, and then drops the handle ref it ends up canceling the operation.  To solve this I need to delay ClientHandle::StartTeardown() if any child operation has a populated mPromiseHolder.

I'll fix this tomorrow and then do some cleanup before moving to the other operations.  The good news is that once this is proved out the other operations should be significantly easier.  I just have to define args in, do the operation in the ClientSource, and define result out.
Attachment #8843745 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This patch keeps the ClientHandle object alive until the pending remote op is complete.  It once again passes service worker tests locally.

Note, the dom/workers/tests/serviceworkers/test_claim.html test is a bit broken and had to be fixed here.
Attachment #8844303 - Attachment is obsolete: true
Depends on: 1345132
Attached patch wip (obsolete) — Splinter Review
Add focus operation to the infrastructure.  Can't test yet since I don't have the DOM API implemented yet.
Attachment #8844482 - Attachment is obsolete: true
Depends on: 1345251
Attached patch wip (obsolete) — Splinter Review
Start implementing Navigate.  I had to refactor a bit more, though, since I realized I needed async operation all the way into ClientSource.

Also, this revealed a problem with MozPromise in worker thread.  I need to fix bug 1345251.
Attachment #8844567 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This implements the Client.navigate() op infrastructure.  I have low confidence that the nsIWebProgressListener I copied from the existing code is adequate here.  We need wait to resolve the promise until the new Client is ExecutionReady.

I'll most likely have some rework to do once I implement the DOM layer and can test this.
Attachment #8844660 - Attachment is obsolete: true
In order to get Client.postMessage() right I need to do bug 1311324 first.  Right now we are lacking some WPT test coverage because we are not to spec any more.  Also, some of our postMessage() code looks wrong to me at the moment.
Depends on: 1311324
Attached patch wip (obsolete) — Splinter Review
Partial implementation of postMessage().  Mostly just the receive-side done so far.  I still need to do the structured clone on the send-side.

Also, as mentioned I need to sort out bug 1311324 to make sure I'm implementing this right.  I think our current implementation is both of an old spec and bugged in a few ways.  For example, we set the evt.source to the controller of the client, but thats pretty clearly wrong.  I want to get the WPT tests green so I'm more confident I'm not regressing something.
Attachment #8844726 - Attachment is obsolete: true
Depends on: 1345943
Attached patch wip (obsolete) — Splinter Review
Update the patch based on the two MessageEvent dependent bugs.  Also, implement the ServiceWorkerDescriptor type and use it.
Attachment #8845151 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8846081 - Attachment is obsolete: true
Last patch mostly finishes infrastructure for postMessage().  I took a bit of a short-cut on the sending side by assuming the actor is already setup.  That will be the case for the ServiceWorker caller, but ultimately might need to get fixed if we expose this on main thread.
Attached patch wip (obsolete) — Splinter Review
Add boilerplate for the Clients.matchAll() query.  This is a separate actor on the ClientManager since PClientOp is specific to individual Client(Handle|Source) objects.
Attachment #8846204 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Continue to flesh out the query path.  Still needs some work on the requesting side and in the service.  I need to convert ClientManager to use some of the convenience stuff I wrote for ClientHandle/ClientSource.
Attachment #8846750 - Attachment is obsolete: true
I need the changes in bug 1340163 in order to get the origin out of a PrincipalInfo() OMT.
Depends on: 1340163
Whiteboard: [e10s-multi:?] → [e10s-multi:+]
Attached patch wip (obsolete) — Splinter Review
Implemented the service side of the matchAll() query.  Still needs some cleanup, but might work.
Attachment #8846823 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Start factoring out the async task execution code so it can be shared among ClientSource/ClientHandle/ClientManager via a base class.  Still some work to do, but I finally figured out the right sacrifices to appease the template gods.
Attachment #8847280 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This refactors things further so I can use the MaybeExecute() helpers in ClientManager.  It also implements the ClientManager::Query() logic.

Next step is to implement the DOM layer of the Clients API.  It should be a fairly thin wrapper around the infrastructure elements.  Then test, fix bugs, cleanup.
Attachment #8847385 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Actually include the important ClientThing.h header.
Attachment #8847788 - Attachment is obsolete: true
Blocks: 1348082
Attached patch wip (obsolete) — Splinter Review
Stub out the Clients and Client DOM classes.  This was a bit harder than I expected due to having to unlink all uses of the old classes.
Attachment #8847790 - Attachment is obsolete: true
Started filling in the guts of the DOM classes and realized I need some more infrastructure for:

1) Clients.get()
2) Clients.claim()
3) Clients.openWindow()
Attached patch wip (obsolete) — Splinter Review
Attachment #8848237 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This implements the DOM layer for Client and part of Clients.  Unfortunately there some things on Clients I still need infrastructure for.  The first two should not be too bad:

1) Clients.claim() could probably be implemented on top of ClientManager::Query and ClientHandle::Control().  I need to add some code to notify ServiceWorkerManager of the change in control, though.
2) Clients.get() should just be a variant of the Query code I added for Clients.MatchAll().

The third though:

3) Clients.openWindow() could be very very hard.  Its unclear to me yet if we have infrastructure to do what I need here.

Clients.openWindow() needs:

1) Open a window in another child process since we don't want windows in the worker process.
2) Detect when window load is complete
3) Communicate the Client UUID back to the remote process that called openWindow() so we can construct a Client DOM object

I may need to build a new set of actors to do this.
Attachment #8848575 - Attachment is obsolete: true
Blake tells me there is a C++ API that I can use from the parent process that will help with openWindow();  ContentParent::GetNewOrUsedBrowserProcess():

https://dxr.mozilla.org/mozilla-central/source/dom/ipc/ContentParent.cpp#796
Attached patch wip (obsolete) — Splinter Review
This converts MatchAll to use a more generic Op mechanism.  It also implements Clients.claim() on top of that mechanism.

Left to do:

* Clients.get()
* Clients.openWindow()
* Lots of cleanup work

The openWindow() changes might take a couple days.
Attachment #8848681 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This adds most of the guts for Clients.get().  Still a bit more to do on the initiation side to get the principal and ID correctly.

I'll probably try to get this stuff passing tests before I tackle openWindow().
Attachment #8849693 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Updated to implement the rest of clients.get().

I also started trying to get this test to pass:

  service-workers/service-worker/clients-get.https.html

I found and fixed a few bugs from running this.  I also realized I'm missing a change.

The new design depends on a ClientInfo being stored in the nsIChannel's nsILoadInfo.  I have code that sets the reserved client's ClientInfo, but not the initiating ClientInfo.  So I need to modify NS_NewChannel() to do that.
Attachment #8849729 - Attachment is obsolete: true
I think for loads with an associated nsINode I can automatically set the ClientInfo here:

https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsIOService.cpp#880

I'm not sure how to set the ClientInfo for network requests on Workers, though.  Workers are not well integrated into NS_NewChannel().  I will probably have to do a bunch of lame post-creation modifications to the nsILoadInfo for all the different places we initiated network from workers.  This will be error prone, unfortunately.
Attached patch wip (obsolete) — Splinter Review
This gets the clients-get.https.html test to pass.  It did reveal some potential issues in the spec, though:

https://github.com/whatwg/html/issues/2456
https://github.com/w3c/ServiceWorker/issues/1091

I implemented what makes sense given the current spec, but I imagine it might change in ways that could be hard to implement.
Attachment #8850193 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This gets clients-matchall.https.html passing.
Attachment #8850267 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This gets postmessage-to-client.https.html to pass.  Turns out I forgot to implement Client::PostMessage(), although most of the infrastructure was there.
Attachment #8850644 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Make claim-fetch.https.html pass.
Attachment #8850676 - Attachment is obsolete: true
These are the WPT tests still failing:

Unexpected Results
==================

/service-workers/service-worker/claim-using-registration.https.html
-------------------------------------------------------------------
FAIL Test for the waiting worker claims a client which is using the the same registration
/service-workers/service-worker/client-navigate.https.html
----------------------------------------------------------
TIMEOUT Frame location should update on successful navigation
NOTRUN Frame location should not be accessible after redirect
NOTRUN Frame location should not update on failed navigation
TIMEOUT [Parent]
/service-workers/service-worker/clients-matchall-client-types.https.html
------------------------------------------------------------------------
TIMEOUT expected FAIL Verify matchAll() with window and sharedworker client types
ERROR [Parent]
/service-workers/service-worker/clients-matchall-order.https.html
-----------------------------------------------------------------
FAIL Clients.matchAll() returns controlled windows in focus order.  Case 1.
FAIL Clients.matchAll() returns controlled windows in focus order.  Case 2.
FAIL Clients.matchAll() returns non-focused uncontrolled windows in creation order.
FAIL Clients.matchAll() returns uncontrolled windows in focus order.  Case 1.
FAIL Clients.matchAll() returns uncontrolled windows in focus order.  Case 2.
FAIL Clients.matchAll() returns controlled windows and frames in focus order.
/service-workers/service-worker/fetch-event.https.html
------------------------------------------------------
FAIL Service Worker responds to fetch event with an existing client id
/service-workers/service-worker/navigate-window.https.html
----------------------------------------------------------
FAIL Clients.matchAll() should not show an old window as controlled after it navigates.
FAIL Clients.matchAll() should not show an old window after it navigates.
Attached patch wip (obsolete) — Splinter Review
This gets navigate-window.https.html test to pass.  I had forgotten to complete the Freeze/Thaw implementation and bfcache Clients were showing up in the list.
Attachment #8850686 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This gets clients-matchall-order.https.html to pass.  The sorting code was wrong.

This also gets fetch-event.https.html.  The test incorrectly checks to make sure a non-subresource fetch event has a null client ID.  This is not correct according to the current spec.  I will probably break this out into a separate bug.
Attachment #8850707 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This gets clients-matchall-client-types.https.html passing.  There were a few issues here:

1) The test timed out when it had a test assertion failure.  I fixed that to propagate the error.
2) We were failing because the shared worker Client.url was wrong.  It was not resolved to the full path.  I fixed Client.url to use the location.href for workers.
3) We then passed the test for the first time, so I removed the failure expectation.
Attachment #8850779 - Attachment is obsolete: true
The clients-navigate.https.html failures are due to a bigger problem.  Currently navigate is designed to work like this:

1) Client::Navigate() effectively creates a PClientHandleOp that is owned by a PClientHandle
2) The PClientHandleOp actor is then connected to a PClientSource and creates a PClientSourceOp
3) The PClientSourceOp then calls into the ClientSource representing the target window to navigate
4) ClientSource::Navigate() returns a MozPromise to the PClientSourceOp.  When the MozPromise resolves the op actors will be destroyed returning the op result.
5) ClientSource::Navigate() sets up a nsIWebProgressListener to wait for document load and then starts the load.  When the progress listener detect the load is complete it resolves the MozPromise.

The problem is that in step 5 we navigate the window.  This *destroys* the ClientSource that is in use.  This automatically aborts all operations in progress for the ClientSource.  This causes the Client.navigate() operation to fail.

I need to create some kind of operation actor associated with the PClientManager or PBackground itself.  I then need to get this new actor to work with the ClientSource in question to start the navigate, wait for the new window to load, and return info about the new ClientSource.

There is some overlap here with openWindow() in that its not an operation on the PClientSource actor.  I'm not sure I can really get any code re-use out of the two, though.  The openWindow functionality needs to work when there are no windows yet which means it must work without a PClientManager already existing.  I think openWindow will require a PContent actor for this.

So what I will probably do here is create a PClientManager navigation operation actor.  The target PClientSource actor can be passed as an argument to this new operation actor.
Attached patch wip (obsolete) — Splinter Review
I did fix some other stupid bugs in the navigate path, though.
Attachment #8850787 - Attachment is obsolete: true
This error:

/service-workers/service-worker/claim-using-registration.https.html
-------------------------------------------------------------------
FAIL Test for the waiting worker claims a client which is using the the same registration

Is due to not rejecting the clients.claim() promise if the function is called on an inactive service worker.

Once the SWM is fully migrated to the Client and multi-e10s architecture I think this might be easier to solve.  At the moment, though, I will need to write some kludgy compat code to pull it out of the ServiceWorkerManager on the current main thread.
(In reply to Ben Kelly [not reviewing due to deadline][:bkelly] from comment #69)
> Once the SWM is fully migrated to the Client and multi-e10s architecture I
> think this might be easier to solve.  At the moment, though, I will need to
> write some kludgy compat code to pull it out of the ServiceWorkerManager on
> the current main thread.

Actually, we will eventually need this information synchronously in the service worker context anyway.  We will need it to implement self.registration.active.state which is a sync getter.  So perhaps I should just push this information to the WorkerPrivate now.
Current failures with the latest patch:

/service-workers/service-worker/claim-using-registration.https.html
-------------------------------------------------------------------
FAIL Test for the waiting worker claims a client which is using the the same registration
/service-workers/service-worker/client-navigate.https.html
----------------------------------------------------------
FAIL Frame location should update on successful navigation
FAIL Return value should be instance of WindowClient
FAIL Redirecting to another origin should resolve with null
FAIL Navigating to about:blank should reject with TypeError

And of course openWindow() is still completely unimplemented.
Depends on: 1350398
Depends on: 1350433
Attached patch wip (obsolete) — Splinter Review
Updated patch includes code to store a ServiceWorkerDescriptor on the WorkerPrivate.  This is only partly done.  I still need to propagate state changes from the ServiceWorkerInfo through the ServiceWorkerPrivate and up to the WorkerPrivate.

Once this is done I can check the state when skipWaiting() is called immediately to reject if necessary.
Attachment #8850811 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This gets claim-using-registration.https.html to pass.  The only service worker WPT test failing now is:

/service-workers/service-worker/client-navigate.https.html
----------------------------------------------------------
FAIL Frame location should update on successful navigation
FAIL Return value should be instance of WindowClient
FAIL Redirecting to another origin should resolve with null
FAIL Navigating to about:blank should reject with TypeError

I need to do the complication navigation re-design to fix that, though.  I also need to implement openWindow().

I will be on travel for the next couple weeks, unfortunately.  I will try to work on them as I can, but may not make much progress until I return.  That might make fitting this in FF55 difficult since the merge is only another week after that.  I'll do my best, though.
Attachment #8851108 - Attachment is obsolete: true
We also fail at least this mochitest:

dom/workers/test/serviceworkers/test_match_all_client_properties.html
Depends on: 1351935
Depends on: 1351959
Attached patch wip (obsolete) — Splinter Review
This fixes client.navigate().  All WPT service worker tests now pass.

I still need to implement openWindow(), do clean up, and then flag for review.
Attachment #8851180 - Attachment is obsolete: true
Blake pointed me to this in order to determine if Clients.openWindow() should potentially create a new child process:

http://searchfox.org/mozilla-central/source/xpcom/system/nsIXULRuntime.idl#104
Attached patch wip (obsolete) — Splinter Review
This fixes test_match_all_client_properties.html.  The test verifies we strip the {} characters off the UUID.  I fixed that in this update.

I also fixed a bug in focus(), but we still fail this test:

dom/workers/test/serviceworkers/test_notificationclick_focus.html

Its failing because focus() should fail if there hasn't been a notification click within a given time frame.  I need to add code to make that check.
Attachment #8852800 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This fixes test_notificationclick_focus.html.

The next failing test is test_postmessage_advanced.html.  This one crashes.  I think the problem is that it tries to serialize a PBlob actor managed by PContent across the PClient's PBackground actor.

I'm not really sure how to handle that, unfortunately.
Attachment #8853337 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This fixes test_postmessage_advanced.html.  The problem was actually with how the clients code bridges the postMessage across the parent process.  When going from one PBackground manager to another we need rebuild the StructuredCloneData in the IPC args.

With this change the only remaining mochitest failure is in test_openwindow.html.  So implementing clients.openWindow() will be next.
Attachment #8853373 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Rebase and update for corrected client-navigate.https.html test case in bug 1351935.
Attachment #8853387 - Attachment is obsolete: true
I have some more work in progress for the openWindow() case, but nothing working yet.  I think its safe to say this will not be done in time for the merge next week.  Sorry.
Attached patch openwindow-wip (obsolete) — Splinter Review
Work-in-progress on the openWindow() code.  This pipes from the Clients.openWindow() call to the main thread on the parent process.  Next step is to create a PContent op to bridge to the child process.
Attached patch openwindow-wip (obsolete) — Splinter Review
This adds the PContent actor to bridge to the child process.  It also handles the "are we on e10s" check and possibly creating the child process if its not there yet.

Next I need to:

1) Copy the existing window opening code into a helper that can be called either from the parent or the child process.
2) Figure out how to gracefully handle browser shutdown with this setup.
Attachment #8857206 - Attachment is obsolete: true
Attached patch openwindow-wip (obsolete) — Splinter Review
This is a first cut at moving the openWindow() code over to the new model.  Currently crashes on tests.  Also I haven't even tried to support the android case yet.
Attachment #8857669 - Attachment is obsolete: true
Depends on: 1313096
Attached patch wip (obsolete) — Splinter Review
Rebase.
Attachment #8856710 - Attachment is obsolete: true
Attached patch openwindow-wip (obsolete) — Splinter Review
This gets about half of test_openWindow.html passing.  I had to fix the crashes, an infinite loop in the parent process, and a few issues in the test.  I think the remaining issues are related to not handling edge cases like about:blank correctly.

Note, a significant blocker to landing this work will be getting a Clients.openWindow() test on fennec.  I'm not familiar with the android test infrastructure at all.  It would be very helpful if we could get someone familiar with that side of things to write the test.  Its somewhat bad that we've been shipping this feature without a test on android.
Attachment #8858402 - Attachment is obsolete: true
Depends on: 1357463
Attached patch openwindow-wip (obsolete) — Splinter Review
This gets test_openWindow.html to pass in non-e10s.

It also passes all of the test cases in e10s, but gets confused about its final condition.  Its using global state which is broken by multi-e10s and our current setup where we spawn different service workers.  I still need to adjust the test to work around this.
Attachment #8858927 - Attachment is obsolete: true
Attached patch openwindow-wip (obsolete) — Splinter Review
This fixes test_openWindow.html in e10s mode.  I ended up adding a preference that makes openWindow() favor the current process.  This allows the tests to work correctly with its global state.  I did this since the same compat problem would probably break real sites as well.  In the future we can remove this code and use the normal remote process code instead.
Attachment #8859335 - Attachment is obsolete: true
Attached patch openwindow-wip (obsolete) — Splinter Review
Slightly better code for the "same process" case.

Next I will try to figure out android.
Attachment #8859647 - Attachment is obsolete: true
I resurrected one of my fxos devices enough to run test_openWindow.html on an android device.  The good news is that it passes.  The bad news is it looks like we won't have any automation for background or not-running openWindow cases.  I'll have to manually validate those.

I'm still working on getting a good developer setup for android.
Attached patch android.patch (obsolete) — Splinter Review
Start of the android open window patch.
Depends on: 1358622
Attached patch android.patch (obsolete) — Splinter Review
This gets the last part of openWindow working on android.
Attachment #8860161 - Attachment is obsolete: true
I have to do a bunch of rework because we apparently have all kinds of restrictions on where IPC types can be used.  I didn't hit this while developing because we only trigger these failures on windows platform.
Depends on: 1359230
Attached patch grr (obsolete) — Splinter Review
This fixes many of the windows.h issues, but there are still more.  I am going to see if I can fix it in the IPC layer instead in bug 1359230.
No longer depends on: 1359230
Depends on: 1361051
Depends on: 1361166
Depends on: 1361722
Depends on: 1362033
Note to self:

We are getting leaks in:

devtools/client/projecteditor/test/browser_projecteditor_rename_file_01.js

Because this:

devtools/client/projecteditor/lib/stores/local.js

Creates chrome workers that it abandons, but does not terminate.  This means its depending on CC to clean up these workers in a timely manner.

With the clients patches, though, we have a WorkerHolder alive for the PClientManager actor.  This increases the busy count preventing the cycle collector from terminating the worker.

So the workers stay alive keeping the window alive until shutdown.
Thing to try:  Call ModifyBusyCountFromWorker() in ClientManager code to force the busy count back down.
Depends on: 1362444
There are some windows specific failures.  Fixes for that, but no talos in this build:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0f9985dd2cabf76be9b9341452a0ff45f93e8133
Depends on: 1364276
Depends on: 1364277
So, my changes to handle initial about:blank iframe document replacement correctly break test_openWindow.html.

In particular, the redirect cases within that test depend on either one of the following happening:

1. The controlled status of the channel being propagated across a redirect.  For a navigation this shouldn't happen, but we do today AFAICT.  So if URL1 is intercepted, but then redirects to URL2 which is not, the final document is controlled.  That is wrong.
2. Or, the tests would work if we re-intercepted the redirected requests.  So URL1 is intercepted and URL2 is intercepted.  This is the correct thing to do, but we don't do it.  Bug 1363848 is filed to fix this.

Since my patches here fix (1), I need to fix (2) so that the tests will continue to pass with these patches.  Its also a pretty important compat bug, so its worthy fixing anyway.
Depends on: 1363848
Depends on: 1366089
Attached patch wip (obsolete) — Splinter Review
Rebased and folded patch queue.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=41e1db0f46dc1837f86ad6e49adf3afb34673e58
Attachment #8858925 - Attachment is obsolete: true
Attachment #8859652 - Attachment is obsolete: true
Attachment #8861032 - Attachment is obsolete: true
Attachment #8861184 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Fix a bug in the ClientChannelHelper code where we incorrectly always assumed we could use the initial ClientInfo.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=324501e4ba67f94216636b24a811509e04f69307

I still need to write a test for various about:blank initial document cases.  It seems we don't actually share the Client with the initial about:blank in many cases because the initial about:blank is created after we start the main document load.
Attachment #8876908 - Attachment is obsolete: true
Depends on: 1372962
Attached patch wip (obsolete) — Splinter Review
Rebase around some MozPromise changes.  Try with bug 1372962:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb9711799995d818404bb4d3e5b18fc831aef8b9
Attachment #8877356 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
This adds a WPT test for the client handling of about:blank replacement.  I posted it to WPT repo as well:

https://github.com/w3c/web-platform-tests/pull/6304

Let's see if I broke anything on try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6f4cc3bce5aed094364e73fe48d1556c9d7905db
Attachment #8877668 - Attachment is obsolete: true
Next task here is to make sure we are doing the right thing with bfcache.  We need to evict the window from bfcache if any operations are performed on its Client.  For example, if you try to postMessage() it, clients.matchAll() would have matched it, etc.  We should also evict if its controlled by a service worker that becomes redundant.
Attached patch wip (obsolete) — Splinter Review
Store the browser-wide Client data in a hashtable.  I was using a relatively stupid nsTArray as a placeholder before.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e61ecf339cf9013b879a1dbb92bd5c1f4bfcf852
Attachment #8880542 - Attachment is obsolete: true
Depends on: 1379243
Trying to get a build working without AbstractThread and proper runnable labeling:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=953d07c894e777755f343daef9a7392f7bcf66a5
Attached patch wip (obsolete) — Splinter Review
This patch removes AbstractThread() in favor of nsIGlobalObject::EventTargetFor().  This should ensure runnables are labeled correctly.  Depends on bug 1379243.
Attachment #8880849 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Rebase
Attachment #8885452 - Attachment is obsolete: true
Priority: -- → P2
Attached patch wip (obsolete) — Splinter Review
Rebased.  It was surprisingly not bad to rebase and compiled relatively easily.

Lets see how green try still is:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c33a322acd9a6f6b2ab833b5e825dd4d9753e17c

Given I have been working on this for 11 months, I think I need to land what I have if try is green.  There will be TODO comments and things to polish, but this bug is blocking other e10s service worker fixes.

I hope to grace :baku with the joy of reviewing it.
Attachment #8891524 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Fix some orange:

1. Follow new clients.openWindow() code to get proper container id.
2. Avoid referencing SWM off the main thread in a case a new WPT triggered.
3. Fix nsIDocShell.idl and nsILoadInfo.idl to use [noscript, nostdcall, notxpcom] instead of c++ virtual methods.  This was confusing the docshell vtbl.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b9978e766a2b012d4a65564df345d07d16c0e2f2
Attachment #8918004 - Attachment is obsolete: true
I will work on adding comments and patch splitting monday.
Looks like one orange still left to fix:

/service-workers/service-worker/client-navigate.https.html | Frame location should update on successful navigation
assert_unreached: unexpected rejection: assert_equals: expected "https://web-platform.test:8443/service-workers/service-worker/resources/client-navigated-frame.html" but got "" Reached unreachable code 

Only on non-e10s.
Attached patch wip (obsolete) — Splinter Review
The LoadInfo::mClientInfo was being lost across the internal redirect introduced by InterceptedHttpChannel.  We should copy this value in the LoadInfo copy constructor.
Attachment #8918457 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Spent most of today fixing a non-e10s failure.  There was some subtle stuff broken by the extra InterceptedHttpChannel internal redirect.  The code now uses the ClientChannelHelper for both window and worker nsIChannels to handle redirects properly.

I would do another try build, but the tree has been closed most of today.
Attachment #8918937 - Attachment is obsolete: true
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c01841e999e1d2ca4e9da39ed336acc69079a7fb

One thing I'm curious about is how we are passing this assertion when InternalHttpChannel does its internal redirect:

https://searchfox.org/mozilla-central/source/dom/workers/ScriptLoader.cpp#687

I'll investigate that tomorrow.
Try is green.  Going to start patch splitting for review.
Attached patch wip (obsolete) — Splinter Review
Attachment #8919288 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8919908 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8920286 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8920341 - Attachment is obsolete: true
Attachment #8920695 - Attachment description: P1 Respect LOAD_CALL_CONTENT_SNIFFERS when a channel is intercepted by a ServiceWorker. r=valentin → P1 Add a ref-counted WorkerHolderToken() class. r=baku
Comment on attachment 8920695 [details] [diff] [review]
P1 Add a ref-counted WorkerHolderToken() class. r=baku

Andrea, this patch adds a convenience class for using WorkerHolder objects.  It creates a ref-counted token that functions as a holder.  I find this easier to reason about and is a pattern I've used in dom/cache, etc.  Rather than implement this over and over I'd like to incorporate it in dom/workers.
Attachment #8920695 - Flags: review?(amarchesini)
Comment on attachment 8920284 [details] [diff] [review]
P2 Expose a "parsed" ServiceWorkerState value. r=baku

This patch exposes a "parsed" ServiceWorkerState.  This is not strictly in the spec yet, but I've raised an issue here:

https://github.com/w3c/ServiceWorker/issues/1162

And it (or something like it) will be necessary if we ever do:

https://github.com/w3c/ServiceWorker/issues/1077

Note, this should be somewhat transparent to existing code since its not currently possible to get a ServiceWorker object that is in this state.  I need some initial value, though, since this bug is going to make us track the state from the beginning of the creation of the WorkerPrivate.
Attachment #8920284 - Flags: review?(amarchesini)
Andrea, this patch adds a new ServiceWorkerDescriptor type.  This is a simple structure that contains enough information to construct a ServiceWorker DOM object.  Its threadsafe and can be passed across process boundaries.

Note, this is not a replacement for ServiceWorkerInfo, but merely a way for us to reference a ServiceWorkerInfo in a separate process without a strict actor involved.  We can create the ServiceWorker DOM object, then build the actor to the real ServiceWorkerInfo, and update the state value if necessary.

The intent here is that nsGlobalWindow and WorkerPrivate will have a Maybe<ServiceWorkerDescriptor> to represent their controller.  This will then be set on the nsILoadInfo to indicate that the channel is controlled by a service worker.

Also note, I originally intended this to be a straight ipdl generated struct.  I had to ultimately build a wrapper class around the IPC type, though, because of windows.h pollution.  The ipdl generated code includes windows.h which makes the bindings code refuse to compile.  This was a major pain and this is the best solution I came up with after struggling with it for a couple weeks.
Attachment #8920700 - Attachment is obsolete: true
Attachment #8920701 - Flags: review?(amarchesini)
Comment on attachment 8920340 [details] [diff] [review]
P4 Use the ServiceWorkerDescriptor in existing WorkerPrivate and ServiceWorkerInfo code. r=baku

This patch replaces various id and scope values in existing code in favor of the new ServiceWorkerDescriptor type.  This is partly to reduce duplication, but will also make it easier when we want to just copy "the service worker" around in later patches.  We can just grab the descriptor and go.
Attachment #8920340 - Flags: review?(amarchesini)
Attachment #8920375 - Attachment is obsolete: true
Comment on attachment 8920704 [details] [diff] [review]
P5 Add ClientInfo and ClientState types. r=baku

Andrea, this patch adds a few more structure types similar to ServiceWorkerDescriptor.

The ClientInfo is used to reference a global somewhere in the browser.  It uses a UUID to uniquely reference the global, but does not have a live actor associated with it.  The ClientInfo is designed to be a lightweight token that can be passed around.  Later patches will introduce an actor system for using the ClientInfo to attach to the global to control it (navigate, etc).

Note, the comments talk about some things happening at "execution ready" time.  This refers to this part of the spec:

https://html.spec.whatwg.org/#concept-environment-execution-ready-flag

Later patches will make this more clear where these values are actually set.

The ClientState, ClientWindowState, and ClientWorkerState are also lightweight structures.  The intent is to pass around a ClientState which can then be unpacked itno either a ClientWindowState or ClientWorkerState.

The ClientWindowState basically represents the readable attributes on WindowClient:

https://w3c.github.io/ServiceWorker/#client-interface

The ClientWorkerState does not really contain anything right now, but I'd like to keep it as a placeholder for completeness.  Its a bit easier to reason about various things in common code if we always have a state available.

In later patches ClientInfo will be used in a number of ways:

1. It will be available on nsGlobalWindow inner windows and WorkerPrivate.
2. It will be set on nsILoadInfo to represent the global triggering the network request.  This will be used to set FetchEvent.clientId for now, but later can be used to replace a lot of things we currently require nsIDocument to passed to NS_NewChannel() for.  For example, we could place CSP, referrer policy, etc on the ClientInfo.  It could also replace triggering principal as a separate argument.
3. The ClientInfo and a ClientState combined will be used to create a the dom Client or WindowClient object.
4. An internal manager API will be provided to query ClientInfo objects for a given principal anywhere in the system.  This is PBackground based and can be used anywhere that is available.
Attachment #8920704 - Flags: review?(amarchesini)
Oh, I should also explain the directory structure I intend to use here:

 dom / client / api     - The DOM Clients API bits.
              / manager - An internal manager interface for tracking globals cross-process
Attachment #8920284 - Flags: review?(amarchesini) → review+
Comment on attachment 8920701 [details] [diff] [review]
P3 Add a ServiceWorkerDescriptor type to represent a thread-safe snapshot of a ServiceWorkerInfo. r=baku

Review of attachment 8920701 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerDescriptor.cpp
@@ +65,5 @@
> +ServiceWorkerDescriptor::operator==(const ServiceWorkerDescriptor& aRight) const
> +{
> +  return mData->id() == aRight.mData->id() &&
> +         mData->scope() == aRight.mData->scope() &&
> +         mData->principalInfo() == aRight.mData->principalInfo();

no mData->state()? This could be confusing.
Attachment #8920701 - Flags: review?(amarchesini) → review+
Comment on attachment 8920340 [details] [diff] [review]
P4 Use the ServiceWorkerDescriptor in existing WorkerPrivate and ServiceWorkerInfo code. r=baku

Review of attachment 8920340 [details] [diff] [review]:
-----------------------------------------------------------------

good cleanup of the code.
Attachment #8920340 - Flags: review?(amarchesini) → review+
Comment on attachment 8920704 [details] [diff] [review]
P5 Add ClientInfo and ClientState types. r=baku

Review of attachment 8920704 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/clients/manager/ClientInfo.h
@@ +1,5 @@
> +/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
> +/* vim: set ts=8 sts=2 et sw=2 tw=80: */
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

a new line here?

::: dom/clients/manager/ClientState.h
@@ +1,5 @@
> +/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
> +/* vim: set ts=8 sts=2 et sw=2 tw=80: */
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

empty line here?

@@ +38,5 @@
> +  operator=(ClientWindowState&& aRight);
> +
> +  ~ClientWindowState();
> +
> +  explicit ClientWindowState(const IPCClientWindowState& aData);

can you move this near the other CTORs?

@@ +57,5 @@
> +// This class defines the mutable worker state we support querying
> +// through the ClientManagerService.  It is a snapshot of the state and
> +// is not live updated.  Right now, we don't actually providate any
> +// worker specific state values, but we may in the future.  This
> +// class also servces as a placeholder that the state is referring

services

@@ +73,5 @@
> +
> +  ClientWorkerState&
> +  operator=(const ClientWorkerState& aRight);
> +
> +  ClientWorkerState(ClientWorkerState&& aRight);

here as well?

@@ +102,5 @@
> +
> +  ClientState&
> +  operator=(const ClientState& aRight) = default;
> +
> +  ClientState(ClientState&& aRight);

here too.

@@ +109,5 @@
> +  operator=(ClientState&& aRight);
> +
> +  ~ClientState();
> +
> +  explicit ClientState(const IPCClientWindowState& aData);

I don't understand the order of the CTORs.
Attachment #8920704 - Flags: review?(amarchesini) → review+
I want to see how you use WorkerHolderToken before reviewing patch 1. I keep the r?.
(In reply to Andrea Marchesini [:baku] from comment #156)
> ::: dom/workers/ServiceWorkerDescriptor.cpp
> @@ +65,5 @@
> > +ServiceWorkerDescriptor::operator==(const ServiceWorkerDescriptor& aRight) const
> > +{
> > +  return mData->id() == aRight.mData->id() &&
> > +         mData->scope() == aRight.mData->scope() &&
> > +         mData->principalInfo() == aRight.mData->principalInfo();
> 
> no mData->state()? This could be confusing.

I'll add a comment.  I don't include state() here since its not part of the identity of the SW.
(In reply to Ben Kelly [:bkelly] from comment #160)
> (In reply to Andrea Marchesini [:baku] from comment #156)
> > ::: dom/workers/ServiceWorkerDescriptor.cpp
> > @@ +65,5 @@
> > > +ServiceWorkerDescriptor::operator==(const ServiceWorkerDescriptor& aRight) const
> > > +{
> > > +  return mData->id() == aRight.mData->id() &&
> > > +         mData->scope() == aRight.mData->scope() &&
> > > +         mData->principalInfo() == aRight.mData->principalInfo();
> > 
> > no mData->state()? This could be confusing.
> 
> I'll add a comment.  I don't include state() here since its not part of the
> identity of the SW.

Actually, this operation isn't really used, so I might as well make it a full comparison.
Attachment #8920704 - Attachment is obsolete: true
Attachment #8921618 - Flags: review+
Attached patch wip (obsolete) — Splinter Review
Remove nsIIPCBackgroundChildCreateCallback usage since its going away.
Attachment #8920379 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Try to remove some more dead code after the PBackground changes.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d7d7d7c9ecdf2400ebd2196b1d48f03f42dc68ef
Attachment #8921624 - Attachment is obsolete: true
Depends on: 1411746
Fix a mis-typed logic condition.  I swapped (oldState == newState) for (oldState != newState).
Attachment #8920340 - Attachment is obsolete: true
Attachment #8922084 - Flags: review+
Attached patch wip (obsolete) — Splinter Review
The claim() code was using ServiceWorkerDescriptor::operator==().  I decided to just change this comparison to explicitly use Id() instead of burying it in the operator==() impl like before.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0db89b21e36dcea41f99ab2f9d1ed5dadd67741a
Attachment #8921644 - Attachment is obsolete: true
I'm going to land some patches as I go to avoid bitrot and to allow more parallel work to happen.  So as things are r+'d here I may move them to a separate bug to land.  To start I'll file a bug to land ServiceWorkerDescriptor and a bug to land ClientInfo/ClientState.
No longer depends on: 1313096
Blocks: 1337439
No longer depends on: 1337439
Blocks: 1412856
No longer depends on: 1357463
Depends on: 1412858
Comment on attachment 8920284 [details] [diff] [review]
P2 Expose a "parsed" ServiceWorkerState value. r=baku

Moved to bug 1412858.
Attachment #8920284 - Attachment is obsolete: true
Attachment #8921617 - Attachment is obsolete: true
Attachment #8922084 - Attachment is obsolete: true
Depends on: 1412864
Comment on attachment 8921618 [details] [diff] [review]
P5 Add ClientInfo and ClientState types. r=baku

Moved to bug 1412864.
Attachment #8921618 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8922085 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
I split the actor boilerplate out into P3.  It was harder than I expected.  Lets see what I broke:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=30a8c54a65109bc66d2242de4d1a841a79a94fab
Attachment #8923528 - Attachment is obsolete: true
Attachment #8923527 - Attachment is obsolete: true
Some new orange, but it doesn't look like its from my code splitting.  New test here:

https://searchfox.org/mozilla-central/source/dom/workers/test/serviceworkers/test_bug1408734.html

Has the same problem as bug 1411746.
Depends on: 1413056
Attachment #8923579 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8923580 - Attachment is obsolete: true
Comment on attachment 8923581 [details] [diff] [review]
P2 Add ClientThing base class. r=baku

Andrea, the code in later patches is designed around the idea of an underlying IPC actor and an outer owning object.  This patch introduces a templated base class for the outer owning objects to extend.  It mainly manages the activation and shutdown steps between the owner and actor.  They each have a weak reference to each other that must be cleared when either one goes away.

I understand that you prefer to implement the owner and the actor in one class.  Personally I prefer to separate them because I find it easier to think about their life cycles separately.

In addition to preference, though, one of the Client owner objects cannot be unified easily with the actor.  The ClientSource is designed to be owned as a UniquePtr<> and is not reference counted.  This means I cannot use the trick of having the IPC code add a ref, etc.

In regards to naming, I could change this to something more verbose like ClientActorOwner if you want.  I kind of like ClientThing, though.

As a side note, this class used to also handle pending lambda operations for cases where the actor was not created yet.  That is no longer necessary since the PBackground actor is now synchronously available.  Thanks!
Attachment #8923581 - Flags: review?(amarchesini)
Comment on attachment 8923854 [details] [diff] [review]
P3 Add IPC actor structure and boilerplate. r=baku

This sizeable patch stubs out the various IPC actors.  There are some long living actors that form this structure:

* PBackground
  * PClientManager
    * PClientSource
    * PClientHandle

Each of these corresponds to an outer owning object.

* ClientManager is going to be a ref-counted per-thread singleton.  It will be a factory for creating ClientSource and ClientHandle objects.  It will also provide methods for performing operations to query Clients, navigate Clients, etc.
* ClientSource will be an RAII style object that each window/worker owns.  While it exists the global is considered alive.  It should be destroyed when the global is destroyed.  This is how we map globals to the Client concept.
* ClientHandle is a way to attach to an existing Client.  It is a ref-counted object.  When creating a ClientHandle the ClientManager will match up the provided ClientInfo to a corresponding ClientSource.  Messages like postMessage(), focus(), etc will then be forwarded to the correct ClientSource.  If the ClientSource goes away, then these operations will simply reject.  The ClientHandle can out live the ClientSource.

In addition to this basic structure, there are a number of "operation" type actors here.  I wrote all of this before the async RPC style ipdl support was added.  I considered going back and rewriting this stuff, but I opted not to for now.  The async RPC support requires slightly different MozPromise usage than I am using here.  It also has some subtle differences on life cycle management that I would have to hunt down.  Given that this bug is way behind I decided not to tackle that for now.

The operations are:

* PBackground
  * PClientManager
    * PClientManagerOp: various child-to-parent operations
    * PClientNavigateOp: A parent-to-child operation.
  * PClientSource
    * PClientSourceOp: various parent-to-child operations
  * PClientHandle
    * PClientHandleOp: various child-to-parent operations

* PContent
  * PClientOpenWindowOp: A parent-to-child operation for opening a new window cross-process with an async return that provides corresponding ClientInfo.

For the most part this patch is just boilerplate.  The actors and types don't have any useful information in them.  I thought getting this out of the way in a separate patch would make it easier to read later patches where I add functionality.
Attachment #8923854 - Flags: review?(amarchesini)
Attachment #8923581 - Flags: review?(amarchesini) → review+
Attachment #8923891 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8923897 - Attachment is obsolete: true
Comment on attachment 8923896 [details] [diff] [review]
P4 Add the ClientManager class. r=baku

Andrea, this patch adds the ClientManager class.  Its a per-thread singleton that acts as a factory for ClientSource and ClientHandle.  It also provides methods for querying the list of clients, etc.

There is a Startup() method that must be called early during process creation to setup the TLS constants.

There is a StartOp() utility method that creates a PClientManagerOp and ties it back to a MozPromise.  Future patches will make use of this method.

Again, this is just getting the structure in place.  Future patches will add the factory methods, etc.
Attachment #8923896 - Flags: review?(amarchesini)
Comment on attachment 8923962 [details] [diff] [review]
P5 Add ClientManagerService and reference it from parent actors. r=baku

This patch adds the parent-side service object that the majority of the actors connect to.  This will be the object that maintains the list of active globals/clients in the system.  When a ClientHandle is created it will be matched to a ClientSource in the service's list.

For now this patch just adds the service and adds a reference to the relevant actors.

This class lives on PBackground thread.
Attachment #8923962 - Flags: review?(amarchesini)
Comment on attachment 8923984 [details] [diff] [review]
P6 Add the ClientSource class and hook it into ClientManager/ClientManagerService. r=baku

Andrea, this patch adds the ClientSource object.  Its an RAII style object designed to be held via a UniquePtr<>.  I say "RAII" since creating it will register the existence of the Client in the ClientManagerService and destruction will remove the entry from ClientManagerService.

I went ahead an included the bits here that integrate ClientSource with ClientManager and ClientManagerService.

Creation starts when nsGlobalWindow, nsDocShell, or WorkerPrivate call one of the ClientManager::CreateSource() methods.  (These call sites will be added in a later patch.). The ClientManager then allocates the ClientSource which triggers the creation of the PClientSource actor.  The parent-side of the actor then calls AddSource on the service.

The service stores its list of Clients in a hash table keyed on the Client's UUID.  (The UUID is originally created in the ClientManager::CreateSourceInternal() method).  I chose this data structure because I wanted to make things as fast as possible for adding and removing Clients to minimize overhead from pages with many iframes, etc.  Looking up by UUID is also important so that ClientHandle can relatively quickly attach to the right source.  The main operation that will still be O(n) is the matchAll(), but that is ok because its O(n) today.

Note, I use the ContentPrincipalInfo.originNoSuffix() value to match principals off the main thread.  This is very handy as I was doing some evil things with the spec in previous versions.  Thanks for adding it!
Attachment #8923984 - Flags: review?(amarchesini)
Comment on attachment 8923854 [details] [diff] [review]
P3 Add IPC actor structure and boilerplate. r=baku

Review of attachment 8923854 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/clients/manager/ClientManagerChild.cpp
@@ +76,5 @@
> +}
> +
> +mozilla::ipc::IPCResult
> +ClientManagerChild::RecvPClientNavigateOpConstructor(PClientNavigateOpChild* aActor,
> +                                   const ClientNavigateOpConstructorArgs& aArgs)

indentation here
Attachment #8923854 - Flags: review?(amarchesini) → review+
Comment on attachment 8920695 [details] [diff] [review]
P1 Add a ref-counted WorkerHolderToken() class. r=baku

Review of attachment 8920695 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerHolderToken.cpp
@@ +14,5 @@
> +already_AddRefed<WorkerHolderToken>
> +WorkerHolderToken::Create(WorkerPrivate* aWorkerPrivate, Status aShutdownStatus,
> +                          Behavior aBehavior)
> +{
> +  MOZ_ASSERT(aWorkerPrivate);

Can you assert in which thread this code runs?
Attachment #8920695 - Flags: review?(amarchesini) → review+
Attachment #8920695 - Attachment is obsolete: true
Attachment #8924200 - Flags: review+
Attachment #8923854 - Attachment is obsolete: true
Attachment #8924201 - Flags: review+
Depends on: 1413604
Depends on: 1413606
Attachment #8924200 - Attachment is obsolete: true
Attachment #8923581 - Attachment is obsolete: true
Attachment #8924201 - Attachment is obsolete: true
Attached patch wip (obsolete) — Splinter Review
Attachment #8923988 - Attachment is obsolete: true
Comment on attachment 8924306 [details] [diff] [review]
P7 Add ClientHandle class and make it attach to the correct ClientSource. r=baku

This adds the ClientHandle owner object.  It also adds the code that makes it attach to the ClientSource via the ClientManagerService.
Attachment #8924306 - Flags: review?(amarchesini)
Attached patch validate.patch (obsolete) — Splinter Review
While trying to split the next patch I realized I never implemented the principal/url validation code.  This patch adds it using MozURL for OMT support.  This should help harden the clients PBackground ipdl protocol against spoofed values (a bit).

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2130ecd4975c30658a23d44ee4781d7318f486d2
Attached patch validate.patch (obsolete) — Splinter Review
Lots of orange.  Looks like file:/// urls were not passing validation.  Fix that and see if anything else is broken:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c3de71a12578a138b5b9e212d6716ecc48020fe3
Attachment #8924627 - Attachment is obsolete: true
Attached patch validate.patch (obsolete) — Splinter Review
Allow javascript: URLs.

Also, add some validation that we don't overwrite an existing ClientSourceParent.  There were assertions for this, but we should use a runtime check since we are dealing with potentially spoofed IPC traffic.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=efc50f2b4784d31ce3bd0a9e88ae169dcbf99e2d
Attached patch validate.patch (obsolete) — Splinter Review
Allow about:srcdoc.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cab639e30dbe6f983e7e4b201349d7a91189bb5d
Attachment #8924676 - Attachment is obsolete: true
Attachment #8924712 - Attachment is obsolete: true
Attached patch validate.patch (obsolete) — Splinter Review
Handle jar, moz-icon, and wyciwyg URLs.

Also, fix a bug in the WorkerPrivate integration where I was marking a worker execution ready even though the load failed due to CSP violation.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a1004e858f4afd084dd927f85d4973ad629a077d
Attachment #8924728 - Attachment is obsolete: true
I like this validate patch.  Its a bit brittle using MozURL right now, but I like that it caught this subtle bug in the WorkerPrivate code.  Also gives me more confidence things are working correctly across the system.
Attached patch