Closed Bug 1280759 Opened 8 years ago Closed 4 years ago

about:debugging service worker start button always creates worker thread in parent process

Categories

(DevTools :: about:debugging, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: bkelly, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 1 obsolete file)

In e10s mode the about:debugging page can create a worker thread in the parent process if a user presses "start" on a service worker while there is no child process.

STR:

1) Open fresh browser instance
2) Open about:memory
3) Open about:debugging#workers in new tab
4) Click start on a service worker entry
5) Measure in about:memory
6) Observe there is only a main process and the service worker thread is running there.

I think this might be expected due to the way we are working around the ServiceWorkerManager in e10s mode, but I couldn't find it filed as a bug.  I just wanted to file this as a known problem.
Actually, this seems worse:

1) Open fresh browser instance
2) Open mozilla.org in a tab to create a child process
3) Open about:debugging#workers in a tab
4) Start a service worker for a site
5) Debug the service worker and set a break point in the fetch event
6) Open a the site for the service worker in a new tab
7) Notice that the breakpoint does not trigger.
8) Now click debug in about:debugging for the SW again
9) In the new debugger window set the same breakpoint
10) Refresh the site
11) Observe that the new breakpoint in the second debugger window triggers

It seems if the SW is started via about:debugging, then its started in the parent and does not receive any of the events generated by the page in the child process.

The about:debugging page should really be creating the worker thread in the child process if one exists.  Ideally this would be handled by ServiceWorkerManager, but we're not there yet.  We need something sooner, though, because start+debug is broken and quite confusing for users.

Jan, Alexandre, what do you guys think?
Flags: needinfo?(poirot.alex)
Flags: needinfo?(janx)
Summary: about:debugging service worker start button can create worker thread in parent process → about:debugging service worker start button always creates worker thread in parent process
See Also: → 1280737
The "start" function is injected via service-worker-child.js into every process (parent process and child processes).

Upon receiving a "serviceWorkerRegistration:start" message, the script attempts to ensure it's in the correct process before actually starting a service worker for the given scope. It does so by checking for the existence of an active worker in that process (note: active as in "successfully installed one day", not as in "currently running") for that scope [0].

It appears that this no longer works: `registration.activeWorker` now seems defined in every process, causing "start" to spawn service workers in all processes, even in the main process.

Upon load or refresh, about:debugging associates all live service workers from all processes to a registration, matching by scope [1]. The current code causes about:debugging to keep track of only the last worker (represented by a devtools workerActor) in the list if several workers match the same registration. As a consequence of this behavior in e10s, using "start" seems to spawn workers into every process, but since a child process worker is generally last in the list, about:debugging usually forgets about the worker in any parent process worker (except when there is no child process).

To fix the "start" method, we go back to the original question: how to properly spawn a service worker on demand?

a) We could bail out from a "start" function if its process is not a child process (in e10s):
- If there is no content process, "start" won't work at all (better than spawning a main process worker, but it's probably worth figuring out how to create a content process when none is available).
- If there are multiple content process, "start" will spawn as many workers (but hopefully we'll probably be able to switch to the refactored ServiceWorkerManager before multiple content processes see widespread adoption).

b) Maybe there is a better way to cause a service worker to start running than to call `attachDebugger()` and hope you're in the right process? Or maybe that's not possible until service workers are refactored for e10s.

[0] https://dxr.mozilla.org/mozilla-central/source/devtools/server/service-worker-child.js#24
[1] https://dxr.mozilla.org/mozilla-central/source/devtools/client/aboutdebugging/components/workers/panel.js#85
Flags: needinfo?(janx)
Flags: needinfo?(bkelly)
Flags: needinfo?(bchabod)
In my opinion, we won't be able to handle all the cases until the ServiceWorkerManager refactoring is over. However, as Ben pointed out, there is still a lot of work to get there and we need a workaround to fix most of our problems before it happens. I think the top priority if the start+debug functionality. This is what most people are likely to use: write a service worker, start it, set breakpoints there and there and load the page.

(In reply to Jan Keromnes [:janx] from comment #2)
> a) We could bail out from a "start" function if its process is not a child
> process (in e10s):
> - If there is no content process, "start" won't work at all (better than
> spawning a main process worker, but it's probably worth figuring out how to
> create a content process when none is available).
> - If there are multiple content process, "start" will spawn as many workers
> (but hopefully we'll probably be able to switch to the refactored
> ServiceWorkerManager before multiple content processes see widespread
> adoption).

I agree! If there is no child process, we must either find a way to create one, or tell the user he needs to open a remote tab.
This way, when the user does start+debug, he can debug the right SW (in the content process) and set breakpoints.
Currently, hitting start and debug opens a toolbox on the SW spawned in the main process... I believe this is the problem in Bug 1280737.

However, if there are multiple child processes, the fact that "start" spawns workers in each of them is hard to repair until the SW refactoring happens. As you said, about:debugging only shows one of these workers (the last one in the list).
If the user happens to debug the wrong one, he might think his SW doesn't work, so in terms of UI, perhaps about:debugging should list all of them? Or maybe we should ignore this case for the moment.

> b) Maybe there is a better way to cause a service worker to start running
> than to call `attachDebugger()` and hope you're in the right process? Or
> maybe that's not possible until service workers are refactored for e10s.

Calling attachDebugger()/detachDebugger() to force the worker to start running works because ServiceWorkerPrivate calls spawnWorkerIfNeeded() if the worker is dead [1]. Maybe we should make this method available in the ServiceWorkerInfo object, to provide a cleaner way to start it on the JS side?
But this doesn't help at all with the process problem.

[1]: https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerPrivate.cpp#1826
Flags: needinfo?(bchabod)
Blocks: 1164829
(In reply to Benoit Chabod [:bigben] from comment #3)
> In my opinion, we won't be able to handle all the cases until the
> ServiceWorkerManager refactoring is over. However, as Ben pointed out, there
> is still a lot of work to get there and we need a workaround to fix most of
> our problems before it happens. I think the top priority if the start+debug
> functionality. This is what most people are likely to use: write a service
> worker, start it, set breakpoints there and there and load the page.
> 
> (In reply to Jan Keromnes [:janx] from comment #2)
> > a) We could bail out from a "start" function if its process is not a child
> > process (in e10s):
> > - If there is no content process, "start" won't work at all (better than
> > spawning a main process worker, but it's probably worth figuring out how to
> > create a content process when none is available).
> > - If there are multiple content process, "start" will spawn as many workers
> > (but hopefully we'll probably be able to switch to the refactored
> > ServiceWorkerManager before multiple content processes see widespread
> > adoption).
> 
> I agree! If there is no child process, we must either find a way to create
> one, or tell the user he needs to open a remote tab.

+1 Telling the user to open a tab should be enough.
What would be great could be to have a grayed start button with a tooltip explaining why.
Starting a child process is complex as it would be great to stop it at some point and managing it's lifecycle would be hard. Keep it mind it should be a edge case. I imagine you would always have a content process when e10s is turned on.

> This way, when the user does start+debug, he can debug the right SW (in the
> content process) and set breakpoints.
> Currently, hitting start and debug opens a toolbox on the SW spawned in the
> main process... I believe this is the problem in Bug 1280737.
> 
> However, if there are multiple child processes, the fact that "start" spawns
> workers in each of them is hard to repair until the SW refactoring happens.

I think we can just ignore multiple child processes scenario until SW have been refactored.
May be we could drop a warning based on the pref value...

You told me once about having different behaviors if e10s is turned on/off.
I believe you can setup a pretty good behavior based on that.
Check for this pref "browser.tabs.remote.autostart", if that true, e10s is turned on.

> Calling attachDebugger()/detachDebugger() to force the worker to start
> running works because ServiceWorkerPrivate calls spawnWorkerIfNeeded() if
> the worker is dead [1]. Maybe we should make this method available in the
> ServiceWorkerInfo object, to provide a cleaner way to start it on the JS
> side?
> But this doesn't help at all with the process problem.

+1 if spawnWorkerIfNeeded is not enough, that would be great to have something less hacky!


So with all that:
- ignore multiple content processes
- hide/disable start button when e10s is turned on and there is no child process
- branch whenever is necessary to have different codepath/behavior when e10s is on or off

Do you think you can draw a decent story around start and debug?
Flags: needinfo?(poirot.alex)
Note that in my testing I did have a tab open with a child process.  Start still created in the parent.  But maybe that is something we think is fixable?  Sorry if I missed that nuance in the above comments.
Flags: needinfo?(bkelly)
Yes, that's because ServiceWorkerRegistrationActor.start assumes loadFrameScript executes only in the child process whereas it seems to also execute in the parent. It should be easily fixable with some e10s checks. In my previous comment 4, I tried to summarize to try to move forward on all these issues.

Benoit, I think you have a good picture of this, do not hesitate to start throwing patches!
Alright, this is a small patch to demonstrate the behaviour Alexandre proposed in comment 4: if we are in e10s and there is no content process available, the "start" button is disabled and a tooltip explains that the user should start a content process himself.

But maybe it would be more convenient in terms of UX to spawn the child process ourself if the user really wants to start a service worker. As discussed in other bugs, we haven't found a proper way to do that yet, but you can force it to start opening a dummy remote iframe [1].

What do you guys think?

I also added a simple check in the service worker frame script loaded by the start function in every process, and it won't do anything if it's inside the parent process. This way, no more service workers spawned in the main process when e10s is enabled, yay!

[1]: https://dxr.mozilla.org/mozilla-central/source/devtools/server/tests/mochitest/test_getProcess.html#67
Priority: -- → P2
(In reply to Benoit Chabod [:bigben] from comment #7)
> But maybe it would be more convenient in terms of UX to spawn the child
> process ourself if the user really wants to start a service worker. As
> discussed in other bugs, we haven't found a proper way to do that yet, but
> you can force it to start opening a dummy remote iframe [1].
> 
> What do you guys think?

Would a content window opened after one of these dummy remote iframes end up in the same child process?  If so, then it sounds good to me!

Do you have a good place to clean up the iframe?
Here's another patch proposal, where I spawn a content process for the service worker using an empty remote iframe added to the about:debugging#workers document.

(In reply to Ben Kelly [:bkelly] from comment #8)
> Would a content window opened after one of these dummy remote iframes end up
> in the same child process?  If so, then it sounds good to me!

They do end up in the same child process if there is only one!
However, if the user overrides the dom.ipc.processCount preference and enables multiple content processes, this may not work. But we can't really fix it until service workers are refactored.

> Do you have a good place to clean up the iframe?

Well, I didn't find a proper way to do that yet, so in this current version of the patch the iframe is appended to the document's body and dies only when about:debugging is closed. 

I tried to add a listener for the "workerListChanged" event and destroy the iframe when the SW has been spawned, but this "unexpected death" of the content process caused some crashes in other parts of the devtools, and it could lead to race conditions.

I guess for now we have to choose between this solution and the previous patch (disable the start button).
Assignee: nobody → bchabod
Status: NEW → ASSIGNED
Ben, what do you think of these two solution?

I've discussed this with Alexandre and perhaps playing with remote iframes isn't the right thing to do.
It does spawn a content process, but perhaps some parts of Firefox code aren't expecting a content process to appear for no reason in an about:* page. And I didn't find a way to clean it properly afterwards.
Therefore, we think disabling the start button when there is no content process spawned is the safest way to go before SW get refactored.
Flags: needinfo?(bkelly)
Sounds good.  Thanks!
Flags: needinfo?(bkelly)
Comment on attachment 8766257 [details] [diff] [review]
Patch - prevent start button from spawning SW in E10S main process

Jan, can you review this patch? :)
As discussed here, preventing service workers to spawn in the main process and disabling the "start" button when there is no content process available is the best solution for now.
Attachment #8766257 - Flags: review?(janx)
Attachment #8767689 - Attachment is obsolete: true
Comment on attachment 8766257 [details] [diff] [review]
Patch - prevent start button from spawning SW in E10S main process

Review of attachment 8766257 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you Benoit! Your approach makes sense, and your patch looks good to me.

Please rebase it (and fix an easy merge conflict), then address the few nits below, and if you can, please add a few steps to devtools/client/aboutdebugging/test/browser_service_workers_start.js to verify the desired button state in relevant edge cases.

::: devtools/client/aboutdebugging/components/workers/service-worker-target.js
@@ +107,5 @@
>      let { pushSubscription } = this.state;
>      let isRunning = this.isRunning();
>  
> +    // Until service workers are refactored for E10S, starting a worker
> +    // should only be possible if there is a child process alive

Nit: Please end this comment with a full stop ('.').

@@ +109,5 @@
>  
> +    // Until service workers are refactored for E10S, starting a worker
> +    // should only be possible if there is a child process alive
> +    let isE10S = Services.appinfo.browserTabsRemoteAutostart;
> +    let contentSpawned = Services.ppmm.childCount > 1;

Shouldn't the ServiceWorkerTarget component listen for process changes in order to re-render?

I couldn't find a scenario where the button stays in a wrong state after a process has dis/appeared, but I suspect that's because a chrome worker dis/appears at the same time as the content process, forcing a re-render. This means we might run into update problems if we decide to stop displaying chrome workers one day.

::: devtools/client/locales/en-US/aboutdebugging.properties
@@ +4,5 @@
>  
>  debug = Debug
>  push = Push
>  start = Start
> +startDisabled = Please start a content process to spawn service workers

Nit: This will probably sound very confusing to web developers who aren't familiar with Firefox' internals.

Maybe add something like "Hint: You can do this by opening any website in a separate tab." ?

Or maybe rephrase to something like "Starting a Service Worker currently requires a Content Process to exist. Please open any website in a separate tab to create one." ?
Attachment #8766257 - Flags: review?(janx) → feedback+
Product: Firefox → DevTools

This bug has not been updated in the last 6 months. Resetting the assignee field.
Please, feel free to pick it up again and add a comment outlining your plans for it if you do still intend to work on it.
This is just trying to clean our backlog of bugs and make bugs available for people.

Assignee: be.chabod → nobody
Status: ASSIGNED → NEW

I am assuming this issue is void after parent-intercept landed. Please re-open if this persists to be relevant.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(jdescottes)
Resolution: --- → INVALID

Sounds good, thanks!

Flags: needinfo?(jdescottes)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: