Closed Bug 1242482 Opened 8 years ago Closed 8 years ago

Service worker "Unregister" is not propagated in e10s

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
e10s - ---
firefox47 + fixed

People

(Reporter: janx, Assigned: bkelly)

References

Details

Attachments

(4 files, 2 obsolete files)

If you apply the 2 patches from bug 1212797, which make about:debugging use `ServiceWorkerManager` to show service worker registrations, listing them with `SWM.getAllRegistrations()`, and listening for changes using `SWM.addListener(this)`, it appears that "Unregister" messages are not getting propagated correctly in e10s.

Steps to reproduce:
1. Apply the 2 patches from bug 1212797
2. Run `./mach mochitest devtools/client/aboutdebugging/test/browser_service_workers.js` (this test works in non-e10s).
3. Now run `./mach mochitest devtools/client/aboutdebugging/test/browser_service_workers.js --e10s`

Expected:
- The test should works as well in e10s

Actual result:
- The test times out because in the parent process, SWM.getAllRegistrations() still returns a previously unregistered service worker registration.

More details:

- In e10s, ServiceWorkerManagerParent.cpp::RecvRegister() seems to successfully trigger ServiceWorkerManagerChild.cpp::RecvNotifyRegister()

- However, still in e10s, ServiceWorkerManagerParent.cpp::RecvUnregister() fails to trigger ServiceWorkerManagerChild.cpp::RecvNotifyUnregister(). Instead, the log shows this line right after RecvUnregister() was called:

###!!! [Parent][OnMaybeDequeueOne] Error: Channel closing: too late to send/recv, messages will be lost

- In non-e10s, ServiceWorkerManagerParent.cpp::RecvRegister() is never called, however ServiceWorkerManagerParent.cpp::RecvUnregister() seems to be called. (Both Child methods are never called either).
I mostly copied whatever we already do in RegisterServiceWorkerCallback.
It looks like we miss this bit in order to correctly pipe the unregistration from the child to the parent.

Jan here wasn't getting the nsIServiceWorkerManager to call its listeners (onUnregister callback)
while we obviously see most of the service worker unregister-related methods being called.
I think they were all correctly called, but only in the child?
*I think* as I don't really know this codebase.
Attachment #8711689 - Flags: review?(bkelly)
Is linux64 opt+debug running xpcshell+mochitests enough to cover service worker code changes?
Jan says that it introduces infinite loop. And try is quite unhappy with this patch.
I tried to look around RegisterServiceWorkerCallback versus UnregisterServiceWorkerCallback to see any possible difference, but can't see any. I don't get why it "register" events works fine while "unregister" doesn't.
This call to SendUnregister in middle of UnregisterJob is suspicious.
This is another difference with Register.
With Register, we call SendRegister from StoreRegistration, which
is not directly called from RegisterJob, but seems to be only called
if the registration isn't already registered.
I think we can remove it with the previous modification I made,
that ensure we call SendUnregister when running the UnregisterServiceWorkerCallback.

But again, I badly miss the whole picture.
Attachment #8712085 - Flags: review?(bkelly)
Attachment #8711689 - Attachment is obsolete: true
Attachment #8711689 - Flags: review?(bkelly)
Try is much greener, but there is still some failures like:
  dom/workers/test/serviceworkers/test_third_party_iframes.html
    Timeout
  dom/workers/test/serviceworkers/test_claim_oninstall.html
    Fail: line 44: Claim should reject if the worker is not active
The whole propagation mechanism is a mess.  We're in the process of refactoring our e10s support which will fix a lot of it.  Is it possible to run about:debugging in the child process until then?  I expect this is probably the opposite of what we discussed previously, though. :-\

To fix it now, though, we probably need something like this:

1) Run SWM::Unregister() in the child process.
2) Propagate the removal of the registration from the child-to-parent.

An approximation of this would be:

1) Leave current code paths alone.
2) Add a new chrome-only unregister method that can be called from parent.
3) New method just calls SWM::unregister if run in non-e10s mode.
4) If run in e10s mode, then look for for child process with matching appID.
4a) Then send an IPC message to that child to run SWM::unregister.
4b) If there is no matching child process then just remove the registration from the store.

Unfortunately I won't be able to look at this until Friday.  I have an all day meeting today and then on PTO.
Flags: needinfo?(poirot.alex)
Comment on attachment 8712085 [details] [diff] [review]
Propagate Service worker unregistration to the parent process

Review of attachment 8712085 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks a lot Alex! Your patch fixes all the Unregister-related issues in bug 1212797:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=321043ee37d2&selectedJob=15944212
(only problem remaining is now the GC crash from bug 1241841)

Ben, given our time constraints on SW devtools, it would be awesome to land this small fix instead of waiting for an e10s refactor.
Attachment #8712085 - Flags: feedback+
(In reply to Jan Keromnes [:janx] from comment #10)
> Comment on attachment 8712085 [details] [diff] [review]
> Propagate Service worker unregistration to the parent process
> 
> Review of attachment 8712085 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks a lot Alex! Your patch fixes all the Unregister-related issues in bug
> 1212797:
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=321043ee37d2&selectedJob=15944212
> (only problem remaining is now the GC crash from bug 1241841)

This is surprising! Why do I get failure on my push to try and not on yours (comment 8)?!
Flags: needinfo?(poirot.alex)
(In reply to Alexandre Poirot [:ochameau] from comment #11)
> This is surprising! Why do I get failure on my push to try and not on yours
> (comment 8)?!

I missed your comment about these failures, but my try run only has devtools mochitests (not dom/workers/*).
tracking-e10s: --- → -
Comment on attachment 8712085 [details] [diff] [review]
Propagate Service worker unregistration to the parent process

Review of attachment 8712085 [details] [diff] [review]:
-----------------------------------------------------------------

The unregister propagation code is broken, but this patch is broken in other ways.

What we really need is a message from child-to-parent when the registration is removed, not when unregister is called.  Right now we propagate it in the unregister job before we know if we can actually remove the registration.  So we might wipe it from disk even though a client is keeping the registration alive.

I really think we just need to change the propagation to be for the removal of the registration.  The child propagates to the parent, which then sends to all the other child processes.  If a child gets a remove registration message then it just summarily kills the service worker and removes the registration.

::: dom/workers/ServiceWorkerManager.cpp
@@ -2481,5 @@
>      RefPtr<ServiceWorkerManager> swm = ServiceWorkerManager::GetInstance();
>  
> -    // Could it be that we are shutting down.
> -    if (swm->mActor) {
> -      swm->mActor->SendUnregister(principalInfo, NS_ConvertUTF8toUTF16(mScope));

We can't just remove the child-to-parent notification because then when content script calls unregister() it won't actually be removed from disk in e10s mode.
Attachment #8712085 - Flags: review?(bkelly) → review-
Also, regarding the assumptions in comment 0:  "unregister" does not immediately remove a worker if there is an existing client.  So if you have a controlled window open then the registration will not be removed.
[Tracking Requested - why for this release]: Blocking new Service Worker & Push features that we'd like to announce with Developer Edition 47.
I'll see if I can write a patch for this some time this week or next.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Thanks Ben!
There are some more serious compat issues here.  We incorrectly remove the service worker registration even if a window is still controlled by the registration.
Actually, I think things are mostly fine compat-wise.  We do have one bug, but I broke it out as bug 1247436.
No longer blocks: ServiceWorkers-compat
So the way about:serviceworkers does this it to call nsIServiceWorkerManager.propagateUnregister() instead of the registration.unregister():

  https://dxr.mozilla.org/mozilla-central/source/toolkit/content/aboutServiceWorkers.js#172

Can you do something similar in your about:debugging code?
Flags: needinfo?(janx)
I did see this code in about:serviceworkers, and I intend to use it for implementing an "Unregister" button in about:debugging.

However, this doesn't fix the problem described in this bug. What happens is that, when a Service Worker unregisters itself, about:debugging continues to show it, because it never hears about the unregistration (so it wouldn't even know when to call `propagateUnregister()`).

If you're suggesting that I change the test files to call `nsIServiceWorkerManager.propagateUnregister()` instead of `registration.unregister()`, this could indeed make the tests pass, but it would be working around the issue that about:debugging never hears about "Unregister" events, and continues to display unregistered service workers.
Flags: needinfo?(janx)
Attachment #8712085 - Attachment is obsolete: true
Comment on attachment 8711689 [details] [diff] [review]
Propagate Service worker unregistration to the parent process.

This patch is good, but requires some follow-up patches I'm working on to avoid pathological bouncing between the processes.
Attachment #8711689 - Attachment is obsolete: false
Attachment #8711689 - Flags: review+
This breaks the infinite loop by only calling SendUnregister() if we have a live registration.

There are still a few process bounces happening before it dies out, though.  Those could cause problems for code doing an unregister() immediately followed by a register().  I'm working on a 3rd patch to address that.
Attachment #8719023 - Flags: review?(amarchesini)
This causes unregister jobs in the child triggered from a parent notification to not send back to the parent.  The parent has already removed the registration already.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3522c85bad06
Attachment #8719037 - Flags: review?(amarchesini)
Attachment #8719023 - Flags: review?(amarchesini) → review+
Comment on attachment 8719037 [details] [diff] [review]
P3 Don't send unregister messages when triggered from a PropagateUnregister(). r=baku

Review of attachment 8719037 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerManager.cpp
@@ +2592,5 @@
> +                                       const nsAString& aScope)
> +{
> +  AssertIsOnMainThread();
> +
> +  if (!aPrincipal) {

Can this really happen?

@@ +2604,5 @@
> +#ifdef DEBUG
> +  nsCOMPtr<nsIURI> scopeURI;
> +  rv = NS_NewURI(getter_AddRefs(scopeURI), aScope, nullptr, nullptr);
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return NS_ERROR_DOM_SECURITY_ERR;

why don't you want to expose the rv value?
Attachment #8719037 - Flags: review?(amarchesini) → review+
Attachment #8719592 - Flags: review?(amarchesini) → review+
Update with review feedback.
Attachment #8719037 - Attachment is obsolete: true
Attachment #8721662 - Flags: review+
See try build in comment 25.  This may need rebasing if bug 1237992 lands first.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: