New Register/Update tasks should terminate existing installing worker

RESOLVED FIXED in Firefox 44, Firefox OS v2.5

Status

Testing
web-platform-tests
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: noemi, Assigned: catalinb)

Tracking

Trunk
mozilla45
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox42 wontfix, firefox43 wontfix, firefox44 fixed, firefox45 fixed, b2g-v2.5 fixed)

Details

MozReview Requests

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(6 attachments, 12 obsolete attachments)

40 bytes, text/x-review-board-request
baku
: review+
Details | Review
6.94 KB, patch
Details | Diff | Splinter Review
12.80 KB, patch
Details | Diff | Splinter Review
30.98 KB, patch
Details | Diff | Splinter Review
1.84 KB, patch
Details | Diff | Splinter Review
4.49 KB, patch
Away for a while
: review+
Details | Diff | Splinter Review
(Reporter)

Description

2 years ago
Test run such as |./mach web-platform-tests _mozilla/service-workers/service-worker/register-wait-forever-in-install-worker.https.html|


Harness status: Timeout
Harness status: Timeout

Found 1 tests
1 Timeout
* register worker that calls waitUntil with a promise that never resolves in oninstall -> timed out
Traces: https://pastebin.mozilla.org/8838598
According to the spec, a new Register/Update task should terminate an existing installing worker. We do not do this yet, leading to the test failure. The actual implementation is going to need some thinking.

The easiest thing to do would be to immediately expire the timeout on the installing worker (once the timers patch lands). Then override Cancel() on the lifecycle tasks so that when they get canceled, or when they get destroyed 'uncleanly' (due to the worker shutting down), they dispatch a runnable to the main thread that indicates the task failed, so that the RegisterJob can continue as if the install failed.
(Clarifying title a bit)
Summary: "Harness status: Timeout" when running "register-wait-forever-in-install-worker.https.html" test → New Register/Update tasks should terminate existing installing worker
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0329cdac1c7e
Created attachment 8665641 [details]
MozReview Request: Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku

Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku

Dispatches a runnable to the main thread to indicate a failed lifecycle task if
the worker starts shutting down and the lifecycle task promise has not
resolved.

The ServiceWorkerRegisterJob constructors terminate an existing installing
worker. These two together ensure that a new register() or update() can proceed
even if an old job is blocked on installation.

Update web-platform-tests expected data
Attachment #8665641 - Flags: review?(amarchesini)
Comment on attachment 8665641 [details]
MozReview Request: Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku

https://reviewboard.mozilla.org/r/20281/#review18289

::: testing/web-platform/meta/subresource-integrity/subresource-integrity.html.ini:17
(Diff revision 1)
> +

if you remove empty lines at the end of any .ini file, don't add it here.
Attachment #8665641 - Flags: review?(amarchesini) → review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=caf016fc1d62

Comment 7

2 years ago
Created attachment 8679199 [details] [diff] [review]
I rebased the Nikhil's patch.  It fixes the bug for me locally and all of the tests seem to pass, but on TreeHerder there is quite a bit of broken-ness: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a64843e0023

It seems like there are some race conditions which manifest in forms on intermittent test failures.  Note the difference between the debug and opt test runs of mochitest-4.  I haven't really looked into it in detail.

Please ignore the failures in dom/push/test/test_serviceworker_lifetime.html, those are caused by something else in my patch queue.

Comment 8

2 years ago
Bumping this.  I think we want to do something at least to make sure that |event.waitUntil(new Promise(()=>()));| doesn't create a zombie worker.
Blocks: 1059784

Updated

2 years ago
Blocks: 1189023
No longer blocks: 1180622

Updated

2 years ago
Assignee: nobody → catalin.badea392
(Assignee)

Comment 9

2 years ago
(In reply to Ehsan Akhgari (don't ask for review please) [Away Nov 3-19] from comment #8)
> Bumping this.  I think we want to do something at least to make sure that
> |event.waitUntil(new Promise(()=>()));| doesn't create a zombie worker.

The zombie worker bug happens because our job queue in ServiceWorkerManager.cpp expects the life cycle promises to be resolved or rejected at some point to continue.

Comment 10

2 years ago
Note, for install events we need to fail the installation if the worker is timed-out, rejects the waitUntil promise, or otherwise fails.

For the activate event, though, we need to allow the SW to proceed to be the active worker.  This is spec'd but was not immediately obvious to me.  See:

  https://github.com/slightlyoff/ServiceWorker/issues/776

Updated

2 years ago
Status: NEW → ASSIGNED
(Assignee)

Comment 11

2 years ago
Created attachment 8689182 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.
(Assignee)

Comment 12

2 years ago
Created attachment 8689678 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.
Attachment #8679199 - Attachment is obsolete: true
Attachment #8689182 - Attachment is obsolete: true
(Assignee)

Comment 13

2 years ago
Created attachment 8689679 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.
(Assignee)

Comment 14

2 years ago
Created attachment 8689711 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.

Comment 15

2 years ago
Please note the changes in bug 1186856 that I intent to land later tonight.  That test times out with these patches applied.
Flags: needinfo?(catalin.badea392)
(Assignee)

Comment 16

2 years ago
(In reply to Ben Kelly [:bkelly] from comment #15)
> Please note the changes in bug 1186856 that I intent to land later tonight. 
> That test times out with these patches applied.

Thanks for the heads up. It's a bug in patches after I did some crazy git rebases.
Flags: needinfo?(catalin.badea392)
(Assignee)

Comment 17

2 years ago
Created attachment 8689789 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.
Attachment #8689678 - Attachment is obsolete: true
Attachment #8689789 - Flags: review?(bkelly)
(Assignee)

Comment 18

2 years ago
Created attachment 8689790 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.
Attachment #8689679 - Attachment is obsolete: true
Attachment #8689790 - Flags: review?(bkelly)
(Assignee)

Comment 19

2 years ago
Created attachment 8689792 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.
Attachment #8689711 - Attachment is obsolete: true
Attachment #8689792 - Flags: review?(bkelly)

Comment 20

2 years ago
Comment on attachment 8689789 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

Review of attachment 8689789 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +355,5 @@
> +  NS_IMETHOD
> +  Cancel() override
> +  {
> +    mCallback->SetResult(false);
> +    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(mCallback)));

It seems this should call ReportResult(mWorkerPrivate->GetJSContext(), false)?  Right now Cancel() is not setting mDone.

@@ +438,5 @@
> +
> +  void
> +  ReportResult(JSContext* aCx, bool aResult)
> +  {
> +    if (mDone) {

Please mWorkerPrivate->AssertIsOnWorkerThread() to make it clear mDone handling is thread safe.

@@ +449,5 @@
> +      NS_RUNTIMEABORT("Failed to dispatch life cycle event handler.");
> +    }
> +
> +    mWorkerPrivate->RemoveFeature(aCx, this);
> +    mDone = true;

To minimize the potential for races, I think its better to set mDone immediately after checking mDone.  That way you can be sure that if anything you call in this method triggers re-entry, it won't double execute.

@@ +480,5 @@
>      }
>  
>      RefPtr<xpc::ErrorReport> xpcReport = new xpc::ErrorReport();
>      xpcReport->Init(report.report(), report.message(),
> +                    /* aIsChrome = */ false, mWorkerPrivate->WindowID());

This needs to be rebased.  I just removed this error reporting code in bug 1225470.  It has not merged to central yet.

@@ +529,2 @@
>    } else {
> +    watcher->ReportResult(aCx, false);

Is this necessary?  It seems the watcher will get de-referenced and report the failure in its destructor.  The comment above seems to suggest this is what you want to happen.
Attachment #8689789 - Flags: review?(bkelly) → review+

Comment 21

2 years ago
Comment on attachment 8689789 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

Review of attachment 8689789 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +413,5 @@
> +    MOZ_ASSERT(mWorkerPrivate);
> +    mWorkerPrivate->AssertIsOnWorkerThread();
> +    JSContext* cx = mWorkerPrivate->GetJSContext();
> +
> +    if (NS_WARN_IF(!mWorkerPrivate->AddFeature(cx, this))) {

Can you add a comment about why a feature is needed?  Its not immediately obvious to me what this is protecting against.  Doesn't the ServiceWorkerPrivate hold the worker open while the lifecycle is progressing?

Comment 22

2 years ago
Comment on attachment 8689790 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.

Review of attachment 8689790 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments.

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +126,5 @@
>    bool
>    WorkerRun(JSContext* aCx, WorkerPrivate* aWorkerPrivate) override
>    {
>      aWorkerPrivate->AssertIsOnWorkerThread();
> +    mCallback->SetResult(aWorkerPrivate->WorkerScriptExecutedSuccessfully());

Nice!  I was not aware of WorkerScriptExecutedSuccessfully.  Much cleaner since we simply need to reject promises with a TypeError and not the specific script error.

@@ +139,5 @@
> +  {
> +    mCallback->SetResult(false);
> +    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(mCallback)));
> +
> +    mDone = true;

nit: It would be nice if the callback SetResult(), dispatch to main thread, and mDone = true were all done in a single method called from both WorkerRun() and Cancel().  It minimizes the chance we miss some cleanup step.
Attachment #8689790 - Flags: review?(bkelly) → review+
(Assignee)

Comment 23

2 years ago
(In reply to Ben Kelly [:bkelly] from comment #21)
> Comment on attachment 8689789 [details] [diff] [review]
> Part 1 - Continue service worker job queue when life cycle events expire.
> 
> Review of attachment 8689789 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/workers/ServiceWorkerPrivate.cpp
> @@ +413,5 @@
> > +    MOZ_ASSERT(mWorkerPrivate);
> > +    mWorkerPrivate->AssertIsOnWorkerThread();
> > +    JSContext* cx = mWorkerPrivate->GetJSContext();
> > +
> > +    if (NS_WARN_IF(!mWorkerPrivate->AddFeature(cx, this))) {
> 
> Can you add a comment about why a feature is needed?  Its not immediately
> obvious to me what this is protecting against.  Doesn't the
> ServiceWorkerPrivate hold the worker open while the lifecycle is progressing?

It does, but we might get stuck when register just 1 service worker and that service workers waits
forever in the install handler. In this case, we wait for the keepAliveToken to expire and terminate the
worker, so we need the feature to catch this and continue the job, well, to fail the job.
(Assignee)

Comment 24

2 years ago
(In reply to Ben Kelly [:bkelly] from comment #20)
> Comment on attachment 8689789 [details] [diff] [review]
> Part 1 - Continue service worker job queue when life cycle events expire.
> 
> Review of attachment 8689789 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with comments addressed.
> 
> ::: dom/workers/ServiceWorkerPrivate.cpp
> @@ +355,5 @@
> > +  NS_IMETHOD
> > +  Cancel() override
> > +  {
> > +    mCallback->SetResult(false);
> > +    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(mCallback)));
> 
> It seems this should call ReportResult(mWorkerPrivate->GetJSContext(),
> false)?  Right now Cancel() is not setting mDone.
Well, ReportResult is a member of LifeCycleEventWatcher, which is only created if the runnable is executed (WorkerRun -> DispatchLifeCycleEvent).
The assumption is that WorkerRun and Cancel are never both called for the same runnable. Should I still make this change?

> @@ +529,2 @@
> >    } else {
> > +    watcher->ReportResult(aCx, false);
> 
> Is this necessary?  It seems the watcher will get de-referenced and report
> the failure in its destructor.  The comment above seems to suggest this is
> what you want to happen.
I usually don't like to rely on destructor logic because the cycle collector may delay things, but I think
you're right.

Comment 25

2 years ago
Comment on attachment 8689792 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.

Review of attachment 8689792 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.  Thanks!

::: dom/workers/ServiceWorkerManager.cpp
@@ +153,5 @@
>    NS_DECL_ISUPPORTS
>  
> +  enum QueueType
> +  {
> +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.

Do we have a bug to add an activation queue as defined in the spec?  Or do we handle those semantics in a different way?

@@ +155,5 @@
> +  enum QueueType
> +  {
> +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.
> +    RegistrationQueue,
> +    InstallationQueue

We already have a JobType enumeration in the ServiceWorkerRegisterJob.  Can we pull that out to a public enumeration and then add installing to it?

Then we can have a single type used for both the queue and the jobs.  The JobQueue could internally map register/update to the registration queue.

I think having multiple enumerations that partly overlap, but don't fully line up will be confusing to maintain in the future.

@@ +201,5 @@
> +        NS_WARNING("Pending/running jobs still around on shutdown!");
> +      }
> +    }
> +
> +    nsTArray<RefPtr<ServiceWorkerJob>> mJobs;

I assume this is only accessed on main thread, but the JobQueue code seems to be missing thread assertions.  Can you add AssertIsOnMainThread() throughout the JobQueue methods?

@@ +239,5 @@
>    void
> +  CancelJobs(QueueData& aQueue);
> +
> +  QueueData&
> +  GetQueue(ServiceWorkerJob::QueueType aType)

Rather than expose an owned internal data structure that then gets operated on, I think it would be cleaner to simple add the type code to public Append() and Pop() methods.  Something like:

  mJobQueue->Append(mJobType, aJob);

And

  mJobQueue->Pop(mJobType);

This makes the public interface of the JobQueue clear and doesn't require multiple step state to be handled properly by the caller.  All the checking can be enforced internal to the JobQueue itself.

@@ +954,5 @@
> +  // Done() with the failure code instead.
> +  // Callers MUST hold a strong ref before calling this!
> +  void
> +  Fail(ErrorResult& aRv)
> +  {

Are there any changes to this method?  Or was it just moved into the base class? I'm going to just assume it was moved without changes for now.

@@ +1127,5 @@
> +      mRegistration->mWaitingWorker->WorkerPrivate()->TerminateWorker();
> +      mRegistration->mWaitingWorker->UpdateState(ServiceWorkerState::Redundant);
> +
> +      nsresult rv = serviceWorkerScriptCache::PurgeCache(mRegistration->mPrincipal,
> +                                                         mRegistration->mWaitingWorker->CacheName());

nit: I know you just moved this line, but line wrapping.

@@ +1425,5 @@
>  void
>  ServiceWorkerJobQueue::CancelJobs()
>  {
> +  CancelJobs(mRegistrationJobQueue);
> +  CancelJobs(mInstallationJobQueue);

Does ordering matter here?  If so, it might be good to comment on that.  I assume canceling registration queue first is best so not to fill the installation queue after its been cleared.

@@ -1529,2 @@
>    ServiceWorkerJobQueue* queue = GetOrCreateJobQueue(originSuffix, cleanedScope);
> -  MOZ_ASSERT(queue);

It seems this assertion is dropped, but queue is still used below even if its nullptr?  Its hard to tell if another patch changed that because its off the splinter diff context.
Attachment #8689792 - Flags: review?(bkelly) → review+

Comment 26

2 years ago
(In reply to Cătălin Badea (:catalinb) from comment #23)
> It does, but we might get stuck when register just 1 service worker and that
> service workers waits
> forever in the install handler. In this case, we wait for the keepAliveToken
> to expire and terminate the
> worker, so we need the feature to catch this and continue the job, well, to
> fail the job.

Ok, I think a comment to that effect would be useful.

(In reply to Cătălin Badea (:catalinb) from comment #24)
> Well, ReportResult is a member of LifeCycleEventWatcher, which is only
> created if the runnable is executed (WorkerRun -> DispatchLifeCycleEvent).
> The assumption is that WorkerRun and Cancel are never both called for the
> same runnable. Should I still make this change?

Oh, good point.  I missed that they were different objects.  You can forget this one.

> > Is this necessary?  It seems the watcher will get de-referenced and report
> > the failure in its destructor.  The comment above seems to suggest this is
> > what you want to happen.
> I usually don't like to rely on destructor logic because the cycle collector
> may delay things, but I think
> you're right.

Either way is fine.  This one was more of a nit.
(Assignee)

Comment 27

2 years ago
(In reply to Ben Kelly [:bkelly] from comment #25)
> Comment on attachment 8689792 [details] [diff] [review]
> Part 3 - Use separate synchronization queues for service worker register
> jobs and install jobs.
> 
> Review of attachment 8689792 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with comments addressed.  Thanks!
> 
> ::: dom/workers/ServiceWorkerManager.cpp
> @@ +153,5 @@
> >    NS_DECL_ISUPPORTS
> >  
> > +  enum QueueType
> > +  {
> > +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.
> 
> Do we have a bug to add an activation queue as defined in the spec?  Or do
> we handle those semantics in a different way?
We handle that part by racing :). I couldn't think of a situation where this has a negative
side effect, though. Filed bug 1226441 to investigate it later.

> @@ +155,5 @@
> > +  enum QueueType
> > +  {
> > +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.
> > +    RegistrationQueue,
> > +    InstallationQueue
> 
> We already have a JobType enumeration in the ServiceWorkerRegisterJob.  Can
> we pull that out to a public enumeration and then add installing to it?
> 
> Then we can have a single type used for both the queue and the jobs.  The
> JobQueue could internally map register/update to the registration queue.
> 
> I think having multiple enumerations that partly overlap, but don't fully
> line up will be confusing to maintain in the future.
Good point!
 
> @@ +954,5 @@
> > +  // Done() with the failure code instead.
> > +  // Callers MUST hold a strong ref before calling this!
> > +  void
> > +  Fail(ErrorResult& aRv)
> > +  {
> 
> Are there any changes to this method?  Or was it just moved into the base
> class? I'm going to just assume it was moved without changes for now.
There is a small change: the previous version would check if mRegistration is null and either
use mRegistration->mScope/mScriptSpec or mScope/mScriptSpec for error reporting:
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp?from=ServiceWorkerManager.cpp#1121

However, mRegistration is never actually null when we call Fail(), this was already enforced by a "hidden" assert in MaybeRemoveRegistration which is always called from Fail().
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp?from=ServiceWorkerManager.cpp#1153
I added an assert to make this obvious and changed it to always use mRegistration when constructing the ErrorResult. ServiceWorkerJobBase doesn't have the mScope/mScriptSpec members.

The other change is that Fail is now a protected member of ServiceWorkerJobBase, which I think is better.
The reason it was public before was to be accessible from HandleError.
 

> @@ +1425,5 @@
> >  void
> >  ServiceWorkerJobQueue::CancelJobs()
> >  {
> > +  CancelJobs(mRegistrationJobQueue);
> > +  CancelJobs(mInstallationJobQueue);
> 
> Does ordering matter here?  If so, it might be good to comment on that.  I
> assume canceling registration queue first is best so not to fill the
> installation queue after its been cleared.
It doesn't really matter, Cancel just sets a flag on the job objects. I'll add a comment.
 
> @@ -1529,2 @@
> >    ServiceWorkerJobQueue* queue = GetOrCreateJobQueue(originSuffix, cleanedScope);
> > -  MOZ_ASSERT(queue);
> 
> It seems this assertion is dropped, but queue is still used below even if
> its nullptr?  Its hard to tell if another patch changed that because its off
> the splinter diff context.
This change was made by mistake.

Thanks for the review!

Comment 28

2 years ago
Unfortunately these patches seem to break the updates triggered by navigation or registration.

Go to this page:

  https://adriftwith.me/sandbox/service_worker_reload/go.html

Open the web console and refresh a few times.

Without these patches we get an install message with a new time on every refresh.  This is due to the SW updating.

With these patches we don't get the console message.  In addition, checking about:serviceworkers shows that the service worker is not updating.  The uuid remains unchanged.

Can you please investigate what is going on here?  I don't think we can include these patches in the release if they break updates.
Flags: needinfo?(catalin.badea392)
(Assignee)

Comment 29

2 years ago
(In reply to Ben Kelly [:bkelly] from comment #28)
> Unfortunately these patches seem to break the updates triggered by
> navigation or registration.
> 
> Go to this page:
> 
>   https://adriftwith.me/sandbox/service_worker_reload/go.html
> 
> Open the web console and refresh a few times.
> 
> Without these patches we get an install message with a new time on every
> refresh.  This is due to the SW updating.
> 
> With these patches we don't get the console message.  In addition, checking
> about:serviceworkers shows that the service worker is not updating.  The
> uuid remains unchanged.
> 
> Can you please investigate what is going on here?  I don't think we can
> include these patches in the release if they break updates.

I'm still investigating this, but it kind of works for me.

Testing with yesterday's m-c + my patches:
Refreshing the web page logs the install and activate messages in the browser console (not the web console).
Twice for each refresh, 2 install events and 2 activate events. Is this because we're trying to soft update then register performs another update?
Next, refreshing about:serviceworkers after each refresh of the demo page reveals a new active worker cache UUID and a new waiting worker as well.

Testing with today's inbound + my patches:
Refreshing the web pages again logs to the browser console, but only once this time (1 install + 1 activate message).
Refreshing about:serviceworkers again reveals new UUIDs for the active and waiting workers.

What's the patch queue you're using?
Flags: needinfo?(catalin.badea392)

Comment 30

2 years ago
You see it on each refresh?  I saw the events on the first time the page loads, but not when hitting refresh repeatedly.

Comment 31

2 years ago
> What's the patch queue you're using?

Just the patches from this bug.  Do you have local changes compared to what is uploaded here?
(Assignee)

Comment 32

2 years ago
Created attachment 8690130 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.
Attachment #8689789 - Attachment is obsolete: true
(Assignee)

Comment 33

2 years ago
Created attachment 8690131 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.
Attachment #8689790 - Attachment is obsolete: true
(Assignee)

Comment 34

2 years ago
Created attachment 8690132 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.
Attachment #8689792 - Attachment is obsolete: true
(Assignee)

Comment 35

2 years ago
Created attachment 8690144 [details] [diff] [review]
Part 4 - Fix race in test_install_event.html.

The test does the following:
1. registers a service worker and waits for the updatefound event.
2. Registers a new service worker that throws in the install handler
3. Next it calls getRegistration and checks that there's an active worker

The problem is that if the first service worker doesn't finish installing by
the time the second registration reaches step 4.2 from [[Update]] it will be
terminated and never reach activating state.

Comment 36

2 years ago
Updates look good with latest patches.  Sorry for comment 28.  I must have been running with the patches that had the bad git rebase.
(Assignee)

Updated

2 years ago
Attachment #8690144 - Flags: review?(bkelly)
(Assignee)

Comment 37

2 years ago
(In reply to Cătălin Badea (:catalinb) from comment #35)
> Created attachment 8690144 [details] [diff] [review]
> Part 4 - Fix race in test_install_event.html.
> 
> The test does the following:
> 1. registers a service worker and waits for the updatefound event.
> 2. Registers a new service worker that throws in the install handler
> 3. Next it calls getRegistration and checks that there's an active worker
> 
> The problem is that if the first service worker doesn't finish installing by
> the time the second registration reaches step 4.2 from [[Update]] it will be
> terminated and never reach activating state.

This is the try run with the mochitest failing.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a2c0b180c6b

Updated

2 years ago
Attachment #8690144 - Flags: review?(bkelly) → review+

Updated

2 years ago
Blocks: 1226441

Comment 38

2 years ago
Catalin, what's the status of this bug?
Flags: needinfo?(catalin.badea392)
(Assignee)

Comment 39

2 years ago
Just need to land bug 1227932 first.
Depends on: 1227932
Flags: needinfo?(catalin.badea392)
(Assignee)

Comment 40

2 years ago
Created attachment 8692143 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

rebased.
Attachment #8690130 - Attachment is obsolete: true
(Assignee)

Comment 41

2 years ago
Created attachment 8692144 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.

rebased
Attachment #8690131 - Attachment is obsolete: true
(Assignee)

Comment 42

2 years ago
Created attachment 8692145 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.

rebased.
Attachment #8690132 - Attachment is obsolete: true
(Assignee)

Comment 43

2 years ago
Created attachment 8692146 [details] [diff] [review]
Part 4 - Fix race in test_install_event.html.
Attachment #8690144 - Attachment is obsolete: true
(Assignee)

Comment 44

2 years ago
Created attachment 8692480 [details] [diff] [review]
Part 5 - Fix race in skip-waiting.https.html and add some logging for SkipWaitingFlag in ServiceWorkerManager.

In test-skip-waiting.html it is possible to run all skipWaiting() calls
before the worker reaches "installing" state, with all of them failing.
This might cause the test to timeout. Filed 1189659 to fix the actual bug,
and tweaked the test to always pass for now.
Attachment #8692480 - Flags: review?(ehsan)

Updated

2 years ago
Attachment #8692480 - Flags: review?(ehsan) → review+

Comment 45

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/48ea5c3f2214
https://hg.mozilla.org/integration/mozilla-inbound/rev/8a417a349a61
https://hg.mozilla.org/integration/mozilla-inbound/rev/a4376ddc87b5
https://hg.mozilla.org/integration/mozilla-inbound/rev/7a96276122dc
https://hg.mozilla.org/integration/mozilla-inbound/rev/9424e969d623
(Assignee)

Comment 46

2 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=00c64507dfcb&selectedJob=14148485
(Assignee)

Comment 47

2 years ago
Comment on attachment 8692143 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

Approval Request Comment
Request for entire patch series (1-5)
[Feature/regressing bug #]: Service Workers
[User impact if declined]: SW will ship in 44 and this fixes a bug in common usage scenarios.
[Describe test coverage new/current, TreeHerder]: covered by web platform tests
[Risks and why]: Low, SW targeted change.
[String/UUID change made/needed]: None
Attachment #8692143 - Flags: approval-mozilla-aurora?

Comment 48

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/48ea5c3f2214
https://hg.mozilla.org/mozilla-central/rev/8a417a349a61
https://hg.mozilla.org/mozilla-central/rev/a4376ddc87b5
https://hg.mozilla.org/mozilla-central/rev/7a96276122dc
https://hg.mozilla.org/mozilla-central/rev/9424e969d623
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox45: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla45

Updated

2 years ago
status-firefox44: --- → affected
Comment on attachment 8692143 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

SW is planned for FF44 and these patches have been in Nightly for a few days, seems safe to uplift to Aurora44.
Attachment #8692143 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8692144 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.

Aurora44+
Attachment #8692144 - Flags: approval-mozilla-aurora+
Comment on attachment 8692145 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.

Aurora44+
Attachment #8692145 - Flags: approval-mozilla-aurora+
Comment on attachment 8692146 [details] [diff] [review]
Part 4 - Fix race in test_install_event.html.

Aurora44+
Attachment #8692146 - Flags: approval-mozilla-aurora+
Comment on attachment 8692480 [details] [diff] [review]
Part 5 - Fix race in skip-waiting.https.html and add some logging for SkipWaitingFlag in ServiceWorkerManager.

Aurora44+
Attachment #8692480 - Flags: approval-mozilla-aurora+
problems during uplift of part 2 

grafting 317032:8a417a349a61 "Bug 1189659 - Part 2 - Remove set of scopes being updated from ServiceWorkerManager. r=bkelly"
merging dom/workers/ServiceWorkerManager.cpp
merging dom/workers/ServiceWorkerPrivate.cpp
merging dom/workers/ServiceWorkerPrivate.h
warning: conflicts while merging dom/workers/ServiceWorkerManager.cpp! (edit, then use 'hg resolve --mark')
Flags: needinfo?(catalin.badea392)
(Assignee)

Comment 55

2 years ago
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/fd9f740bc3f9
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/fb97e67dde7d
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/e0ad70c2b1ee
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/eacf4b25bd8a
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/d57fc3a1dda5
status-firefox42: affected → wontfix
status-firefox43: --- → wontfix
status-firefox44: affected → fixed
Flags: needinfo?(catalin.badea392)

Comment 56

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/fd9f740bc3f9
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/fb97e67dde7d
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/e0ad70c2b1ee
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/eacf4b25bd8a
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/d57fc3a1dda5
status-b2g-v2.5: --- → fixed
You need to log in before you can comment on or make changes to this bug.