Closed Bug 1189659 Opened 9 years ago Closed 9 years ago

New Register/Update tasks should terminate existing installing worker

Tracking

(firefox42 wontfix, firefox43 wontfix, firefox44 fixed, firefox45 fixed, b2g-v2.5 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla45

Tracking Flags:

Tracking

Status

firefox42

---

wontfix

firefox43

---

wontfix

firefox44

---

fixed

firefox45

---

fixed

b2g-v2.5

---

fixed

People

(Reporter: noemi, Assigned: catalinb)

References

Details

Attachments

(6 files, 12 obsolete files)

MozReview Request: Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku 9 years ago Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?) 40 bytes, text/x-review-board-request	baku : review+	Details
I rebased the Nikhil's patch. It fixes the bug for me locally and all of the tests seem to pass, but on TreeHerder there is quite a bit of broken-ness: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a64843e0023 9 years ago (no longer active) 11.42 KB, patch		Details \| Diff \| Splinter Review
Part 1 - Continue service worker job queue when life cycle events expire. 9 years ago Cătălin Badea (:catalinb) 4.72 KB, patch		Details \| Diff \| Splinter Review
Part 1 - Continue service worker job queue when life cycle events expire. 9 years ago Cătălin Badea (:catalinb) 7.00 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Remove set of scopes being updated from ServiceWorkerManager. 9 years ago Cătălin Badea (:catalinb) 12.84 KB, patch		Details \| Diff \| Splinter Review
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. 9 years ago Cătălin Badea (:catalinb) 29.70 KB, patch		Details \| Diff \| Splinter Review
Part 1 - Continue service worker job queue when life cycle events expire. 9 years ago Cătălin Badea (:catalinb) 6.94 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Part 2 - Remove set of scopes being updated from ServiceWorkerManager. 9 years ago Cătălin Badea (:catalinb) 12.84 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. 9 years ago Cătălin Badea (:catalinb) 30.24 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Part 1 - Continue service worker job queue when life cycle events expire. 9 years ago Cătălin Badea (:catalinb) 6.94 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Remove set of scopes being updated from ServiceWorkerManager. 9 years ago Cătălin Badea (:catalinb) 12.81 KB, patch		Details \| Diff \| Splinter Review
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. 9 years ago Cătălin Badea (:catalinb) 30.59 KB, patch		Details \| Diff \| Splinter Review
Part 4 - Fix race in test_install_event.html. 9 years ago Cătălin Badea (:catalinb) 1.84 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
Part 1 - Continue service worker job queue when life cycle events expire. 9 years ago Cătălin Badea (:catalinb) 6.94 KB, patch	ritu : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Part 2 - Remove set of scopes being updated from ServiceWorkerManager. 9 years ago Cătălin Badea (:catalinb) 12.80 KB, patch	ritu : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. 9 years ago Cătălin Badea (:catalinb) 30.98 KB, patch	ritu : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Part 4 - Fix race in test_install_event.html. 9 years ago Cătălin Badea (:catalinb) 1.84 KB, patch	ritu : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Part 5 - Fix race in skip-waiting.https.html and add some logging for SkipWaitingFlag in ServiceWorkerManager. 9 years ago Cătălin Badea (:catalinb) 4.49 KB, patch	ehsan.akhgari : review+ ritu : approval-mozilla-aurora+	Details \| Diff \| Splinter Review

Noemí Freire (:noemi)

Reporter

Description

•

9 years ago

Test run such as |./mach web-platform-tests _mozilla/service-workers/service-worker/register-wait-forever-in-install-worker.https.html|


Harness status: Timeout
Harness status: Timeout

Found 1 tests
1 Timeout
* register worker that calls waitUntil with a promise that never resolves in oninstall -> timed out
Traces: https://pastebin.mozilla.org/8838598

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 1

•

9 years ago

According to the spec, a new Register/Update task should terminate an existing installing worker. We do not do this yet, leading to the test failure. The actual implementation is going to need some thinking.

The easiest thing to do would be to immediately expire the timeout on the installing worker (once the timers patch lands). Then override Cancel() on the lifecycle tasks so that when they get canceled, or when they get destroyed 'uncleanly' (due to the worker shutting down), they dispatch a runnable to the main thread that indicates the task failed, so that the RegisterJob can continue as if the install failed.

Andrew Overholt [:overholt]

Comment 2

•

9 years ago

(Clarifying title a bit)

Summary: "Harness status: Timeout" when running "register-wait-forever-in-install-worker.https.html" test → New Register/Update tasks should terminate existing installing worker

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 3

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=0329cdac1c7e

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 4

•

9 years ago

Attached file MozReview Request: Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku — Details

Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku

Dispatches a runnable to the main thread to indicate a failed lifecycle task if
the worker starts shutting down and the lifecycle task promise has not
resolved.

The ServiceWorkerRegisterJob constructors terminate an existing installing
worker. These two together ensure that a new register() or update() can proceed
even if an old job is blocked on installation.

Update web-platform-tests expected data

Attachment #8665641 - Flags: review?(amarchesini)

Andrea Marchesini [:baku]

Comment 5

•

9 years ago

Comment on attachment 8665641 [details]
MozReview Request: Bug 1189659 - Fail lifecycle task when worker terminates with pending promise. r?baku

https://reviewboard.mozilla.org/r/20281/#review18289

::: testing/web-platform/meta/subresource-integrity/subresource-integrity.html.ini:17
(Diff revision 1)
> +

if you remove empty lines at the end of any .ini file, don't add it here.

Attachment #8665641 - Flags: review?(amarchesini) → review+

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 6

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=caf016fc1d62

(no longer active)

Comment 7

•

9 years ago

Attached patch I rebased the Nikhil's patch. It fixes the bug for me locally and all of the tests seem to pass, but on TreeHerder there is quite a bit of broken-ness: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6a64843e0023 (obsolete) — Details — Splinter Review

It seems like there are some race conditions which manifest in forms on intermittent test failures.  Note the difference between the debug and opt test runs of mochitest-4.  I haven't really looked into it in detail.

Please ignore the failures in dom/push/test/test_serviceworker_lifetime.html, those are caused by something else in my patch queue.

(no longer active)

Comment 8

•

9 years ago

Bumping this.  I think we want to do something at least to make sure that |event.waitUntil(new Promise(()=>()));| doesn't create a zombie worker.

Blocks: ServiceWorkers-v1

(no longer active)

Updated

•

9 years ago

Blocks: 1189023
No longer blocks: 1180622

(no longer active)

Updated

•

9 years ago

Assignee: nobody → catalin.badea392

Cătălin Badea (:catalinb)

Assignee

Comment 9

•

9 years ago

(In reply to Ehsan Akhgari (don't ask for review please) [Away Nov 3-19] from comment #8)
> Bumping this.  I think we want to do something at least to make sure that
> |event.waitUntil(new Promise(()=>()));| doesn't create a zombie worker.

The zombie worker bug happens because our job queue in ServiceWorkerManager.cpp expects the life cycle promises to be resolved or rejected at some point to continue.

Ben Kelly [:bkelly, not reviewing]

Comment 10

•

9 years ago

Note, for install events we need to fail the installation if the worker is timed-out, rejects the waitUntil promise, or otherwise fails.

For the activate event, though, we need to allow the SW to proceed to be the active worker.  This is spec'd but was not immediately obvious to me.  See:

  https://github.com/slightlyoff/ServiceWorker/issues/776

Ben Kelly [:bkelly, not reviewing]

Updated

•

9 years ago

Status: NEW → ASSIGNED

Cătălin Badea (:catalinb)

Assignee

Comment 11

•

9 years ago

Attached patch Part 1 - Continue service worker job queue when life cycle events expire. (obsolete) — Details — Splinter Review

Cătălin Badea (:catalinb)

Assignee

Comment 12

•

9 years ago

Attached patch Part 1 - Continue service worker job queue when life cycle events expire. (obsolete) — Details — Splinter Review

Attachment #8679199 - Attachment is obsolete: true

Attachment #8689182 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 13

•

9 years ago

Attached patch Part 2 - Remove set of scopes being updated from ServiceWorkerManager. (obsolete) — Details — Splinter Review

Cătălin Badea (:catalinb)

Assignee

Comment 14

•

9 years ago

Attached patch Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Comment 15

•

9 years ago

Please note the changes in bug 1186856 that I intent to land later tonight.  That test times out with these patches applied.

Flags: needinfo?(catalin.badea392)

Cătălin Badea (:catalinb)

Assignee

Comment 16

•

9 years ago

(In reply to Ben Kelly [:bkelly] from comment #15)
> Please note the changes in bug 1186856 that I intent to land later tonight. 
> That test times out with these patches applied.

Thanks for the heads up. It's a bug in patches after I did some crazy git rebases.

Flags: needinfo?(catalin.badea392)

Cătălin Badea (:catalinb)

Assignee

Comment 17

•

9 years ago

Attached patch Part 1 - Continue service worker job queue when life cycle events expire. (obsolete) — Details — Splinter Review

Attachment #8689678 - Attachment is obsolete: true

Attachment #8689789 - Flags: review?(bkelly)

Cătălin Badea (:catalinb)

Assignee

Comment 18

•

9 years ago

Attached patch Part 2 - Remove set of scopes being updated from ServiceWorkerManager. (obsolete) — Details — Splinter Review

Attachment #8689679 - Attachment is obsolete: true

Attachment #8689790 - Flags: review?(bkelly)

Cătălin Badea (:catalinb)

Assignee

Comment 19

•

9 years ago

Attached patch Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. (obsolete) — Details — Splinter Review

Attachment #8689711 - Attachment is obsolete: true

Attachment #8689792 - Flags: review?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Comment 20

•

9 years ago

Comment on attachment 8689789 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

Review of attachment 8689789 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +355,5 @@
> +  NS_IMETHOD
> +  Cancel() override
> +  {
> +    mCallback->SetResult(false);
> +    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(mCallback)));

It seems this should call ReportResult(mWorkerPrivate->GetJSContext(), false)?  Right now Cancel() is not setting mDone.

@@ +438,5 @@
> +
> +  void
> +  ReportResult(JSContext* aCx, bool aResult)
> +  {
> +    if (mDone) {

Please mWorkerPrivate->AssertIsOnWorkerThread() to make it clear mDone handling is thread safe.

@@ +449,5 @@
> +      NS_RUNTIMEABORT("Failed to dispatch life cycle event handler.");
> +    }
> +
> +    mWorkerPrivate->RemoveFeature(aCx, this);
> +    mDone = true;

To minimize the potential for races, I think its better to set mDone immediately after checking mDone.  That way you can be sure that if anything you call in this method triggers re-entry, it won't double execute.

@@ +480,5 @@
>      }
>  
>      RefPtr<xpc::ErrorReport> xpcReport = new xpc::ErrorReport();
>      xpcReport->Init(report.report(), report.message(),
> +                    /* aIsChrome = */ false, mWorkerPrivate->WindowID());

This needs to be rebased.  I just removed this error reporting code in bug 1225470.  It has not merged to central yet.

@@ +529,2 @@
>    } else {
> +    watcher->ReportResult(aCx, false);

Is this necessary?  It seems the watcher will get de-referenced and report the failure in its destructor.  The comment above seems to suggest this is what you want to happen.

Attachment #8689789 - Flags: review?(bkelly) → review+

Ben Kelly [:bkelly, not reviewing]

Comment 21

•

9 years ago

Comment on attachment 8689789 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

Review of attachment 8689789 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +413,5 @@
> +    MOZ_ASSERT(mWorkerPrivate);
> +    mWorkerPrivate->AssertIsOnWorkerThread();
> +    JSContext* cx = mWorkerPrivate->GetJSContext();
> +
> +    if (NS_WARN_IF(!mWorkerPrivate->AddFeature(cx, this))) {

Can you add a comment about why a feature is needed?  Its not immediately obvious to me what this is protecting against.  Doesn't the ServiceWorkerPrivate hold the worker open while the lifecycle is progressing?

Ben Kelly [:bkelly, not reviewing]

Comment 22

•

9 years ago

Comment on attachment 8689790 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.

Review of attachment 8689790 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments.

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +126,5 @@
>    bool
>    WorkerRun(JSContext* aCx, WorkerPrivate* aWorkerPrivate) override
>    {
>      aWorkerPrivate->AssertIsOnWorkerThread();
> +    mCallback->SetResult(aWorkerPrivate->WorkerScriptExecutedSuccessfully());

Nice!  I was not aware of WorkerScriptExecutedSuccessfully.  Much cleaner since we simply need to reject promises with a TypeError and not the specific script error.

@@ +139,5 @@
> +  {
> +    mCallback->SetResult(false);
> +    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(mCallback)));
> +
> +    mDone = true;

nit: It would be nice if the callback SetResult(), dispatch to main thread, and mDone = true were all done in a single method called from both WorkerRun() and Cancel().  It minimizes the chance we miss some cleanup step.

Attachment #8689790 - Flags: review?(bkelly) → review+

Cătălin Badea (:catalinb)

Assignee

Comment 23

•

9 years ago

(In reply to Ben Kelly [:bkelly] from comment #21)
> Comment on attachment 8689789 [details] [diff] [review]
> Part 1 - Continue service worker job queue when life cycle events expire.
> 
> Review of attachment 8689789 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/workers/ServiceWorkerPrivate.cpp
> @@ +413,5 @@
> > +    MOZ_ASSERT(mWorkerPrivate);
> > +    mWorkerPrivate->AssertIsOnWorkerThread();
> > +    JSContext* cx = mWorkerPrivate->GetJSContext();
> > +
> > +    if (NS_WARN_IF(!mWorkerPrivate->AddFeature(cx, this))) {
> 
> Can you add a comment about why a feature is needed?  Its not immediately
> obvious to me what this is protecting against.  Doesn't the
> ServiceWorkerPrivate hold the worker open while the lifecycle is progressing?

It does, but we might get stuck when register just 1 service worker and that service workers waits
forever in the install handler. In this case, we wait for the keepAliveToken to expire and terminate the
worker, so we need the feature to catch this and continue the job, well, to fail the job.

Cătălin Badea (:catalinb)

Assignee

Comment 24

•

9 years ago

(In reply to Ben Kelly [:bkelly] from comment #20)
> Comment on attachment 8689789 [details] [diff] [review]
> Part 1 - Continue service worker job queue when life cycle events expire.
> 
> Review of attachment 8689789 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with comments addressed.
> 
> ::: dom/workers/ServiceWorkerPrivate.cpp
> @@ +355,5 @@
> > +  NS_IMETHOD
> > +  Cancel() override
> > +  {
> > +    mCallback->SetResult(false);
> > +    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(NS_DispatchToMainThread(mCallback)));
> 
> It seems this should call ReportResult(mWorkerPrivate->GetJSContext(),
> false)?  Right now Cancel() is not setting mDone.
Well, ReportResult is a member of LifeCycleEventWatcher, which is only created if the runnable is executed (WorkerRun -> DispatchLifeCycleEvent).
The assumption is that WorkerRun and Cancel are never both called for the same runnable. Should I still make this change?

> @@ +529,2 @@
> >    } else {
> > +    watcher->ReportResult(aCx, false);
> 
> Is this necessary?  It seems the watcher will get de-referenced and report
> the failure in its destructor.  The comment above seems to suggest this is
> what you want to happen.
I usually don't like to rely on destructor logic because the cycle collector may delay things, but I think
you're right.

Ben Kelly [:bkelly, not reviewing]

Comment 25

•

9 years ago

Comment on attachment 8689792 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.

Review of attachment 8689792 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.  Thanks!

::: dom/workers/ServiceWorkerManager.cpp
@@ +153,5 @@
>    NS_DECL_ISUPPORTS
>  
> +  enum QueueType
> +  {
> +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.

Do we have a bug to add an activation queue as defined in the spec?  Or do we handle those semantics in a different way?

@@ +155,5 @@
> +  enum QueueType
> +  {
> +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.
> +    RegistrationQueue,
> +    InstallationQueue

We already have a JobType enumeration in the ServiceWorkerRegisterJob.  Can we pull that out to a public enumeration and then add installing to it?

Then we can have a single type used for both the queue and the jobs.  The JobQueue could internally map register/update to the registration queue.

I think having multiple enumerations that partly overlap, but don't fully line up will be confusing to maintain in the future.

@@ +201,5 @@
> +        NS_WARNING("Pending/running jobs still around on shutdown!");
> +      }
> +    }
> +
> +    nsTArray<RefPtr<ServiceWorkerJob>> mJobs;

I assume this is only accessed on main thread, but the JobQueue code seems to be missing thread assertions.  Can you add AssertIsOnMainThread() throughout the JobQueue methods?

@@ +239,5 @@
>    void
> +  CancelJobs(QueueData& aQueue);
> +
> +  QueueData&
> +  GetQueue(ServiceWorkerJob::QueueType aType)

Rather than expose an owned internal data structure that then gets operated on, I think it would be cleaner to simple add the type code to public Append() and Pop() methods.  Something like:

  mJobQueue->Append(mJobType, aJob);

And

  mJobQueue->Pop(mJobType);

This makes the public interface of the JobQueue clear and doesn't require multiple step state to be handled properly by the caller.  All the checking can be enforced internal to the JobQueue itself.

@@ +954,5 @@
> +  // Done() with the failure code instead.
> +  // Callers MUST hold a strong ref before calling this!
> +  void
> +  Fail(ErrorResult& aRv)
> +  {

Are there any changes to this method?  Or was it just moved into the base class? I'm going to just assume it was moved without changes for now.

@@ +1127,5 @@
> +      mRegistration->mWaitingWorker->WorkerPrivate()->TerminateWorker();
> +      mRegistration->mWaitingWorker->UpdateState(ServiceWorkerState::Redundant);
> +
> +      nsresult rv = serviceWorkerScriptCache::PurgeCache(mRegistration->mPrincipal,
> +                                                         mRegistration->mWaitingWorker->CacheName());

nit: I know you just moved this line, but line wrapping.

@@ +1425,5 @@
>  void
>  ServiceWorkerJobQueue::CancelJobs()
>  {
> +  CancelJobs(mRegistrationJobQueue);
> +  CancelJobs(mInstallationJobQueue);

Does ordering matter here?  If so, it might be good to comment on that.  I assume canceling registration queue first is best so not to fill the installation queue after its been cleared.

@@ -1529,2 @@
>    ServiceWorkerJobQueue* queue = GetOrCreateJobQueue(originSuffix, cleanedScope);
> -  MOZ_ASSERT(queue);

It seems this assertion is dropped, but queue is still used below even if its nullptr?  Its hard to tell if another patch changed that because its off the splinter diff context.

Attachment #8689792 - Flags: review?(bkelly) → review+

Ben Kelly [:bkelly, not reviewing]

Comment 26

•

9 years ago

(In reply to Cătălin Badea (:catalinb) from comment #23)
> It does, but we might get stuck when register just 1 service worker and that
> service workers waits
> forever in the install handler. In this case, we wait for the keepAliveToken
> to expire and terminate the
> worker, so we need the feature to catch this and continue the job, well, to
> fail the job.

Ok, I think a comment to that effect would be useful.

(In reply to Cătălin Badea (:catalinb) from comment #24)
> Well, ReportResult is a member of LifeCycleEventWatcher, which is only
> created if the runnable is executed (WorkerRun -> DispatchLifeCycleEvent).
> The assumption is that WorkerRun and Cancel are never both called for the
> same runnable. Should I still make this change?

Oh, good point.  I missed that they were different objects.  You can forget this one.

> > Is this necessary?  It seems the watcher will get de-referenced and report
> > the failure in its destructor.  The comment above seems to suggest this is
> > what you want to happen.
> I usually don't like to rely on destructor logic because the cycle collector
> may delay things, but I think
> you're right.

Either way is fine.  This one was more of a nit.

Cătălin Badea (:catalinb)

Assignee

Comment 27

•

9 years ago

(In reply to Ben Kelly [:bkelly] from comment #25)
> Comment on attachment 8689792 [details] [diff] [review]
> Part 3 - Use separate synchronization queues for service worker register
> jobs and install jobs.
> 
> Review of attachment 8689792 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with comments addressed.  Thanks!
> 
> ::: dom/workers/ServiceWorkerManager.cpp
> @@ +153,5 @@
> >    NS_DECL_ISUPPORTS
> >  
> > +  enum QueueType
> > +  {
> > +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.
> 
> Do we have a bug to add an activation queue as defined in the spec?  Or do
> we handle those semantics in a different way?
We handle that part by racing :). I couldn't think of a situation where this has a negative
side effect, though. Filed bug 1226441 to investigate it later.

> @@ +155,5 @@
> > +  enum QueueType
> > +  {
> > +    // Adding a new entry here requires updating ServiceWorkerQueue::GetQueue.
> > +    RegistrationQueue,
> > +    InstallationQueue
> 
> We already have a JobType enumeration in the ServiceWorkerRegisterJob.  Can
> we pull that out to a public enumeration and then add installing to it?
> 
> Then we can have a single type used for both the queue and the jobs.  The
> JobQueue could internally map register/update to the registration queue.
> 
> I think having multiple enumerations that partly overlap, but don't fully
> line up will be confusing to maintain in the future.
Good point!
 
> @@ +954,5 @@
> > +  // Done() with the failure code instead.
> > +  // Callers MUST hold a strong ref before calling this!
> > +  void
> > +  Fail(ErrorResult& aRv)
> > +  {
> 
> Are there any changes to this method?  Or was it just moved into the base
> class? I'm going to just assume it was moved without changes for now.
There is a small change: the previous version would check if mRegistration is null and either
use mRegistration->mScope/mScriptSpec or mScope/mScriptSpec for error reporting:
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp?from=ServiceWorkerManager.cpp#1121

However, mRegistration is never actually null when we call Fail(), this was already enforced by a "hidden" assert in MaybeRemoveRegistration which is always called from Fail().
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp?from=ServiceWorkerManager.cpp#1153
I added an assert to make this obvious and changed it to always use mRegistration when constructing the ErrorResult. ServiceWorkerJobBase doesn't have the mScope/mScriptSpec members.

The other change is that Fail is now a protected member of ServiceWorkerJobBase, which I think is better.
The reason it was public before was to be accessible from HandleError.
 

> @@ +1425,5 @@
> >  void
> >  ServiceWorkerJobQueue::CancelJobs()
> >  {
> > +  CancelJobs(mRegistrationJobQueue);
> > +  CancelJobs(mInstallationJobQueue);
> 
> Does ordering matter here?  If so, it might be good to comment on that.  I
> assume canceling registration queue first is best so not to fill the
> installation queue after its been cleared.
It doesn't really matter, Cancel just sets a flag on the job objects. I'll add a comment.
 
> @@ -1529,2 @@
> >    ServiceWorkerJobQueue* queue = GetOrCreateJobQueue(originSuffix, cleanedScope);
> > -  MOZ_ASSERT(queue);
> 
> It seems this assertion is dropped, but queue is still used below even if
> its nullptr?  Its hard to tell if another patch changed that because its off
> the splinter diff context.
This change was made by mistake.

Thanks for the review!

Ben Kelly [:bkelly, not reviewing]

Comment 28

•

9 years ago

Unfortunately these patches seem to break the updates triggered by navigation or registration.

Go to this page:

  https://adriftwith.me/sandbox/service_worker_reload/go.html

Open the web console and refresh a few times.

Without these patches we get an install message with a new time on every refresh.  This is due to the SW updating.

With these patches we don't get the console message.  In addition, checking about:serviceworkers shows that the service worker is not updating.  The uuid remains unchanged.

Can you please investigate what is going on here?  I don't think we can include these patches in the release if they break updates.

Flags: needinfo?(catalin.badea392)

Cătălin Badea (:catalinb)

Assignee

Comment 29

•

9 years ago

(In reply to Ben Kelly [:bkelly] from comment #28)
> Unfortunately these patches seem to break the updates triggered by
> navigation or registration.
> 
> Go to this page:
> 
>   https://adriftwith.me/sandbox/service_worker_reload/go.html
> 
> Open the web console and refresh a few times.
> 
> Without these patches we get an install message with a new time on every
> refresh.  This is due to the SW updating.
> 
> With these patches we don't get the console message.  In addition, checking
> about:serviceworkers shows that the service worker is not updating.  The
> uuid remains unchanged.
> 
> Can you please investigate what is going on here?  I don't think we can
> include these patches in the release if they break updates.

I'm still investigating this, but it kind of works for me.

Testing with yesterday's m-c + my patches:
Refreshing the web page logs the install and activate messages in the browser console (not the web console).
Twice for each refresh, 2 install events and 2 activate events. Is this because we're trying to soft update then register performs another update?
Next, refreshing about:serviceworkers after each refresh of the demo page reveals a new active worker cache UUID and a new waiting worker as well.

Testing with today's inbound + my patches:
Refreshing the web pages again logs to the browser console, but only once this time (1 install + 1 activate message).
Refreshing about:serviceworkers again reveals new UUIDs for the active and waiting workers.

What's the patch queue you're using?

Flags: needinfo?(catalin.badea392)

Ben Kelly [:bkelly, not reviewing]

Comment 30

•

9 years ago

You see it on each refresh?  I saw the events on the first time the page loads, but not when hitting refresh repeatedly.

Ben Kelly [:bkelly, not reviewing]

Comment 31

•

9 years ago

> What's the patch queue you're using?

Just the patches from this bug.  Do you have local changes compared to what is uploaded here?

Cătălin Badea (:catalinb)

Assignee

Comment 32

•

9 years ago

Attached patch Part 1 - Continue service worker job queue when life cycle events expire. (obsolete) — Details — Splinter Review

Attachment #8689789 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 33

•

9 years ago

Attached patch Part 2 - Remove set of scopes being updated from ServiceWorkerManager. (obsolete) — Details — Splinter Review

Attachment #8689790 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 34

•

9 years ago

Attached patch Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. (obsolete) — Details — Splinter Review

Attachment #8689792 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 35

•

9 years ago

Attached patch Part 4 - Fix race in test_install_event.html. (obsolete) — Details — Splinter Review

The test does the following:
1. registers a service worker and waits for the updatefound event.
2. Registers a new service worker that throws in the install handler
3. Next it calls getRegistration and checks that there's an active worker

The problem is that if the first service worker doesn't finish installing by
the time the second registration reaches step 4.2 from [[Update]] it will be
terminated and never reach activating state.

Ben Kelly [:bkelly, not reviewing]

Comment 36

•

9 years ago

Updates look good with latest patches.  Sorry for comment 28.  I must have been running with the patches that had the bad git rebase.

Cătălin Badea (:catalinb)

Assignee

Updated

•

9 years ago

Attachment #8690144 - Flags: review?(bkelly)

Cătălin Badea (:catalinb)

Assignee

Comment 37

•

9 years ago

(In reply to Cătălin Badea (:catalinb) from comment #35)
> Created attachment 8690144 [details] [diff] [review]
> Part 4 - Fix race in test_install_event.html.
> 
> The test does the following:
> 1. registers a service worker and waits for the updatefound event.
> 2. Registers a new service worker that throws in the install handler
> 3. Next it calls getRegistration and checks that there's an active worker
> 
> The problem is that if the first service worker doesn't finish installing by
> the time the second registration reaches step 4.2 from [[Update]] it will be
> terminated and never reach activating state.

This is the try run with the mochitest failing.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8a2c0b180c6b

Ben Kelly [:bkelly, not reviewing]

Updated

•

9 years ago

Attachment #8690144 - Flags: review?(bkelly) → review+

Ben Kelly [:bkelly, not reviewing]

Updated

•

9 years ago

Blocks: 1226441

(no longer active)

Comment 38

•

9 years ago

Catalin, what's the status of this bug?

Flags: needinfo?(catalin.badea392)

Cătălin Badea (:catalinb)

Assignee

Comment 39

•

9 years ago

Just need to land bug 1227932 first.

Depends on: 1227932

Flags: needinfo?(catalin.badea392)

Cătălin Badea (:catalinb)

Assignee

Comment 40

•

9 years ago

Attached patch Part 1 - Continue service worker job queue when life cycle events expire. — Details — Splinter Review

rebased.

Attachment #8690130 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 41

•

9 years ago

Attached patch Part 2 - Remove set of scopes being updated from ServiceWorkerManager. — Details — Splinter Review

rebased

Attachment #8690131 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 42

•

9 years ago

Attached patch Part 3 - Use separate synchronization queues for service worker register jobs and install jobs. — Details — Splinter Review

rebased.

Attachment #8690132 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 43

•

9 years ago

Attached patch Part 4 - Fix race in test_install_event.html. — Details — Splinter Review

Attachment #8690144 - Attachment is obsolete: true

Cătălin Badea (:catalinb)

Assignee

Comment 44

•

9 years ago

Attached patch Part 5 - Fix race in skip-waiting.https.html and add some logging for SkipWaitingFlag in ServiceWorkerManager. — Details — Splinter Review

In test-skip-waiting.html it is possible to run all skipWaiting() calls
before the worker reaches "installing" state, with all of them failing.
This might cause the test to timeout. Filed 1189659 to fix the actual bug,
and tweaked the test to always pass for now.

Attachment #8692480 - Flags: review?(ehsan)

(no longer active)

Updated

•

9 years ago

Attachment #8692480 - Flags: review?(ehsan) → review+

Pulsebot

Comment 45

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/48ea5c3f2214
https://hg.mozilla.org/integration/mozilla-inbound/rev/8a417a349a61
https://hg.mozilla.org/integration/mozilla-inbound/rev/a4376ddc87b5
https://hg.mozilla.org/integration/mozilla-inbound/rev/7a96276122dc
https://hg.mozilla.org/integration/mozilla-inbound/rev/9424e969d623

Cătălin Badea (:catalinb)

Assignee

Comment 46

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=00c64507dfcb&selectedJob=14148485

Cătălin Badea (:catalinb)

Assignee

Comment 47

•

9 years ago

Comment on attachment 8692143 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

Approval Request Comment
Request for entire patch series (1-5)
[Feature/regressing bug #]: Service Workers
[User impact if declined]: SW will ship in 44 and this fixes a bug in common usage scenarios.
[Describe test coverage new/current, TreeHerder]: covered by web platform tests
[Risks and why]: Low, SW targeted change.
[String/UUID change made/needed]: None

Attachment #8692143 - Flags: approval-mozilla-aurora?

Carsten Book [:Tomcat]

Comment 48

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/48ea5c3f2214
https://hg.mozilla.org/mozilla-central/rev/8a417a349a61
https://hg.mozilla.org/mozilla-central/rev/a4376ddc87b5
https://hg.mozilla.org/mozilla-central/rev/7a96276122dc
https://hg.mozilla.org/mozilla-central/rev/9424e969d623

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox45: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla45

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Updated

•

9 years ago

status-firefox44: --- → affected

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 49

•

9 years ago

Comment on attachment 8692143 [details] [diff] [review]
Part 1 - Continue service worker job queue when life cycle events expire.

SW is planned for FF44 and these patches have been in Nightly for a few days, seems safe to uplift to Aurora44.

Attachment #8692143 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 50

•

9 years ago

Comment on attachment 8692144 [details] [diff] [review]
Part 2 - Remove set of scopes being updated from ServiceWorkerManager.

Aurora44+

Attachment #8692144 - Flags: approval-mozilla-aurora+

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 51

•

9 years ago

Comment on attachment 8692145 [details] [diff] [review]
Part 3 - Use separate synchronization queues for service worker register jobs and install jobs.

Aurora44+

Attachment #8692145 - Flags: approval-mozilla-aurora+

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 52

•

9 years ago

Comment on attachment 8692146 [details] [diff] [review]
Part 4 - Fix race in test_install_event.html.

Aurora44+

Attachment #8692146 - Flags: approval-mozilla-aurora+

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Comment 53

•

9 years ago

Comment on attachment 8692480 [details] [diff] [review]
Part 5 - Fix race in skip-waiting.https.html and add some logging for SkipWaitingFlag in ServiceWorkerManager.

Aurora44+

Attachment #8692480 - Flags: approval-mozilla-aurora+

Carsten Book [:Tomcat]

Comment 54

•

9 years ago

problems during uplift of part 2 

grafting 317032:8a417a349a61 "Bug 1189659 - Part 2 - Remove set of scopes being updated from ServiceWorkerManager. r=bkelly"
merging dom/workers/ServiceWorkerManager.cpp
merging dom/workers/ServiceWorkerPrivate.cpp
merging dom/workers/ServiceWorkerPrivate.h
warning: conflicts while merging dom/workers/ServiceWorkerManager.cpp! (edit, then use 'hg resolve --mark')

Flags: needinfo?(catalin.badea392)

Cătălin Badea (:catalinb)

Assignee

Comment 55

•

9 years ago

remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/fd9f740bc3f9
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/fb97e67dde7d
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/e0ad70c2b1ee
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/eacf4b25bd8a
remote:   https://hg.mozilla.org/releases/mozilla-aurora/rev/d57fc3a1dda5

status-firefox42: affected → wontfix

status-firefox43: --- → wontfix

status-firefox44: affected → fixed

Flags: needinfo?(catalin.badea392)

Carsten Book [:Tomcat]

Comment 56

•

9 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/fd9f740bc3f9
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/fb97e67dde7d
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/e0ad70c2b1ee
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/eacf4b25bd8a
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/d57fc3a1dda5

status-b2g-v2.5: --- → fixed

You need to log in before you can comment on or make changes to this bug.