Closed Bug 1217367 Opened 4 years ago Closed 4 years ago

Service workers update algorithm optimization

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox44 --- fixed
firefox45 --- fixed

People

(Reporter: dimi, Assigned: dimi)

References

Details

Attachments

(2 files, 5 obsolete files)

In bug 1207727, we rework sw update algorithm, the update mainly triggered in following two scenarios:
1. After end of a navigation fetch event
2. If last update time is over 24 hours, a functional event will trigger an update.

For case2, multiple fetch events could be triggered at the same time:
https://github.com/slightlyoff/ServiceWorker/issues/759

We should find a way to reduce update events triggered at the same time to improve performance.
Depends on: 1207727
The first few patches in Bug 1207727 have implemented the optimization approach.
But it may have an issue according to Bug 1207727 Comment 24.
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Maybe we should wait to more work here until observe that its a problem in practice.
Dimi, any chance you could work this one now?  Developers are noticing the excessive number of updates running now that they can see worker threads in about:debugging.
Flags: needinfo?(dlee)
No longer blocks: 1220308
See Also: → 1220308
(In reply to Ben Kelly [:bkelly] from comment #3)
> Dimi, any chance you could work this one now?  Developers are noticing the
> excessive number of updates running now that they can see worker threads in
> about:debugging.

Sure!!!
Flags: needinfo?(dlee)
Attached patch WIP patch (obsolete) — Splinter Review
Will do more test tomorrow.
Try run to make sure it won't corrupt anything
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f2bd740d42e
Hi Ben,
Could you help review this ? thanks !
Attachment #8686542 - Attachment is obsolete: true
Attachment #8687070 - Flags: review?(bkelly)
Comment on attachment 8687070 [details] [diff] [review]
Patch - update algorithm optimization v1

Review of attachment 8687070 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, but need to catch up to the ErrorResult changes to Fail() updates I just landed in inbound.

::: dom/workers/ServiceWorkerManager.cpp
@@ +1306,5 @@
> +      // from another object like a stream loader or something.
> +      // UpdateFailed may do something with that, so hold a ref to ourself since
> +      // FailCommon relies on it.
> +      // FailCommon does check for cancellation, but let's be safe here.
> +      callback->UpdateFailed(aRv);

This needs to be re-based.  I just landed bug 1217909 that refactored there to be only one Fail() method that takes an ErrorResult.

It may be tricky to account for, though.  The ErrorResult is consumable if it has a message like for a TypeError.  So we will need some way to clone the ErrorResult.
Attachment #8687070 - Flags: review?(bkelly)
Hi Ben,
This patch is based on your patch in bug 1225219
Attachment #8687070 - Attachment is obsolete: true
Attachment #8688912 - Flags: review?(bkelly)
Comment on attachment 8688912 [details] [diff] [review]
Patch - update algorithm optimization v2

Review of attachment 8688912 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.  Thanks!

::: dom/workers/ServiceWorkerManager.cpp
@@ +899,5 @@
>      , mJobType(REGISTER_JOB)
>      , mCanceled(false)
>    {
>      MOZ_ASSERT(mLoadGroup);
> +    mCallbacks.AppendElement(aCallback);

AssertIsOnMainThread();
MOZ_ASSERT(aCallback);

@@ +912,4 @@
>      , mJobType(UPDATE_JOB)
>      , mCanceled(false)
> +  {
> +    mCallbacks.AppendElement(aCallback);

AssertIsOnMainThread();
MOZ_ASSERT(aCallback);

@@ +923,5 @@
>  
>    void
> +  AppendCallback(ServiceWorkerUpdateFinishCallback* aCallback)
> +  {
> +    mCallbacks.AppendElement(aCallback);

AssertIsOnMainThread();
MOZ_ASSERT(aCallback)
MOZ_ASSERT(!mCallbacks.Contains(aCallback));

@@ +1111,5 @@
>    // Callers MUST hold a strong ref before calling this!
>    void
>    Fail(ErrorResult& aRv)
>    {
> +    MOZ_ASSERT(mCallbacks.Length());

AssertIsOnMainThread();

@@ +1148,5 @@
>  
> +    for (uint32_t i = 1; i < mCallbacks.Length(); ++i) {
> +      ErrorResult rv;
> +      aRv.CloneTo(rv);
> +      mCallbacks[i]->UpdateFailed(rv);

You need to rv.SuppressException() after calling UpdateFailed() in case the callback does not consume the ErrorResult.

@@ +1311,5 @@
>  
>    void
>    Succeed()
>    {
> +    MOZ_ASSERT(mCallbacks.Length());

AssertIsOnMainThread();
Attachment #8688912 - Flags: review?(bkelly) → review+
Would it be possible to add a test case that queues up like 5 subsequent .update() calls in a row?  Just to try to exercise this code?
Flags: needinfo?(dlee)
(In reply to Ben Kelly [:bkelly] from comment #10)
> Would it be possible to add a test case that queues up like 5 subsequent
> .update() calls in a row?  Just to try to exercise this code?

Sure, I will write a testcase.
One quick question, do we prefer wpt tests over mochitest ? or is there any criteria to decide which test to use ?
Flags: needinfo?(bkelly)
I guess I would use whatever already has a test case for updates, which I think is wpt.

Also, it would be nice to use a printed or something just to verify we enter this new path.
Flags: needinfo?(bkelly)
Attached patch Wpt testcase v1 (obsolete) — Splinter Review
Hi Ben,
This testcase will do the following:
1. trigger single update event
2. trigger multiple update events
3. trigger single update event again

I have added debug message locally and step2 did enter the new path.
Flags: needinfo?(dlee)
Attachment #8689391 - Flags: review?(bkelly)
Address Ben's comment
Attachment #8688912 - Attachment is obsolete: true
Attachment #8689392 - Flags: review+
Comment on attachment 8689391 [details] [diff] [review]
Wpt testcase v1

Review of attachment 8689391 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/web-platform/mozilla/tests/service-workers/service-worker/multiple-update.https.html
@@ +46,5 @@
> +      .then(function() {
> +          // Test triggering multiple update events at the same time.
> +          const burstUpdateCount = 10;
> +          for (var i = 0; i < burstUpdateCount; i++) {
> +            registration.update();

We should Promise.all() the result of these update() calls with the wait_for_update() below.  I tested this locally and it works.
Attachment #8689391 - Flags: review?(bkelly) → review+
Since we want to get uplifts this week, I'm going to land these patches today so they might make it to aurora tomorrow.
Dimi, it seems something here broke an existing update test:

  https://treeherder.mozilla.org/logviewer.html#?job_id=13923499&repo=try
Flags: needinfo?(dlee)
Updated with the Promise.all() I mentioned in the review feedback.
Attachment #8689391 - Attachment is obsolete: true
Attachment #8689602 - Flags: review+
Dimi, hopefully those test failures are just places where we need to explicitly wait for updates so that things don't get coalesced when we actually want multiple updates.  Sorry I didn't have a chance to investigate today.
Hi Ben,
I found the error is kind of timing issue in testcase "update-after-oneday".
In that testcase we will do the following:
1.Trigger an navigation fetch event
2.Wait onupdatefound event
3.Trigger an non-navigation fetch event
4.Wait onupdatefound event

But between step2 and step3 I didn't explicitly wait for "installed" state change. So if the first update job is still in the updating process and the second update is triggered. It might end up in somewhere like
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp#3537
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp#3544
And then step4 will never be executed.

Running try now to see if everything works fine.
Attachment #8689392 - Attachment is obsolete: true
Flags: needinfo?(dlee)
Attachment #8689954 - Flags: review?(bkelly)
(In reply to Dimi Lee[:dimi][:dlee] from comment #21)
> Created attachment 8689954 [details] [diff] [review]
> Patch - update algorithm optimization v4
> 
> Hi Ben,
> I found the error is kind of timing issue in testcase "update-after-oneday".
> In that testcase we will do the following:
> 1.Trigger an navigation fetch event
> 2.Wait onupdatefound event
> 3.Trigger an non-navigation fetch event
> 4.Wait onupdatefound event
> 
> But between step2 and step3 I didn't explicitly wait for "installed" state
> change. So if the first update job is still in the updating process and the
> second update is triggered. It might end up in somewhere like
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/
> ServiceWorkerManager.cpp#3537
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/
> ServiceWorkerManager.cpp#3544
> And then step4 will never be executed.
> 
> Running try now to see if everything works fine.

This patch only modify the wpt testcase.
(In reply to Dimi Lee[:dimi][:dlee] from comment #21)
> Created attachment 8689954 [details] [diff] [review]
> Patch - update algorithm optimization v4
> 
> But between step2 and step3 I didn't explicitly wait for "installed" state
> change. So if the first update job is still in the updating process and the
> second update is triggered. It might end up in somewhere like
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/
> ServiceWorkerManager.cpp#3537
> https://dxr.mozilla.org/mozilla-central/source/dom/workers/
> ServiceWorkerManager.cpp#3544
> And then step4 will never be executed.

Sorry correct one thing in my previous comment, it will end up in
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp#3537
NOT https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp#3544
Attachment #8689954 - Flags: review?(bkelly) → review+
Try looks relatively green.  I'm going to try to land this morning for Dimi.
Comment on attachment 8689954 [details] [diff] [review]
Patch - update algorithm optimization v4

Approval Request Comment
[Feature/regressing bug #]: Service workers
[User impact if declined]: We are trying to ship service workers in 44.  This bug fixes a small issue with our service worker update code.
[Describe test coverage new/current, TreeHerder]: Existing update tests and a new test in this bug.
[Risks and why]: Minimal.  Only affects service workers.
[String/UUID change made/needed]: None.
Attachment #8689954 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/62f20f51f687
https://hg.mozilla.org/mozilla-central/rev/68f4222bbc8a
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Comment on attachment 8689954 [details] [diff] [review]
Patch - update algorithm optimization v4

SWs is planned for FF44 and the patch has automated tests which is great. Let's uplift to Aurora44.
Attachment #8689954 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Hi, this failed to apply to aurora:

grafting 316004:62f20f51f687 "Bug 1217367 - Service workers update algorithm optimization. r=bkelly"
merging dom/workers/ServiceWorkerManager.cpp
warning: conflicts during merge.
merging dom/workers/ServiceWorkerManager.cpp incomplete! (edit conflicts, then use 'hg resolve --mark')
merging dom/workers/ServiceWorkerManager.h
warning: conflicts during merge.
merging dom/workers/ServiceWorkerManager.h incomplete! (edit conflicts, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
(use hg resolve and hg graft --continue)

could you take a look, thanks!
Flags: needinfo?(dlee)
Hello, I'm trying to understand this bug, since it overlaps considerably with the changes from bug 1189659.

First, why are we coalescing multiple updates? Shouldn't a sequence of say 5 registration.update() calls result in 5 different updates if the server returns different scripts for each request?

The spec and the referred issue[1] mention that Soft Update shouldn't invoke Update if the registration
queue is not empty, but registration.update() doesn't go through Soft Update at all. Note that in Gecko, update() calls into SWM::SoftUpdate()[2]. 

Next, I suspect update() won't resolve if called while the service worker is installing. Consider the following:
1. register("sw.js")
2. before the service worker's state becomes installing, we resolve the registration promise
3. registration.update()
4. At this point, SoftUpdate() will find the registration->IsUpdating() true, since we don't unset mUpdateJob after calling Succeed() in ServiceWorkerRegisterJob::ContinueInstall()
.. update never resolves.


[1] https://github.com/slightlyoff/ServiceWorker/issues/759
[2] https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerRegistration.cpp?from=ServiceWorkerRegistration.cpp#269
Flags: needinfo?(ehsan)
Flags: needinfo?(dlee)
> 1. register("sw.js")
> 2. before the service worker's state becomes installing, we resolve the
> registration promise
> 3. registration.update()
> 4. At this point, SoftUpdate() will find the registration->IsUpdating()
> true, since we don't unset mUpdateJob after calling Succeed() in
> ServiceWorkerRegisterJob::ContinueInstall()
> .. update never resolves.
The example is wrong because register() doesn't set mUpdateJob, but this can still happen if at step 1
we call update() on an existing registration and then call update again while the worker is "installing".
(In reply to Cătălin Badea (:catalinb) from comment #33)
> The example is wrong because register() doesn't set mUpdateJob, but this can
> still happen if at step 1
> we call update() on an existing registration and then call update again
> while the worker is "installing".
If we call |registration.update| while the worker is "installing", I think we will return at |if (registration->mInstallingWorker)|[1] which cause update never resolve. Maybe we should handle this case here?

[1]https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerManager.cpp#3591
(In reply to Cătălin Badea (:catalinb) from comment #32)
> Hello, I'm trying to understand this bug, since it overlaps considerably
> with the changes from bug 1189659.
> 
> First, why are we coalescing multiple updates? Shouldn't a sequence of say 5
> registration.update() calls result in 5 different updates if the server
> returns different scripts for each request?
The reason to coalesce soft update is sometimes multiple updates are triggered at the same time because of new update algorithm - bug 1207727. For example, if last update time is over 86400 sec, next time we refresh the page it will trigger a soft update for each fetch event.

> The spec and the referred issue[1] mention that Soft Update shouldn't invoke
> Update if the registration
> queue is not empty, but registration.update() doesn't go through Soft Update
> at all. Note that in Gecko, update() calls into SWM::SoftUpdate()[2]. 
So you are saying if the caller of update is registration.update, we should not coalesce this update with updates called by SoftUpdate.
Flags: needinfo?(dlee)
(In reply to Dimi Lee[:dimi][:dlee] from comment #35)
> (In reply to Cătălin Badea (:catalinb) from comment #32)
> > Hello, I'm trying to understand this bug, since it overlaps considerably
> > with the changes from bug 1189659.
> > 
> > First, why are we coalescing multiple updates? Shouldn't a sequence of say 5
> > registration.update() calls result in 5 different updates if the server
> > returns different scripts for each request?
> The reason to coalesce soft update is sometimes multiple updates are
> triggered at the same time because of new update algorithm - bug 1207727.
> For example, if last update time is over 86400 sec, next time we refresh the
> page it will trigger a soft update for each fetch event.
> 
> > The spec and the referred issue[1] mention that Soft Update shouldn't invoke
> > Update if the registration
> > queue is not empty, but registration.update() doesn't go through Soft Update
> > at all. Note that in Gecko, update() calls into SWM::SoftUpdate()[2]. 
> So you are saying if the caller of update is registration.update, we should
> not coalesce this update with updates called by SoftUpdate.
Yes, registration.update() shouldn't be coalesced at all. And for Soft Update, we don't really
need to keep track of its UpdateFinishedCallback. The spec doesn't require us to resolve a promise
and what we're currently doing is passing an empty callback, so no point in having an array of noop callbacks. I think the correct solution would be:
First, just keep a flag on mRegistration that it's currently being updated. In SoftUpdate, if the flag
is set don't do anything. Have registration.update() queue an update job without calling into SoftUpdate().

Does this make sense?^
Flags: needinfo?(ehsan)
(In reply to Cătălin Badea (:catalinb) from comment #36)
> Yes, registration.update() shouldn't be coalesced at all. And for Soft
> Update, we don't really
> need to keep track of its UpdateFinishedCallback. The spec doesn't require
> us to resolve a promise
> and what we're currently doing is passing an empty callback, so no point in
> having an array of noop callbacks. I think the correct solution would be:
> First, just keep a flag on mRegistration that it's currently being updated.
> In SoftUpdate, if the flag
> is set don't do anything. Have registration.update() queue an update job
> without calling into SoftUpdate().
> 
> Does this make sense?^
Yes, I think this is the right way.
Also thanks for create the bug and fix it!
You need to log in before you can comment on or make changes to this bug.