Closed Bug 1167553 Opened 6 years ago Closed 5 years ago

Timeout-related Service Worker shutdowns should be reported

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox41 --- affected
firefox50 --- fixed

People

(Reporter: flaki, Assigned: asuth)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tw-dom])

Attachments

(1 file)

Just as there should be a way to stop Service Workers (https://bugzilla.mozilla.org/show_bug.cgi?id=1164831), Gecko should be able to shut down long-running Service Workers, too.

Service Workers are shut down from time to time by the runtime, that is a natural and expected occurence, but certain events could extend the period (via waitUntil() promises) which request the runtime not to shut down that particular SW until their operation has finished.

There is ongoing discussion what the accepted/recommended lifetime limit to keeping Service Worker scripts alive should be (https://github.com/slightlyoff/ServiceWorker/issues/679), but eventually some scripts should be killed by the runtime.
It would be ideal, if these kinds of shutdowns were somehow logged & presented to the developer (as simply as maybe a standard log message in the developer console).

Relevant Chrome bug: https://code.google.com/p/chromium/issues/detail?id=457968
I don't think we do any kind of timeout for SWs with or without waitUntil() right now.  We need to fix that first.
We have a grace timer now and do indeed kill the service worker.  So we should look at reporting these timeouts to the console.
Whiteboard: [tw-dom]
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Builds on the bug 1267473 patch and the revised mime type one.  Log looks like:

Terminating ServiceWorker for scope 
‘http://mochi.test:8888/tests/dom/workers/test/serviceworkers/’ with
pending waitUntil/respondWith promises because of grace timeout.

Better phrasing very possible.  We could also easily include the mTokenCount to provide a figure on how many outstanding things were broken, although it's not clear that's super useful.  Some attempt to list the outstanding WakeUpReason values is possible but seems to require some new book-keeping.  Keeping the most recent WakeUpReason around as a best-effort explanation may be viable, however.  (And sufficiently correct based on how there's only a single timer used.)  However, since intercepted fetches do seem to properly log to the impacted document (the test in fact expects the InterceptionFailedWithURL from bug 1215140), I think those specific errors in conjunction with the scope-wide "hey we had to kill the ServiceWorker" that explains the why of those errors might be enough.
Attachment #8769043 - Flags: review?(bkelly)
Comment on attachment 8769043 [details] [diff] [review]
sw-timeout-logging-v1.diff

Review of attachment 8769043 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  r=me with comments addressed.

::: dom/workers/ServiceWorkerPrivate.cpp
@@ +1902,5 @@
> +  // ensure we don't get invoked even if the nsTimerEvent is in the event queue.
> +  ServiceWorkerManager::LocalizeAndReportToAllClients(
> +    serviceWorkerPrivate->mInfo->Scope(),
> +    "ServiceWorkerGraceTimeoutTermination",
> +    nsTArray<nsString> {});

What is the nsTArray<nsString> {} syntax here?  I would have expected () instead of curly braces for the default constructor.

::: dom/workers/test/serviceworkers/test_error_reporting.html
@@ +136,5 @@
> + * expiration of its idle timeout ("idle_timeout").
> + *
> + * This test case cribs from test_unresolved_fetch_interception.html, including
> + * reusing its service worker that claims this page and never resolves fetch
> + * events in order to keep the SW alive long enough to get killed by timeout.

Should we just remove test_unresolved_fetch_interception.html in favor of this test?  Seems this one is a complete superset of that test now.

::: dom/workers/test/serviceworkers/unresolved_fetch_worker.js
@@ +9,5 @@
>             })
>    );
>  
> +  // Never resolve, and keep it alive on our global so it can't get GC'ed and
> +  // make this test weird and intermittent.

Nice catch!
Attachment #8769043 - Flags: review?(bkelly) → review+
(In reply to Ben Kelly [:bkelly] from comment #4)
> ::: dom/workers/ServiceWorkerPrivate.cpp
> @@ +1902,5 @@
> > +  // ensure we don't get invoked even if the nsTimerEvent is in the event queue.
> > +  ServiceWorkerManager::LocalizeAndReportToAllClients(
> > +    serviceWorkerPrivate->mInfo->Scope(),
> > +    "ServiceWorkerGraceTimeoutTermination",
> > +    nsTArray<nsString> {});
> 
> What is the nsTArray<nsString> {} syntax here?  I would have expected ()
> instead of curly braces for the default constructor.

I was just trying to standardize on the cool initializer list syntax to ease cargo culting in the future, but you're right, this looks weirder than it needs to and () would have been better.  Happily this will turn back into { serviceWorkerPrivate->mInfo->Scope() } when we stop magically cramming that in.
 
> ::: dom/workers/test/serviceworkers/test_error_reporting.html
> Should we just remove test_unresolved_fetch_interception.html in favor of
> this test?  Seems this one is a complete superset of that test now.

I don't know.  If you regress the grace timer killbot, arguably it's more obvious what's going on if "test_unresolved_fetch_interception.html" breaks in addition to "test_error_reporting.html".  And it's not clear that the spec is going to change in a way that breaks the former to cause increased maintenance.

I could push the reusable bits of test_error_reporting.html into a common "error_reporting_test_helper.js" or something like that and then have the "grace_timeout_termination_with_interrupted_intercept" function replace the existing logic in test_unresolved_fetch_interception.html so we get the benefit of separate tests but no duplication.

(This is somewhat what I was getting at in bug 1229156 comment 2 where I wasn't sure what the right balance is.  I've erred on the side of super tests in the past with large, powerful helper libs that were very helpful to me in keeping code size down but somewhat terrifying to newcomers.)
Pushed by bugmail@asutherland.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f3a1648af4e8
Timeout-related Service Worker shutdowns should be reported. r=bkelly
https://hg.mozilla.org/mozilla-central/rev/f3a1648af4e8
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in before you can comment on or make changes to this bug.