Closed Bug 1217909 Opened 9 years ago Closed 9 years ago

exceptions during ServiceWorker fetch event handler are not logged to console

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox43 --- wontfix
firefox44 --- fixed
firefox45 --- fixed
b2g-v2.5 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

(Blocks 1 open bug)

Details

Attachments

(11 files, 18 obsolete files)

10.61 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
7.04 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
16.21 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
4.97 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
2.54 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
3.52 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
1.21 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
15.71 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
5.04 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
1.33 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
1.71 KB, patch
catalinb
: review+
Details | Diff | Splinter Review
So, figuring out stuff like bug 1217501 would be a lot easier if we logged exceptions thrown during SW event handlers.  We should maybe do that.

I believe we manually need to do this in SW by checking the internal WidgetEvent's mExceptionHasBeenRisen here:

  https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerPrivate.cpp?case=true&from=DispatchFetchEvent#1068
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Depends on: 1215140
Note, we need to handle exceptions during dispatch of ExtendableEvents as well.
Also, throwing an exception in the FetchEvent handler should cancel the network load.  We don't currently do that.
Attached patch work in progress (obsolete) — Splinter Review
So, it looks like there are no pending exception on the JSContext.  I assume this is because the event handler cleared it.

Boris, any ideas how I can keep the default event handling from reporting it into oblivion and then clearing the exception?

I'll dig through the event code tomorrow, but any suggestions welcome.  Thanks!
Flags: needinfo?(bzbarsky)
> I assume this is because the event handler cleared it.

Yes, of course.  That's what the spec for events says to do.  The exceptions are reported as soon as control returns to event dispatch.

> any ideas how I can keep the default event handling from reporting it into oblivion and
> then clearing the exception?

You can't prevent the latter.  Here's why.  Say you have an event target that has three listeners for the same event: a, b, c, in that order.  "a" and "c" throw, "b" does not.  The event is dispatched.  The correct behavior is to run all three listeners and report exceptions from "a" and "c".  That means the exception from "a" _must_ be reported and cleared off the JSContext before you start running "b".

The mExceptionHasBeenRisen [sic] flag is just there to tell you that one or more listeners in fact threw and got their exception reported.  That's all it can do for you.

What you _can_ do, I would hope, is to make the reporting actually log somewhere useful, not "into oblivion".  That means storing enough information in the data structures the error reporter gets to work with that it can do the right thing.
Flags: needinfo?(bzbarsky)
Oh, and I guess the point is that digging through the event code is not going to be useful here.

What is useful is figuring out how to associate the right information with the exceptions.  The problem, I guess, is that we have a single global here but the error reporting needs to go not to a "place" associated with that global (as it is everywhere else in the platform) but rather to a "place" associated with the code that dispatched the event, right?

Can this code reenter?  That is, can one of the event listeners do something that triggers another, nested, event dispatch?  Or anything else that will end up reporting errors?  Or can we assume that any error report during this event dispatch should in fact be reported on mInterceptedChannel?
Flags: needinfo?(bkelly)
Looks like this is related to the same problem as bug 1189681.
Attached patch work in progress (obsolete) — Splinter Review
This now fails the network request when FetchEvent handler throws.  It also logs the exception.

Needs cleanup, rebase after bug 1215140 review comments, and then logging for other SW events.
Attachment #8679770 - Attachment is obsolete: true
Flags: needinfo?(bkelly)
Attached patch work in progress (obsolete) — Splinter Review
Partial rebase on other bugs that are ahead of this in my queue.  Not fully working again as I want to rewrite how ConsoleReportCollector supports both localized and non-localized reports.
Attachment #8680251 - Attachment is obsolete: true
This patch goes in a different direction from my original WIP here.  I think its important to have a good catch-all solution for error reporting, so this patch makes any unhandled exception report to every controlled document's console.

So this will cover:

* throwing in a FetchEvent
* throwing in something like setTimeout()
* throwing in install event when updating and an existing client exists

I still need to deal with these cases:

* throwing in install event when there is no controlled document (first register)
* throwing in activate event

I will tackle these in follow-up patches here.
Attachment #8681501 - Attachment is obsolete: true
Attachment #8681646 - Flags: review?(catalin.badea392)
Updated patch to fix an error revealed by try.

New try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=86905099c057
Attachment #8681646 - Attachment is obsolete: true
Attachment #8681646 - Flags: review?(catalin.badea392)
Attachment #8681671 - Flags: review?(catalin.badea392)
Fix a stale assert in the patch.

Also, this is still failing an openWindow() test.  Dropping flag until I can figure that out.  Sorry for the noise here.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f141847c13ab
Attachment #8681671 - Attachment is obsolete: true
Attachment #8681671 - Flags: review?(catalin.badea392)
Comment on attachment 8681694 [details] [diff] [review]
P1 Report service worker exceptions to controlled documents. r=catalinb

Ok, the openWindow() error is gone in the last try.  Looks green to me.  I think it must have been something fixed in my update to latest m-c.

So flagging for review again.  Sorry for all the flag spam.
Attachment #8681694 - Flags: review?(catalin.badea392)
Blocks: 1220757
This will report errors back to the original .register() document.  With P1 it will give a duplicate error report, but I plan to fix that in a later patch.
Attachment #8682173 - Flags: review?(catalin.badea392)
So, it seems we are getting duplicate error types in the message.  Like:

  SyntaxError: SyntaxError: missing ) after argument list

This is because we get the Error value from the worker context, convert it to a message string, pass to main thread, and then convert it back to an Error value again.  The result is we end up with the type name applied twice.

This patch is kind of a hack to avoid this by stripping the type name before convert back to an Error object.  Open to better ways to avoid this.
Attachment #8682182 - Flags: review?(bzbarsky)
This reports if an error occurs during a soft update triggered from the main thread.  Note, this also correctly fires the rejection for an exception.  Previously we were just ignoring these errors completely!

Some code duplication here with the other callback object.  I will look into factoring it out in a later patch.
Attachment #8682188 - Flags: review?(catalin.badea392)
Comment on attachment 8681694 [details] [diff] [review]
P1 Report service worker exceptions to controlled documents. r=catalinb

Review of attachment 8681694 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm

::: dom/workers/ServiceWorkerManager.cpp
@@ +2378,5 @@
>    }
>  
> +  // Report errors to the consoles of every controlled document
> +  for (auto iter = mControlledDocuments.Iter(); !iter.Done(); iter.Next()) {
> +    ServiceWorkerRegistrationInfo *reg = iter.UserData();

nit: coding style
Attachment #8681694 - Flags: review?(catalin.badea392) → review+
Attached patch P1 interdiff 001 refactor (obsolete) — Splinter Review
A bit of refactoring to the P1 patch.
Attachment #8682696 - Flags: review?(catalin.badea392)
Attachment #8682696 - Flags: review?(catalin.badea392) → review+
Comment on attachment 8682188 [details] [diff] [review]
P4 Report exceptions during main thread soft update. r=catalinb

Review of attachment 8682188 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/ServiceWorkerRegistration.cpp
@@ +659,5 @@
>    nsCOMPtr<nsIDocument> doc = GetOwner()->GetExtantDoc();
>    MOZ_ASSERT(doc);
>  
>    RefPtr<MainThreadUpdateCallback> cb =
> +    new MainThreadUpdateCallback(GetOwner(), promise);

We should check GetOwner() for null. This can return null if we call update on an object from a window that was closed:
open a window or an iframe, keep a reference to the object, close the window.

Filed bug 1221351 because at least register crashes this way.
Attachment #8682188 - Flags: review?(catalin.badea392) → review+
Attachment #8682173 - Flags: review?(catalin.badea392) → review+
(In reply to Cătălin Badea (:catalinb) from comment #19)
> We should check GetOwner() for null. This can return null if we call update
> on an object from a window that was closed:
> open a window or an iframe, keep a reference to the object, close the window.

In this particular case the ServiceWorkerRegistrationMainThread::Update() code already checks for nullptr GetOwner() with this code at the top of the method:

  nsCOMPtr<nsIGlobalObject> go = do_QueryInterface(GetOwner());
  if (!go) {
    aRv.Throw(NS_ERROR_FAILURE);
    return nullptr;
  }
Attachment #8681694 - Attachment is obsolete: true
Attachment #8682696 - Attachment is obsolete: true
Attachment #8683353 - Flags: review+
Comment on attachment 8682182 [details] [diff] [review]
P3 Prevent duplicate error type names in message by stripping existing error ty pe. r=bz

r=me I guess, but it seems like it would be better to avoid adding the prefix when creating the message string...
Attachment #8682182 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] from comment #22)
> r=me I guess, but it seems like it would be better to avoid adding the
> prefix when creating the message string...

Unfortunately the prefix is added many layers up the stack during worker error reporting.  I would either have to plumb a service worker specific path up through all these layers or duplicate the formatting code in many places (hoping I don't miss one).  I think this complicates the code less at this point.
Patch to properly handle exceptions during worker-triggered update() calls.  This is ugly.
I think I may change this around to use the console service support being added in bug 1213932.  That would avoid some extra round trips to the main thread and also make it more consistent which windows these exceptions show up in.
Depends on: 1213932
No longer depends on: 1213932
Updated not to log as part of .register() error handling.  I have a better way to do it consistently in ReportToAllClients().
Attachment #8682173 - Attachment is obsolete: true
Attachment #8684449 - Flags: review+
Remove error reporting during main thread .update().
Attachment #8682188 - Attachment is obsolete: true
Attachment #8684450 - Flags: review+
I will remove the code duplication for rejecting the promise with an ErrorObject in a later patch.
Attachment #8683893 - Attachment is obsolete: true
Attachment #8684451 - Flags: review?(catalin.badea392)
Track documents that call .register() for a particular scope so we can report subsequent service worker errors back to them.

This is necessary for a number reasons.  It handles error reporting for windows that registers SWs that won't control that window.  It also deals with the problem that the "activate" event handler essentially runs with no controlled documents and after .register() resolves.  That means for activate exceptions we can't report via controlled documents and we can't report via the update job either.

Whenever .register() is called again for a scope we check and remove dead documents from the list.
Attachment #8684455 - Flags: review?(catalin.badea392)
Blocks: 1222720
While writing this test I realized that any uncaught exception during script evaluation should cause the .update() to reject with TypeError, not the actual exception.  So I can remove a lot of code.

I'm going to drop the other flags until I can update the patches to reflect this.
Attachment #8684451 - Flags: review?(catalin.badea392)
The P7 patch is still reviewable, though.  Its unrelated to the issue in comment 32.
Comment on attachment 8684449 [details] [diff] [review]
P2 Report errors back to the document that called .register(). r=catalinb

Actually, this patch is completely empty after making .register() and .update() not report to console themselves.
Attachment #8684449 - Attachment is obsolete: true
(In reply to Ben Kelly [:bkelly] from comment #31)
> Created attachment 8684455 [details] [diff] [review]
> P7 Track registering documents as weak reference so SWM can report errors to
> th em. r=catalinb
> 
> Track documents that call .register() for a particular scope so we can
> report subsequent service worker errors back to them.
> 
> This is necessary for a number reasons.  It handles error reporting for
> windows that registers SWs that won't control that window.  It also deals
> with the problem that the "activate" event handler essentially runs with no
> controlled documents and after .register() resolves.  That means for
> activate exceptions we can't report via controlled documents and we can't
> report via the update job either.
> 
> Whenever .register() is called again for a scope we check and remove dead
> documents from the list.

Wouldn't it be better to just report errors to all documents that match the registration's scope,
regardless if they're controlled or not?
(In reply to Cătălin Badea (:catalinb) from comment #35)
> Wouldn't it be better to just report errors to all documents that match the
> registration's scope,
> regardless if they're controlled or not?

There are two issues with this:

1) It wouldn't show registration errors if foo.com/index.html calls register('foo.com/a/sw.js', { scope: './a').  The page calling register() is not in scope of the SW.
2) It would require iterating over all windows, frames, and workers to find potential clients.  The windows seems ok, but iterating over every frame in every window seems like overkill to me.
Attachment #8684455 - Flags: review?(catalin.badea392) → review+
Refactor the Job failure code to use ErrorResult instead of nsresult.  This lets the js exception handling code fail with a TypeError directly.

Also, convert other internal failure codes to TypeError as well.  Per the spec we should only be rejecting register()/update() with either a SecurityErr or TypeErr.

As a nice side effect, this patch makes the old P3, P4, P5, and P6 patches unnecessary.
Attachment #8682182 - Attachment is obsolete: true
Attachment #8684450 - Attachment is obsolete: true
Attachment #8684451 - Attachment is obsolete: true
Attachment #8684452 - Attachment is obsolete: true
Attachment #8684950 - Flags: review?(catalin.badea392)
Update the WPT tests to verify we reject .update() correctly.
Attachment #8684593 - Attachment is obsolete: true
Attachment #8684951 - Flags: review?(catalin.badea392)
Sorry, one more small patch to test when the fetch event handler throws.  That was something I fixed in P1.
Attachment #8684956 - Flags: review?(catalin.badea392)
Updated to fix some build bustage on non-windows:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c0ad7ace636a
These two patches should fix the orange in the try.  The tests were expecting the wrong error types since the spec now says to through TypeError for any exceptions thrown during script evaluation.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6d8ed19a9e50
Attachment #8685080 - Flags: review?(catalin.badea392)
I have one more case I need to solve here:

1) Window is on an uncontrolled origin
2) User navigates to a controlled document
3) The SW fetch event throws before respondWith() completes
4) The resulting console report is not shown because the window is not controlled until after the document content is first returned to nsDocShell.

I think the fix here might be to track in progress navigations for a given worker scope.  Unfortunately iterating every frame in every window is still seems too much to me.
P8 and P9 track intercepted navigations by scope in the SWM.  We can then report exceptions to these windows even though they don't contain a controlled document yet.

I also plan to use this data in bug 1223116.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3c3b19be4883
Attachment #8685740 - Flags: review?(catalin.badea392)
Catalin, sorry for all these patches.  Please feel free to pass any of them off to Andrea.  Thanks!
Sorry, one more small patch to cleanup the nsTArray after the last registering document is removed for a particular scope.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f83ddf45596e
Attachment #8685748 - Flags: review?(catalin.badea392)
A small patch to check IsCurrentActiveDocument() before reporting to a window.  This is necessary because a document can be attached to a window, but inactive in the bfcache.  We shouldn't spam the console just because a SW-controlled window is in the user's history.
Attachment #8686988 - Flags: review?(catalin.badea392)
Attachment #8684950 - Flags: review?(catalin.badea392) → review+
Attachment #8684951 - Flags: review?(catalin.badea392) → review+
Attachment #8684956 - Flags: review?(catalin.badea392) → review+
Attachment #8685078 - Flags: review?(catalin.badea392) → review+
Attachment #8685080 - Flags: review?(catalin.badea392) → review+
Attachment #8685737 - Flags: review?(catalin.badea392) → review+
Attachment #8685740 - Flags: review?(catalin.badea392) → review+
Attachment #8685748 - Flags: review?(catalin.badea392) → review+
Attachment #8686988 - Flags: review?(catalin.badea392) → review+
Comment on attachment 8683353 [details] [diff] [review]
P1 Report service worker exceptions to controlled documents. r=catalinb

For patches P1 to P11.  All the patches.

Approval Request Comment
[Feature/regressing bug #]: Service workers
[User impact if declined]: This bug is important for properly handling certain errors in service workers.  For example, if an exception is thrown during a fetch event we need to abort the network request.  That doesn't happen without these patches.  We also report exceptions back to the console so developers can figure out why their service worker isn't functioning.
[Describe test coverage new/current, TreeHerder]: Includes web-platform-tests for improved error handling.  Local testing for reporting errors to console.
[Risks and why]: Minimal.  Only affects service workers.
[String/UUID change made/needed]: None.
Attachment #8683353 - Flags: approval-mozilla-aurora?
Will wait for these patches to land on m-c before approving aurora uplift requests.
Comment on attachment 8683353 [details] [diff] [review]
P1 Report service worker exceptions to controlled documents. r=catalinb

SW is planned for FF44, taking it.
Attachment #8683353 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8684948 [details] [diff] [review]
P2 Track registering documents as weak reference so SWM can report errors to th em. r=catalinb

Aurora44+
Attachment #8684948 - Flags: approval-mozilla-aurora+
Comment on attachment 8684950 [details] [diff] [review]
P3 Refactor service worker register()/update() to reject only with SecurityErr or TypeErr. r=catalinb

Aurora44+
Attachment #8684950 - Flags: approval-mozilla-aurora+
Comment on attachment 8684951 [details] [diff] [review]
P4 Extend wpt tests to verify update() promise values for different script fail ures. r=catalinb

Aurora44+
Attachment #8684951 - Flags: approval-mozilla-aurora+
Comment on attachment 8684956 [details] [diff] [review]
P5 Add wpt test case for fetch event handlers that throw. r=catalinb

Aurora44+
Attachment #8684956 - Flags: approval-mozilla-aurora+
Comment on attachment 8685078 [details] [diff] [review]
P6 Fix wpt registration.https.html to expect TypeError for script evaluation errors. r=catalinb

Aurora44+
Attachment #8685078 - Flags: approval-mozilla-aurora+
Comment on attachment 8685080 [details] [diff] [review]
P7 Fix mochitest to expect TypeError when serviceWorker.register() rejects. r=catalinb

Aurora44+
Attachment #8685080 - Flags: approval-mozilla-aurora+
Comment on attachment 8685737 [details] [diff] [review]
P8 Track navigation interceptions per scope in ServiceWorkerManager. r=catalinb

Aurora44+
Attachment #8685737 - Flags: approval-mozilla-aurora+
Comment on attachment 8685740 [details] [diff] [review]
P9 Report exceptions to windows performing an intercepted navigation. r=catalinb

Aurora44+
Attachment #8685740 - Flags: approval-mozilla-aurora+
Comment on attachment 8685748 [details] [diff] [review]
P10 Remove stale nsTArray when the last registering document for a scope is removed. r=catalinb

Aurora44+
Attachment #8685748 - Flags: approval-mozilla-aurora+
Comment on attachment 8686988 [details] [diff] [review]
P11 Only report errors to documents that are active and not in the bfcache.  r=catalinb

Aurora44+
Attachment #8686988 - Flags: approval-mozilla-aurora+
I'm hitting conflicts uplifting this to aurora. Can we get rebased patches?
Flags: needinfo?(bkelly)
Yea, I'll push it to aurora.  Thanks!
I need bug 1220007 uplifted in order to uplift this.  Sorry, I didn't realize it hadn't been uplifted yet.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: