Closed Bug 1288768 Opened 3 years ago Closed 3 years ago

Investigate correctness of throwing ThrowDOMException for blocked resources within dom/workers/ScriptLoader.cpp

Categories

(Core :: DOM: Workers, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: ckerschb, Assigned: baku)

References

Details

Attachments

(2 files, 2 obsolete files)

When working on XCTO (Bug 471020) I realized that some web platform tests are failing because of our worker behavior. In particular:
> testing/web-platform/meta/fetch/nosniff/worker.html.ini

While the individual subtests within that test are passing, the overall state is still 'error'. This is due to this exception [1]. We are using 'expected: ERROR' for a lot of other worker related web platform tests where subtests are passing but the overall state of the harness encounters an error.

We should investigate that behavior and potentially fix if necessary.

[1] https://dxr.mozilla.org/mozilla-central/source/dom/workers/ScriptLoader.cpp#2169
Flags: needinfo?(amarchesini)
I'm happy to take this bug, but I wonder if bz has an opinion on it. bz, recently, you changed how errors are reported in workers, so I wonder if this an known issue and if you have a plan about how to fix it. What I see is that the error is reported twice: window.onerror and worker.onerror.
Flags: needinfo?(amarchesini)
Flags: needinfo?(bzbarsky)
The link in comment 0 is to ... nowhere useful.  It's not using a permalink form of URI, so presumably the code changed sometime in the last two weeks and dxr has finally reindexed?

In any case, looking at the nosniff/worker.html test it tries to load a worker with nosniff headers and some non-script type.  Looking at the spec, this corresponds to https://html.spec.whatwg.org/multipage/workers.html#run-a-worker which in step 9 attempts to obtain the script data.  The load fails, so the fetch returns null (if you dig through "Fetch a classic worker script"), so the spec says:

    If the algorithm asynchronously completes with null, queue a task to fire
    a simple event named error at worker, and abort these steps.

So we should get an error event on the worker for sure.  I don't think we should be getting one on the window.

Now the problem, presumably, is that we convert load failures to exceptions.  And exceptions inside a worker are treated _differently_ in the spec.  See https://html.spec.whatwg.org/multipage/workers.html#runtime-script-errors-2 which says:

    For dedicated workers, if the error is still not handled afterwards
    [after firing error on the global scope of the worker], the user agent
    must queue a task to fire a trusted event that uses the ErrorEvent
    interface, with the name error, that doesn't bubble and is cancelable,
    with its message, filename, lineno, colno, attributes initialised
    appropriately, and with the error attribute initialised to null,
    at the Worker object associated with the worker. If the event is
    not canceled, the user agent must act as if the uncaught runtime
    script error had occurred in the global scope that the Worker object
    is in, thus repeating the entire runtime script error reporting
    process one level up.

which is what we end up doing (though note that the spec is broken around this stuff right now; filed <https://github.com/whatwg/html/issues/1607>).  So the right answer if we want to follow the spec is presumably to stop converting load errors into exceptions and instead have a different mechanism for converting them into a single error event fired at the worker itself.

This is not a known issue in the sense that I knew about it; I assumed the general behavior was correct and simply refactored it to be a bit more explicit without really changing the "everything is converted to a JS exception" thing.  I do not have a concrete plan to fix it past what I described above.
Flags: needinfo?(bzbarsky)
I guess what we could do is that in CompileScriptRunnable::WorkerRun we could detect the case when rv is not a JS exception (which would correspond to some sort of load failure), and instead of putting it on the JSContext we could just directly post something to fire an error event on the worker.
Priority: -- → P2
(In reply to Andrea Marchesini [:baku] from comment #1)
> I'm happy to take this bug

Does your offer still stand given comments 2 and 3? :)
Flags: needinfo?(amarchesini)
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Attached patch worker_sniff2.patch (obsolete) — Splinter Review
Attachment #8827556 - Flags: review?(bzbarsky)
Comment on attachment 8827556 [details] [diff] [review]
worker_sniff2.patch

>+      aMessage.Append(mMessage->mArgs.ElementAt(i));

That doesn't seem right.  The args are supposed to be interpolated into the string identified by the error code...

>+  MOZ_ASSERT(IsDOMException());

Why can you assert that?  We could just be some random nsresult, no?

>+    aMessage.Assign(NS_ConvertUTF8toUTF16(mDOMExceptionInfo->mMessage));

Did we decide we don't want to include the DOMException name (aka mDOMExceptionInfo->mRv) in this string?  Why not?

>+    aMessage.Assign(NS_ConvertUTF8toUTF16(mDOMExceptionInfo->mMessage));

  CopyUTF8toUTF16(mDOMExceptionInfo->mMessage, aMessage);

avoids some string copies there.

>+  aMessage.Assign(NS_ConvertUTF8toUTF16(mDOMExceptionInfo->mMessage));

This looks wrong.  Should't you use "message" somewhere here?

>+  // ErrorResult is containing an error (Failed()) and if it is not a JS
>+  // exception (!IsJSException()).

s/is containing an error (Failed())/is Failed()/

Also, we probably want to exclude IsJSContextException(), no?  I don't see how we can produce a sane string there.

>+  CreateAndDispatch(JSContext* aCx, WorkerPrivate* aWorkerPrivate,
>+                    ErrorResult& aRv)

aRv should be const.  And maybe named something different, because it's an _in_ parameter here, not the usual outparam to indicate failure.

>+    nsAutoString message;

Why bother, if we're just going to pass it to an nsString immediately?  Just make it nsString.

>+                             const nsAString& aMessage)

const nsString&, since we fully control the caller.

>+  PostDispatch(WorkerPrivate* aWorkerPrivate, bool aDispatchResult) override
>+    aWorkerPrivate->AssertIsOnWorkerThread();

I would think we would be on the parent thread there, no?

>+    if (aWorkerPrivate->GetParent()) {

I don't understand this test.  Why do we care whether we have a parent WorkerPrivate?

>+      AssertIsOnMainThread();

I don't see how we could be, in the GetParent() case.

>+      if (aWorkerPrivate->IsSharedWorker()) {

Don't see how it could be, in the GetParent() case.

>+      if (aWorkerPrivate->IsServiceWorker()) {

Likewise.

>+    RootedDictionary<ErrorEventInit> init(aCx);

No, this isn't right.  Again, spec (as of today; it's changed a bit) says:

  If the algorithm asynchronously completes with null, queue a task to fire an
  event named error at worker, and abort these steps.

where "fire an event" links to https://dom.spec.whatwg.org/#concept-event-fire which says to use the Event constructor in this case (because we didn't provide another one).  This should be constructed with default values for bubbles/cancelable, because nothing says to use non-default ones.

This clearly needs more tests for this last part.

>+    // reported after we return.  We want to propagate just JS exceptions,

I think you should have a comment about what exceptions are expected to be on the ErrorResult at this point and in which circumstances, and why we want to do whatever we do with each kind.
Attachment #8827556 - Flags: review?(bzbarsky) → review-
> >+  PostDispatch(WorkerPrivate* aWorkerPrivate, bool aDispatchResult) override
> >+    aWorkerPrivate->AssertIsOnWorkerThread();
> 
> I would think we would be on the parent thread there, no?

PostDispatch and PreDispatch run on the 'current' thread.

I applied all your comments but I don't understand what I have to do for the RootedDictionary. I hope my patch is fine.
I'll try to add some test in a separate patch. I didn't find a way to unify the format string use between JS and dom.
Attached patch worker_sniff2.patch (obsolete) — Splinter Review
Attachment #8827556 - Attachment is obsolete: true
Attachment #8828333 - Flags: review?(bzbarsky)
> PostDispatch and PreDispatch run on the 'current' thread.

Ah, I see.  OK.

> but I don't understand what I have to do for the RootedDictionary

The problem is not the RootedDictionary.  The problem is that you're creating the wrong sort of object to start with.  Per spec it should be an Event, not an ErrorEvent.

Which also means it shouldn't have any message, so maybe we shouldn't be worrying about that?  That would certainly simplify things.  But maybe we instead need to get the spec changed to fire an ErrorEvent so a message _can_ be included?

Do you know what other browsers do here?

I'm still not sure why, if we do include a message, we wouldn't include the exception name in it...
Attachment #8828333 - Flags: review?(bzbarsky) → review-
> Do you know what other browsers do here?

It seems that chrome doesn't support nosniff for workers. I wrote a test and it works (the worker is not loaded) in FF, but it doesn't work (the worker is executed) in Chrome. This test is a copy of the WPT. I cannot test safari nor edge.
Let's fix this bug in 2 steps: first we can release a patch with a simple Event (no message involved).
Then we can see if the spec will change. I have to file a spec bug...
Flags: needinfo?(bzbarsky)
OK, but what happens for, say, a 404?  That should end up on the same codepath, right?
And yes, just following the current spec sounds like a reasonable start.
Flags: needinfo?(bzbarsky)
Attachment #8828333 - Attachment is obsolete: true
Attachment #8829373 - Flags: review?(bzbarsky)
Comment on attachment 8829373 [details] [diff] [review]
worker_sniff.patch

I assume the shared/service worker bits here are correct...  Might be worth filing a followup to add tests for those.  Or spec links in the code here, or both.

That said, your code really doesn't seem to match the spec for shared workers, unless I'm missing something....  I haven't found the relevant pieces of the serviceworker spec.

>+    RootedDictionary<EventInit> init(aCx);

This doesn't actually need rooting.  You can just do:

  EventInit init;

Or even just pass a literal EventInit() as the third arg to Event::Constructor.  The reason it doesn't need rooting is that EventInit doesn't contain any gcthings.

>+    // because all the other errors are handled when the script is loaded.

Spec link would be a good idea here.

>+      rv.SuppressException();

That's ok, but don't we still want to return false here immediately after doing that?

>+      RootedDictionary<EventInit> eventInit(aCx);

Again, don't need to root that.

>+      eventInit.mCancelable = true;

Why?  Which spec are you implementing here?

r=me on the dedicated worker bits with the above nits.  Not sure about the rest of it...

Did you file any spec issues?
Attachment #8829373 - Flags: review?(bzbarsky) → review+
I had to fix several tests. Some of them are part of WPT. I need a feedback.
Attachment #8829853 - Flags: review?(bzbarsky)
Comment on attachment 8829853 [details] [diff] [review]
worker_sniff2.patch

Those wpt tests clearly don't test what the spec says to do...  So r=me for at least changing them to match the spec.

And maybe mention them in the spec issue, in case the spec changes?
Attachment #8829853 - Flags: review?(bzbarsky) → review+
Oh, and looks like we added that "instanceof ErrorEvent" thing ourselves, in bug 1218433 (but upstreamed via the wpt repo directly).  So it's totally your and sicking's thing.  ;)
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8b8219c20d6
Better error reporting for network errors in workers, r=bz
https://hg.mozilla.org/integration/mozilla-inbound/rev/3361d527f683
Better error reporting for network errors in workers - WPT, r=bz
It's saying that we have a fundamentally broken lint in the tree.  I thought bug 1334297 was the extent of it, but clearly sometimes we run the lint even if MANIFEST.json didn't change?

What are those times when we run it?  Do we actually have to regenerate the SHA1s in the manifest any time we edit any WPT anything?
Flags: needinfo?(james)
Anyway, in practice this can be relanded if you update the hashes for the modified files in the manifest.  You can probably just copy the "correct" values from that failure log.  :(
I have a patch in bug 1334767 to fix the lint to not care about file hashes changing. Once that lands we will stop getting a lot of useless noise from the lint. In the meantime I don't mind if it gets hidden or whatever, although that does make it possible to land certain test changes that never get run (because the manifest gets out of date).
Flags: needinfo?(james)
Hidden, since we were about to back something else out over it.
The lint should now be fixed to be less annoying, so there's no reason not to go ahead and land this.
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7cf28e24da9d
Better error reporting for network errors in workers, r=bz
https://hg.mozilla.org/integration/mozilla-inbound/rev/11b1bbf8ba94
Better error reporting for network errors in workers - WPT, r=bz
https://hg.mozilla.org/mozilla-central/rev/7cf28e24da9d
https://hg.mozilla.org/mozilla-central/rev/11b1bbf8ba94
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
You need to log in before you can comment on or make changes to this bug.