Closed Bug 1249652 Opened 9 years ago Closed 9 years ago

Make the error propagation for ScriptExecutorRunnable saner

Categories

(Core :: DOM: Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

Details

(Whiteboard: dom-triaged btpp-fixlater)

Attachments

(3 files)

The current setup goes as follows: 1) We enter LoadAllScripts in workers/ScriptLoader.cpp, with an ErrorResult& argument. 2) We create a ScriptLoaderRunnable which has an ErrorResult& member and dispatch it to the main thread. 3) This does some loading on the main thread, then creates a ScriptExecutorRunnable (or more than one sometimes) holding a reference to the ScriptLoaderRunnable and dispatches that back to the worker thread. 4) ScriptExecutorRunnable::WorkerRun runs the script. Note that it gets the JSContext* out of thin air (TLS). If execution succeeds, it sets mExecutionResult to true on the loadinfo. Otherwise it leaves it false, and leaves any exception thrown hanging out on aCx. 5) If this was the last ScriptExecutorRunnable for this ScriptLoaderRunnable, then in ScriptExecutorRunnable::PostRun we walk though all the loadinfos and check whether any of them has a false mExecutionResult. If it does, we set a boolean flag to false. Note that we completely ignore the boolean aRunResult passed to PostRun, because our WorkerRun never returns false. 6) We then call ShutdownScriptLoader, passing it a JSContext, the boolean we set in step 5, an nsresult that indicates what happend with the load of the last loadinfo that did not execute successfully and wasn't canceled. 7) ShutdownScriptLoader now takes the 3 "sources of error" it has: (boolean flag, nsresult, JSContext with exceptiom maybe pending) and converts them to two "sources of error": (JSContext, ErrorResult). This process is lossy: for example, if one of our scripts was terminated by the interrupt callback (worker shutting down?), we convert that from an uncatchable exception into an InvalidStateErr DOMException. 8) Now we unwind all the sync calls and stuff and return to binding code. This notices a failing ErrorResult, checks for exception on JSContext, and if it finds one there, throws it. Otherwise an exception is synthesized from the ErrorResult. This all seems really fragile to me, especially the totally non-local behavior of the exception on the JSContext. I would really like to move to a setup where we track at least all the JS stuff via the ErrorResult. Andrea, I guess the reason ScriptExecutorRunnable::PostRun walks through all the loadinfos instead of stopping at the first failing one is that we can end up in a situation where the load of the second script comes back from the server as failed so we cancel the first script, so hitting a failing loadinfo doesn't mean we have the right loadResult, right? Hence the walk to the first loadinfo with some other failing loadResult, assuming there is one?
Flags: needinfo?(amarchesini)
Er, no, a later load failure shouldn't affect earlier scripts per spec, afaict. What _is_ the point of that logic in PostRun?
Blocks: 1249673
Oh, for extra fun, for toplevel script loads (via CompileScriptRunnable and the equivalent for debugger scripts) we currently drop the ErrorResult on the floor, but convert a failure on it to a false return from WorkerRun, which then gets turned into a reported exception in PostRun because the exception is still hanging out on the JSContext. How awesome is that?
OK, talked to Andrea about that loop. That's trying to handle the case of us starting two loads, then newURI failing on the second load. That needs to kill off the first load (which it does, with NS_BINDING_ABORTED), but we want to report the failure from the newURI call, not the NS_BINDING_ABORTED. The spec has all the URI creations happen before anything else does, so this issue does not arise, but we're doing all this async in a weirdly interwingled way instead...
Flags: needinfo?(amarchesini)
Whiteboard: dom-triaged btpp-fixlater
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
There is a bit of subtlety here with NS_BINDING_ABORTED. Before these changes, we would land in ReportLoadError, not do anything with NS_BINDING_ABORTED, and just return. If called from WorkerPrivate::Constructor we'd then go ahead and throw it on the ErrorResult, but I'm pretty sure we never ended up with NS_BINDING_ABORTED there. If called from ScriptExecutorRunnable::WorkerRun, we would proceed on to ScriptExecutorRunnable::PostRun and hence ShutdownScriptLoader where we would throw on the ErrorResult but NOT on the JSContext. Then we would unwind to our consumer and if that consumer was a toplevel script load we would suppress the exception on the ErrorResult. Otherwise we'd go ahead and throw the exception we ended up with to the caller. The upshot is that we used to not fire error events on a worker whose main script load was canceled with NS_BINDING_ABORTED. So we try to preserve that behavior explicitly for toplevel scripts.
Attachment #8721598 - Flags: review?(khuey)
Attachment #8721598 - Flags: review?(amarchesini)
Comment on attachment 8721597 [details] [diff] [review] part 2. ScriptExecutorRunnable::WorkerRun should immediately move JS exceptions to its ErrorResult instead of allowing them to linger on the JSContext Review of attachment 8721597 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/workers/ScriptLoader.cpp @@ +1891,5 @@ > + // 3) mScriptLoader.mRv succeeded and aLoadResult is also success. As far > + // as I can tell, this can only happen when loading the main worker > + // script and GetOrCreateGlobalScope() fails or if > + // ScriptExecutorRunnable::Cancel got called. Does it matter what we > + // throw in this case? I'm not sure... Maybe not ... there's no script running on this thread to observe it, so the only question is whether we propagate that error to Worker.onerror on the parent thread. ::: dom/workers/WorkerRunnable.h @@ +218,5 @@ > DispatchInternal() override; > }; > > // This runnable is identical to WorkerSyncRunnable except it is meant to be > +// created and dispatched on the main thread only. Its WorkerRun/PostRun will grammar nit: created on and dispatched from
Attachment #8721597 - Flags: review?(khuey) → review+
Attachment #8721597 - Flags: review?(amarchesini) → review+
Comment on attachment 8721598 [details] [diff] [review] part 3. Simplify way we handle canceling when ScriptLoaderRunnable::RunInternal fails by canceling things with its actual failure code, so we don't have to guess which failed loads are actual failures and which are just canceled via this mechanism Review of attachment 8721598 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/workers/ScriptLoader.cpp @@ +569,5 @@ > { > AssertIsOnMainThread(); > > + nsresult rv = RunInternal(); > + if (NS_FAILED(rv)) { NS_WARN_IF
Attachment #8721598 - Flags: review?(amarchesini) → review+
> so the only question is whether we propagate that error to Worker.onerror on the parent thread. I'm not sure how to test this; any ideas? Certainly either nothing in our test covers this case or I did not change behavior. Will address the rest of the comments.
Depends on: 1270783
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: