Baldr: implement async WebAssembly.compile

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: luke, Assigned: luke)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 affected)

Details

Attachments

(11 attachments, 10 obsolete attachments)

2.52 KB, patch
jandem
: review+
Details | Diff | Splinter Review
3.77 KB, patch
sunfish
: review+
Details | Diff | Splinter Review
23.53 KB, patch
till
: review+
Details | Diff | Splinter Review
15.17 KB, patch
till
: review+
Details | Diff | Splinter Review
23.44 KB, patch
till
: review+
Details | Diff | Splinter Review
2.17 KB, patch
till
: review+
Details | Diff | Splinter Review
5.54 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
7.59 KB, patch
bkelly
: review-
Details | Diff | Splinter Review
2.59 KB, patch
till
: review+
Details | Diff | Splinter Review
7.72 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
1.25 KB, patch
till
: review+
Details | Diff | Splinter Review
Mostly, the challenge is to remove the 'cx' argument from wasm::Compile (and thus the entire compiler pipeline) so that wasm::Compile can be executed without any ExclusiveContext (avoiding the overhead of off-main-thread script parsing).  Furthermore, bug 1276029 will want to run wasm::Compile from a random (non-JS-helper) thread when deserializing a module whose machine code is invalid, so ideally we won't even have a JSRuntime.
Posted patch no-cx-compilation (obsolete) — Splinter Review
This removes the CompileRuntime for wasm's JitContext. In general, most of what CompileRuntime does is provide details about the JSRuntime to bake into code, but (due to serialization constraints) wasm doesn't want to do that.
Attachment #8767321 - Flags: review?(jdemooij)
Posted patch rm-compile-runtime (obsolete) — Splinter Review
Oops, wrong patch.
Attachment #8767321 - Attachment is obsolete: true
Attachment #8767321 - Flags: review?(jdemooij)
Attachment #8767322 - Flags: review?(jdemooij)
Posted patch rm-compile-runtime (obsolete) — Splinter Review
... and this patch is able to transitively remove a little more.  This disables trace logging for the moment, perhaps we can reenable after more is in place so we can see how things are supposed to fit together.
Attachment #8767322 - Attachment is obsolete: true
Attachment #8767322 - Flags: review?(jdemooij)
Attachment #8767327 - Flags: review?(jdemooij)
Posted patch empty-jcx (obsolete) — Splinter Review
The main compiler thread (on which ModuleGenerator lives) doesn't even need an allocator; it doesn't do any regalloc, only merging and some basic patching, so its JitContext can be completely empty (but still required to be present for various asserts).
Attachment #8767331 - Flags: review?(jdemooij)
Attachment #8767331 - Attachment description: rm-jcx → empty-jcx
Attachment #8767331 - Attachment filename: rm-jcx → empty-jcx
Another use of cx is to get to rt->canuseOffthreadIonCompilation().  Given that there are already a few other ways to disable parallel compilation for shell testing/fuzzing (--thread-count, --no-threads) and javascript.options.ion.offthread_compilation is more aimed at triaging JS JIT compilation bugs where compilation overlaps execution (unlike wasm, where there is an join before compilation completes), I think it's not worth trying to pipe this setting around.

On a side note, I noticed that JS_SetOffthreadIonCompilationEnabled() is only called for the main-thread JSRuntime, not workers.  In the future, perhaps a better place for some of these flags would be on HelperThreadState() so they could be accessed from anywhere and simultaneously disable all runtimes.
Attachment #8767364 - Flags: review?(jdemooij)
Depends on: 1284056
Attachment #8767327 - Attachment is obsolete: true
Attachment #8767327 - Flags: review?(jdemooij)
Attachment #8767331 - Attachment is obsolete: true
Attachment #8767331 - Flags: review?(jdemooij)
Attachment #8767364 - Flags: review?(jdemooij)
Attachment #8767364 - Attachment is obsolete: true
(Moving pre-req work to bug 1284056.)
Posted patch WIP (obsolete) — Splinter Review
Here's a WIP patch.  Still needed are:
 - a new WasmCompileTask that executes wasm::Compile off the main thread
 - a Gecko mechanism for, when the wasm::Compile task finishes, getting back to the main thread
I find myself wanting this somewhat often.
Attachment #8780776 - Flags: review?(jdemooij)
Posted patch tweak-kindSplinter Review
Tiny tweak, makes a later patch a bit nicer.
Attachment #8780777 - Flags: review?(sunfish)
Posted patch wasm-compileSplinter Review
This patch adds the basic WebAssembly.compile method that returns a promise that is already fulfilled before returning.  The subsequent patches will iterate on this towards being fully concurrent.

The patch also generalizes the 'filename' that is passed around to 'ScriptedCaller' which is file, line and column for the benefit of promise rejection.
Attachment #8780778 - Flags: review?(till)
This patch adds the JSAPI for allowing the SM to dispatch a runnable to a JSContext's owner thread.  The patch only contains a shell implementation; later patches will add main thread and browser.  This API is currently only used to dispatch from the JSContext's owner thread, but the next patch will dispatch from a helper thread.

The intention of the JSAPI is that it allows:
 - worker callbacks to create a WorkerHolder object to keep the worker alive until it's time to dispatch back
 - main thread callbacks to let dispatch fail (during shutdown)
but those will be in separate patches.
Attachment #8780782 - Flags: review?(till)
This patch performs the compilation on a helper thread and uses the JSAPI added in the last patch to get back to the JSContext owner thread afterwards.
Attachment #8780783 - Flags: review?(till)
Attachment #8772152 - Attachment is obsolete: true
Comment on attachment 8780778 [details] [diff] [review]
wasm-compile

Review of attachment 8780778 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great. r=me with feedback addressed.

::: js/src/asmjs/WasmJS.cpp
@@ +208,5 @@
> +static bool
> +DescribeScriptedCaller(JSContext* cx, ScriptedCaller* scriptedCaller)
> +{
> +    JS::AutoFilename af;
> +    if (DescribeScriptedCaller(cx, &af, &scriptedCaller->line, &scriptedCaller->column)) {

Is the intent here to hide failure of the inner DescribeScriptedCaller call? Otherwise, there should be an `else return false` after this block, no? Or is the only possible failure condition here OOM, and we're ok with masking that? If so, please add a comment to that effect.

@@ +1172,5 @@
> +    if (!errorObj)
> +        return false;
> +
> +    RootedValue rejectionValue(cx, ObjectValue(*errorObj));
> +    return JS::RejectPromise(cx, promise, rejectionValue);

Given that you have the promise as a PromiseObject* here anyway, you might as well use promise->reject(cx, rejectionValue). Same for the resolve/reject

@@ +1210,5 @@
> +        RootedValue rejectionValue(cx);
> +        if (!cx->getPendingException(&rejectionValue))
> +            return false;
> +
> +        cx->clearPendingException();

Nit: could use js::GetAndClearException here.
Attachment #8780778 - Flags: review?(till) → review+
Comment on attachment 8780782 [details] [diff] [review]
jsapi-addition

Review of attachment 8780782 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8780782 - Flags: review?(till) → review+
Comment on attachment 8780783 [details] [diff] [review]
use-helper-thread

Review of attachment 8780783 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. r=me with or without feedback addressed.

::: js/src/asmjs/WasmJS.cpp
@@ +1187,5 @@
>      RootedValue resolutionValue(cx, ObjectValue(*moduleObj));
>      return JS::ResolvePromise(cx, promise, resolutionValue);
>  }
>  
> +struct CompilePromiseTask : PromiseTask

Does this need to have "Promise" in its name? It reads a bit confusing ("wait, this compiles promises?"), and it seems like the API contract makes it eminently clear that it uses a Promise.
Attachment #8780783 - Flags: review?(till) → review+
Comment on attachment 8780777 [details] [diff] [review]
tweak-kind

Review of attachment 8780777 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/WasmCode.h
@@ +418,5 @@
>      MemoryUsage           memoryUsage;
>      uint32_t              minMemoryLength;
>      uint32_t              maxMemoryLength;
>  
> +    MetadataCacheablePod(ModuleKind kind) {

This constructor can use the explicit keyword.

@@ +440,5 @@
>  };
>  
>  struct Metadata : ShareableBase<Metadata>, MetadataCacheablePod
>  {
> +    Metadata(ModuleKind kind = ModuleKind::Wasm) : MetadataCacheablePod(kind) {}

This ctor can use the explicit keyword.
Attachment #8780777 - Flags: review?(sunfish) → review+
Comment on attachment 8780776 [details] [diff] [review]
unique-chars-string

Review of attachment 8780776 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/String.cpp
@@ +1270,5 @@
>  template JSFlatString*
>  NewStringCopyNDontDeflate<NoGC>(ExclusiveContext* cx, const Latin1Char* s, size_t n);
>  
> +JSFlatString*
> +NewLatin1String(ExclusiveContext* cx, UniqueChars chars)

Maybe add a Z suffix, as in NewStringCopy{N,Z} ? NewString takes the length explicitly.
Attachment #8780776 - Flags: review?(jdemooij) → review+
(In reply to Till Schneidereit [:till] from comment #13)
> Is the intent here to hide failure of the inner DescribeScriptedCaller call?

Bizarrely, JS::DescribeScriptedCaller's return bool indicates only whether a scripted caller was returned (an error is never thrown).  Definitely deserves a comment, though, good point.
(In reply to Till Schneidereit [:till] from comment #15)
> Does this need to have "Promise" in its name? It reads a bit confusing
> ("wait, this compiles promises?"), and it seems like the API contract makes
> it eminently clear that it uses a Promise.

Agreed
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/532bf50ab2ae
Add NewLatin1StringZ (r=jandem)
https://hg.mozilla.org/integration/mozilla-inbound/rev/a96770350e80
Baldr: tidy how ModuleKind is set (r=sunfish)
https://hg.mozilla.org/integration/mozilla-inbound/rev/efd8f92aa5eb
Baldr: add initial main-thread WebAssembly.compile (r=till)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/00ff86931c76
Add JSAPI for dispatching from helper threads back to a JSContext's owner thread (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/37a438d1a37d
Baldr: dispatch WebAssembly.compile tasks to a helper thread (r=till)
Add a bit of unwrapping so we can compile cross-global buffers.
Attachment #8782176 - Flags: review?(till)
Posted patch main-thread-compile (obsolete) — Splinter Review
This patch implements the two callbacks described in this just-added JSAPI:
  http://hg.mozilla.org/integration/mozilla-inbound/file/37a438d1a37d/js/src/jsapi.h#l4511
for the main-thread JSContext.
Attachment #8782177 - Flags: review?(bkelly)
Posted patch worker-compile (obsolete) — Splinter Review
And this patch implements them for worker JSContexts.
Attachment #8782178 - Flags: review?(bkelly)
Whiteboard: [leave open]
Comment on attachment 8782176 [details] [diff] [review]
unwrap-in-compile

Review of attachment 8782176 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8782176 - Flags: review?(till) → review+
Comment on attachment 8782177 [details] [diff] [review]
main-thread-compile

Review of attachment 8782177 [details] [diff] [review]:
-----------------------------------------------------------------

Overall this looks reasonable, but I don't really understand the memory management here.  It seems AsyncTask is non-ref-counted object, but its ownership is not well defined.  Could we refactor this to pass a UniquePtr<AsyncTask, JS::FreePolicy> around instead of just AsyncTask*?  That would make it clearer who is responsible for deleting the object.

r- for now until I better understand the ownership system.

::: dom/base/nsJSEnvironment.cpp
@@ +2359,5 @@
>    return asmjscache::OpenEntryForWrite(principal, aInstalled, aBegin, aEnd,
>                                         aSize, aMemory, aHandle);
>  }
>  
> +#ifdef SPIDERMONKEY_PROMISE

Why is this dependent on SPIDERMONKEY_PROMISE?  It seems the test does not check if sm_promise is enabled.

@@ +2360,5 @@
>                                         aSize, aMemory, aHandle);
>  }
>  
> +#ifdef SPIDERMONKEY_PROMISE
> +class AsyncTaskRunnable final : public CancelableRunnable

Runnables dispatched to the main thread are not canceled AFAIK.  I think you can just inherit from Runnable here.

@@ +2364,5 @@
> +class AsyncTaskRunnable final : public CancelableRunnable
> +{
> +public:
> +  explicit AsyncTaskRunnable(JS::AsyncTask* aTask)
> +    : mTask(aTask)

MOZ_ASSERT(mTask)

@@ +2368,5 @@
> +    : mTask(aTask)
> +  {
> +  }
> +
> +  virtual ~AsyncTaskRunnable()

The destructor should be private and non-virtual.  It will be called on the Runnable base class through its Release() method.  No one should ever be able to call delete externally on a ref-counted class like this.

@@ +2377,5 @@
> +  NS_IMETHOD Run() override
> +  {
> +    MOZ_ASSERT(sContext == mTask->user);
> +    AutoJSAPI jsapi;
> +    jsapi.Init();

What is the AutoJSAPI here for?  AFAICT its not actually referenced.

@@ +2382,5 @@
> +    mTask->finish(sContext);
> +    return NS_OK;
> +  }
> +
> +  nsresult Cancel() override

And you can remove the Cancel() method.

@@ +2410,5 @@
> +  // AsyncTasks can finish during shutdown so cannot simply
> +  // NS_DispatchToMainThread.
> +  nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> +  if (!mainThread) {
> +    return false;

Returning false here leaks the AsyncTask, no?

@@ +2413,5 @@
> +  if (!mainThread) {
> +    return false;
> +  }
> +
> +  RefPtr<AsyncTaskRunnable> r = new AsyncTaskRunnable(aTask);

Our default new operator will crash if we exhaust memory.  So you do not need to check for returned nullptr.

@@ +2418,5 @@
> +  if (!r) {
> +    return false;
> +  }
> +
> +  mainThread->Dispatch(r.forget(), NS_DISPATCH_NORMAL);

I think it would be easier to just write this method like this:

  nsCOMPtr<nsIRunnable> r = new AsyncTaskRunnable(aTask);
  return NS_SUCCEEDED(NS_DispatchToMainThread(r.forget()));

@@ +2459,5 @@
>    };
>    JS::SetAsmJSCacheOps(sContext, &asmJSCacheOps);
>  
> +#ifdef SPIDERMONKEY_PROMISE
> +  JS::SetAsyncTaskCallbacks(sContext, StartAsyncTaskCallback, FinishAsyncTaskCallback);

Where is this function defined?  I don't see it in any of the patches.
Attachment #8782177 - Flags: review?(bkelly) → review-
Comment on attachment 8782178 [details] [diff] [review]
worker-compile

Review of attachment 8782178 [details] [diff] [review]:
-----------------------------------------------------------------

Same memory ownership comments as the main thread patch.  Also, I don't think you are using control runnables quite right here.  You need to use the control runnable's Dispatch() method.

::: dom/promise/tests/test_webassembly_compile.html
@@ +87,5 @@
> +function terminateCompileInWorker() {
> +    var w = new Worker(`data:text/plain,
> +      var fooModuleCode;
> +      function spawnWork() {
> +        const N = 100;

Why have a limit here?  This creates the possibility that the test completes the compilation before the terminate() is called.  We don't have any guarantees about ordering.  I think it would be better to infinitely loop.  Is that possible here?

::: dom/workers/RuntimeService.cpp
@@ +724,5 @@
> +
> +public:
> +  AsyncTaskRunnable(WorkerPrivate* aWorker, JS::AsyncTask* aTask)
> +    : WorkerControlRunnable(aWorker, WorkerThreadUnchangedBusyCount),
> +      mTask(aTask)

MOZ_ASSERT(mTask)

@@ +730,5 @@
> +
> +  void Dispatch()
> +  {
> +    RefPtr<WorkerControlRunnable> self(this);
> +    MOZ_ALWAYS_SUCCEEDS(Worker()->Dispatch(self.forget()));

Why do you need to override Dispatch()?  This loses the benefits of using a control runnable since this Dispatch() can fail.  I think you just want to use the Dispatch() from WorkerControlRunnable.

If this is to make sure you are on the WorkerHolder's worker thread, I think you should instead do something like:

MOZ_ASSERT(WorkerHolder::mWorkerPrivate == WorkerControlRunnable::mWorkerPrivate);

@@ +739,5 @@
> +    MOZ_ASSERT(aWorkerPrivate == Worker());
> +    MOZ_ASSERT(aCx == Worker()->GetJSContext());
> +
> +    AutoJSAPI jsapi;
> +    jsapi.Init();

Again, why is there a AutoJSAPI here when we use a different JSContext?  I thought we only needed this if we needed a JSContext.

@@ +756,5 @@
> +  }
> +
> +  bool Notify(Status aStatus) override
> +  {
> +    return true;

Add a comment indicating that the process is guaranteed to complete in bounded time and there is not a clean way to cancel it.

@@ +785,5 @@
> +  RefPtr<AsyncTaskRunnable> runnable =
> +    already_AddRefed<AsyncTaskRunnable>(
> +      static_cast<AsyncTaskRunnable*>(aTask->user));
> +
> +  runnable->Dispatch();

MOZ_ALWAYS_TRUE(runnable->Dispatch());
Attachment #8782178 - Flags: review?(bkelly) → review-
(In reply to Ben Kelly [:bkelly] from comment #27)
Thanks!

> It seems AsyncTask is non-ref-counted object, but its
> ownership is not well defined.  Could we refactor this to pass a
> UniquePtr<AsyncTask, JS::FreePolicy> around instead of just AsyncTask*? 
> That would make it clearer who is responsible for deleting the object.

AsyncTask is fully owned by SM (allocated and deleted) which is why there is no UniquePtr (and no delete calls anywhere for the JS::AsyncTask).  Now, to implement this ownership, SM requires Gecko to implement the contract described in jsapi.h (calling finish()/cancel()), but that's just a simple contract with no ownership.  This is actually necessary for SM to ensure some internal destroy-on-the-right-thread invariants and so the lifetime isn't something Gecko can control.

> > +#ifdef SPIDERMONKEY_PROMISE
> 
> Why is this dependent on SPIDERMONKEY_PROMISE?  It seems the test does not
> check if sm_promise is enabled.

I did this because the promise hooks in CycleCollectedJSRuntime::Initialize were also #ifdef SPIDERMONKEY_PROMISE.  However, I guess that isn't strictly necessary: if SPIDERMONKEY_PROMISE is not defined, there is a SM-internal #ifdef which will simply not install WebAssembly.compile.  So I can remove these.

> > +    AutoJSAPI jsapi;
> > +    jsapi.Init();
> 
> What is the AutoJSAPI here for?  AFAICT its not actually referenced.

IIUC, that's just the rule: you have to have an AutoJSAPI on the stack before you call any JSAPI (see AutoJSAPI comment).  Practically, it includes the JSAutoRequest which is necessary to avoid asserts inside SM.

> > +  nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
> > +  if (!mainThread) {
> > +    return false;
> 
> Returning false here leaks the AsyncTask, no?

This is a well-defined case according to the contract described in jsapi.h.  SM explicitly expects this case and does its own internal magic to make sure the AsyncTask is (eventually) destroyed on the right thread.

> > +  mainThread->Dispatch(r.forget(), NS_DISPATCH_NORMAL);
> 
> I think it would be easier to just write this method like this:
> 
>   nsCOMPtr<nsIRunnable> r = new AsyncTaskRunnable(aTask);
>   return NS_SUCCEEDED(NS_DispatchToMainThread(r.forget()));

I'm writing it this way to handle shutdown: apparently it's not valid to call NS_DispatchToMainThread() late in shutdown (and this callback can be called very late (as late as the JS_ShutDown() call: http://searchfox.org/mozilla-central/source/xpcom/build/XPCOMInit.cpp#1007).  This is what nfroyd suggested, pointing to:
  https://dxr.mozilla.org/mozilla-central/source/dom/promise/Promise.cpp#2598

> Where is this function defined?  I don't see it in any of the patches.

Sorry, I merged the two setters into one function.  They've landed now so I'd just read jsapi.h on m-i:
  http://hg.mozilla.org/integration/mozilla-inbound/file/tip/js/src/jsapi.h#l4560
(In reply to Ben Kelly [:bkelly] from comment #28)
Thanks again, new patches up in a second after I recompile and rerun tests.

> Why have a limit here?  This creates the possibility that the test completes
> the compilation before the terminate() is called.  We don't have any
> guarantees about ordering.  I think it would be better to infinitely loop. 
> Is that possible here?

This is actually an iloop: see the `.then(spawnWork)` at the end which does the whole thing again if it manages to finish.  The reason I didn't just do an infinite spawn cycle is that I was worried the might cause various kinds of OOM that would make the test less reliable.  This scheme bounds the number of outstanding promises.  

> Why do you need to override Dispatch()?

Oh hah, that's actually just a coincidence: I just randomly picked this name as a helper method; I didn't know there was already a WorkerControlRunnable::Dispatch() :)

> Again, why is there a AutoJSAPI here when we use a different JSContext?  I
> thought we only needed this if we needed a JSContext.

Same answer as above.  Maybe you're thinking of AutoSafeJSContext?
Updated with comments addressed.  Hopefully my explanation above clears up the ownership question.
Attachment #8782177 - Attachment is obsolete: true
Attachment #8782708 - Flags: review?(bkelly)
Posted patch worker-compile (obsolete) — Splinter Review
Updated with comments addressed.
Attachment #8782178 - Attachment is obsolete: true
Attachment #8782709 - Flags: review?(bkelly)
Comment on attachment 8782708 [details] [diff] [review]
main-thread-compile

Review of attachment 8782708 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!  r=me with comments addressed.

::: dom/base/nsJSEnvironment.cpp
@@ +2371,5 @@
> +
> +protected:
> +  NS_IMETHOD Run() override
> +  {
> +    MOZ_ASSERT(sContext == mTask->user);

Please add a MOZ_ASSERT(NS_IsMainThread()) since this is only intended to be run from main thread.

@@ +2374,5 @@
> +  {
> +    MOZ_ASSERT(sContext == mTask->user);
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +    mTask->finish(sContext);

It seems mTask may be a garbage pointer after this call.  I think we should note this in a comment and clear mTask.

We should probably also MOZ_ASSERT(!mTask) in the destructor to ensure that we always pass effective ownership back to the js engine.

@@ +2401,5 @@
> +    return false;
> +  }
> +
> +  RefPtr<AsyncTaskRunnable> r = new AsyncTaskRunnable(aTask);
> +  mainThread->Dispatch(r.forget(), NS_DISPATCH_NORMAL);

Please wrap this in MOZ_ALWAYS_SUCCEEDS().
Attachment #8782708 - Flags: review?(bkelly) → review+
Comment on attachment 8782709 [details] [diff] [review]
worker-compile

Review of attachment 8782709 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.

::: dom/workers/RuntimeService.cpp
@@ +736,5 @@
> +
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +
> +    mTask->finish(Worker()->GetJSContext());

As with the other patch, mTask may be garbage after this call, right?  Lets clear it.

In this case, though, we might get destroyed without calling Run() or Cancel() if the HoldWorker() call fails.  So we probably need this in the destructor:

MOZ_ASSERT_IF(WorkerHolder::mWorkerPrivate, !mTask);

And I guess we should MOZ_ASSERT(mTask) in run/cancel here and in the other patch if we are clearing it.  Just to verify we don't call run twice, etc.

@@ +742,5 @@
> +  }
> +
> +  nsresult Cancel() override
> +  {
> +    AutoJSAPI jsapi;

We should probably do the MOZ_ASSERT(aWorkerPrivate == Worker()) here as well.

@@ +746,5 @@
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +
> +    mTask->cancel(Worker()->GetJSContext());
> +    return WorkerRunnable::Cancel();

Maybe add a comment here that you are explicitly opting out of the WorkerControlRunnable::Cancel() since you don't want to its default behavior.  Here you want to call mTask->cancel() instead of triggering the same ::Run() method that the control runnable would normally do.  At first glance I thought this was a typo.

@@ +777,5 @@
> +
> +static bool
> +FinishAsyncTaskCallback(JS::AsyncTask* aTask)
> +{
> +  // Match the forget().take() in StartAsyncTaskCallback.

What thread does this run on?  Can we assert it or at least write a comment to provide context?
Attachment #8782709 - Flags: review?(bkelly) → review+
(In reply to Ben Kelly [:bkelly] from comment #35)
Thanks for the reviews!

> > +  nsresult Cancel() override
> 
> We should probably do the MOZ_ASSERT(aWorkerPrivate == Worker()) here as
> well.

Can't: no aWorkerPrivate argument :)

> > +FinishAsyncTaskCallback(JS::AsyncTask* aTask)
> 
> What thread does this run on?  Can we assert it or at least write a comment
> to provide context?

Either worker thread or random JS-internal helper thread (so can't even MOZ_ASSERT(!worker thread)).  Will comment.
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8cda3478ffd4
Baldr: handle cross-global typed arrays in JS API (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/694300732586
Implement AsyncTask hooks for main thread JSContext (r=bkelly)
https://hg.mozilla.org/integration/mozilla-inbound/rev/b6cf6d7a95f9
Implement AsyncTask hooks for worker thread JSContexts (r=bkelly)
Whiteboard: [leave open]
Unfortunately, it seems like 1 particular run failed in the worker test I added (not the one that terminate()s either, just the plain "does it work" test):
  https://archive.mozilla.org/pub/firefox/tinderbox-builds/mozilla-inbound-win32/1471635194/mozilla-inbound_win7_ix_test-mochitest-e10s-3-bm111-tests1-windows-build373.txt.gz

So I'll backout just the last cset and see if that's the only failure which could help isolate whether this is a worker-specific problem or just a rare race:
  https://hg.mozilla.org/integration/mozilla-inbound/rev/5107e97e2f02

In case anyone has dealt with this before, the error is "TypeError: fulfillReactions is null at data" which seems to come from within the self-hosted Promise.js impl and, scanning the code, seems like it'd happen if a promise gets rejected or fulfilled more than once.  I'll have to poke around more next week.
Whiteboard: [leave open]
Investigating this with try server logging over the weekend, it appears that the problem is indeed just with the worker impl here: I'm using a WorkerControlRunnable to dispatch to the worker thread and, unlike normal WorkerRunnables, the control runnables *interrupt* what is currently running on the worker thread leading to obvious race conditions like we have here.  To confirm, switching to a WorkerRunnable makes the patch green over 10 runs.

The only challenge is that we need a reliable way to release our WorkerHolder on the worker thread and WorkerRunnables will fail to dispatch to the worker thread during worker shutdown.  So it seems like either:
 1. we allow WorkerRunnables to take the Status that gates their dispatch (similar to HoldWorker())
 2. we use something like NS_ReleaseOnMainThread() except for the WorkerThread (maybe *that* could be a WorkerControlRunnable)

Of course open to other options too.
Flags: needinfo?(bkelly)
The correct way to do this right now is to:

1) Try to dispatch a WorkerRunnable.
2) If that fails, then dispatch a WorkerControlRunnable (since this works during shutdown).

You can't just do (2) because it bypasses the event loop ordering.  You can't just do (1), because that fails during worker thread shutdown.

For an example, see:

https://dxr.mozilla.org/mozilla-central/source/dom/fetch/Fetch.cpp#413
Flags: needinfo?(bkelly)
And yes, it would be nice if we had a more convenient way to perform this kind of operation.  I suggest filing a follow-up bug to add an API.
Righto, this redoes the patch to be more like the fetch impl, separating out the WorkerHolder object from the Worker(Control)Runnable objects which share a base class.
Attachment #8782709 - Attachment is obsolete: true
Attachment #8783692 - Flags: review?(bkelly)
While reading logs I noticed that we were (misleadingly, but harmlessly) reporting out-of-memory in some cases when attempting to start an async task during shutdown.  I think the better and symmetric (with finishAsyncTaskCallback failure) thing to do is silently drop the error since, on shutdown, promise handlers aren't necessarily called.
Attachment #8783696 - Flags: review?(till)
Comment on attachment 8783696 [details] [diff] [review]
dont-report-start-failure

Review of attachment 8783696 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense.
Attachment #8783696 - Flags: review?(till) → review+
Comment on attachment 8783692 [details] [diff] [review]
worker-compile v.2

Review of attachment 8783692 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for reworking this for me!  This is mostly correct, but we need to be a bit more careful where the holder ref-counted object is destroyed.

::: dom/workers/RuntimeService.cpp
@@ +699,5 @@
>                                         aSize, aMemory, aHandle);
>  }
>  
> +class AsyncTaskWorkerHolder final
> +  : public RefCounted<AsyncTaskWorkerHolder>, public WorkerHolder

RefCounted<> is legacy and should not be used in new code.  Please see NS_INLINE_DECL_REFCOUNTING instead.  It is also safer since its assertions would have caught the cross-thread refcounting use in this patch.

@@ +771,5 @@
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +
> +    mTask->finish(mWorkerPrivate->GetJSContext());
> +    mTask = nullptr;  // mTask may delete itself

You must explicitly clear mHolder here. The destructor may run on the dispatching thread due to races.

@@ +784,5 @@
> +    AutoJSAPI jsapi;
> +    jsapi.Init();
> +
> +    mTask->cancel(mWorkerPrivate->GetJSContext());
> +    mTask = nullptr;  // mTask may delete itself

You must explicitly clear mHolder here. The destructor may run on the dispatching thread due to races.

@@ +811,5 @@
> +
> +  bool WorkerRun(JSContext* aCx, WorkerPrivate* aWorkerPrivate) override
> +  {
> +    // Nothing to do but release the AsyncTaskWorkerHolder in the destructor.
> +    // See comment in FinishAsyncTaskCallback.

You must explicitly clear mHolder here.  The destructor may run on the dispatching thread due to races.

@@ +849,5 @@
> +  // report failure back to the JS engine but make sure to release the
> +  // WorkerHolder on the worker thread using a control runnable. Control
> +  // runables aren't suitable for calling AsyncTask::finish() since they are run
> +  // via the interrupt callback which breaks JS run-to-completion.
> +  RefPtr<AsyncTaskRunnable> r = new AsyncTaskRunnable(holder, aTask);

Sorry, but I don't think this is thread safe.

1) You are passing the holder ref-counted object to the runnable to get AddRef()'d on this other thread.  Your ref-counting is not declared atomic or threadsafe.
2) Even if the ref-counting was threadsafe you would then have a race as to where the destructor runs.  Its possible for the worker thread to complete before the holder stack variable is destructed here.

I think you need to make AsyncTaskRunnable() take an already_AddRefed<AsyncTaskWorkerHolder>.  So you would then do:

  RefPtr<AsyncTaskRunnable> r = new AsyncTaskRunnable(holder.forget(), aTask);

And then if it fails to dispatch you need a StealHolder() call to get the ref out of there:

  RefPtr<AsyncTaskControlRunnable> cr = new AsyncTaskControlRunnable(r->StealHolder())

And then you need to clear mHolder explicitly on the target worker thread.  Again, the stack RefPtr<> for the runnable here may not be destroyed until after the worker thread runs.
Attachment #8783692 - Flags: review?(bkelly) → review-
Oof, good catch.  This reminds me why I don't like ref-counting to express mostly-single-ownership.  I should've just used a UniquePtr which then makes this all a lot more explicit instead of reasoning about when the last ref is dropped.
Posted patch worker-compile v.2.1 (obsolete) — Splinter Review
Attachment #8784122 - Flags: review?(bkelly)
Oops, forgot to qref.
Attachment #8784122 - Attachment is obsolete: true
Attachment #8784122 - Flags: review?(bkelly)
Attachment #8784126 - Flags: review?(bkelly)
Comment on attachment 8784126 [details] [diff] [review]
worker-compile v.2.1

Review of attachment 8784126 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.  Thanks!

::: dom/workers/RuntimeService.cpp
@@ +760,5 @@
> +  }
> +
> +  void DestroyHolder()
> +  {
> +    mHolder.reset();

nit: You could MOZ_ASSERT(mHolder) here since it shouldn't be called twice.

@@ +820,5 @@
> +
> +    return WorkerRunnable::Cancel();
> +  }
> +
> +  bool Dispatch()

override

@@ +853,5 @@
> +{
> +  WorkerPrivate* worker = GetWorkerPrivateFromContext(aCx);
> +  worker->AssertIsOnWorkerThread();
> +
> +  auto runnable = MakeUnique<AsyncTaskWorkerHolder>();

nit: runnable is not a good variable name here any more.

@@ +858,5 @@
> +  if (!runnable->HoldWorker(worker, Status::Closing)) {
> +    return false;
> +  }
> +
> +  // Matched by an already_AddRefed in FinishAsyncTaskCallback which, by

nit: This comment needs to be updated since ref-counting isn't used any more.
Attachment #8784126 - Flags: review?(bkelly) → review+
(In reply to Ben Kelly [:bkelly] from comment #50)
> > +  bool Dispatch()
> 
> override

I would, but the base method isn't virtual so compiler rejects.
(In reply to Luke Wagner [:luke] from comment #51)
> I would, but the base method isn't virtual so compiler rejects.

Actually, I think the right thing to do here is to throw away mTask in PostDispatch() if aDispatchResult is false.  That method is virtual and intended to be overriden for this sort of thing.

Can you do that instead of shadowing the base method?
Flags: needinfo?(luke)
Ah, cool, yes, will do.
Flags: needinfo?(luke)
The worker changes allow me to reliably test the "finish after the JSRuntime started shutting down" path in SM which, I find, has a pretty silly bug.
Attachment #8784620 - Flags: review?(till)
Comment on attachment 8784620 [details] [diff] [review]
promise-task-unique-ptr

Review of attachment 8784620 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with a proper commit message.
Attachment #8784620 - Flags: review?(till) → review+
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/61b3877a8a6e
Store a UniquePtr in the PromiseTaskPtrVector (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/3661b79df89f
Don't report error for failure to start async task during shutdown (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/88097fd715ef
Implement AsyncTask hooks for worker thread JSContexts (r=bkelly)
Luke, should this be marked fixed, or are there remaining patches to add here?
Flags: needinfo?(luke)
Oops, you're right, fixed!
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Flags: needinfo?(luke)
You need to log in before you can comment on or make changes to this bug.