Closed Bug 1406421 Opened 7 years ago Closed 7 years ago

Baldr: actual streaming compilation of WebAssembly.compileStreaming

Categories

(Core :: JavaScript Engine: JIT, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: luke, Assigned: luke)

References

Details

(Whiteboard: [js:tech-debt])

Attachments

(6 files, 1 obsolete file)

... instead of just buffering it up in an ArrayBuffer.
Priority: -- → P3
Whiteboard: [js:tech-debt]
This patch extends ExclusiveData with a ConditionVariable giving ExclusiveWaitableData.  This is more useful in the last patch which uses 3 of them, making it nice to bundle them together.  The only downside is you lose the variable name of the condvar which I was using to indicate the logical condition; I fixed that by adding a comment in the wait()/notify() parens.
Attachment #8918513 - Flags: review?(lhansen)
This patch both simplifies ModuleGenerator a bit (removing MG::startFuncDefs()) and allows parallel function compilation to go until MG::finishX() which allows parallel compilation to overlap data segment download (in the last patch, with streaming).
Attachment #8918514 - Flags: review?(bbouvier)
This little patch adds a utility function, StartsCodeSection(), that is critical in the next patch (see .h for careful description).  Even before streaming, this patch is able to use StartsCodeSection() in all non-streaming compilations (which is nice because it found some surprising corner cases) to allow CompileInitialTier() to know the size of the code section *before* DecodeModuleEnvironment, allowing ModuleEnvironment to go back to having tier/mode/debug be const fields.

The patch also factors out InitialCompilerFlags() so that it can be reused in the next patch.
Attachment #8918518 - Flags: review?(lhansen)
Attached patch actual-streaming (obsolete) — Splinter Review
And, finally, this patch does the actual streaming compilation, changing CompileStreamTask from buffering up all bytes and doing a compilation at the end to actually feeding the bytes to the ModuleGenerator as they come in.

This was actually pretty fun; I was able to find an interface that kept streaming relatively self-contained and the changes minimal.  I'd start by reading the wasm::CompileStreaming comment in WasmCompile.h.
Attachment #8918520 - Flags: review?(lhansen)
And of course, I should've checked to see that Windows had the same race condition that POSIX did that I fixed in bug 1347644.  This one is harder to fix by simply reordering, so I just introduced a Mutex and used it uniformly so that Linux and Windows are symmetric.
Attachment #8919120 - Flags: review?(till)
Right now this assert trivially fails if you run the shell with 'js -h', or any other way to fail early in main().
Attachment #8919123 - Flags: review?(till)
Comment on attachment 8919120 [details] [diff] [review]
fix-windows-thread-race

Review of attachment 8919120 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not too fond of the added duplication between the Posix and Windows threads code, but I see how extracting that would be annoying.
Attachment #8919120 - Flags: review?(till) → review+
Comment on attachment 8919123 [details] [diff] [review]
no-shutdown-assert

Review of attachment 8919123 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8919123 - Flags: review?(till) → review+
Comment on attachment 8918514 [details] [diff] [review]
tweak-module-generator

Review of attachment 8918514 [details] [diff] [review]:
-----------------------------------------------------------------

Indeed simpler, thank you for the patch!

::: js/src/wasm/WasmGenerator.cpp
@@ +903,5 @@
>      MOZ_ASSERT(finishedFuncDefs_);
>  
> +    while (outstanding_ > 0) {
> +        if (!finishOutstandingTask())
> +            return false;

nit: return nullptr;

::: js/src/wasm/WasmGenerator.h
@@ -191,4 @@
>      DebugOnly<bool>                 finishedFuncDefs_;
>  
>      bool allocateGlobalBytes(uint32_t bytes, uint32_t align, uint32_t* globalDataOff);
> -

light nit: I'd keep this \n, or move this function somewhere else: this function isn't a logical step in the generation process, as all the other functions below are.
Attachment #8918514 - Flags: review?(bbouvier) → review+
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/47fc470d7a93
Fix js::Thread race on windows (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/712aa74b91ce
Remove invalid assert from ~BufferStreamState() (r=till)
https://hg.mozilla.org/integration/mozilla-inbound/rev/e550d0a1bfd8
Baldr: allow parallel compilation to proceed until finish() (r=bbouvier)
Keywords: leave-open
Comment on attachment 8918513 [details] [diff] [review]
exclusive-waitable

Review of attachment 8918513 [details] [diff] [review]:
-----------------------------------------------------------------

I like this.
Attachment #8918513 - Flags: review?(lhansen) → review+
Comment on attachment 8918518 [details] [diff] [review]
starts-code-section

Review of attachment 8918518 [details] [diff] [review]:
-----------------------------------------------------------------

Nice to be rid of the mutable fields in ModuleEnvironment.
Attachment #8918518 - Flags: review?(lhansen) → review+
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf08b78f36ee
Baldr: add ExclusiveWaitableData subclass of ExclusiveData and use it (r=lth)
https://hg.mozilla.org/integration/mozilla-inbound/rev/073bcf5a01d5
Baldr: add and use StartsCodeSection (r=lth)
Comment on attachment 8918520 [details] [diff] [review]
actual-streaming

Review of attachment 8918520 [details] [diff] [review]:
-----------------------------------------------------------------

The compilation logic is fine and nicely contained, but I'm concerned about the (lack of) deletion protocol for the CompileStreamTask object, see comments below.

::: js/src/wasm/WasmCompile.h
@@ +83,5 @@
> +
> +// Compile the given WebAssembly module which has been broken into three
> +// partitions:
> +//  - envBytes contains a complete ModuleEnvironment that have already been
> +//    copied in from the stream.

Grammar nit: something wrong in that sentence, 'have' should perhaps be 'has'.

::: js/src/wasm/WasmJS.cpp
@@ +2179,5 @@
> +    // Until StartOffThreadPromiseHelperTask succeeds, we are responsible for
> +    // dispatching ourselves back to the JS thread in cases of failure.
> +    //
> +    // Warning: After this function returns, 'this' can be deleted at any time, so the
> +    // caller must immediately return from the stream callback.

If that can be true after the method returns then it must also be true the instant the ExclusiveData::Guard is destroyed, and the Guard is scoped to the assignment statement that updates the state, I think, so technically isn't the call to dispatchResolveAndDestroy risky?

@@ +2330,5 @@
> +          }
> +        }
> +        // At this point, the streamState_ is Closed, but we can't assert that
> +        // because, in the Closed state, we might have already been deleted if
> +        // execute() and resolve() completed already!

(Continued from above.)

I think this is safe only on the assumption that the compiler does not generate any code that touches this object after the release of the lock following setting the state to Closed.  I suspect the compiler is allowed to generate such code and that technically this is a race condition.

There's also the possibility that eg the mutex object may have code (an assertion say) that it executes following an unlock.  I've looked around and not found any such code yet, but there's no reason why an update to the Mutex object would not consider such an addition legal.
(In reply to Lars T Hansen [:lth] from comment #16)
> > +    // Warning: After this function returns, 'this' can be deleted at any time, so the
> > +    // caller must immediately return from the stream callback.
> 
> If that can be true after the method returns then it must also be true the
> instant the ExclusiveData::Guard is destroyed, and the Guard is scoped to
> the assignment statement that updates the state, I think, so technically
> isn't the call to dispatchResolveAndDestroy risky?

Before the helper thread starts, setting the state to Closed is only done to satisfy the assert in resolve().  Rather, it is the call to dispatchResolveAndDestroy() which, as the name implies, is intended for this purpose and has the same "must immediately return" design.

> I think this is safe only on the assumption that the compiler does not
> generate any code that touches this object after the release of the lock
> following setting the state to Closed.  I suspect the compiler is allowed to
> generate such code and that technically this is a race condition.
> 
> There's also the possibility that eg the mutex object may have code (an
> assertion say) that it executes following an unlock.  I've looked around and
> not found any such code yet, but there's no reason why an update to the
> Mutex object would not consider such an addition legal.

Agreed with concern.  We discussed offline, but I think the basic reason for keeping as is is:
 - we can't use ref-counting to hold 'this' alive b/c the PersistentRooted held by CompileStreamTask and OffThreadPromiseTask must be destroyed on a JSContext thread
 - seqcst barriers should prevent the compiler from moving anything after the lock/atomic
 - ~Mutex shouldn't assume 'this' is valid after unlock: let's say the ~Mutex was used to implement a (slow) atomic ref-count: after we've dec-ref'd and unlocked, another thread can dec-ref and delete this

Trying to think of how this could be less hazardous, I stuffed "AndDestroy" into the names of the 4 methods that do this, matching the pre-existing OffThreadPromiseTask::dispatchResolveAndDestroy().  I also tweaked streamClosed() so that an "AndDestroy" method is always immediately followed by return.
Attached patch actual-streamingSplinter Review
Attachment #8918520 - Attachment is obsolete: true
Attachment #8918520 - Flags: review?(lhansen)
Attachment #8921241 - Flags: review?(lhansen)
(In reply to Luke Wagner [:luke] from comment #17)
> (In reply to Lars T Hansen [:lth] from comment #16)
> 
> > I think this is safe only on the assumption that the compiler does not
> > generate any code that touches this object after the release of the lock
> > following setting the state to Closed.  I suspect the compiler is allowed to
> > generate such code and that technically this is a race condition.
> > 
> > There's also the possibility that eg the mutex object may have code (an
> > assertion say) that it executes following an unlock.  I've looked around and
> > not found any such code yet, but there's no reason why an update to the
> > Mutex object would not consider such an addition legal.
> 
> Agreed with concern.  We discussed offline, but I think the basic reason for
> keeping as is is:
>  - we can't use ref-counting to hold 'this' alive b/c the PersistentRooted
> held by CompileStreamTask and OffThreadPromiseTask must be destroyed on a
> JSContext thread
>  - seqcst barriers should prevent the compiler from moving anything after
> the lock/atomic

The barriers should prevent the compiler from moving *some things* after the lock/atomic.  But if the object in question has a const field, say, that is read before the barrier but whose value is used after, then I would normally expect the compiler to move the read of the value until after the barrier as that yields better code, unless it has reason to believe that the 'this' object could have been destroyed.

The problem is intertwined with issues around the scope of 'this'.  That's discussed here: http://en.cppreference.com/w/cpp/language/this. The implication of that language is that code must cope with the deletion of the this object, which would tend to restrict what the C++ compiler can do.  However, in this case the deletion is from a different thread, which gives the compiler freedom to assume that there are no races, and also makes it believe that there is no deletion statement.

>  - ~Mutex shouldn't assume 'this' is valid after unlock: let's say the
> ~Mutex was used to implement a (slow) atomic ref-count: after we've
> dec-ref'd and unlocked, another thread can dec-ref and delete this

I also find it hard to accept as a general principle that this is reasonable behavior :)

> Trying to think of how this could be less hazardous, I stuffed "AndDestroy"
> into the names of the 4 methods that do this, matching the pre-existing
> OffThreadPromiseTask::dispatchResolveAndDestroy().  I also tweaked
> streamClosed() so that an "AndDestroy" method is always immediately followed
> by return.

It's a welcome clarification but doesn't really address my concerns.  I will however r+ the patch because I sense that we are at an impasse.
Attachment #8921241 - Flags: review?(lhansen) → review+
I think the compiler should need to model deletion as a mutating operation and thus assume that even a const member variables can be "mutated" at any time.  Thus, if there is a happens-before relation between an unlock() and a delete, the compiler would be introducing a data race to move a read of a const field from before to after an unlock().
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4104997aea0c
Baldr: actual streaming compilation (r=lth)
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/4104997aea0c
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Depends on: 1411545
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: