Closed Bug 1360211 Opened 3 years ago Closed 2 years ago

Merge WasmActivation into JitActivation

Categories

(Core :: JavaScript Engine: JIT, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: bbouvier, Assigned: bbouvier)

References

Details

Attachments

(13 files, 16 obsolete files)

2.05 KB, patch
jandem
: review+
Details | Diff | Splinter Review
3.19 KB, patch
jandem
: review+
Details | Diff | Splinter Review
27.27 KB, patch
Details | Diff | Splinter Review
13.14 KB, patch
luke
: review+
Details | Diff | Splinter Review
20.74 KB, patch
jandem
: review+
Details | Diff | Splinter Review
110.46 KB, patch
luke
: review+
jandem
: review+
Details | Diff | Splinter Review
15.43 KB, patch
luke
: review+
Details | Diff | Splinter Review
14.28 KB, patch
jandem
: review+
luke
: review+
Details | Diff | Splinter Review
20.28 KB, patch
jandem
: review+
luke
: review+
Details | Diff | Splinter Review
185.81 KB, patch
Details | Diff | Splinter Review
43.73 KB, patch
luke
: review+
Details | Diff | Splinter Review
12.17 KB, patch
luke
: review+
Details | Diff | Splinter Review
12.21 KB, patch
jandem
: review+
Details | Diff | Splinter Review
The first step is to merge WasmActivation into JitActivation, merging fields where appropriate, and introducing new ones when it's needed. We'll keep the WasmFrameIterator at this point. Calling into wasm behavior doesn't change (yet): we'll still put two JitActivations on the stack (including one inactive for fast wasm->ion calls).
Preparatory patches as I read the code: this method CalleeStackOffset is unused.
Attachment #8862425 - Flags: review?(jdemooij)
EntryFrameLayout is unused as well: it's just a renaming of JitFrameLayout.
Attachment #8862426 - Flags: review?(jdemooij)
Depends on: 1360248
Depends on: 1360254
Depends on: 1360263
Attachment #8862425 - Flags: review?(jdemooij) → review+
Comment on attachment 8862426 [details] [diff] [review]
2.remove-entryframelayout.patch

Review of attachment 8862426 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay, and thanks!

::: js/src/jit/JitFrames.h
@@ +96,5 @@
>  // dependent.
>  //
> +// Two special frame types exist:
> +// - Entry frames begin an ion activation, and therefore there is exactly one
> +// per activation of jit::EnterIon. These reuse JitFrameLayout.

All this code used to be Ion-specific but it no longer is, so please rename ion activation to JitActivation (or JIT activation) and s/EnterIon/EnterIon or EnterBaseline/.
Attachment #8862426 - Flags: review?(jdemooij) → review+
Keywords: leave-open
Depends on: 1362357
Depends on: 1364520
A few notes:

- since we want the JitActivation and WasmActivation to be merged together (only keeping the JitActivation), thus having only a single JitActivation when entering wasm code, it means the JitFrameIterator must know how to unwind wasm frames.
- There's already a wasm::FrameIterator, so we'll just embed a wasm::FrameIterator (WFI) within the JitFrameIterator (JFI). We'll need "markers" to go from one to the other.
- usually, these markers are the FrameType: http://searchfox.org/mozilla-central/source/js/src/jit/JitFrameIterator.h#28 and using a common frame beginning structure, the JitFrameIterator decides how to unwind a frame based on this FrameType.
- this common structure (the CommonFrameLayout) contains the return address and the frame descriptor (which contains the frame type), *in that order*. It means that the last thing pushed is the return address, second-to-last is the frame descriptor.
- we introduce a frame type (JitFrame_Wasm) and a frame layout (WasmFrameLayout, inheriting from the CommonFrameLayout). These frames might be inserted only for the purpose of the JitFrameIterator, to serve as markers between the worlds: so we're only interested by entries and exits.
- when calling out (exit to C++/ion), we'd need an ExitFrameLayout with a descriptor type := Wasm. The issue is that wasm::Frames don't quite resemble JIT frames, even not the CommonFrameLayout. For instance, a call to C++ looks like:

   fp, arguments..., (ret_addr pushed by call)

while the CommonFrameLayout wants (fp, ret_addr) at sp just after the call. So from the JitFrameIterator, we can't look at the FrameType and decide that it's a wasm frame (because we might be reading wasm arguments). Adding a fake JIT frame just before the call wouldn't work, because the arguments would need to have an offset.
- when calling in (entry to wasm from C++), we can create a frame with WasmFrameLayout and descriptor type := Entry, but where to put? The entry stub dynamically aligns sp (for SIMD), the places the arguments. I think we can put such a frame just before the arguments (the wasm::FrameIterator would know that when it can't unwind FP anymore, it can compute the next FP by adding the frame arg stack size). Even in this case, the ret_addr contained in the CommonFrameLayout wouldn't make any sense, since it's a fake JIT frame.

Another idea is to just set aside the FrameType for wasm, and instead have a boolean callingWasm in the JitActivation. When this boolean is set to true, it means we need to use a WFI, otherwise we can just use a JFI.

- Entries would just set this boolean to true, and their epilogues would set it to false.
- C++ exits would keep this boolean to true, since we're still calling from wasm (other jit activations can be pushed but they won't clobber the value of callingWasm).
- Ion exits would push a real Ion frame (as done right now) *and* set the boolean to false.

Of course, this is still problematic, because in some rare cases in profiling mode, we might profile at the tipping point when the boolean was not set yet but the frame was already created (and conversely); the wrong frame iter would then be used, likely crashing.

(Another completely different path is to just rewrite entirely wasm frames so they match the layout of ion frames, but that's much more involved and complicated)
An alternative idea: encode information in the low bits of the frame pointer (since it should be aligned on a stack boundary) to indicate whether a frame is a wasm frame or an ion frame. One path would keep 0 as the low bit, the other path would have to manually OR the low bit (meaning a very small speed hit). In this case, since wasm is much less frequently used, it'll be the slow one here. This would have to be done only for exits (to ion/C++), for non-profiling frame iteration: the JitFrameIterator could use this bit to know if we're in a wasm or ion frame, at construction time, selecting the Jit way of unwinding frames (= the JitFrameIterator currently) or the wasm way (wasm::FrameIterator).

For profiling frame iteration, to determine whether we're in wasm or jit code, we can find a wasm instance based on PC lookup (in the wasm compartment). If none is found, we're in a jit; otherwise, we're in wasm and can use a wasm::ProfilingFrameIterator.

The naive way to implement this would be to add a "mode" (JS or wasm) in the JitFrameIterator, and assert that each of its methods handles both cases properly (sometimes with if(wasm){}else{}, but most of the time MOZ_ASSERTing(!wasm) since wasm doesn't use most JitFrameIterator methods). Another cleaner way is to split the interface:

- introduce a JitSegmentIterator, that selects the underlying frame iterator according to the FP low bit (non profiling mode) or the PC lookup (profiling mode)
- keep on using JitFrameIterator for unwinding JIT frames
- keep on using wasm::FrameIterator for unwinding wasm frames
- JitSegmentIterator would have {is,as}{Jit,Wasm} for giving access to methods used only by one of the two frame iterators
- JitSegmentIterator would jump from one iterator to the other, at boundaries (entries/exits).

The interface splitting could be done in a first patch, reducing the amount of shuffling in the second patch.
Attached patch 3.wip-segmentiter.patch (obsolete) — Splinter Review
A first glance (WIP) at what the JitSegmentIter could look like. With this, we should be able to merge the FrameIter::{WASM,ION} states into a single FrameIter::JIT enum value, having each FrameIter::method either decide if it wants to have a look at asJit()/asWasm(), or the code in the FrameIter::methods could be buried into the JitSegmentIter API instead.
Attached patch 3.jitsegmentiter.wip.patch (obsolete) — Splinter Review
Jan, asking for early feedback to know if it's worth continuing or not. I've got a JitSegmentIter structure now, that's a variant between a JitFrameIterator/wasm::FrameIterator, and a few methods implemented for each of them.

I wanted to get rid of the FrameIter STATE enum values JIT/WASM, combining them into a JITSEGMENT value instead. Each time we'd see the JITSEGMENT state, we'd call into one method of the JitSegmentIter.

It introduces a bit of boilerplate, but because of the mozilla::Variant and the match() mechanism, this ensures we never forget a case. (Maybe we could use it at the FrameIter level itself, instead of DONE/INTERP/JITSEGMENT?)

What I thought to be a promising approach looked a bit dirty when it came to most methods, which for the JIT use the FrameIter::script() function. This function uses the FrameIter::ionInlineFrames_ attribute, which the JitSegmentIter doesn't have access to. So every time we'd want to call one such method, we'd have to pass a pointer to the ionInlineFrames_ attribute, only for the JIT case (it's obviously unused by wasm).

This allows a nice symmetry and blackboxing of the underlying used iterator, but it's a bit messy to pass the InlineFrameIterator for each method where the JitFrameIterator *could* use the inlineFrameIter. See for instance JitSegmentIter::filename() or JitSegmentIter::isEvalFrame().

What do you think? Is it worth going on in this direction, or should we keep it restricted to the operator++/done interface?
Attachment #8887587 - Attachment is obsolete: true
Attachment #8888006 - Flags: feedback?(jdemooij)
Comment on attachment 8888006 [details] [diff] [review]
3.jitsegmentiter.wip.patch

Review of attachment 8888006 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay, I always overlook feedback requests.

Some questions:

* Wasm frames will show up in JitFrameIterator as a single JItFrame_Wasm, or will JItFrameIterator skip wasm frames completely?
* Places in the JITs that only care about Baseline and/or Ion frames will still be able to use JitFrameIterator directly, right? 

The patch makes sense. Alternative is to add a Maybe<WasmFrameIterator> to FrameIter::Data and instantiate that when we see a Wasm frame. I think which approach is nicer depends on the questions above?
Attachment #8888006 - Flags: feedback?(jdemooij) → feedback+
Great questions.

    * Wasm frames will show up in JitFrameIterator as a single JItFrame_Wasm, or
    will JItFrameIterator skip wasm frames completely?

I think skipping all the wasm frames at once implies to store the value of SP
at (wasm) entry somewhere, but because it needs to be reentrant (because of
e.g. a sequence jit->wasm->jit->wasm), it wouldn't be convenient to find such a
place.

    * Places in the JITs that only care about Baseline and/or Ion frames will still
    be able to use JitFrameIterator directly, right? 

Instead, it'd be cleaner if the JitSegmentIterator was used in place of the
JitFrameIterator everywhere (JitFrameIterator and wasm::FrameIterator would be
almost exclusively used by the JitSegmentIter). Then we can implement filters
that skip all the wasm frames (like what JitActivationIterator is to
ActivationIterator), and use the filtered iterators in the locations
where we're using the JitFrameIterator now.

Does it make sense?

The hard question is then: what do we call the JitSegmentIterator only for JIT
(i.e. non wasm) code?

Maybe what's currently JitSegmentIterator could be renamed CodeSegmentIterator (or EmittedCodeIterator? CodeIterator?),
and the JIT-only filtering would be JitCodeSegmentIterator? (still a bit
unfortunate because there's a thing called wasm::CodeSegment, although the risk
of confusion is pretty low in my opinion)
Flags: needinfo?(jdemooij)
(In reply to Benjamin Bouvier [:bbouvier] from comment #11)
> I think skipping all the wasm frames at once implies to store the value of SP
> at (wasm) entry somewhere, but because it needs to be reentrant (because of
> e.g. a sequence jit->wasm->jit->wasm), it wouldn't be convenient to find
> such a
> place.

Couldn't we just store SP when we enter wasm code and use it to compute the "combined wasm frame size" when we leave wasm code? We would have to restore it when we return to wasm code though.

Is the alternative to have the wasm and jit iterators completely independent but to interleave their frames somehow? That seems more complicated for the frame iterators...
Flags: needinfo?(jdemooij)
Depends on: 1384683
Comment on attachment 8888006 [details] [diff] [review]
3.jitsegmentiter.wip.patch

Moved this patch to bug 1384683 instead.
Attachment #8888006 - Attachment is obsolete: true
Attached patch 3.merge.wip (obsolete) — Splinter Review
(WIP patch)
Attachment #8898861 - Attachment is patch: true
Attached patch 3.merge.wip.patch (obsolete) — Splinter Review
(Jan: NI you for question in bullet 5)

An updated WIP, which almost works on x64:

1. one thing that doesn't work is when there's an error thrown from JS jit called from wasm (assertion in HandleException). The good news is that this kind of issues should be not too hard to spot, because the code was trying to ignore all JS frames until it saw a wasm one and asserted.

2. ProfilingFrameIterator hasn't been touched yet.

3. To know when to pass from js jit frames to wasm frames, a new JitFrame type is introduced: JitFrame_Wasm and a corresponding WasmFrameLayout (which is just a JitFrameLayout atm). When the JSJitFrameIter sees the JitFrame_Wasm, it considers itself done(); then JitFrameIter::done() can do the switch to the WasmFrameIter.

4. The wasm->JSJit stub didn't quite put the right numbers in the frame descriptor etc. (it was accounting for the JitFrameLayout twice). This is fixed here, allowing us to use jitFrame().prevFP() to determine wasm exitFP.

5. When the JitFrameIter switches from JSJit to wasm, it just replaces exitFP in the JitActivation. I assumed it would be wrong first, because then other concurrent JitFrameIter wouldn't see the jit frames below anymore (which can be an issue for GC, iiuc), but then I saw that when unwinding frames after an exception, this is what we did already (with EnsureBareExitFrame). Is that correct?

6. JitActivation can tell isWasm(), if and only if exitFP is defined (since we're using the low bit for that purpose). Some functions were looking for a wasm JitActivation where an exitFP wasn't set (e.g. OOB), but it appears that calls to compartment.wasm.lookupCode could be placed before we actually get the activation; once lookupCode and other checks have returned true, we're sure that cx->activation() is a JitActivation. So we can entirely drop ActivationIfInnermost.

Preliminary results: the following benchmark gets a ~30% speedup (I'll compare with v8 and maybe-jsc once the patch has stabilized).

(import $js "main" "func" (func (result i32)))

(func (export "benchmark") (local $i i32)
  i32.const 1000000000
  set_local $i

  loop $top
    call $js

    get_local $i
    i32.const 1
    i32.sub
    tee_local $i

    i32.const 0
    i32.gt_s
    br_if $top
    drop
 end
)
Attachment #8898861 - Attachment is obsolete: true
Flags: needinfo?(jdemooij)
(answered on irc: in HandleThrow it's correct to reset exitFP because we never get back to the unwound frames, but not otherwise)
Flags: needinfo?(jdemooij)
Attached patch wip.patch (obsolete) — Splinter Review
Still a WIP patch, but still I won't be around for the next two weeks, it'd be nice to have an high-level feedback pass. See previous comments explaining a few design choices. (Again if you see XXX comments, they're personal reminders and I'll make sure to remove them once I fix them)

The main difference with the previous patches: i
- now compiles on x64/x86/arm
- can recover from an exception thrown by the jits when the innermost frame is wasm
- passes most tests

Still remaining to do:
- profiling frame iterator
- some subtle edge cases
Attachment #8900358 - Attachment is obsolete: true
Attachment #8901245 - Flags: feedback?(luke)
Attachment #8901245 - Flags: feedback?(jdemooij)
Comment on attachment 8901245 [details] [diff] [review]
wip.patch

Review of attachment 8901245 [details] [diff] [review]:
-----------------------------------------------------------------

Generally looks great!

::: js/src/vm/SavedStacks.cpp
@@ +159,5 @@
>          activation(activation)
>      {
>          MOZ_ASSERT(source);
>          MOZ_ASSERT_IF(framePtr.isSome(), activation);
> +        MOZ_ASSERT_IF(framePtr.isSome() && !(activation->isJit() && activation->asJit()->isWasm()),

nit: perhaps add Activation::isWasm() as shorthand for isJit() && asJit->isWasm?

::: js/src/vm/Stack.cpp
@@ +569,5 @@
> +
> +        wasm::Frame* prevFP = (wasm::Frame*) jitFrame.prevFp();
> +        iter_.destroy();
> +        iter_.construct<wasm::WasmFrameIter>(act_, prevFP);
> +        MOZ_ASSERT(!asWasm().done());

It seems to me that done() shouldn't perform a mutating operation.  Rather, done() should only be evaluated when an iterator has settled on a valid jit or wasm frame and it's operator++/settle that move the iterator across the divide.

::: js/src/vm/Stack.h
@@ +1488,1 @@
>      uint8_t* exitFP_;

Can you rename this 'packedExitFP_' and then add a jsExitFP() symmetric to wasmExitFP()?  I think this will help catch bugs by forcing every use to consider both cases.

@@ +1659,5 @@
> +        MOZ_ASSERT(isWasm());
> +        return (wasm::Frame*)(intptr_t(exitFP()) & ~0x1);
> +    }
> +    void setWasmExitFP(const wasm::Frame* fp) {
> +        exitFP_ = (uint8_t*)(intptr_t(fp) | 0x1);

nit: can you assert low bit of 'fp' is not already set?

@@ +1865,5 @@
>      // Unlike ScriptFrameIter itself, ScriptFrameIter::Data can be allocated on
>      // the heap, so this structure should not contain any GC things.
>      struct Data
>      {
> +        JSContext* cx_;

nit: since you're touching it, can you align the column of cx_?

::: js/src/wasm/WasmFrameIter.cpp
@@ +105,5 @@
>  WasmFrameIter::popFrame()
>  {
>      Frame* prevFP = fp_;
>      fp_ = prevFP->callerFP;
> +    MOZ_ASSERT((uintptr_t(fp_) & 0x1) == 0);

Could we introduce a symbolic constant JitActivation::WasmBit used here and below in place of '0x1'?

::: js/src/wasm/WasmFrameIter.h
@@ +77,5 @@
>      void operator++();
>      bool done() const;
>      const char* filename() const;
>      const char16_t* displayURL() const;
> +    const Frame* fp() const { return fp_; }

I'm reticent to hand out fp(); from past experience, it often ends up causing the Frame to be used directly in a lot of ways that breaks encapsulation of the iterators and inhibits future changes.  I see one use for calling  cx->activation()->asJit()->setWasmExitFP(iter->fp()) in JitFrames.cpp.  I was wondering about this separately: it seems like maybe we should have the iterator do this itself directly, either when constructed or when unwinding is enabled.  I was actually wondering how JSJitFrameIter deals with the problem of solved by Unwind::True and if we make the two iterators do the same thing.

::: js/src/wasm/WasmInstance.cpp
@@ +675,5 @@
>  
>      {
> +        // Push an active JitActivation to describe the wasm frames we're about
> +        // to push when running this module. It is used for both the wasm frames
> +        // and subsequent fast calls to Ion.

nit: I think this is what you'd expect/hope, so no comment needed.

::: js/src/wasm/WasmSignalHandlers.cpp
@@ -1342,5 @@
>      AutoSetHandlingSegFault handling(cx);
>  
> -    WasmActivation* activation = ActivationIfInnermost(cx);
> -    if (!activation)
> -        return false;

I think it's still valuable to have a short-circuit if there isn't an active JitActivation since this reduces risk of crashing while handling an unrelated/real C++ crash.
Attachment #8901245 - Flags: feedback?(luke) → feedback+
Comment on attachment 8901245 [details] [diff] [review]
wip.patch

Review of attachment 8901245 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay.

Looks good to me! It would be great if we could split this up for review though as it's touching complicated code.

::: js/src/vm/Stack.h
@@ +1650,5 @@
>          lastProfilingCallSite_ = ptr;
>      }
> +
> +    // WebAssembly specific methods.
> +    bool isWasm() const {

Nit: I think I'd prefer hasWasmExitFP() or something. isWasm() implies the whole activation is wasm but that's not true IIUC.

@@ +1865,5 @@
>      // Unlike ScriptFrameIter itself, ScriptFrameIter::Data can be allocated on
>      // the heap, so this structure should not contain any GC things.
>      struct Data
>      {
> +        JSContext* cx_;

Nit: FWIW I don't really agree with aligning field names - I'm always the one who adds a new field with a long type name and then I have to either fix everything up or leave a mess :)
Attachment #8901245 - Flags: feedback?(jdemooij) → feedback+
Thanks for the feedback, will address them. Just answering one question:

> I was wondering about this separately: it seems like maybe we should have the iterator do this itself directly, either when constructed or when unwinding is enabled.  I was actually wondering how JSJitFrameIter deals with the problem of solved by Unwind::True and if we make the two iterators do the same thing.

This is done in HandleException by http://searchfox.org/mozilla-central/source/js/src/jit/JitFrames.cpp#730-737, so not at the iterator level but above. The logic is different enough that it justifies not trying to common out this code, since for JS we overwrite the frame footer.
Update for the benchmark in comment 15: still preliminary results, but on the same benchmark:

(the results are the medians of 3 runs)

- spidermonkey (baseline): 6477ms
- spidermonkey (patched): 4233ms
- v8: 4390ms

So we're at least catching up with v8 here and getting slightly better.
Attached patch wip.patch (obsolete) — Splinter Review
(updated patch addressing comments)

Names I've chosen:
- JitActivation::hasWasmExitFP() (instead of isWasm()) is the closest to reality and most descriptive.
- instead of the Activation::isWasm() shorthand, for the same reason Jan explained in his feedback, and after discussion on IRC with him, we've come to the conclusion that Activation::hasWasmExitFP() was the most descriptive (it checks isJit() and asJit().hasWasmExitFP()).

Things remaining:
- still profiling mode
- a few asm.js cases still not passing and needing investigation, including...
- ... a very edgy case with asm.js: after an optimized jit import function call, if the return value has to go through an OOL conversion, and the OOL conversion path contains a call to a JS jitted function (valueOf, yayyyyyy), then the Activation::packedExitFP we read in wasm::HandleThrow is the one from the jit call, not the wasm one (that's been clobbered). This issue is a headache, I'll sleep on it. (We cannot just have a thunk for the CoerceInPlace* functions, because then the exitFP would be unset just after the CoerceInPlace call, right before the jump to wasm::HandleThrow)
- split the patch in smaller chunks, to facilitate reviewing
Attachment #8901245 - Attachment is obsolete: true
(In reply to Benjamin Bouvier [:bbouvier] from comment #22)
> - ... a very edgy case with asm.js:

Interesting!  So thinking about it a bit, unlike all the other exits, the Jit exit is not actually an exit!  It's just a Callable (as in GenerateCallablePrologue()) which should have ExitReason::None() and thus not set exitFp (which is great, that should remove a non-trivial amount of insns from prologue/epilogue which should boost our score on comment 21!).  But if we throw, I think that means we need to store to exitFP right before jumping to the throw stub.  Probably we could jump to a Label at the end of the Jit stub that sets this and then jumps to the throw label.

This makes me think also: wasm currently has the invariant that exitFP is cleared when returning from an exit so exitFP is null in normal code.  Thus, exitFP, if non-null, is never stale and this is assumed by the branch on exitFP in ProfilingFrameIterator().  I think we need Ion to preserve that invariant as well if we call directly into Ion code.
Some report on the work made last week, to try to preserve the wasm invariant:

> exitFP, if non-null, is never stale

which means clearing exitFP (setting it to nullptr) in all the JIT paths too. This would slow down the VM calls just a bit by adding a few instructions on the exit path(load activation, pointer chasing and setting exitFP to 0). I've started doing this, but it is rather complicated for a few reasons:

- most of the time, the JIT doesn't care about leaving a stale exitFP; so while there are APIs in the masm which would be the clear cutting point for this (leaveExitFrame which is the parallel of enterExitFrame/enterFakeExitFrame), leaveExitFrame is barely called after an enterExitFrame.
- there are many paths in the JIT where we create a new frame, set the exitFP, then make a decision upon a value (read from a register or the stack) to decide where to jump next (e.g. in bailout tail). So the leaveExitFrame would need to be scattered into all the places that are jumped to, plus we'd be wasting every time a jumper isn't setting the enterExitFrame first.
- the instructions for clearing JitActivation::exitFP need a scratch register; live register analysis has to be done in all these JIT paths epilogues and it would be quite easy to do a mistake there, especially when paths diverge or special purpose registers are used. We could introduce a new register, NonArgReturnVolatileRegister and make sure it isn't used for any other purpose.


Reframing the problem: what we want is for the profiling iterator to have a way to know "are we called from wasm or from ion?" based on exitFP and the current machine state. We thought about another way to do this while only weakening slightly the invariant:

- if we're in wasm code, we can just look up based on the value of pc
- if we've exited to C++ from wasm, we can just look if exitFP's low bit is tagged
- in all the other cases, we're in ion or we've exited from ion

This way seems simpler, less invasive and presents a weaker new invariant: exitFP's low bit is set if and only if we've exited *from wasm*. The good thing about it is that we don't even need to worry about setting/unsetting exitFP when calling into the JIT and back. Most of the time, there won't be any live wasm module (until it gets very popular), so the cost of looking up PC will be very low. We could even straighten the invariant again in the future by having the JIT set exitFP to null when we're in JIT code, but that does not seem to show any other benefit than enforcing the initial wasm invariant. I'll update the patch.
> The good thing about it is that we don't even need to worry about setting/unsetting exitFP when calling into the JIT and back.

New results for the benchmark in comment 15 (same method as in comment 21):

- spidermonkey baseline: 6507 ms
- spidermonkey patched: 3224 ms (50.4% speedup)
- v8: 4438 ms
Attached patch wip.patch (obsolete) — Splinter Review
(including first changes to have the profiling frame iterator working too; WIPy)
Attachment #8907833 - Attachment is obsolete: true
Attached patch wip.patch (obsolete) — Splinter Review
An interesting problem came out, for profiling mode: now, the jit exit doesn't set exitFP in the JitActivation anymore (because it isn't a regular exit anymore). So the profiling iterator will use much more wasm::StartUnwinding, and we need to consider if the values of FP are valid for each instruction in wasm::GenerateImportJitExit (excluding the prologue/epilogue which are already handled).

Fortunately, FP is *mostly* reliable: there's just a small set of instructions where FP can have been clobbered by the JIT and is unreliable: http://searchfox.org/mozilla-central/source/js/src/wasm/WasmStubs.cpp#761-765
In this patch, I call this set of instructions the "untrusted FP" zone, and store the start/end offsets of this zone in the importJitExit CodeRange (there's enough space, considering the CodeRange::Function use so much). Then, in wasm::StartUnwinding, we can pattern match these instructions and just ignore the stack in this case, which is better than crashing.

It's unfortunate because it means that profiling will lose all stack information in the untrusted FP range (6 instructions on ARM), but it is the best we can do without resorting to having the JIT restoring wasm's exitFP.
Attachment #8910372 - Attachment is obsolete: true
I am working on profiling now, and there are some cases where we will lose an entire profiling stack for a few instructions, because of transitory stubs. I wanted to document that here, considering I've spent enough time wondering if the patch was just doing something wrong and losing some information, or if there was something else.

For instance, I've analyzed the profiling stack for a simple program: a function named JS calls into wasm which calls into a JS function named CALLEE.

In the following array of profiling captured stacks, > means wasm entry, any number means a wasm function name, < means a wasm exit, any name means a JS function. The prefix is (index, number of ARM simulated instructions in that range).

The interesting parts are where we suddenly lose some function display in the profiling stack, at indexes 5 and 7. All the JS functions are run enough to be baseline-compiled and running in baseline (but not ion). The wasm function is run enough times so that it uses the optimized jit exit.

The fundamental idea here is that the wasm profiling iterator uses JitActivation::exitFP, while the jit profiling iterator uses JitActivation::lastProfilingFrame.


1, 299: JS
2, 15: >,JS
3, 17: 1,>,JS
4, 24: <,1,>,JS
5, 9: JS              ----> losing stack info here: when calling into the baseline
                      script, in this instruction range, we're not in wasm code
                      (pc isn't in wasm code, and wasm exitFP isn't set). So we
                      use the jit frame iterator, which uses
                      JitActivation::lastProfilingFrame as the reference FP for
                      unwinding. It's not set yet, because we're in
                      BaselineCompiler::emitProfilerEnterFrame [1], which sets
                      lastProfilingFrame in 9 instructions.
6, 710: CALLEE,1,>,JS
7, 8: JS              ----> here, we've just set lastProfilingFrame to nullptr and
                      are about to ret in the baseline code [2]. Then, there's the
                      dynamic stack alignment assertion in debug mode and a few
                      instructions to reload FP in the wasm-to-jit exit (untrusted 
                      FP zone, mentioned in the previous comment).
8, 19: <,1,>,JS
9, 7: 1,>,JS
10, 12: >,JS
11, 57: JS


I think what happens at 7 is inevitable unless JitActivation::exitFP gets properly reset by the JIT, which is something we've excluded. What happens at 5 could be mitigated by having wasm set the JitActivation::lastProfilingFrame to the wasm-to-js transition frame, but that would add a few instructions to the wasm-to-jit call path, making the exit slower.

In both cases, considering the number of instructions in which we don't have profiling information is fixed and so low (I expect it to be even smaller in the x86 and x64 cases), we could agree that it's acceptable to lose this information in favor of performance. Of course, the same analysis needs to be done for Ion, but I expect to come to the same conclusions. Luke, does this sound reasonable?

[1] call of emitProfilerEnterFrame in emitPrologue http://searchfox.org/mozilla-central/source/js/src/jit/BaselineCompiler.cpp#362
[2] http://searchfox.org/mozilla-central/source/js/src/jit/arm/Trampoline-arm.cpp#1435-1436
Flags: needinfo?(luke)
Fwiw, the Ion analysis was trivial to do, because the setting of lastProfilingFrame happens in CodeGenerator::generatePrologue/generateEpilogue: the numbers at indexes 5 and 7 are respectively 6 and 8, for Ion. (the difference by 3 instructions in the case of index 5 is because baseline does a dynamic stack alignment check + toggled jump)
Thanks for the thorough investigation and writeup.  I think you're right that, due to the limited number of instructions here, we probably won't be materially hurting our profiling results.  If this does become a problem, there are a number of mitigations I can imagine, but it's probably just not worth our time right now.
Flags: needinfo?(luke)
To wit: in comment 28, at index 6 (we're sampling from the JS call), we're losing the jit exit information.

When transitioning from the JS frame to the wasm frame, we pass the last value of FP read by the JSJit profiling frame iterator, which points at the first wasm frame: the jit exit frame. Since we can only read the return address from this FP (in wasm::ProfilingIterator::initFromFP) to get a value of pc to get the unwinding starting, we skip the exit frame and end up not seeing it in the profile.

I've had an idea to fix this: when transitioning, we can actually recover the return address as well (from the JitProfilingFrameIterator), so we could in theory reconstruct a RegisterState that would contain only FP and PC (set to the return address). That's safe to have null SP and LR, because the instruction at the return address isn't in the critical statically known prologue/epilogue ranges that are scanned by wasm::StartUnwinding.

Unfortunately, the return address is the first instruction *after* the call, of course: so it's right in the "untrusted FP zone" defined before. The obvious way would be, instead of using RA, to use RA-sizeof(call instruction), but that means introducing a platform-specific function to get that info, and that makes things a bit confusing...

--- (an hour later)

Fortunately, I realized after writing this comment that we can still use a pretend innerframe for the jit exit in this case (during the writing of this patch, I had removed it). With a special case in wasm::GenerateCallablePrologue to *not* set FP for the jit exit. That works like a charm and we don't lose the jit exit info anymore \o/
Attached patch wip.patch (obsolete) — Splinter Review
(close-to-final WIP patch: needs to be split)
Attachment #8910427 - Attachment is obsolete: true
Doh! Now the ARM simulator has to look up PC for each memory access (be it in JSJit or wasm code), because we don't have a strong signal for "are we running in wasm code" anymore (before, that would have been the presence of WasmActivation on the stack).

http://searchfox.org/mozilla-central/source/js/src/jit/arm/Simulator-arm.cpp#1590

So I had this wild idea to use the real signal handling stuff for the ARM simulator. Everything went pretty well... until I realized we needed a way to advance PC by one instruction on the host. This needs a disassembler per possible host architecture, which seems unfortunate. I've tried adding the x86 disassembler to the ARM simulator build, but it resulted in a death by a thousand cuts (many assembler concepts mismatches).

I'll try something else.

(shelving patch here)
Attached patch full.patch (obsolete) — Splinter Review
Full patch, if people prefer it for getting a better idea of the bigger picture.

I've split the patch into smaller patches, but it's sometimes a bit hard because there are so many dependencies between all of them; I don't expect every single smaller patch to build individually, so they'll get folded back into a big patch after reviews.

(The arm-simulator fix for the issue mentioned in the previous comment hasn't been found and isn't included yet)
Attachment #8911993 - Attachment is obsolete: true
This renames JitActivation::exitFP to JitActivation::packedExitFP, so it makes more sense. Also introduces a variant called jsExitFP(), that's getting a nice assertion about tagging later.
Attachment #8912854 - Flags: review?(jdemooij)
Attached patch b.merge.patch (obsolete) — Splinter Review
The biggest part:

- merges WasmActivation into JitActivation
- uses exitFP tagging to know whether we've exited from wasm or the jits
- (normal) frame iteration (not profiling, not handling errors yet)
Attachment #8912855 - Flags: review?(luke)
Attachment #8912855 - Flags: review?(jdemooij)
Attached patch c.untrustedfp.patch (obsolete) — Splinter Review
Handle the untrusted FP zone, as explained in one of the above comments.
Attachment #8912856 - Flags: review?(luke)
Attached patch d.handle-failures.patch (obsolete) — Splinter Review
(r? luke to explain the variant ctors/methods added in the biggest patch, but most of the code is in the JIT)

Adds error handling for the situation where wasm calls the jit, and the jit throws/bails out.
Attachment #8912857 - Flags: review?(luke)
Attachment #8912857 - Flags: review?(jdemooij)
Attached patch e.profiling.patch (obsolete) — Splinter Review
Adds profiling iteration handling.

I've looked into having JS::ProfilingFrameIterator contain a mozilla::MaybeOneOf<jit profiling frame iter, wasm profiling frame iter, but because the ProfilingFrameIterator is in JS and exposed publicly, that would mean make a lot of the jit/wasm code public as well. Instead, we need to explicitly remember in which mode after construction, to properly unwind.
Attachment #8912858 - Flags: review?(luke)
Attachment #8912858 - Flags: review?(jdemooij)
Attached patch f.tests.patchSplinter Review
(final patch) Tests! With comments!
Attachment #8912859 - Flags: review?(luke)
Just re-benchmarked, for final numbers. Same benchmark as in comment 25:

before the patch: 6485ms
after the patch: 2965ms -> 54% speedup
(didn't measure v8)
Blocks: 1403873
Comment on attachment 8912854 [details] [diff] [review]
3a.rename-exitfp-packedexitfp.patch

Review of attachment 8912854 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Stack.cpp
@@ +1112,5 @@
>          if (jsJitFrame().isBaselineJS()) {
>              jit::BaselineFrame* frame = jsJitFrame().baselineFrame();
>              jit::JitActivation* activation = data_.activations_->asJit();
>  
> +            // activation's exit FP may be invalid, so create a new

self nit: reverted this change.

::: js/src/vm/Stack.h
@@ +1492,5 @@
>  class JitActivation : public Activation
>  {
> +    // If Baseline, Ion or Wasm code is on the stack, and has called into C++,
> +    // this will be aligned to an ExitFrame. The last bit indicates if it's a
> +    // wasm frame (bit set to WasmBit) or not (bit set to !WasmBit).

self nit: that second sentence should belong to another commit...

::: js/src/wasm/WasmBuiltins.cpp
@@ +177,5 @@
> +    // JitActivation::wasmExitFP() once each time WasmFrameIter is incremented,
> +    // ultimately leaving exit FP null when the WasmFrameIter is done().  This
> +    // is necessary to prevent a DebugFrame from being observed again after we
> +    // just called onLeaveFrame (which would lead to the frame being re-added
> +    // to the map of live frames, right as it becomes trash).

patch splitting issue: will put back where it was

::: js/src/wasm/WasmStubs.cpp
@@ +1023,5 @@
>  
>  // Generate a stub which is only used by the signal handlers to handle out of
>  // bounds access by experimental SIMD.js and Atomics and unaligned accesses on
>  // ARM. This stub is executed by direct PC transfer from the faulting memory
> +// access and thus the stack depth is unknown. Since JitActivation::exitFP() is

self nit: packedExitFP
Comment on attachment 8912855 [details] [diff] [review]
b.merge.patch

Review of attachment 8912855 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/BaselineBailouts.cpp
@@ +1514,5 @@
>  
>      // The caller of the top frame must be one of the following:
>      //      IonJS - Ion calling into Ion.
>      //      BaselineStub - Baseline calling into Ion.
> +    //      (Wasm,)Entry - Interpreter or other (wasm) calling into Ion.

self nit: Entry / WasmToJSJit

::: js/src/jit/JitFrames.cpp
@@ +2723,5 @@
>              prevFrameSize = frameSize;
>              frameSize = callerFp - calleeFp;
>  
> +            lastFrameType = frames.prevType();
> +            // The entry frames and the entry frame always overlap.

self nit: lol. Should be:

The entry frame and the first Jit frames always overlap.

::: js/src/jit/MacroAssembler.cpp
@@ +3107,5 @@
>  void
>  MacroAssembler::wasmAssertNonExitInvariants(Register activation)
>  {
>  #ifdef DEBUG
> +    // packedExitFP should not be tagged as wasm. Note this includes nullptr.

nit: includes/takes care of/

::: js/src/vm/Stack.h
@@ +1492,3 @@
>      // If Baseline, Ion or Wasm code is on the stack, and has called into C++,
>      // this will be aligned to an ExitFrame. The last bit indicates if it's a
>      // wasm frame (bit set to WasmBit) or not (bit set to !WasmBit).

self nit: (that was the comment that needed to move from the previous patch)

::: js/src/wasm/WasmBuiltins.cpp
@@ +172,2 @@
>  
> +    // Live wasm code on the stack is kept alive (in TraceWasmFrame) by

self nit: TraceJitActivation

::: js/src/wasm/WasmFrameIter.cpp
@@ +43,5 @@
>      MOZ_ASSERT(fp_);
>  
> +    // Normally, execution exits wasm code via an exit stub which sets exit fp
> +    // to the exit stub's frame. Thus, in this case, we want to start iteration
> +    // at the caller of the exit frame, whose Code, CodeRange and CallSite are

self nit: will revert this change

@@ +52,5 @@
>          MOZ_ASSERT(!done());
>          return;
>      }
>  
> +    // When asynchronously interrupted, exit FP is set to the interrupted frame

self nit: ditto

@@ +89,5 @@
>      // given frame, that frame must not be visible to subsequent stack iteration
>      // (or it could be added as a "new" frame just as it becomes garbage).
> +    // When the frame is "interrupted", then exit FP is included in the
> +    // callstack (otherwise, it is skipped, as explained above). So to unwind
> +    // the innermost frame, we just clear the interrupt state.

ditto
Comment on attachment 8912856 [details] [diff] [review]
c.untrustedfp.patch

Review of attachment 8912856 [details] [diff] [review]:
-----------------------------------------------------------------

I'll also add a comment in the OOL path stating that FP can be trusted between the call() instruction and the end, because FP is non-volatile in the native ABI, by definition.
Comment on attachment 8912858 [details] [diff] [review]
e.profiling.patch

Review of attachment 8912858 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Stack.cpp
@@ +1993,5 @@
> +    // or not. To determine this, there are three possibilities:
> +    // - we've exited to C++ from wasm, in which case the activation
> +    //   exitFP low bit is tagged and we can test hasWasmExitFP().
> +    // - we're in wasm code, so we can do a lookup on PC.
> +    // - in all the other cases, we're not in wasm or from wasm.

self nit: we're not in wasm or not exiting from wasm.
Attachment #8912854 - Attachment is obsolete: true
Attachment #8912854 - Flags: review?(jdemooij)
Attachment #8913172 - Flags: review?(jdemooij)
Attached patch 3b.merge.patchSplinter Review
Attachment #8912855 - Attachment is obsolete: true
Attachment #8912855 - Flags: review?(luke)
Attachment #8912855 - Flags: review?(jdemooij)
Attachment #8913173 - Flags: review?(luke)
Attachment #8913173 - Flags: review?(jdemooij)
Attachment #8912856 - Attachment is obsolete: true
Attachment #8912856 - Flags: review?(luke)
Attachment #8913174 - Flags: review?(luke)
Attachment #8912857 - Attachment is obsolete: true
Attachment #8912857 - Flags: review?(luke)
Attachment #8912857 - Flags: review?(jdemooij)
Attachment #8913175 - Flags: review?(luke)
Attachment #8913175 - Flags: review?(jdemooij)
Attachment #8912858 - Attachment is obsolete: true
Attachment #8912858 - Flags: review?(luke)
Attachment #8912858 - Flags: review?(jdemooij)
Attachment #8913176 - Flags: review?(luke)
Attachment #8913176 - Flags: review?(jdemooij)
Attached patch 3.full.patchSplinter Review
(rebased and addressed self nits)
Attachment #8912852 - Attachment is obsolete: true
The previous set of patches make the ARM simulator very slow (see previous comments). This fixes it by replacing wasm::Compartment::lookupCode by wasm::Compartment::lookupCodeSegment, and keeping a dynamic vector of code segments sorted by base pointer in the current wasm compartment.

Note I have reused (un)registerInstance for this purpose instead of introducing (un)registerCodeSegment, so we need a basic counting of CodeSegments since Instances can share them.

This makes all the ARM simulator tests run much faster, as a necessary side-effect. So fast we can even remove the wasm_code_ caching in the Simulators.
Attachment #8913357 - Flags: review?(luke)
Regarding patch 4: I've looked into having (un)registerCodeSegment, but there doesn't seem to exist an easy way to get back from an CodeSegment to a wasm::Compartment, so we couldn't have unregisterCodeSegment in the CodeSegment dtor easily. Abandoning this approach and keeping the one from comment 52, which works nicely and is simple and easy.
Attached patch 5.codesegment.outparams.patch (obsolete) — Splinter Review
A cleanup after the previous patch: all the lookup functions in wasm::Code don't need the CodeSegment outparam anymore, since we always already have a CodeSegment around.
Attachment #8913604 - Flags: review?(luke)
We can even remove a couple of code()->containsCodePC, because there are release asserts above doing the same check and we already have the CodeSegment. Also, I *think* the exceptional case for Windows where pc points to the interrupt code can be simplified...
Attachment #8913604 - Attachment is obsolete: true
Attachment #8913604 - Flags: review?(luke)
Attachment #8913607 - Flags: review?(luke)
Attachment #8913172 - Flags: review?(jdemooij) → review+
Comment on attachment 8913173 [details] [diff] [review]
3b.merge.patch

Review of attachment 8913173 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM. I skipped most of the wasm-specific code as I'm less familiar with it.

::: js/src/vm/Stack.cpp
@@ +521,5 @@
>  {
> +    act_ = act;
> +    MOZ_ASSERT(act->hasExitFP(), "packedExitFP is used to determine if JSJit or wasm");
> +    uint32_t actualWasmBit = intptr_t(act->packedExitFP()) & jit::JitActivation::WasmBitMask;
> +    if (actualWasmBit == (1 - jit::JitActivation::WasmBit)) {

Can we use |if (act->hasJSExitFP())| here? The packing is a JitActivation implementation detail that we shouldn't rely on here IMO.

@@ +1830,5 @@
>      return *this;
>  }
>  
> +bool
> +Activation::hasWasmExitFP() const

None of the callers are hot? If some potentially are, it would be nice to define this in Stack.h (after the JitActivation definition) or in Stack-inl.h

::: js/src/vm/Stack.h
@@ +1485,5 @@
>  class JitActivation : public Activation
>  {
> +  public:
> +    static const uint32_t WasmBitMask = 0x1;
> +    static const intptr_t WasmBit = 0x1;

Nit: maybe ExitFpWasmBit and ExitFpWasmBitMask? It's a bit more explicit about how/where these values are used.

@@ +1667,5 @@
> +
> +    // WebAssembly specific attributes.
> +    bool hasWasmExitFP() const {
> +        MOZ_ASSERT(packedExitFP_, "FP must be set to determine if this is a wasm JitActivation");
> +        return (intptr_t(packedExitFP_) & WasmBitMask) == WasmBit;

Can this be just return intptr_t(packedExitFp_) & WasmBit; ? Wondering if we can remove WasmBitMask.

Also a nit: personally I prefer uintptr_t over intptr_t, because it avoids weirdness with the sign bit, signed overflow, etc.
Attachment #8913173 - Flags: review?(jdemooij) → review+
Comment on attachment 8913175 [details] [diff] [review]
3d.handle-failures.patch

Review of attachment 8913175 [details] [diff] [review]:
-----------------------------------------------------------------

I assume we have good tests for exception handling across asm.js/wasm frames?

::: js/src/wasm/WasmBinaryConstants.h
@@ +505,5 @@
>  static const unsigned MaxMemoryMaximumPages  = 65536;
>  static const unsigned MaxModuleBytes         = 1024 * 1024 * 1024;
>  static const unsigned MaxFunctionBytes       =         128 * 1024;
>  
> +// A magic value of the FramePointer to indicate after an return to the entry

Nit: s/an/a/
Attachment #8913175 - Flags: review?(jdemooij) → review+
Comment on attachment 8913176 [details] [diff] [review]
3e.profiling.patch

Review of attachment 8913176 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but Luke is more familiar with the wasm profiling bits.

::: js/src/vm/Stack.cpp
@@ +2014,5 @@
>      MOZ_ASSERT(!done());
> +    MOZ_ASSERT(activation_->isJit());
> +
> +    jit::JitActivation* activation = activation_->asJit();
> +    MOZ_ASSERT(activation->isActive());

Can we remove isActive() now?
Attachment #8913176 - Flags: review?(jdemooij) → review+
Yes, we can remove JitActivation::active now! Thanks.
Attachment #8913709 - Flags: review?(luke)
Attachment #8913709 - Flags: review?(luke) → review?(jdemooij)
Comment on attachment 8913709 [details] [diff] [review]
6.remove-active.patch

Review of attachment 8913709 [details] [diff] [review]:
-----------------------------------------------------------------

\o/
Attachment #8913709 - Flags: review?(jdemooij) → review+
Comment on attachment 8913173 [details] [diff] [review]
3b.merge.patch

Review of attachment 8913173 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!  r+, with comments addressed.

::: js/src/jit/Bailouts.h
@@ +103,5 @@
>  // This address is a magic number made to cause crashes while indicating that we
>  // are making an attempt to mark the stack during a bailout.
> +// Don't set the low bit because it could be misinterpreted as a low-bit tagged
> +// wasm exit FP.
> +static uint8_t* const FAKE_EXITFP_FOR_BAILOUT = reinterpret_cast<uint8_t*>(0xba2);

Could you add a static_assert using the wasm exitFP bit symbolic constant?

::: js/src/jit/JSJitFrameIter.h
@@ +178,5 @@
>      bool isBareExit() const;
>      template <typename T> bool isExitFrameLayout() const;
>  
> +    static bool isEntry(FrameType type) {
> +        return type == JitFrame_Entry || type == JitFrame_WasmToJSJit;

I'm worried people will forget about WasmToJSJit and use JitFrame_Entry accidentally in the future.  To prevent this and make sure people really do mean "only the C++->JSJit entry", could you rename JitFrame_Entry to JitFrame_CppToJSJit or something similarly evocative?

::: js/src/jit/JitFrames.cpp
@@ +2713,4 @@
>          size_t prevFrameSize = 0;
>          size_t frameSize = 0;
> +        uint8_t* lastFP = nullptr;
> +        FrameType lastFrameType = JitFrame_Entry;

Instead of switching to an OnlyJSJitFrameIter, I think we could assert an even stronger condition if you added a level of nesting to this loop: activations -> (JSJit | Wasm) -> JSJitFrameIter.  The end of each JSJit iteration (the new mid-level loop) could then assert those isEntry/JitStackAlignment asserts below without needing lastFP/lastFrameType.  Also maybe we could find something meaningful to assert about wasm?

::: js/src/jit/arm/Simulator-arm.h
@@ -41,5 @@
>  #include "wasm/WasmCode.h"
>  
>  namespace js {
>  
> -class WasmActivation;

nit: can you remove \n between namespace js and jit?

::: js/src/jscntxtinlines.h
@@ +567,5 @@
>          return nullptr;
>  
>      if (act->isJit()) {
> +        if (act->hasWasmExitFP())
> +            return nullptr;

Can this explicit test be removed?  I would think GetPcScript() should do the right thing anyway (or should be fixed to).

::: js/src/vm/Stack.cpp
@@ +521,5 @@
>  {
> +    act_ = act;
> +    MOZ_ASSERT(act->hasExitFP(), "packedExitFP is used to determine if JSJit or wasm");
> +    uint32_t actualWasmBit = intptr_t(act->packedExitFP()) & jit::JitActivation::WasmBitMask;
> +    if (actualWasmBit == (1 - jit::JitActivation::WasmBit)) {

Agreed with Jan.  Actually, by any chance can we remove packedExitFP() as a public method so everyone is using the appropriate queries?

@@ +546,5 @@
>  {
>      if (!isSome())
>          return true;
>      if (isJSJit())
> +        return asJSJit().done() && asJSJit().type() != jit::JitFrame_WasmToJSJit;

Looking at operator++, we seem to implicitly skip over JitFrame_WasmToJSJit so is there some other path (maybe the ctor) which does leave us in this state?  Seems like if we had the classic Ctor() { settle() } / op++ { inc(); settle(); } design it would avoid this case entirely so done() wouldn't need to change.  Can we do that?

@@ +580,5 @@
> +        // popped.
> +
> +        wasm::Frame* prevFP = (wasm::Frame*) jitFrame.prevFp();
> +        iter_.destroy();
> +        iter_.construct<wasm::WasmFrameIter>(act_, prevFP);

It's kindof beautiful that, with everything else in place, this is all it takes to iterate from one to the other.

::: js/src/vm/Stack.h
@@ +1667,5 @@
> +
> +    // WebAssembly specific attributes.
> +    bool hasWasmExitFP() const {
> +        MOZ_ASSERT(packedExitFP_, "FP must be set to determine if this is a wasm JitActivation");
> +        return (intptr_t(packedExitFP_) & WasmBitMask) == WasmBit;

Can we remove the MOZ_ASSERT(packedExitFP_) assert?  When packedExitFP_ is null, the predicate will return the logically-correct result.

@@ +1675,5 @@
> +        return (wasm::Frame*)(intptr_t(packedExitFP_) & ~WasmBit);
> +    }
> +    void setWasmExitFP(const wasm::Frame* fp) {
> +        if (fp) {
> +            packedExitFP_ = (uint8_t*)(intptr_t(fp) | WasmBit);

nit: can you assert WasmBit isn't set in fp?

::: js/src/wasm/WasmBuiltins.cpp
@@ +66,4 @@
>  CallingActivation()
>  {
>      Activation* act = TlsContext.get()->activation();
> +    MOZ_ASSERT(act->asJit()->isActive(), "WasmCall pushes an active JitActivation");

With builtin thunks etc, can we assert here the activation hasWasmExitFP()?

@@ +156,5 @@
>      return true;
>  }
>  
> +void*
> +wasm::HandleThrow(JSContext* cx, WasmFrameIter& iter)

Could you keep the a modified version of the original comment?

@@ -167,3 @@
>  {
> -    WasmActivation* activation = CallingActivation();
> -    JSContext* cx = activation->cx();

Can you MOZ_ASSERT(CallingActivation() == iter.activation())?

::: js/src/wasm/WasmCompartment.h
@@ +24,5 @@
>  namespace js {
>  
> +namespace jit {
> +class JitActivation;
> +} // namespace JitActivation

I think we may no longer need this friendship (I think it was from when we tracked number-of-live-wasm-activations in the wasm::Compartment).  If we do need something, I think I'd rather make it public and treat JitActivation as a client instead of private extension.

::: js/src/wasm/WasmFrameIter.cpp
@@ +114,5 @@
>          codeRange_ = nullptr;
>          callsite_ = nullptr;
>  
>          if (unwind_ == Unwind::True) {
> +            // TODO bug 1319203: there may be other JIT frames above.

nit: to not scare anyone, could you change to "TODO: with bug 1319203, there may be other JIT frames above"?

@@ +367,5 @@
>              *tierEntry = masm.currentOffset();
>      }
>  
> +    // To make the jit exit faster, we don't set exitFP although we set the
> +    // exit reason: the profiling iteration knows how to recover FP.

We're not setting the exit reason, so I don't think the comment is quite right.

@@ +368,5 @@
>      }
>  
> +    // To make the jit exit faster, we don't set exitFP although we set the
> +    // exit reason: the profiling iteration knows how to recover FP.
> +    if (!reason.isNone() && !(reason.isFixed() && reason.fixed() == ExitReason::Fixed::ImportJit)) {

Could GenerateImportJitExit instead pass ExitReason::None() so we don't need this special case?

@@ +660,5 @@
>  
> +    // The frame pointer might be in the process of tagging/untagging; make
> +    // sure it's untagged.
> +    bool taggedFP = intptr_t(registers.fp) & intptr_t(JitActivation::WasmBit);
> +    Frame* const fp = (Frame*) (intptr_t(registers.fp) - (taggedFP ? 1 : 0));

nit: I think you can express this more succinctly as:
  Frame* const fp = (Frame*)(uintptr_t(registers.fp) & ~WasmBit);

@@ +820,5 @@
>      exitReason_(ExitReason::Fixed::None)
>  {
>      // In the case of ImportJitExit, the fp register may be temporarily
>      // clobbered on return from Ion so always use activation.fp when it is set.
> +    if (activation.hasExitFP() && activation.hasWasmExitFP()) {

With the other suggested change, you wouldn't need both hasExitFP() and hasWasmExitFP().

::: js/src/wasm/WasmInstance.cpp
@@ -679,5 @@
> -        // the optimized wasm-to-Ion FFI call path (which we want to be very
> -        // fast) can avoid doing so. The JitActivation is marked as inactive so
> -        // stack iteration will skip over it.
> -        WasmActivation activation(cx);
> -        JitActivation jitActivation(cx, /* active */ false);

\o/  IIUC, you can now completely kill the JitActivation field and a bunch of !active-skipping logic.

::: js/src/wasm/WasmStubs.cpp
@@ +710,5 @@
>      // the JIT ABI requires that sp be JitStackAlignment-aligned *after* pushing
>      // the return address.
>      static_assert(WasmStackAlignment >= JitStackAlignment, "subsumes");
>      unsigned sizeOfRetAddr = sizeof(void*);
> +    unsigned sizeOfHeader = WasmFrameLayout::Size() - sizeOfRetAddr;

nit: "Header" is a bit vague.  "FrameMinusRetAddr" is a bit long.  How about "PreFrame"?

@@ +838,5 @@
> +        // call by restoring our tagged exitFP.
> +        masm.orPtr(Imm32(JitActivation::WasmBit), FramePointer);
> +        LoadActivation(masm, scratch);
> +        masm.storePtr(FramePointer, Address(scratch, JitActivation::offsetOfPackedExitFP()));
> +        masm.andPtr(Imm32(~JitActivation::WasmBit), FramePointer);

Instead of exposing LoadActivation, how about factoring out SetExitFP() and ClearExitFP() which are called both here and by GenerateCallable(Prologue|Epilogue)?
Attachment #8913173 - Flags: review?(luke) → review+
Comment on attachment 8912859 [details] [diff] [review]
f.tests.patch

Review of attachment 8912859 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit-test/tests/asm.js/testFFI.js
@@ +126,5 @@
>  for (var i = 0; i < 15; i++)
>      assertEq(f(i), i+10);
>  
>  // Test the throw path
> +function ffiThrow(n) { if (n == 14) throw new Error('yolo'); }

Why is this change necessary?

@@ +151,5 @@
> +    valueToConvert = {valueOf: function () { throw new Error("FAIL"); }};
> +    try {
> +        f(40);
> +    } catch(e) {
> +        assertEq(e.message, "FAIL");

Could this test not stay the same?  It seems to test a weaker condition now (viz., the test will pass if it doesn't throw).

::: js/src/jit-test/tests/wasm/basic.js
@@ +125,5 @@
>  wasmEvalText('(module (import "a" "" (result f32)))', {a:{"":()=>{}}});
>  wasmEvalText('(module (import "a" "" (result f64)))', {a:{"":()=>{}}});
>  wasmEvalText('(module (import $foo "a" "" (result f64)))', {a:{"":()=>{}}});
>  
> +(function testFFIJitExit() {

Could you move this test to wasm/import-export.js and name it testImportJitExit ("ffi" is an asm.js term).
Attachment #8912859 - Flags: review?(luke) → review+
Comment on attachment 8913174 [details] [diff] [review]
3c.untrustedfp.patch

Review of attachment 8913174 [details] [diff] [review]:
-----------------------------------------------------------------

Sad we have to do this, but perfectly done given that we do!  Maybe one day we'll enable fp for all JIT code and then fp will be preserved by the caller...

::: js/src/wasm/WasmFrameIter.cpp
@@ +769,5 @@
>              AssertMatchesCallSite(activation, fixedPC, fixedFP);
>          } else {
> +            if (codeRange->kind() == CodeRange::ImportJitExit) {
> +                // The jit exit contains a range where the value of FP can't be
> +                // trusted.

Could you add "Technically, we could recover fp from sp, but since the range is so short, for now just drop the stack."

::: js/src/wasm/WasmTypes.h
@@ +1130,5 @@
> +    }
> +
> +    // ImportJitExit have a particular range where the value of FP can't be
> +    // trusted for profiling and thus must be ignored.
> +    uint32_t jitExitUntrustedFPStart() const {

nit: can you add \n after comment?
Attachment #8913174 - Flags: review?(luke) → review+
Comment on attachment 8913175 [details] [diff] [review]
3d.handle-failures.patch

Review of attachment 8913175 [details] [diff] [review]:
-----------------------------------------------------------------

This was surprisingly clean :)

::: js/src/jit/JitFrames.cpp
@@ +637,5 @@
>      DebugModeOSRVolatileJitFrameIter iter(cx);
> +    while (!iter.done()) {
> +        if (iter.isWasm()) {
> +            HandleExceptionWasm(cx, &iter.asWasm(), rfe);
> +            continue;

Although I think this is fine for now because there can't be JSJit frames after wasm frames, in the next patch I think, after iter.asWasm() is unwound, we'll need to trigger some sort of operation on 'iter' here to advance to the next frame group or be done().
Attachment #8913175 - Flags: review?(luke) → review+
Comment on attachment 8913176 [details] [diff] [review]
3e.profiling.patch

Review of attachment 8913176 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

::: js/src/vm/Stack.cpp
@@ +1949,5 @@
>          settle();
>          return;
>      }
>  
>      ++jitIter();

As a follow-up, could jit::JitProfilingFrameIterator be renamed to JSJitProfilingFrameIterator and this method to jsJitIter()?  rs=me :)

@@ +1952,5 @@
>  
>      ++jitIter();
> +
> +    // Handle transition frames (see comment in JitFrameIter::operator++).
> +    if (!jitIter().done() && jitIter().frameType() == jit::JitFrame_WasmToJSJit) {

Same question as in the earlier patch: could jitIter() just be done() here and settle() normalizes to a wasm frame group if necessary.

@@ +1995,5 @@
> +    //   exitFP low bit is tagged and we can test hasWasmExitFP().
> +    // - we're in wasm code, so we can do a lookup on PC.
> +    // - in all the other cases, we're not in wasm or we haven't exited from
> +    //   wasm.
> +    if ((activation->hasExitFP() && activation->hasWasmExitFP()) ||

Same comment as earlier with not needing hasExitFP().

@@ +2019,5 @@
>  
> +    // The same reasoning as in the above iteratorConstruct variant applies
> +    // here, except that it's even simpler: since this activation is higher up
> +    // on the stack, it can only have exited to C++, through wasm or ion.
> +    if (activation->hasExitFP() && activation->hasWasmExitFP()) {

ditto
Attachment #8913176 - Flags: review?(luke) → review+
Comment on attachment 8913357 [details] [diff] [review]
4.fast.lookupcode.patch

Review of attachment 8913357 [details] [diff] [review]:
-----------------------------------------------------------------

Great to have this optimized in general!

::: js/src/wasm/WasmCode.cpp
@@ +251,5 @@
>  
>      if (!cs->initialize(tier, Move(codeBytes), codeLength, bytecode, linkData, metadata))
>          return nullptr;
>  
> +    return UniqueCodeSegment(cs.release());

I think this can now just be 'return cs' (C++ does the conversion to rvalue for you b/c it's a return statement).

::: js/src/wasm/WasmCode.h
@@ +78,5 @@
>      uint8_t*        interruptCode_;
>      uint8_t*        outOfBoundsCode_;
>      uint8_t*        unalignedAccessCode_;
>  
> +    const Code*     code_;

nit: could you put this field right above 'tier_' (no \n between)?

@@ +118,5 @@
> +                                    const ShareableBytes& bytecode,
> +                                    const LinkDataTier& linkData,
> +                                    const Metadata& metadata);
> +
> +    void attach(const Code* code) {

nit: 'attach' sounds like something more meaningful is happening; how about initCode()?

@@ +138,5 @@
>      }
>  
> +    const Code* code() const {
> +        MOZ_ASSERT(code_, "should have been attached");
> +        return code_;

nit: could you move this to the line before tier() and make it on one line (removing "should have been attached" since I think that's clear enough)?

::: js/src/wasm/WasmCompartment.h
@@ +34,5 @@
>  
> +// A KISS ref-counting for the vector of code segments in the Compartment,
> +// since they can be reused by different instances (via Code sharing).
> +
> +struct SharedCodeSegment {

Is this ref-counting wrapper really necessary?  I would think a sorted list of CodeSegments would tolerate duplicates without any special trouble.  The reference counting scheme does reduce the size of codeSegments_ in few-module-many-instance scenarios, but that seems somewhat more rare and I'd expect the savings not to be terribly significant even in that case.

@@ +91,5 @@
>  
>      const InstanceVector& instances() const { return instances_; }
>  
> +    // This methods returns the wasm::CodeSegment containing the given pc, if
> +    // any exists in the compartment.

nit: s/exists/exit/

::: js/src/wasm/WasmFrameIter.cpp
@@ +62,5 @@
>      // instead.
>  
>      code_ = &fp_->tls->instance->code();
> +    MOZ_ASSERT(code_ ==
> +        activation->compartment()->wasm.lookupCodeSegment(activation->wasmUnwindPC())->code());

nit: How about keeping lookupCode() as a trivial wrapper that returns lookupCodeSegment(pc)->code()?  Maybe half the changes from lookupCode() to lookupCodeSegment() then be reverted (including this ugly line-wrapping).
Attachment #8913357 - Flags: review?(luke) → review+
Comment on attachment 8913607 [details] [diff] [review]
5.codesegment.outparams.patch

Review of attachment 8913607 [details] [diff] [review]:
-----------------------------------------------------------------

Nice simplifications!
Attachment #8913607 - Flags: review?(luke) → review+
Thanks a lot for all the reviews! I'll address the remarks as best as I can. Answering a few questions already:

From comment 61:

> @@ +367,5 @@
> >              *tierEntry = masm.currentOffset();
> >      }
> >  
> > +    // To make the jit exit faster, we don't set exitFP although we set the
> > +    // exit reason: the profiling iteration knows how to recover FP.
> 
> We're not setting the exit reason, so I don't think the comment is quite right.
> 
> @@ +368,5 @@
> >      }
> >  
> > +    // To make the jit exit faster, we don't set exitFP although we set the
> > +    // exit reason: the profiling iteration knows how to recover FP.
> > +    if (!reason.isNone() && !(reason.isFixed() && reason.fixed() == ExitReason::Fixed::ImportJit)) {
> 
> Could GenerateImportJitExit instead pass ExitReason::None() so we don't need this special case?

We actually do set exitReason in the prologue for getting a pretend frame in
profiling mode! See comment 31. (Otherwise, in profiling mode, we lose the
exit in stacks, as in comment 28 index 6)

Since in all the cases we push something onto the stack (even if it's 0 to say
no exit reason), it seemed worth it to push the correct exit reason to get the
fake frame in profiling mode.

From comment 62:

> ::: js/src/jit-test/tests/asm.js/testFFI.js
> @@ +126,5 @@
> >  for (var i = 0; i < 15; i++)
> >      assertEq(f(i), i+10);
> >  
> >  // Test the throw path
> > +function ffiThrow(n) { if (n == 14) throw new Error('yolo'); }
> 
> Why is this change necessary?
> 
> @@ +151,5 @@
> > +    valueToConvert = {valueOf: function () { throw new Error("FAIL"); }};
> > +    try {
> > +        f(40);
> > +    } catch(e) {
> > +        assertEq(e.message, "FAIL");
> 
> Could this test not stay the same?  It seems to test a weaker condition now (viz., the test will pass if it doesn't throw).

Both these tests found more interesting issues with an Error (because the Error
creates a stack thus uses the iterators, even if it's not read later). I can
keep both tests as they were previously and add two copies using the Error
ctor instead.


From comment 66:

> ::: js/src/wasm/WasmCompartment.h
> @@ +34,5 @@
> >  
> > +// A KISS ref-counting for the vector of code segments in the Compartment,
> > +// since they can be reused by different instances (via Code sharing).
> > +
> > +struct SharedCodeSegment {
> 
> Is this ref-counting wrapper really necessary?  I would think a sorted list of CodeSegments would tolerate duplicates without any special trouble.  The reference counting scheme does reduce the size of codeSegments_ in few-module-many-instance scenarios, but that seems somewhat more rare and I'd expect the savings not to be terribly significant even in that case.

So we'd have copies of the same CodeSegment*? Fair enough (the issue I was
fighting with was that an instance couldn't just remove the CodeSegment* from
the list, if the CodeSegment* was shared by many instances).

> @@ +91,5 @@
> >  
> >      const InstanceVector& instances() const { return instances_; }
> >  
> > +    // This methods returns the wasm::CodeSegment containing the given pc, if
> > +    // any exists in the compartment.
> 
> nit: s/exists/exit/

Maybe my initial sentence is incorrect, but not sure "exit" is the right word
here. I wanted to mean:

This method returns the wasm::CodeSegment containing the given pc, given that
there is such a one in the compartment.

(it still sounds a bit clumsy)
(In reply to Benjamin Bouvier [:bbouvier] from comment #68)
> We actually do set exitReason in the prologue for getting a pretend frame in
> profiling mode!

Ah hah, gotcha.

> Both these tests found more interesting issues with an Error (because the
> Error creates a stack thus uses the iterators, even if it's not read later). I can
> keep both tests as they were previously and add two copies using the Error
> ctor instead.

Ah good, so this is just to make the test more strenuous, makes sense.

> So we'd have copies of the same CodeSegment*?

Exactly.

> > > +    // This methods returns the wasm::CodeSegment containing the given pc, if
> > > +    // any exists in the compartment.
> > 
> > nit: s/exists/exit/
> 
> Maybe my initial sentence is incorrect, but not sure "exit" is the right word
> here.

Oops, I meant s/exists/exist/, just a tiny grammar tweak
> Also maybe we could find something meaningful to assert about wasm?

Just unwinding all the wasm frames one by one runs a bunch of asserts in the wasm::WasmFrameIter methods, so still good to have. Thanks.

> ::: js/src/wasm/WasmBuiltins.cpp
> @@ +66,4 @@
> >  CallingActivation()
> >  {
> >      Activation* act = TlsContext.get()->activation();
> > +    MOZ_ASSERT(act->asJit()->isActive(), "WasmCall pushes an active JitActivation");
> 
> With builtin thunks etc, can we assert here the activation hasWasmExitFP()?

No, because if we're unwinding an exception in jit::HandleExceptions, exitFP may be marked as JS, so the assertion faults when called in the new assertion suggested in wasm::HandleThrow (CallingActivation() == iter.activation()).

> ::: js/src/wasm/WasmCode.cpp
> @@ +251,5 @@
> >  
> >      if (!cs->initialize(tier, Move(codeBytes), codeLength, bytecode,
> >      linkData, metadata))
> >          return nullptr;
> >  
> > +    return UniqueCodeSegment(cs.release());
> 
> I think this can now just be 'return cs' (C++ does the conversion to rvalue for
> you b/c it's a return statement).

clang 4.0 couldn't resolve the right ctor variant, so it had to be kept explicit.
Pushed by bbouvier@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/aac9140f624f
Rename JitActivation::exitFP to JitActivation::packedExitFP; r=jandem
https://hg.mozilla.org/integration/mozilla-inbound/rev/12639b7cafe8
Merge WasmActivation into JitActivation and make wasm->jit calls faster; r=luke, r=jandem
https://hg.mozilla.org/integration/mozilla-inbound/rev/1ce73128f135
Tests; r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/d857c934cbad
Make WasmCompartment code lookup blazingly fast; r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/e7d47205fc1d
Remove all the CodeSegment outparams in wasm::Code lookup functions; r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/0196626330ad
Remove JitActivation::active_; r=jandem
After discussing from IRC, regarding the new assertions, we decided that jit::HandleExceptionWasm would explicitly maintain the wasm invariant (setWasmExitFp(getJsExitFp())) so we always have exitFP marked as wasm when entering wasm::HandleThrow.
Keywords: leave-open
Blocks: 1406041
Depends on: 1406879
No longer blocks: 1406041
Depends on: 1406041
Depends on: 1410192
Depends on: 1414852
You need to log in before you can comment on or make changes to this bug.