Closed Bug 1319388 Opened 8 years ago Closed 7 years ago

Wasm baseline: Generate function prologue in-line

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: lth, Assigned: lth)

References

Details

Attachments

(2 files, 2 obsolete files)

Right now the code emitted at the start of the function jumps to the prologue code which is emitted at the end of the function, which then jumps back to the body if the stack overflow check worked out OK.

This means more static code plus more jumping around in every function.  Depending on how you count, it is one or two jumps extra, but since we can probably jump OOL for the stack overflow and flow through the conditional jump if no overflow, then really it's closer to two than one.

It would be nice to generate the code in-line in the prologue.  However, for that to happen, we have to backpatch the stack overflow check because the size of the frame won't be known until all code has been generated for the function.  Initially, therefore, we'd have to emit space for a subtract-constant-from-register that allows for the largest possible frame, and patch it in endFunction().  (At that point we can choose to leave a SUB r, 0 in place or replace it with nops.)

See https://bug1316803.bmoattachments.org/attachment.cgi?id=8812902 for an example of what the current code looks like, though in that case the additional frame size adjustment is zero so the subtraction in the out-of-line prologue code is not emitted.  Imagine it present just after the RSP is being read.
Attached patch bug1319388-prologue.patch (obsolete) — Splinter Review
Roughly like this, but the patching machinery needs more work (both to move it out of the compiler and into the assembler, and to make it work properly), see comments in the code.
Testing shows that many functions are small: bug 1320374 comment 8.  This optimization may be quite useful.
Priority: P5 → P3
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Attached patch bug1319388-inline-prologue.patch (obsolete) — Splinter Review
This is general cleanup, really, but it seems worth doing because it's not hard.  It will probably yield a very small performance improvement across platforms and probably a code size improvement on x86 and x64.  It's the sibling of 1319358 (generate return code in-line when we can).
Attachment #8813164 - Attachment is obsolete: true
Attachment #8817750 - Flags: review?(luke)
Yury: with this patch, and assuming the stack depth gets aligned to WasmStackAlignment (which I'll check), the debug trap handler won't have to dynamically align the stack.
Comment on attachment 8817750 [details] [diff] [review]
bug1319388-inline-prologue.patch

Review of attachment 8817750 [details] [diff] [review]:
-----------------------------------------------------------------

Great idea!  Before landing, could you run the MacroAssembler addition by a JIT person (perhaps Jan) to make sure this is the idiomatic way to express a patchable add?  (E.g., should this be in MacroAssembler.h?)

::: js/src/wasm/WasmBaselineCompile.cpp
@@ +592,5 @@
>      ////////////////////////////////////////////////////////////
>      //
> +    // Assembler functionality (should move).
> +
> +    CodeOffset sub32FromStackPointerWithPatch(Register r) {

nit: should this be named add32ToStackPointerWithPatch?  E.g., this makes the AboveOrEqual branch condition after the add make sense.  With this change, a few other names/comments would need to be updated.

@@ +2218,4 @@
>  
>          MOZ_ASSERT(maxFramePushed_ >= localSize_);
>  
> +        patchSub32FromStackPointer(stackSubOffset_, int32_t(maxFramePushed_ - localSize_));

Could you round up the stack alignment so the stack is always WasmStackAlignment-aligned?  With that change, I believe you'll be able to remove FunctionCall::frameAlignAdjustment (since it'll always be zero).
Attachment #8817750 - Flags: review?(luke) → review+
Comment on attachment 8817750 [details] [diff] [review]
bug1319388-inline-prologue.patch

Jan, see comment 5: Are you OK with these masm changes, which are pretty minimal but not cross-platform?

The alternative is perhaps to abstract a pair of functions like add32ToPtrRegWithPatch() / patchAdd32ToPtrReg() and export those functions from the MacroAssembler cross-platform.  Which I can do, if you think it has utility.
Attachment #8817750 - Flags: review?(jdemooij)
Comment on attachment 8817750 [details] [diff] [review]
bug1319388-inline-prologue.patch

Review of attachment 8817750 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Lars T Hansen [:lth] (On PTO until 27 Dec) from comment #6)
> The alternative is perhaps to abstract a pair of functions like
> add32ToPtrRegWithPatch() / patchAdd32ToPtrReg() and export those functions
> from the MacroAssembler cross-platform.  Which I can do, if you think it has
> utility.

Yeah I think this would be nice for consistency. It might also make life a little easier for the MIPS people, as they (hopefully) won't have to touch WasmBaselineCompile.cpp for this when they get to it.
Attachment #8817750 - Flags: review?(jdemooij)
The more general add32ToPtrRegWithPatch() sounds good; in that case it could subsume the existing x86-shared addlWithWithPatch() in Assembler-x86-shared.h.
As discussed, abstracts add32ToPtrWithPatch(src,dest) and patchAdd32ToPtr(offset,value) at the MacroAssembler layer and implements on all platforms (stubs on arm64 and mips).
Attachment #8822135 - Flags: review?(jdemooij)
(In reply to Luke Wagner [:luke] from comment #5)
> Comment on attachment 8817750 [details] [diff] [review]
> bug1319388-inline-prologue.patch
> 
> Review of attachment 8817750 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +2218,4 @@
> >  
> >          MOZ_ASSERT(maxFramePushed_ >= localSize_);
> >  
> > +        patchSub32FromStackPointer(stackSubOffset_, int32_t(maxFramePushed_ - localSize_));
> 
> Could you round up the stack alignment so the stack is always
> WasmStackAlignment-aligned?  With that change, I believe you'll be able to
> remove FunctionCall::frameAlignAdjustment (since it'll always be zero).

I think that may be a misunderstanding brought on by the names I had chosen for the helper functions (sub32FromStackPointerWithPatch and patchSub32FromStackPointer), as those functions do not actually update the stack pointer but compute temp=SP+patchedValue.  The compiler still uses naked push/pop stack manipulation everywhere; the stack frame is still dynamically sized.

Or if you were thinking of something else: what were you thinking of?
Flags: needinfo?(luke)
Updated compiler patch, carrying r+.
Attachment #8817750 - Attachment is obsolete: true
Attachment #8822136 - Flags: review+
Ah yes, you're right, some wires got crossed in my head, so n/m.  In the future, though, we could have a constant stack depth and change all the pushing/popping to access sp[K] (so a push is K++).  The benefit would be no need to maintain a virtual (memory) fp register (or fine-grained stack-depth pc ranges) to support async interrupt + stack-walk.
Flags: needinfo?(luke)
Comment on attachment 8822135 [details] [diff] [review]
bug1319388-patchable-add.patch

Review of attachment 8822135 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8822135 - Flags: review?(jdemooij) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/e0e05d0bc34b19e5c034de40f1435cde9a803226
Bug 1319388 - Add add32ToPtrWithPatch and patchAdd32ToPtr to MacroAssembler. r=jandem

https://hg.mozilla.org/integration/mozilla-inbound/rev/0de5f500516b0d576fd77f342ac2570b536e8115
Bug 1319388 - wasm baseline: Generate function prologue in-line by patching. r=luke
https://hg.mozilla.org/mozilla-central/rev/e0e05d0bc34b
https://hg.mozilla.org/mozilla-central/rev/0de5f500516b
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: