Closed
Bug 1319388
Opened 8 years ago
Closed 7 years ago
Wasm baseline: Generate function prologue in-line
Categories
(Core :: JavaScript Engine: JIT, defect, P3)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox53 | --- | fixed |
People
(Reporter: lth, Assigned: lth)
References
Details
Attachments
(2 files, 2 obsolete files)
13.17 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
5.90 KB,
patch
|
lth
:
review+
|
Details | Diff | Splinter Review |
Right now the code emitted at the start of the function jumps to the prologue code which is emitted at the end of the function, which then jumps back to the body if the stack overflow check worked out OK. This means more static code plus more jumping around in every function. Depending on how you count, it is one or two jumps extra, but since we can probably jump OOL for the stack overflow and flow through the conditional jump if no overflow, then really it's closer to two than one. It would be nice to generate the code in-line in the prologue. However, for that to happen, we have to backpatch the stack overflow check because the size of the frame won't be known until all code has been generated for the function. Initially, therefore, we'd have to emit space for a subtract-constant-from-register that allows for the largest possible frame, and patch it in endFunction(). (At that point we can choose to leave a SUB r, 0 in place or replace it with nops.) See https://bug1316803.bmoattachments.org/attachment.cgi?id=8812902 for an example of what the current code looks like, though in that case the additional frame size adjustment is zero so the subtraction in the out-of-line prologue code is not emitted. Imagine it present just after the RSP is being read.
Assignee | ||
Comment 1•8 years ago
|
||
Roughly like this, but the patching machinery needs more work (both to move it out of the compiler and into the assembler, and to make it work properly), see comments in the code.
Assignee | ||
Comment 2•8 years ago
|
||
Testing shows that many functions are small: bug 1320374 comment 8. This optimization may be quite useful.
Priority: P5 → P3
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → lhansen
Status: NEW → ASSIGNED
Assignee | ||
Comment 3•8 years ago
|
||
This is general cleanup, really, but it seems worth doing because it's not hard. It will probably yield a very small performance improvement across platforms and probably a code size improvement on x86 and x64. It's the sibling of 1319358 (generate return code in-line when we can).
Attachment #8813164 -
Attachment is obsolete: true
Attachment #8817750 -
Flags: review?(luke)
Comment 4•8 years ago
|
||
Yury: with this patch, and assuming the stack depth gets aligned to WasmStackAlignment (which I'll check), the debug trap handler won't have to dynamically align the stack.
Comment 5•8 years ago
|
||
Comment on attachment 8817750 [details] [diff] [review] bug1319388-inline-prologue.patch Review of attachment 8817750 [details] [diff] [review]: ----------------------------------------------------------------- Great idea! Before landing, could you run the MacroAssembler addition by a JIT person (perhaps Jan) to make sure this is the idiomatic way to express a patchable add? (E.g., should this be in MacroAssembler.h?) ::: js/src/wasm/WasmBaselineCompile.cpp @@ +592,5 @@ > //////////////////////////////////////////////////////////// > // > + // Assembler functionality (should move). > + > + CodeOffset sub32FromStackPointerWithPatch(Register r) { nit: should this be named add32ToStackPointerWithPatch? E.g., this makes the AboveOrEqual branch condition after the add make sense. With this change, a few other names/comments would need to be updated. @@ +2218,4 @@ > > MOZ_ASSERT(maxFramePushed_ >= localSize_); > > + patchSub32FromStackPointer(stackSubOffset_, int32_t(maxFramePushed_ - localSize_)); Could you round up the stack alignment so the stack is always WasmStackAlignment-aligned? With that change, I believe you'll be able to remove FunctionCall::frameAlignAdjustment (since it'll always be zero).
Attachment #8817750 -
Flags: review?(luke) → review+
Assignee | ||
Comment 6•8 years ago
|
||
Comment on attachment 8817750 [details] [diff] [review] bug1319388-inline-prologue.patch Jan, see comment 5: Are you OK with these masm changes, which are pretty minimal but not cross-platform? The alternative is perhaps to abstract a pair of functions like add32ToPtrRegWithPatch() / patchAdd32ToPtrReg() and export those functions from the MacroAssembler cross-platform. Which I can do, if you think it has utility.
Attachment #8817750 -
Flags: review?(jdemooij)
Comment 7•8 years ago
|
||
Comment on attachment 8817750 [details] [diff] [review] bug1319388-inline-prologue.patch Review of attachment 8817750 [details] [diff] [review]: ----------------------------------------------------------------- (In reply to Lars T Hansen [:lth] (On PTO until 27 Dec) from comment #6) > The alternative is perhaps to abstract a pair of functions like > add32ToPtrRegWithPatch() / patchAdd32ToPtrReg() and export those functions > from the MacroAssembler cross-platform. Which I can do, if you think it has > utility. Yeah I think this would be nice for consistency. It might also make life a little easier for the MIPS people, as they (hopefully) won't have to touch WasmBaselineCompile.cpp for this when they get to it.
Attachment #8817750 -
Flags: review?(jdemooij)
Comment 8•7 years ago
|
||
The more general add32ToPtrRegWithPatch() sounds good; in that case it could subsume the existing x86-shared addlWithWithPatch() in Assembler-x86-shared.h.
Assignee | ||
Comment 9•7 years ago
|
||
As discussed, abstracts add32ToPtrWithPatch(src,dest) and patchAdd32ToPtr(offset,value) at the MacroAssembler layer and implements on all platforms (stubs on arm64 and mips).
Attachment #8822135 -
Flags: review?(jdemooij)
Assignee | ||
Comment 10•7 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #5) > Comment on attachment 8817750 [details] [diff] [review] > bug1319388-inline-prologue.patch > > Review of attachment 8817750 [details] [diff] [review]: > ----------------------------------------------------------------- > > @@ +2218,4 @@ > > > > MOZ_ASSERT(maxFramePushed_ >= localSize_); > > > > + patchSub32FromStackPointer(stackSubOffset_, int32_t(maxFramePushed_ - localSize_)); > > Could you round up the stack alignment so the stack is always > WasmStackAlignment-aligned? With that change, I believe you'll be able to > remove FunctionCall::frameAlignAdjustment (since it'll always be zero). I think that may be a misunderstanding brought on by the names I had chosen for the helper functions (sub32FromStackPointerWithPatch and patchSub32FromStackPointer), as those functions do not actually update the stack pointer but compute temp=SP+patchedValue. The compiler still uses naked push/pop stack manipulation everywhere; the stack frame is still dynamically sized. Or if you were thinking of something else: what were you thinking of?
Flags: needinfo?(luke)
Assignee | ||
Comment 11•7 years ago
|
||
Updated compiler patch, carrying r+.
Attachment #8817750 -
Attachment is obsolete: true
Attachment #8822136 -
Flags: review+
Comment 12•7 years ago
|
||
Ah yes, you're right, some wires got crossed in my head, so n/m. In the future, though, we could have a constant stack depth and change all the pushing/popping to access sp[K] (so a push is K++). The benefit would be no need to maintain a virtual (memory) fp register (or fine-grained stack-depth pc ranges) to support async interrupt + stack-walk.
Flags: needinfo?(luke)
Comment 13•7 years ago
|
||
Comment on attachment 8822135 [details] [diff] [review] bug1319388-patchable-add.patch Review of attachment 8822135 [details] [diff] [review]: ----------------------------------------------------------------- Thanks!
Attachment #8822135 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 14•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=28dfc715a7e5
Assignee | ||
Comment 15•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/e0e05d0bc34b19e5c034de40f1435cde9a803226 Bug 1319388 - Add add32ToPtrWithPatch and patchAdd32ToPtr to MacroAssembler. r=jandem https://hg.mozilla.org/integration/mozilla-inbound/rev/0de5f500516b0d576fd77f342ac2570b536e8115 Bug 1319388 - wasm baseline: Generate function prologue in-line by patching. r=luke
Comment 16•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e0e05d0bc34b https://hg.mozilla.org/mozilla-central/rev/0de5f500516b
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
You need to log in
before you can comment on or make changes to this bug.
Description
•