Closed Bug 932627 Opened 11 years ago Closed 7 years ago

Create VMFunctions wrappers under CodeGenerator::link.

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: nbp, Unassigned)

References

Details

Attachments

(1 file, 3 obsolete files)

Currently all VMFunction wrappers are generated ahead of time, on a nightly build we have around 188 VMFunctions which is both taking time at the initialization of the binary and time at the creation of the wrappers.

Creating wrappers when they are used for the first time will reduce the number of generated wrappers, and the need for an initialization list.

The initialization list is currently the only blocker behind Bug 905210 & Bug 894092 which basically mean that we can move all the statics back into the CodeGen functions, which will reduce the cold_startup of B2G applications.

The trick is that we would need to associate to each return address to the corresponding VMFunction, probably by adding it to the SafepointIndex, such as we can identify the VMFunction for a given call.

Note that adding a VMFunction pointer to the SafepointIndex would be redundant with the ExitFrame footer.  So we should probably remove the exit frame footer and always use the SafepointIndex to identify calls.  This will reduce the time needed to make VMFunction / DOM calls :)  [we should either do that as part of this bug or open a follow-up bug]
Component: JavaScript Engine → JavaScript Engine: JIT
As we discussed, instead of jumping through hoops to move compilation costs around at runtime, why don't we just include the VMFunction wrappers as part of the binary? This would also enable sharing the wrappers across IonRuntimes.
This patch handles the x64 version, as it was easier to test.

This add a new stub which is charged to emulate the stack in case of a GC, and to allocate a Stub in the atoms compartment for the VM wrapper, when it is required.

Stubs are eagerly generated for Baseline, because baseline is not compiled out of the main thread and because most of the callVM uses are made by the ICs of baseline which are already generated on-demand.

This patch also handles PJS, but we need to pay attention that PJS does not stop the world. Which means that multiple thread can enter the generic wrapper, as well as execute the jump/call which is patched.  I will open another bug to make the patching thread safe.

I am only asking for feedback, as this patch does not include the x86 / ARM version, but they would mostly be similar.
Attachment #826543 - Flags: feedback?(hv1989)
Depends on: 934314
Before going too deep into the review.

So there are 188 VMFunctions that need to get initialised. This is done only for every IonRuntime. So this is more a problem for b2g, since they use a new context for every app, right.

1) Can you give numbers on how many VMFunctions now get initialized? (with lazy initialisation)
2) What is the overhead of lazy init. ?
3) What are the pros and cons vs comment 1. Sharing across IonRuntimes seems better. Since we remove all initialisations during creation of IonRuntime. So why didn't you go that way?
(In reply to Hannes Verschore [:h4writer] from comment #3)
> Before going too deep into the review.
> 
> So there are 188 VMFunctions that need to get initialised. This is done only
> for every IonRuntime. So this is more a problem for b2g, since they use a
> new context for every app, right.
> 
> 1) Can you give numbers on how many VMFunctions now get initialized? (with
> lazy initialisation)
> 2) What is the overhead of lazy init. ?

I will answer these in 2 weeks.

> 3) What are the pros and cons vs comment 1. Sharing across IonRuntimes seems
> better. Since we remove all initialisations during creation of IonRuntime.
> So why didn't you go that way?

The goal of VM wrappers was to avoid manually writing tons of tiny C++ wrappers (1), while reducing the code size (2) needed for making such VM function calls.

Having C++ functions to act as wrappers would involve handling a different stack mechanism in C++, and do the marking from there.  Also, the fact of having them manually written in C++ was one of the mistake made in JM.  Note, having these wrapper manually written in inlined-assembly would be both error-prone and compiler-dependent.

Also, I think the approach suggested in comment 1 does not consider the current constraint, and it would be hard to make it because of the dependency on the Runtime.  Currently we depend on the Runtime to set the ionTop and to give the correct argument to the VMFunction.

I think we would first need to remove the dependence on the runtime, which work against (2), and then we can think of generating them once or having them statically generated (as opposed to (1)).  And I think this is way harder than making a lazy creation of VM wrappers, which is why I chose this approach.
Comment on attachment 826543 [details] [diff] [review]
x64 - Compile lazify the VM Functions used by Ion.

Review of attachment 826543 [details] [diff] [review]:
-----------------------------------------------------------------

So this does two things:
1) Make it possible to create vm wrappers on request, instead on the creation of the jitRuntime.
2) Only create VM wrappers on first call.

As I'm reading it, the complexity is in (2). Do we need (2)? And why? Since I think I could argue only (1) is needed and as soon as we want to make a vmcall, we know the function. So we could create the VM wrapper during compilation and already bake in the right pointers etc... The subset of VM wrappers that get created during compilation will also much less than all VM wrappers and will almost be the same as the ones that do get called...

So what am I missing here? Or why is it needed to also have (2)
Attachment #826543 - Flags: feedback?(hv1989)
(In reply to Hannes Verschore [:h4writer] from comment #5)
> Comment on attachment 826543 [details] [diff] [review]
> x64 - Compile lazify the VM Functions used by Ion.
> 
> Review of attachment 826543 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> So this does two things:
> 1) Make it possible to create vm wrappers on request, instead on the
> creation of the jitRuntime.
> 2) Only create VM wrappers on first call.
> 
> As I'm reading it, the complexity is in (2). Do we need (2)? And why? Since
> I think I could argue only (1) is needed and as soon as we want to make a
> vmcall, we know the function.

I agree, what we want is (1), (2) is the mean to achieve (1).

> So we could create the VM wrapper during
> compilation and already bake in the right pointers etc...

The problem is that we have no JSContext when we are generating the VMFunction calls from the CodeGenerator of IonMonkey.  The main reason why the VMWrapper creation has been moved to the the IonRuntime was to prevent the dependence on the JSContext and prevent locking the JSContext to allocate the VMWrapper.

At the time, I suggested to Brian to use the link phase to create all needed VMFunctions and to substitute their pointers in the code.  As we are going to have everything out off the main thread, one approach is to avoid depending on the JSContext when we do not have it, and to be even more lazy on the creation of VMWrappers, as in the current patch.

Otherwise, we could just go for the solution of doing a substitution during the link phase, which prevent any issue related to make the repatchable jumps thread safe. (Bug 934314)

> The subset of VM
> wrappers that get created during compilation will also much less than all VM
> wrappers and will almost be the same as the ones that do get called...

Some VMWrapper are only used as fallback solutions, but I can agree that this is not a big deal for the moment and this does not justify (2).

> So what am I missing here? Or why is it needed to also have (2)

The reason of (2) is mostly (1) & Parallel link-phase.

Note: I just asked Brian, and he told me that we are not going to have parallel link.  So, I will just go for the solution which allocates the VMWrappers during the link phase.
Thanks for the explanation.

(In reply to Nicolas B. Pierron [:nbp] from comment #6)
> Otherwise, we could just go for the solution of doing a substitution during
> the link phase, which prevent any issue related to make the repatchable
> jumps thread safe. (Bug 934314)

Yes, doing that during the link phase seems cleaner for me
Summary: Lazily create VMFunctions wrappers on the first call. → Create VMFunctions wrappers under CodeGenerator::link.
Attached patch lazy-vm-wrappers-2.patch (obsolete) — Splinter Review
This patch move the generation of the VM function wrappers under CodeGenerator::link function for IonMonkey and inlined for Baseline.

I tested this patch on x64/x86.  I still have to test it on ARM.
Attachment #826543 - Attachment is obsolete: true
Attachment #8334646 - Flags: review?(hv1989)
(In reply to Nicolas B. Pierron [:nbp] from comment #4)
> (In reply to Hannes Verschore [:h4writer] from comment #3)
> > Before going too deep into the review.
> > 
> > So there are 188 VMFunctions that need to get initialised. This is done only
> > for every IonRuntime. So this is more a problem for b2g, since they use a
> > new context for every app, right.
> > 
> > 1) Can you give numbers on how many VMFunctions now get initialized? (with
> > lazy initialisation)
> > 2) What is the overhead of lazy init. ?
> 
> I will answer these in 2 weeks.
Needinfo nbp to help him remember to answer this question when his current work has finished ;)
Flags: needinfo?(nicolas.b.pierron)
(In reply to Hannes Verschore [:h4writer] from comment #3)
> 1) Can you give numbers on how many VMFunctions now get initialized? (with
> lazy initialisation)

With the latest patch, we have:

on.js:  40 VMWrappers
no-jit.js: 8 VMWrappers (haha)
high.js: 40 VMWrappers

Before this patch this was 2x 198 VMWrappers.

> 2) What is the overhead of lazy init. ?

With the latest patch the overhead, we should have almost no extra overhead for baseline, and a bit of overhead for patching the calls in Ion.
Flags: needinfo?(nicolas.b.pierron)
Ok, I do not understand the metric reported by b2gperf, as it is still reporting the same results …

After checking the b2gperf harness, I still get the same results as before.  What I did not noticed before is that high.js has the same results as on.js, which basically means that we are eagerly compiling functions for the sake of type inference.

I will try to time Bug 932039 to see how much time we are losing by doing these unused compilation results.

In the mean time, this patch should still remove unused VMWrappers and improve the memory needed by the JSRuntime as well as the time needed to initialize them.
Comment on attachment 8334646 [details] [diff] [review]
lazy-vm-wrappers-2.patch

Review of attachment 8334646 [details] [diff] [review]:
-----------------------------------------------------------------

Lgtm

::: js/src/jit/CodeGenerator.cpp
@@ +5843,5 @@
>          ionScript->copyPatchableBackedges(cx, code, patchableBackedges_.begin());
>  
> +    // Generate VM wrappers needed by the current Ion Script.
> +    if (linkVMWrappers_.length() > 0)
> +        ionScript->linkVMWrappers(cx, code, linkVMWrappers_.begin(), linkVMWrappers_.end(), masm);

All other functions here use .begin() and length(). So instead of passing .end(), please pass .length().

::: js/src/jit/Ion.cpp
@@ +291,5 @@
>  
> +    IonSpew(IonSpew_Codegen, "# Emitting Unreachable Trap");
> +    unreachableTrap_ = generateUnreachableTrap(cx);
> +    if (!unreachableTrap_)
> +        return false;

This only makes sense for Debug builds, right? Would it make sense to not generate this for Optimized builds?

::: js/src/jit/JitCompartment.h
@@ +214,5 @@
>      // If signal handlers are installed, this contains all loop backedges for
>      // IonScripts in the runtime.
>      InlineList<PatchableBackedge> backedgeList_;
>  
> +    // Thunk that is not supposed to be called, but crash the application if it

s/, but crash/ and crashes/
Attachment #8334646 - Flags: review?(hv1989) → review+
(In reply to Hannes Verschore [:h4writer] from comment #13)
> ::: js/src/jit/Ion.cpp
> @@ +291,5 @@
> >  
> > +    IonSpew(IonSpew_Codegen, "# Emitting Unreachable Trap");
> > +    unreachableTrap_ = generateUnreachableTrap(cx);
> > +    if (!unreachableTrap_)
> > +        return false;
> 
> This only makes sense for Debug builds, right? Would it make sense to not
> generate this for Optimized builds?

No, it also makes sense for optimized builds, such as if a call is not registered, we will fail properly and not with a security issue.  On the other hand, I agree that this supposed to be dead code, but I don't think we can have static GC objects, so we need to allocate an IonCode to please the MacroAssembler which is encoding the call.
Attached patch lazy-vm-wrappers-2.patch (obsolete) — Splinter Review
I am asking again for review since I did quite a few changes to the previous patch.

Delta:
 - Remove functionWrapper lookup from Trampoline-arm.
 - Remove unused headers from VMFunction.h
 - Rebase: Use Linker::newCode<NoGC> for VMWrappers.
 - Use reportOutOfMemory in getVMWrapper if the newCode<NoGC> is failing.
 - Propagate OOMs correctly to prevent label assertions.
 - Split LinkVMWrapper into a Compile & Link, to simplify the error handling and locks.
Attachment #8334646 - Attachment is obsolete: true
Attachment #8339363 - Flags: review?(hv1989)
Comment on attachment 8339363 [details] [diff] [review]
lazy-vm-wrappers-2.patch

Review of attachment 8339363 [details] [diff] [review]:
-----------------------------------------------------------------

I just remember that all CodeOffsetLabel/CodeOffsetJump need a label.fixup(masm) for ARM. For the offset of "lastPatchableCall" we don't do that. Isn't that a potential fault?

(All other comments are mostly renaming requests)

::: js/src/jit/CodeGenerator.cpp
@@ +5797,5 @@
>  }
>  
> +
> +bool
> +CodeGenerator::compileVMWrappers(JSContext *cx)

initUsedVMWrappers?

@@ +5801,5 @@
> +CodeGenerator::compileVMWrappers(JSContext *cx)
> +{
> +    // VM Wrappers are shared stubs created in the atoms compartment.
> +    AutoLockForExclusiveAccess atomsLock(cx);
> +    JSRuntime::AutoLockForOperationCallback lock(cx->runtime());

Remove this.

@@ +5852,5 @@
>                       patchableBackedges_.length());
>      if (!ionScript)
>          return false;
>  
> +    if (!compileVMWrappers(cx)) {

move this under "JSRuntime::AutoLockForOperationCallback lock(cx->runtime());"

and remove the "JSRuntime::AutoLockForOperationCallback lock(cx->runtime());" in compileVMWrappers.

@@ +5945,5 @@
>          ionScript->copyCallTargetEntries(callTargets.begin());
>      if (patchableBackedges_.length() > 0)
>          ionScript->copyPatchableBackedges(cx, code, patchableBackedges_.begin());
>  
> +    // Generate VM wrappers needed by the current Ion Script.

Link VM wrapp...

@@ +5946,5 @@
>      if (patchableBackedges_.length() > 0)
>          ionScript->copyPatchableBackedges(cx, code, patchableBackedges_.begin());
>  
> +    // Generate VM wrappers needed by the current Ion Script.
> +    if (linkVMWrappers_.length() > 0)

s/linkVMWrappers_/patchableVMWrappers/

@@ +5947,5 @@
>          ionScript->copyPatchableBackedges(cx, code, patchableBackedges_.begin());
>  
> +    // Generate VM wrappers needed by the current Ion Script.
> +    if (linkVMWrappers_.length() > 0)
> +        ionScript->linkVMWrappers(cx, code, linkVMWrappers_.begin(),

s/linkVMWrappers/patchVMWrappers

::: js/src/jit/shared/CodeGenerator-shared.cpp
@@ +670,1 @@
>          return false;

I find it strange to re-use markSafepointAt to save the wrapper info. I would like to see it the other way around:

Create function markVMWrapper where we add the wrapper to linkVMWrappers_ and call markSafePointAt.
Attachment #8339363 - Flags: review?(hv1989)
(In reply to Hannes Verschore [:h4writer] from comment #17)
> Comment on attachment 8339363 [details] [diff] [review]
> lazy-vm-wrappers-2.patch
> 
> Review of attachment 8339363 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I just remember that all CodeOffsetLabel/CodeOffsetJump need a
> label.fixup(masm) for ARM. For the offset of "lastPatchableCall" we don't do
> that. Isn't that a potential fault?

The fixup is done as part of linkVMWrappers.

> @@ +5801,5 @@
> > +CodeGenerator::compileVMWrappers(JSContext *cx)
> > +{
> > +    // VM Wrappers are shared stubs created in the atoms compartment.
> > +    AutoLockForExclusiveAccess atomsLock(cx);
> > +    JSRuntime::AutoLockForOperationCallback lock(cx->runtime());
> 
> Remove this.
> 
> @@ +5852,5 @@
> >                       patchableBackedges_.length());
> >      if (!ionScript)
> >          return false;
> >  
> > +    if (!compileVMWrappers(cx)) {
> 
> move this under "JSRuntime::AutoLockForOperationCallback
> lock(cx->runtime());"
> 
> and remove the "JSRuntime::AutoLockForOperationCallback
> lock(cx->runtime());" in compileVMWrappers.

As discuss over IRC, we need to lock the operation callback to allocate the VMWrappers in the atoms compartments and the order of the lock matters, so we cannot move the exclusive lock under the operation callback lock.
 
> @@ +5946,5 @@
> >      if (patchableBackedges_.length() > 0)
> >          ionScript->copyPatchableBackedges(cx, code, patchableBackedges_.begin());
> >  
> > +    // Generate VM wrappers needed by the current Ion Script.
> > +    if (linkVMWrappers_.length() > 0)
> 
> s/linkVMWrappers_/patchableVMWrappers/

patchableVMCalls ?
This is not the VMWrappers which are patched.

> @@ +5947,5 @@
> >          ionScript->copyPatchableBackedges(cx, code, patchableBackedges_.begin());
> >  
> > +    // Generate VM wrappers needed by the current Ion Script.
> > +    if (linkVMWrappers_.length() > 0)
> > +        ionScript->linkVMWrappers(cx, code, linkVMWrappers_.begin(),
> 
> s/linkVMWrappers/patchVMWrappers

Same here?
Follow renaming nits.
Attachment #8339363 - Attachment is obsolete: true
Attachment #8340333 - Flags: review?(hv1989)
Comment on attachment 8340333 [details] [diff] [review]
lazy-vm-wrappers-2.patch

Review of attachment 8340333 [details] [diff] [review]:
-----------------------------------------------------------------

Can you also rename LinkVMWrapper to the new name, PatchableVMCall?

Thanks and now hope this does indeed help FFOS startup time enough ;)

::: js/src/jit/CodeGenerator.h
@@ +432,5 @@
>  #if defined(JS_ION_PERF)
>      PerfSpewer perfSpewer_;
>  #endif
> +
> +    // Compile all VM Wrappers which are waiting in the linkVMWrappers list.  We

s/linkVMWrappers/patchableVMCals

@@ +434,5 @@
>  #endif
> +
> +    // Compile all VM Wrappers which are waiting in the linkVMWrappers list.  We
> +    // need to compile them before we link them the prevent failing the
> +    // compilation among the last operations.

Can you mention:
"Make sure all VM Wrappers used in this script are compiled or where already compiled."
Attachment #8340333 - Flags: review?(hv1989) → review+
nbp, this was backed out three years ago.  Is the patch still relevant?  If so please give priority.
Flags: needinfo?(nicolas.b.pierron)
The problem still exists, and we should probably fix it.

After discussing with Jan and Hannes about other ideas, I think comment 1 would make more sense iff we can make this a general principle.

The idea would be to use our macro assembler to generate the assembly code, that we can include into a C file, and later statically link against the assembly code produced by the generateVMWrapper, and maybe other trampolines.

This way we keep the flexibility of our MacroAssembler, while avoiding writing inlined assembly code for each compiler.
This work is at the moment important for faster start-up of Firefox for Android.

If I recall correctly we loose 50ms on the start-up of Firefox / Workers, by creating all the VMWrappers during the JitRuntime initialization, which happens when the first function gets Baseline-compiled.
Priority: -- → P3
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(nicolas.b.pierron)
Recent changes causes all VMFunction to be part of a single JitCode ""segment"", which invalidates the idea of generating them lazily. (Bug 1416572)
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: