Closed Bug 1334212 Opened 3 years ago Closed 3 years ago

Handle multiple contexts per runtime in compiled JIT code

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

Details

Attachments

(1 file, 1 obsolete file)

Attached patch patch (obsolete) — Splinter Review
Currently the JIT bakes in internal pointers into JSContexts in compiled code.  If the code can run on multiple different threads, with multiple different JSContexts, at different times in its existence, then baking in these pointers is incorrect.  The attached patch changes Ion, Baseline, and irregexp compilation to load the current context when it needs to be accessed.  Normally this is just a load from an internal pointer to the zone group being compiled for, except for jitcode in JitRuntime stubs which are not associated with a zone group and need to use TLS (this differentiation is handled transparently by MacroAssembler::loadJSContext).

The main effect this has on the compiled code is that accessing the context now takes an extra dereference.  The main impact this is likely to have is on overrecursion and interrupt checks.  I'm hoping that the aggressive inlining and implicit interrupt checks which Ion performs will ameliorate this.

This patch also fixes a few remaining places where the JIT assumes that there is only a single context / zone group in each runtime.
Attachment #8830833 - Flags: review?(hv1989)
Assignee: nobody → bhackett1024
Brian, what's the plan for the VMFunction wrappers? A loadJSContext that does an extra call + a TLS lookup *on every call* is unacceptable. A few extra loads we could live with but some of our VMFunctions are really hot...
Hmm, I forgot that some of these trampolines are really hot.  I think the only simple way to fix this is to make the VMFunction wrappers and other trampolines to be per-zone-group, rather than per-runtime.  They would still live in the atoms zone but would only be marked during GC for zone groups which are being collected from.  Once the zone group goes away the atoms will stop being marked and can be collected.  How does this sound?

A related issue is that if we move all the stubs to JitZoneGroup then the only thing left in JitRuntime are the executable allocators.  If we do this movement then we can move the executable allocators to JitZoneGroup and JitZoneGroup will be the exact same thing as JitRuntime pre-1323066, and JitRuntime will cease to exist.  Moving the executable allocators would have some nice benefits: not only can they then be accessed lock free, but using things like AutoWritableJitCode wouldn't try to change memory protection for jitcode in active use by other threads (as things stand now I'll have to change writable jitcode to be RWX instead of RW, and it would be nice to not have to do that).
Attachment #8830833 - Flags: review?(hv1989)
(In reply to Brian Hackett (:bhackett) from comment #2)
> Hmm, I forgot that some of these trampolines are really hot.  I think the
> only simple way to fix this is to make the VMFunction wrappers and other
> trampolines to be per-zone-group, rather than per-runtime.

We have a lot of these wrappers unfortunately and creating them takes a while :/ Once per ZoneGroup/tab would be measurable overhead I'm afraid. Could we allocate them per-JSContext/thread, since that's what we really care about for these wrappers? Then we could just bake in the JSContext*. We now also have a per-process executable code quota and some people have tons of tabs open...

> Moving the executable allocators would have
> some nice benefits: not only can they then be accessed lock free, but using
> things like AutoWritableJitCode wouldn't try to change memory protection for
> jitcode in active use by other threads (as things stand now I'll have to
> change writable jitcode to be RWX instead of RW, and it would be nice to not
> have to do that).

+1 for moving these allocators to JitZoneGroup. That's also great for code locality, memory fragmentation, etc. The only worry with that is the per-process executable code quota we now have, because these different allocators combined will have more unused space than a single allocator would have, but the benefits outweigh that IMO.
(In reply to Jan de Mooij [:jandem] from comment #3)
> Could we allocate them per-JSContext/thread, since that's what we really
> care about for these wrappers?

Well nvm, that doesn't work because other code will have pointers to these wrappers. Maybe we should allocate the VMFunction code lazily again?
Well, maybe it would be best to focus for now on what is easy to implement and won't impact our current users.  Initially everyone will have only one zone group per runtime / off main thread parse task (which don't do any pretty JIT compilation), and when Quantum DOM is further along we'll be able to measure and see the memory/etc. characteristics of having multiple content zone groups (I'm assuming the multiple-zone-groups will be behind a pref for a while).  If the JitZoneGroup wrapper overhead is too much we could create them lazily, or look into consolidating them, or go back to having a single set of wrappers in the runtime/process and restructure them so that per-zone-group baseline/ion code loads the JSContext from a static location and then passes the context into the wrapper.
Could we continue to use a global location for the context? Initially Quantum DOM will use cooperative scheduling, so it will be easy to swap this global value as we switch between threads. Once we get preemptive scheduling inside the JS engine, we'll have to fix this. But that's further off.
(In reply to Bill McCloskey (:billm) from comment #6)
> Could we continue to use a global location for the context? Initially
> Quantum DOM will use cooperative scheduling, so it will be easy to swap this
> global value as we switch between threads. Once we get preemptive scheduling
> inside the JS engine, we'll have to fix this. But that's further off.

Sure, I can put a JSContext* on the runtime for the thread that is currently executing using cooperative scheduling.  That would let us keep this patch almost as is, and just change loadJSContext to load from the runtime field instead of using TLS.
(In reply to Brian Hackett (:bhackett) from comment #7)
> Sure, I can put a JSContext* on the runtime for the thread that is currently
> executing using cooperative scheduling.  That would let us keep this patch
> almost as is, and just change loadJSContext to load from the runtime field
> instead of using TLS.

That sounds like a good short-term solution to me.
Attached patch patchSplinter Review
This patch always loads the JSContext from a static location in the macro assembler.  For trampolines the runtime's active context is used, otherwise the zone group's context is used.  (It would be fine to just always use the runtime's active context, but it's kind of nice to minimize dependencies on the active context, as all uses of it will need to be fixed to get preemptive multithreading later on.)
Attachment #8830833 - Attachment is obsolete: true
Attachment #8831484 - Flags: review?(hv1989)
Depends on: 1325050, 1334837
Priority: -- → P3
Comment on attachment 8831484 [details] [diff] [review]
patch

Review of attachment 8831484 [details] [diff] [review]:
-----------------------------------------------------------------

The patch looks good.
I'm needinfo'ing Jan since he raised a question in comment 3, which you answered but he didn't react on it yet.
If he is satisfied with the answer. This is an r+.

::: js/src/jit/CodeGenerator.cpp
@@ +7667,5 @@
>  
>      AllocatableGeneralRegisterSet regs(GeneralRegisterSet::Volatile());
>      Register temp0 = regs.takeAny();
>  
> +    masm.enterFakeExitFrame(temp0,LazyLinkExitFrameLayoutToken);

Nit: space after ,

::: js/src/jit/Ion.cpp
@@ +640,1 @@
>  }

Should we just remove the function?

::: js/src/jit/MacroAssembler.cpp
@@ +2261,4 @@
>  
>      CodeOffset label = masm.movWithPatch(ImmWord(uintptr_t(-1)), reg);
> +    masm.loadPtr(jscontext, reg2);
> +    masm.loadPtr(Address(reg2, offsetof(JSContext, profilingActivation_)), reg2);

Why not using loadJSContext here?
Attachment #8831484 - Flags: review?(hv1989) → review+
@Jan: can you let us know if you agree with Brian his explanation?
Flags: needinfo?(jdemooij)
(In reply to Hannes Verschore [:h4writer] from comment #11)
> @Jan: can you let us know if you agree with Brian his explanation?

Yes that SGTM. Thanks for the NI.
Flags: needinfo?(jdemooij)
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/85849dc84129
Handle multiple contexts per runtime in compiled JIT code, r=h4writer.
https://hg.mozilla.org/mozilla-central/rev/85849dc84129
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Depends on: 1345423
You need to log in before you can comment on or make changes to this bug.