Closed Bug 1325050 Opened 8 years ago Closed 8 years ago

Structure reorganization for multithreaded runtimes

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox53 --- affected
firefox54 --- fixed

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

Details

Attachments

(10 files, 1 obsolete file)

Attached patch WIP (obsolete) — Splinter Review
The concept of a single threaded runtime is rather deeply ingrained in the JS engine's data structures. For example, JSRuntime includes many members that are only relevant to a single thread's execution state, like JS stack information. In order to allow the runtime to be multithreaded, we first need to reorganize the core threading data structures to respect the following future constraints: - JSRuntime members can be accessed by multiple threads concurrently. - JSContext members are specific to a single OS thread. - ZoneGroup members are specific to the set of zones in the group. Throughout the group's lifetime it may be accessed on different threads at different points, but at any one point at most one thread has ownership of the group. The attached WIP adds a ZoneGroup class, removes ExclusiveContext and PerThreadData, and shuffles members around for the above constraints. It doesn't (or shouldn't) change the possible behaviors in the engine or relax existing invariants --- the runtime still has a main thread and exclusive zones, and has pointers to the main thread's JSContext and the (for now) single zone group. Future bugs will address these issues, and can be a lot smaller and simpler if the organizational work is done up front.
Assignee: nobody → bhackett1024
After these patches, iirc, we'll start having more than one JSContext and the idea was that JSContext would only contain a very small number of activation-related fields (compared to the huge number of fields it contains now). Assuming that "JSContext" is the name of the per-thread type, does that mean that the type that is, in this patch, named JSContext would get renamed to something else like "JSRuntimeMainThread" (or something more succinct)?
(In reply to Luke Wagner [:luke] from comment #1) > After these patches, iirc, we'll start having more than one JSContext and > the idea was that JSContext would only contain a very small number of > activation-related fields (compared to the huge number of fields it contains > now). Assuming that "JSContext" is the name of the per-thread type, does > that mean that the type that is, in this patch, named JSContext would get > renamed to something else like "JSRuntimeMainThread" (or something more > succinct)? Well, most of the fields that are in JSContext after this patch are directly tied to information about the thread's stack: JS activation info, things like the current compartment/zone, and a grab bag of TLS like |suppressGC| and |ionCompiling|. Fields that are not directly tied to the current thread should be moved to either ZoneGroup (where they can be accessed lock free) or JSRuntime (where locks are required), but I don't know what adding a fourth class would help with: after bug 1323066 is done there won't be a notion in the JS engine anymore of a main thread, other than requiring single threaded execution during engine startup and teardown.
Ah, I see, makes sense. I forgot that there basically ceases to be any concept of "main thread".
Attached patch patchSplinter Review
This patch reshuffles the core threading data structures (JSContext, JSRuntime, Zone, ZoneGroup) per the constraints in comment 1. It also adds a ProtectedData class which describes and provides #ifdef DEBUG checks for the constraints under which each piece of data in each of these structures may be accessed. Some initialization and teardown code has needed changes as a result, but almost all of this patch consists of simple changes and there shouldn't be any new behavior resulting from this patch --- the runtime has a pointer to its main thread context and single zone group, so pointers can be chased to move between any of these data structures. As a bit of background, I worked from an earlier version of this patch to decouple the runtime from the main thread context and zone group and allow off main thread parsing to use a separate zone group. I then unwound those changes (which required more restructuring that I didn't expect) to get back to this point. So, moving forward from here should be easier to do without a lot more data shuffling, and can be done incrementally with smaller patches. Caveat: I didn't get as far as allowing multiple foreground threads to run on the same runtime. There are still some issues to deal with there like handling the large amount of data in JSRuntime and GCRuntime that assumes a single foreground thread, and a few more complex issues like allowing threads to cancel in progress incremental GCs in zone groups they can't access. Fixing these might require more reorganization but at least with the ProtectedData class we'll be able to progress on making the runtime multithreaded in a way that avoids data races and provides checkable documentation for what protects what. I'll break this up for review later today.
Attachment #8820701 - Attachment is obsolete: true
Changes to the JSContext and JSRuntime classes, and lots of tiny changes to other parts of the VM to accommodate them. I'm sorry about the size of this patch; the great majority of it can just be skimmed.
Attachment #8828592 - Flags: review?(jdemooij)
Changes to GC related structures, and lots of tiny changes affected by them. As with the first patch, most of this can be skimmed.
Attachment #8828597 - Flags: review?(jcoppeard)
The ProtectedData class used by the first two patches.
Attachment #8828598 - Flags: review?(jdemooij)
Attachment #8828598 - Flags: review?(jcoppeard)
Attached patch JIT changesSplinter Review
JIT changes, mostly tiny changes necessitated by the first two patches but with some restructuring of JitRuntime.
Attachment #8828600 - Flags: review?(hv1989)
Attached patch Wasm changesSplinter Review
Web Assembly changes necessitated by the previous patches. I think these are all tiny changes (there will be a more substantial patch next week to rework the interrupt mechanism for a multithreaded runtime world).
Attachment #8828601 - Flags: review?(luke)
Attached patch Futex changesSplinter Review
In the first patch FutexRuntime was moved to JSContext. This patch renames it to FutexThread, which seems to better describe its role in a multithreaded runtime world, and has tiny changes necessary as a result. I don't think further changes are actually necessary for this to work with multiple foreground threads in a runtime, since the Futex code is, well, already multithreaded.
Attachment #8828603 - Flags: review?(lhansen)
Some changes to GDB scripts that seem necessary. I don't know how to test these.
Attachment #8828605 - Flags: review?(nicolas.b.pierron)
Attachment #8828603 - Flags: review?(lhansen) → review+
Comment on attachment 8828605 [details] [diff] [review] GDB integration changes Review of attachment 8828605 [details] [diff] [review]: ----------------------------------------------------------------- You can find documentation for running the tests here: http://searchfox.org/mozilla-central/source/js/src/gdb/README#135 The way it works is by creating a gdb instead, seting up a breakpoint and then running the gdb-tests program with the name of the fragment to test. You can find the logic for each test in the corresponding python file.
Attachment #8828605 - Flags: review?(nicolas.b.pierron) → review+
Just a heads up, I'll get to the reviews this week.
Why do we often have to do the "cx->runtime()->contextFromMainThread()" dance? What is the difference between cx and cx->runtime()->contextFromMainThread() ?
(In reply to Hannes Verschore [:h4writer] from comment #14) > Why do we often have to do the "cx->runtime()->contextFromMainThread()" > dance? What is the difference between cx and > cx->runtime()->contextFromMainThread() ? Later patches for bug 1323066 will be removing all the contextFromMainThread() calls and similar functions for the zoneGroup from the engine, as part of changing the one-to-one context/runtime ratio to be many-to-one. Right now there is no difference between cx and cx->runtime()->contextFromMainThread(), but the latter is something that is easy to grep for later and a mark that a call is in need of fixing. More specifically, with multiple contexts per runtime the JIT cannot bake in internal pointers to JSContext fields, since JIT code may run on multiple different threads at different points in its lifetime, each of which has a different JSContext. So, these cx->runtime()->contextFromMainThread() uses in the JIT won't be fixed by just changing them to cx, and will need some more work to get the context for the thread that is running when the jitcode executes (I have a patch for this and will post it later this week).
(In reply to Brian Hackett (:bhackett) from comment #15) > More specifically, with multiple contexts per runtime the JIT cannot bake in > internal pointers to JSContext fields, since JIT code may run on multiple > different threads […] 1. Simultaneously? 2. Does the JIT depends on JSContext fields or on JSRuntime fields?
(In reply to Nicolas B. Pierron [:nbp] from comment #16) > (In reply to Brian Hackett (:bhackett) from comment #15) > > More specifically, with multiple contexts per runtime the JIT cannot bake in > > internal pointers to JSContext fields, since JIT code may run on multiple > > different threads […] > > 1. Simultaneously? No. At any given time, at most one thread has exclusive access to the contents of a zone group. > 2. Does the JIT depends on JSContext fields or on JSRuntime fields? I'm not sure what you are asking. Jitcode might need to access data in the JSContext, JSRuntime, ZoneGroup, Zone, etc. The only pointers here it can't bake in are those inside the JSContext.
Brian, do you think there's value in keeping a separate ExclusiveContext-like thing that's used for threads that don't run JS? With multiple JSContexts it makes sense to make the parse thread just another one of these threads, but there are things in JSContext that aren't relevant for it (the activation stack, error handling differences, etc).
Comment on attachment 8828592 [details] [diff] [review] JSContext/JSRuntime changes Review of attachment 8828592 [details] [diff] [review]: ----------------------------------------------------------------- This is difficult to review due to the size, but you're right that most of it is renaming stuff. I don't feel very confident being the only one approving a large and engine-wide change like this. I think we should send an overview of these changes (especially where and how JSContext will be used in the new world) to the js-internals list, so more people will be aware of this and have a chance to comment. r=me assuming there's no serious disagreement. ::: js/src/vm/HelperThreads.cpp @@ +1157,5 @@ > { > AutoUnlockHelperThreadState parallelSection(locked); > gc::AutoSetThreadIsPerformingGC performingGC; > uint64_t timeStart = PRMJ_Now(); > + TlsContext.get()->heapState = JS::HeapState::MajorCollecting; Here (and below) we could use cx->heapState IIUC. @@ +1819,5 @@ > > currentTask.emplace(HelperThreadState().gcHelperWorklist(locked).popCopy()); > GCHelperState* task = gcHelperTask(); > > + JSContext cx(task->runtime(), JS::ContextOptions()); What is sizeof(JSContext)? Wondering if it'd make sense to heap allocate this. At least these backgrounds threads should have an almost-empty stack so it's probably fine to keep it like this.
Attachment #8828592 - Flags: review?(jdemooij) → review+
Comment on attachment 8828592 [details] [diff] [review] JSContext/JSRuntime changes Review of attachment 8828592 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/ArrayBufferObject.cpp @@ +298,2 @@ > if (!p) > ReportOutOfMemory(cx); Is this ReportOutOfMemory call unnecessary now that we use cx->pod_malloc instead of cx->runtime()->pod_malloc?
(In reply to Jan de Mooij [:jandem] from comment #18) > Brian, do you think there's value in keeping a separate > ExclusiveContext-like thing that's used for threads that don't run JS? With > multiple JSContexts it makes sense to make the parse thread just another one > of these threads, but there are things in JSContext that aren't relevant for > it (the activation stack, error handling differences, etc). I don't think there's a lot of value here. ExclusiveContext was originally added because we could only have one thread in a runtime running JS and needed a way for other threads to interact with the VM. When we have preemptively scheduled multithreaded runtimes this restriction is gone and there is no reason that parse threads can't run JS in the zone group they have exclusive access to. Eventually I'd like to remove the special error handling code used for parse threads, so that after parsing we just merge any generated error object into the target compartment (like we do with the parsed scripts/etc. themselves) and then throw it.
(In reply to Jan de Mooij [:jandem] from comment #19) > Comment on attachment 8828592 [details] [diff] [review] > JSContext/JSRuntime changes > > Review of attachment 8828592 [details] [diff] [review]: > ----------------------------------------------------------------- > > This is difficult to review due to the size, but you're right that most of > it is renaming stuff. > > I don't feel very confident being the only one approving a large and > engine-wide change like this. I think we should send an overview of these > changes (especially where and how JSContext will be used in the new world) > to the js-internals list, so more people will be aware of this and have a > chance to comment. r=me assuming there's no serious disagreement. Sure, I'll send an email. Thanks for the review! > What is sizeof(JSContext)? Wondering if it'd make sense to heap allocate > this. At least these backgrounds threads should have an almost-empty stack > so it's probably fine to keep it like this. sizeof(JSContext) is 3264 on my x64 machine. Since it's constructor is pretty simple I think it's ok to allocate on the stack for each new task for now, but it would be better if we did heap- or stack-allocate a single context for the lifetime of each helper thread. The main blocker for this is that JSContext::runtime_ is an immutable part of the context. The only reason I think this is needed right now is that the two persistent roots on the context are associated with the runtime. I just realized these are buggy (they will add persistent roots to the runtime from off the main thread, which is not yet threadsafe) and will attach a patch to this bug in a bit for that. With that fixed I think we could change the runtime for helper thread contexts to reflect the current task.
(In reply to Jan de Mooij [:jandem] from comment #20) > ::: js/src/vm/ArrayBufferObject.cpp > @@ +298,2 @@ > > if (!p) > > ReportOutOfMemory(cx); > > Is this ReportOutOfMemory call unnecessary now that we use cx->pod_malloc > instead of cx->runtime()->pod_malloc? Yeah, it's unnecessary. I can remove it if you want, but it's also not harming anything and I was trying to keep the number of non-trivial changes to a minimum.
Blocks: 1334837
As noted in comment 22, the use of persistent roots in JSContext is not threadsafe now that JSContexts are created on additional threads. This patch changes these persistent roots to be lazily initialized, which is sufficient to avoid the race. Note that the patches here protect JSRuntime::heapRoots with UnprotectedData, which, well, doesn't do any checking when the field is accessed (it's a placeholder to allow grepping for data that needs fixing for multithreaded runtimes and to allow adding protection for the field without churning a bunch of other code). Bug 1334837 changes this field's protection to ActiveThreadData, which is sufficient to catch this persistent root bug.
Attachment #8831479 - Flags: review?(jdemooij)
Some of the ProtectedData classes check GCHelperState::onBackgroundThread() to see if an access is permitted. This call itself is not threadsafe, though, since this looks at state on the GCHelperState which is not atomic and might change at any time. This patch changes GCHelperState::onBackgroundThread() to use TLS state on the JSContext, so that it may be called at any time. This change allows the related fields to use ProtectedData classes that ensure thread safety, whereas earlier GCHelperState::thread used UnprotectedData.
Attachment #8831480 - Flags: review?(jcoppeard)
Blocks: 1334845
Blocks: 1334194
Blocks: 1334212
Blocks: 1334880
Blocks: 1334885
Blocks: 1334927
Comment on attachment 8828601 [details] [diff] [review] Wasm changes Review of attachment 8828601 [details] [diff] [review]: ----------------------------------------------------------------- Oops, sorry for taking so long on this. ::: js/src/wasm/AsmJS.cpp @@ +1634,5 @@ > typedef HashMap<PropertyName*, AsmJSAtomicsBuiltinFunction> AtomicsNameMap; > typedef HashMap<PropertyName*, SimdOperation> SimdOperationNameMap; > typedef Vector<ArrayView> ArrayViewVector; > > + JSContext* cx_; can you fix column alignment here? @@ +1859,5 @@ > > return true; > } > > + JSContext* cx() const { return cx_; } and here @@ +2918,5 @@ > ret_(ExprType::Limit) > {} > > ModuleValidator& m() const { return m_; } > + JSContext* cx() const { return m_.cx(); } and here
Attachment #8828601 - Flags: review?(luke) → review+
Attachment #8831479 - Flags: review?(jdemooij) → review+
Blocks: 1335095
Comment on attachment 8828598 [details] [diff] [review] ProtectedData class Review of attachment 8828598 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/threading/ProtectedData.h @@ +71,5 @@ > + DECLARE_BOOL_OPERATOR(>=) > + DECLARE_BOOL_OPERATOR(<) > + DECLARE_BOOL_OPERATOR(>) > + > +#undef DECLARE_BOOL_OPERATOR Should this use DECLARE_BOOL_OPERATORS defined above?
Attachment #8828598 - Flags: review?(jcoppeard) → review+
Comment on attachment 8828597 [details] [diff] [review] Zone/ZoneGroup and GC changes Review of attachment 8828597 [details] [diff] [review]: ----------------------------------------------------------------- Sorry it took so long to get to this. Overall it looks good, and I like having the extra checks. There are a lot of UnprotectedData ones at the moment - I take it the plan is to fix those next? ::: js/public/RootingAPI.h @@ +765,3 @@ > } > + inline RootedListHeads& rootLists(JSContext* cx) { > + return rootLists(RootingContext::get(cx)); It looks like this change moves the root lists back to the context from the zone. I'm fine with this since it didn't buy us much and was causing some problems (e.g. bug 1329247). ::: js/src/gc/Allocator.cpp @@ +301,3 @@ > { > // A GC may be happening on the main thread, but zones used by exclusive > // contexts are never collected. This comment needs updating to talk about helper threads rather than exclusive contexts. ::: js/src/gc/GCRuntime.h @@ +992,3 @@ > > + // List of all zone groups (protected by the GC lock). > + UnprotectedData<ZoneGroupVector> groups; This is UnprotectedData but the comment says it's protected by the GC lock. Did you find it is not always accessed with the lock held? ::: js/src/gc/Verifier.cpp @@ +174,5 @@ > { > if (verifyPreData || isIncrementalGCInProgress()) > return; > > + if (IsIncrementalGCUnsafe(rt) != AbortReason::None || TlsContext.get()->keepAtoms || rt->exclusiveThreadsPresent()) Isn't keepAtoms a runtime-wide property? ::: js/src/jsgc.cpp @@ +1694,4 @@ > bool > GCRuntime::isCompactingGCEnabled() const > { > + return compactingEnabled && TlsContext.get()->compactingDisabledCount == 0; I'm not sure why this is now on the context rather than being a runtime-wide configuration. @@ +3025,5 @@ > #endif > > if (zone->isAtomsZone()) { > /* We can't do a zone GC of the atoms compartment. */ > + if (TlsContext.get()->keepAtoms) { Runtime::keepAtoms() used to check exclusiveThreadsPresent as well. Do we need to check that here as well? Also, isn't keepAtoms a runtime-wide rather than a per-context property? @@ +6871,5 @@ > { > js::CancelOffThreadIonCompile(fop->runtime()); > for (ZonesIter zone(fop->runtime(), SkipAtoms); !zone.done(); zone.next()) { > zone->setPreservingCode(false); > + zone->discardJitCode(fop->runtime()->defaultFreeOp()); Why use the default free op here? ::: js/src/vm/MallocProvider.h @@ -34,5 @@ > * > * - gc::Zone: Automatically triggers zone GC. > * - JSRuntime: Automatically triggers full GC. > - * - ThreadsafeContext > ExclusiveContext > JSContext: > - * Dispatches directly to the runtime. Thanks for removing this very outdated reference :)
Comment on attachment 8828600 [details] [diff] [review] JIT changes Review of attachment 8828600 [details] [diff] [review]: ----------------------------------------------------------------- Big patch, but looks good. Thanks for the explanation about the cx and cx->runtime()->contextFromMainThread() difference. Makes sense. ::: js/src/jit/BaselineIC.cpp @@ +3972,5 @@ > EmitPreBarrier(masm, BaseIndex(holderReg, scratch, TimesOne), MIRType::Value); > masm.storeValue(R1, BaseIndex(holderReg, scratch, TimesOne)); > if (holderReg != objReg) > regs.add(holderReg); > + if (cx->nursery().exists()) { Is this a shorthand for "zone()->group()->nursery()" ? In that case it makes sense to me. Else I'm confused between the difference. ::: js/src/jit/CompileWrappers.cpp @@ +297,5 @@ > JitCompileOptions::JitCompileOptions(JSContext* cx) > { > cloneSingletons_ = cx->compartment()->creationOptions().cloneSingletons(); > + spsSlowAssertionsEnabled_ = cx->runtime()->spsProfiler().enabled() && > + cx->runtime()->spsProfiler().slowAssertionsEnabled(); Style nit: Can you align this again? ::: js/src/jit/Ion.cpp @@ -215,5 @@ > > JitRuntime::~JitRuntime() > { > - js_delete(functionWrappers_); > - freeOsrTempData(); Shouldn't the JSContext call this now on destruction?
Attachment #8828600 - Flags: review?(hv1989) → review+
(In reply to Jon Coppeard (:jonco) from comment #28) > Comment on attachment 8828597 [details] [diff] [review] > Zone/ZoneGroup and GC changes > > Review of attachment 8828597 [details] [diff] [review]: > ----------------------------------------------------------------- > > Sorry it took so long to get to this. > > Overall it looks good, and I like having the extra checks. There are a lot > of UnprotectedData ones at the moment - I take it the plan is to fix those > next? Yes, these will be fixed in future patches under bug 1323066. Bug 1334837 fixes most of these to use ActiveThreadData (which ensures thread safety in a cooperatively threaded runtime). > ::: js/src/gc/GCRuntime.h > @@ +992,3 @@ > > > > + // List of all zone groups (protected by the GC lock). > > + UnprotectedData<ZoneGroupVector> groups; > > This is UnprotectedData but the comment says it's protected by the GC lock. > Did you find it is not always accessed with the lock held? I'll fix the comment, there are places where this is accessed without the GC lock held. With a cooperatively multithreaded runtime this doesn't need a lock to protect it, and bug 1334837 changes this away from UnprotectedData. > ::: js/src/gc/Verifier.cpp > @@ +174,5 @@ > > { > > if (verifyPreData || isIncrementalGCInProgress()) > > return; > > > > + if (IsIncrementalGCUnsafe(rt) != AbortReason::None || TlsContext.get()->keepAtoms || rt->exclusiveThreadsPresent()) > > Isn't keepAtoms a runtime-wide property? The JSContext/JSRuntime changes patch moves keepAtoms from the runtime to JSContext. This is a field that is set by an Auto* class, and TLS state like this generally belongs on the context: it describes the state of this particular thread, not that of the runtime as a whole. In a cooperatively scheduled multithreaded runtime, the distinction doesn't matter that much. In a preemptively scheduled multithreaded runtime, though, keeping |keepAtoms| on the runtime means that one thread's behavior could be inadvertently affected by another thread entering or leaving these keepAtoms regions of code. That said, initially we'll be having a cooperatively scheduled runtime, and permitting the GC verifier (and all other GC activity) to operate in whatever zones it wants, except for those used by parse threads. It would be best I think for this code to iterate over all contexts in the runtime participating in cooperative scheduling, and check |keepAtoms| on each one. I can file a bug to fix this and compactingEnabled below, and audit the other GC related TLS state for similar issues (these changes can't be done here because they depend on bug 1334927, and since there is still only a single context per runtime here they aren't necessary for this bug). > ::: js/src/jsgc.cpp > @@ +1694,4 @@ > > bool > > GCRuntime::isCompactingGCEnabled() const > > { > > + return compactingEnabled && TlsContext.get()->compactingDisabledCount == 0; > > I'm not sure why this is now on the context rather than being a runtime-wide > configuration. The same issues here apply as for keepAtoms. |compactingEnabled| is a runtime wide property, but |compactingDisabledCount| is set for a thread based on an Auto* class. > @@ +3025,5 @@ > > #endif > > > > if (zone->isAtomsZone()) { > > /* We can't do a zone GC of the atoms compartment. */ > > + if (TlsContext.get()->keepAtoms) { > > Runtime::keepAtoms() used to check exclusiveThreadsPresent as well. Do we > need to check that here as well? Oops, yes, we do. > @@ +6871,5 @@ > > { > > js::CancelOffThreadIonCompile(fop->runtime()); > > for (ZonesIter zone(fop->runtime(), SkipAtoms); !zone.done(); zone.next()) { > > zone->setPreservingCode(false); > > + zone->discardJitCode(fop->runtime()->defaultFreeOp()); > > Why use the default free op here? Umm, I'll fix this, this change must have gone through a few revisions.
Attachment #8831480 - Flags: review?(jcoppeard) → review+
Blocks: 1335130
Comment on attachment 8828598 [details] [diff] [review] ProtectedData class Review of attachment 8828598 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/threading/ProtectedData.h @@ +204,5 @@ > +#ifdef DEBUG > + ZoneGroup* group; > + > + public: > + CheckZoneGroup(ZoneGroup* group) : group(group) {} |explicit|, also some other classes in this file maybe.
Attachment #8828598 - Flags: review?(jdemooij) → review+
Attachment #8828597 - Flags: review?(jcoppeard) → review+
Pushed by bhackett@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/d2758f635f72 Structure reorganization for multithreaded runtimes, r=jandem,jonco,h4writer,luke,lhansen,nbp.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Blocks: 1336603
See Also: → 1558131
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: