Last Comment Bug 774253 - IonMonkey: off thread compilation
: IonMonkey: off thread compilation
Status: RESOLVED FIXED
[ion:t]
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Other Branch
: x86 Mac OS X
: -- normal with 8 votes (vote)
: ---
Assigned To: Brian Hackett (:bhackett)
:
Mentors:
Depends on: 864756 767223 784906 785755 787016
Blocks: 767238 785494 785761 785762 785905 813559
  Show dependency treegraph
 
Reported: 2012-07-16 06:12 PDT by Brian Hackett (:bhackett)
Modified: 2013-04-23 07:01 PDT (History)
19 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (8ea86b9020a2) (132.21 KB, patch)
2012-07-17 14:07 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
WIP (fdf520ad7dbc) (162.69 KB, patch)
2012-07-18 16:49 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
patch (41f66d0e46b3) (169.68 KB, patch)
2012-07-26 13:31 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
decouple JSContext from Ion backend (79.61 KB, patch)
2012-07-26 13:38 PDT, Brian Hackett (:bhackett)
dvander: review+
Details | Diff | Review
everything else (90.71 KB, patch)
2012-07-26 13:43 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
clean up script->ion handling (92.15 KB, patch)
2012-07-27 06:35 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
patch (41f66d0e46b3) (170.66 KB, patch)
2012-07-27 06:36 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
patch (b2361e15b665) (57.08 KB, patch)
2012-08-07 09:18 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Review
patch (9df5b5265826) (77.85 KB, patch)
2012-08-21 07:23 PDT, Brian Hackett (:bhackett)
dvander: review+
Details | Diff | Review
restore previous compilation-triggering behavior when !js_IonOptions.parallelCompilation (4.42 KB, patch)
2012-08-24 08:44 PDT, Brian Hackett (:bhackett)
dvander: review+
Details | Diff | Review
tweak recompilation check behavior (3.64 KB, patch)
2012-08-24 19:54 PDT, Brian Hackett (:bhackett)
dvander: review+
Details | Diff | Review

Description Brian Hackett (:bhackett) 2012-07-16 06:12:24 PDT
IonMonkey's design makes it pretty straightforward to compile scripts off thread at the same time as VM operations.  Running IonBuilder to construct the intiial MIR would still be serial, as this needs to query and add constraints to type information and query the VM state, but the remaining (much more expensive) stages of compilation should be completely divorced from the rest of the VM's behavior.  (This contrasts with JM, which queries the VM and adds type constraints throughout compilation, and can't really run in parallel with a VM thread.)

Bug 767223 adds infrastructure so that analysis and JM compilation can happen in parallel with each other.  Once that gets in, it can be extended so that IM compilation jobs can be run while the main thread is doing VM stuff.  A few extensions would be needed, primarily so that the main thread can quickly abort the compilation thread in case of, e.g. a GC or new types invalidating the script being compiled.

Currently, the compilation workflow we have is:

- Interpret until the script hits 40 uses (calls or loop backedges)
- Compile with JM+TI on main thread
- Run JM+TI until the script hits 10,000 uses
- Compile with Ion on main thread

When multiple cores are available, the new workflow would be:

- Interpret until the script hits 40 uses (calls or loop backedges)
- Compile with JM+TI on main thread
- Run JM+TI until the script hits N uses (maybe 1,000, subject to tuning etc.)
- Run IonBuilder, hand off MIR to another thread to finish compilation
- Continue running JM+TI code on main thread.  When Ion compilation finishes, the thread sets an interrupt on the runtime, the main thread runs the interrupt handler and incorporates the compiled code.
Comment 1 Nicolas B. Pierron [:nbp] 2012-07-16 10:08:38 PDT
(In reply to Brian Hackett (:bhackett) from comment #0)
> - Continue running JM+TI code on main thread.  When Ion compilation
> finishes, the thread sets an interrupt on the runtime, the main thread runs
> the interrupt handler and incorporates the compiled code.

How would that work with OSR? Currently we use CanEnterAtBranch to emit the OSR basic block for the requested code.
Comment 2 Brian Hackett (:bhackett) 2012-07-16 10:35:22 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #1)
> How would that work with OSR? Currently we use CanEnterAtBranch to emit the
> OSR basic block for the requested code.

Ion is less flexible than JM here, which compiles OSR entry points for every loop in the script.  The main benefit of OSR is with long running loops in a single script activation.  So we can just use the osrPC for the Ion compilation that we were at when triggering the compile.  If we're still in that loop when compilation finishes we can just discard the JM code, force things into the interpreter and enter the Ion code at the start of the next iteration.  If we're not in the function, or in a different loop in the function, the JM script's entry point can be disabled (bug 773655) and will enter the Ion code the next time the script is called.

This might not be ideal in the case where there are nested loops in the script, as we might not enter the Ion code until the next invoke even if we hit the OSR loop head many more times.  In that case we could patch the JM code to take an interrupt the next time the OSR loop is hit.  Hoping this won't be necessary off the bat.
Comment 3 David Anderson [:dvander] 2012-07-16 13:30:46 PDT
Looking at the JM+TI patch, for IM it would be good to separate out the necessary refactorings from the actual code which performs compilation off-thread, to make review+landing easier.

For example IM does some thread-unsafe stuff past MIR generation (like allocating template objects during lirgen and maybe codegen) and moving that around could be landed early.
Comment 4 Brian Hackett (:bhackett) 2012-07-17 14:07:04 PDT
Created attachment 643139 [details] [diff] [review]
WIP (8ea86b9020a2)

Actually, most of the changes in bug 767223 are about the AnalysisContext stuff, so just took the bits relevant to IM parallel compilation and moving forward with that.  This patch allows IM compilation to always happen off thread (currently always happens in JS_THREADSAFE builds with at least two cores available and no eager compilation).  Places during LIR generation / codegen where the VM was being queried (creating template objects, getting lazily initialized stuff, etc.) are moved up into IonBuilder, and places during codegen where code is allocated or lazy blocks are fetched are synchronized using the lock which controls parallel analysis.

Not safe wrt GCs or invalidation, not able to abort compilations partway through, works on a simple example and promptly crashes on ss-3d-raytrace.
Comment 5 Brian Hackett (:bhackett) 2012-07-18 16:49:27 PDT
Created attachment 643670 [details] [diff] [review]
WIP (fdf520ad7dbc)

Working patch, stable on SS.  Currently slows down SS though, needs investigating.  Assuming the compilation threads are actually running on different cores from the main thread (haven't confirmed this on my test system), this should be a strict improvement over JM+TI if the IM code is faster.  So something stupid is probably happening somewhere like spending too much time in the interpreter or spamming a stub call.
Comment 6 Brian Hackett (:bhackett) 2012-07-26 13:31:54 PDT
Created attachment 646295 [details] [diff] [review]
patch (41f66d0e46b3)

Combined patch for off thread Ion compilation.  Running a thread safe build with my two core macbook, I get a 6-7% improvement on SS and a 1-2% slowdown on kraken and v8.  I investigated the kraken one, which is a preexisting issue due to a heuristics problem.  A script's compilation starts, it is invalidated during compilation, recompiled, then hits a bailout which basically poisons it for eternity, and we run the remainder of the test in JM.  Without off thread compilation, the bailout still happens but is wiped clean after the (unrelated) invalidation.  I think v8 has similar issues but haven't confirmed.  Since other changes which affect these heuristics are also due soon (bug 775818) and these heuristics are so fiddly anyways, I'd rather just get this in and work on tuning afterwards.  Builds with parallel compilation turned off (non-threadsafe and with --ion-parallel-compile=off) should be unaffected.

This is the whole patch, splitting into two parts for review.
Comment 7 Brian Hackett (:bhackett) 2012-07-26 13:38:43 PDT
Created attachment 646300 [details] [diff] [review]
decouple JSContext from Ion backend

The approach in this bug is structured so that IonBuilder runs on the main thread, and all later compilation phases happen off thread.  To that end, those other threads cannot read any state which could be mutated by the VM, and can't do anything which could modify the VM state itself.  This patch structures things so that IonBuilder gets all information from the VM up front which will be needed later on during compilation, and prevents later phases from being able to use a JSContext.
Comment 8 Brian Hackett (:bhackett) 2012-07-26 13:43:23 PDT
Created attachment 646302 [details] [diff] [review]
everything else

This adds pared down helper thread stuff from bug 767223 for doing analysis work off thread and allows those threads to pick up IonBuilders and finish compilations.  The tricky bits are the points where the helper threads and main thread need to interact --- invalidating off thread compilations due to GC or new type information, allowing helper threads to allocate IonCode and executable blocks from the non-threadsafe GC and executable allocators, and integrating compiled code in after compilation finishes.
Comment 9 Brian Hackett (:bhackett) 2012-07-27 06:35:17 PDT
Created attachment 646555 [details] [diff] [review]
clean up script->ion handling

In the previous patch script->ion could be set off thread without locking and race with the main thread, which is gross and makes it hard to reason about script->ion when working on the main thread.  This changes things so that all assignments to script->ion occur on the main thread, and cleans up some related stuff around handing off finished/cancelled compilations.
Comment 10 Brian Hackett (:bhackett) 2012-07-27 06:36:28 PDT
Created attachment 646556 [details] [diff] [review]
patch (41f66d0e46b3)
Comment 11 David Anderson [:dvander] 2012-07-27 18:37:25 PDT
Comment on attachment 646300 [details] [diff] [review]
decouple JSContext from Ion backend

Review of attachment 646300 [details] [diff] [review]:
-----------------------------------------------------------------

This patch looks good, it'd be nice if we could land it ahead of the off-thread compilation part, but it's not clear that it's standalone yet?

::: js/src/ion/IonBuilder.cpp
@@ +39,5 @@
> +    pc = info->startPC();
> +}
> +
> +void
> +IonBuilder::clearForBackEnd()

What is this used for?

@@ +289,5 @@
>      current->makeStart(MStart::New(MStart::StartType_Default));
> +    if (instrumentedProfiling()) {
> +        SPSProfiler *profiler = &cx->runtime->spsProfiler;
> +        const char *string = profiler->profileString(cx, script, script->function());
> +        if (string == NULL)

nit: if (!string)

@@ +408,5 @@
>      // Flag the entry into an inlined function with a special MStart block
> +    if (instrumentedProfiling()) {
> +        SPSProfiler *profiler = &cx->runtime->spsProfiler;
> +        const char *string = profiler->profileString(cx, script, script->function());
> +        if (string == NULL)

nit: if (!string)

@@ +1018,5 @@
>        case JSOP_SETPROP:
>        case JSOP_SETNAME:
>        {
> +          if (pc - script->code == 127)
> +              printf("FUBAR\n");

nit: Don't forget to remove this.

::: js/src/ion/IonCode.h
@@ +232,5 @@
>    public:
>      // Do not call directly, use IonScript::New. This is public for cx->new_.
>      IonScript();
>  
> +    static IonScript *New(uint32 frameLocals, uint32 frameSize,

It looks like the implementation in Ion.cpp didn't change, so how does this work?

::: js/src/ion/MIRGenerator.h
@@ +27,5 @@
>  
>  class MIRGenerator
>  {
>    public:
> +    MIRGenerator(JSCompartment *compartment, TempAllocator *temp, MIRGraph *graph, CompileInfo *info);

Why was the MIRGraph change needed? It looks like the call to ::IonBuilder wasn't updated in this patch.

::: js/src/jsapi.h
@@ +2688,5 @@
> +{
> +    if (v.isNumber())
> +        return v.toNumber();
> +    double out;
> +    js::ToNumberSlow(NULL, v, &out);

ToNumberSlow's implementation deserves a comment explaining that cx can be NULL.
Comment 12 David Anderson [:dvander] 2012-07-27 18:37:52 PDT
(Sorry, I ran out of time to get to part 2 today, if I don't get to it this weekend I will on Monday.)
Comment 13 Brian Hackett (:bhackett) 2012-07-27 18:45:21 PDT
Yeah, these patches aren't standalone, but are chopped up from the larger diff and a little rough around the edges.  I'd also like to land the decoupling patch first, it shouldn't be hard to get it working independently from the second patch.
Comment 14 David Anderson [:dvander] 2012-07-30 18:20:18 PDT
Comment on attachment 646555 [details] [diff] [review]
clean up script->ion handling

Review of attachment 646555 [details] [diff] [review]:
-----------------------------------------------------------------

Still working on this, will have more questions tomorrow -

::: js/src/ion/Ion.cpp
@@ +156,5 @@
> +        // This is a compilation thread, don't try to allocate an IonCode or an
> +        // executable code block, as neither allocator is thread safe. Instead,
> +        // post the requested job, trigger an interrupt and wait for the main
> +        // thread to perform the allocation.
> +        types::AnalysisHelperThread *thread = rt->analysisHelperState.currentThread();

I wonder if there are alternatives to this mechanism, since it seems complicated. One is that we could perform code generation on the main thread (it should be... kind of fast, compared to the rest of the pipeline). We could also guarantee that trampolines are created up-front, but that sounds tricky for call wrappers. We could make call wrapper generation (or really any trampoline generation) deferred, and have the linking step for MacroAssembler occur on the main thread. Doing that might have a nice refactoring, like:

class StubGenerator : public TempObject ...

class VMWrapperGenerator : public StubGenerator ...

masm.call(new VMWrapperGenerator(signature));

::: js/src/ion/IonSpewer.cpp
@@ +36,5 @@
> +    if (!spewer) {
> +        spewer = OffTheBooks::new_<IonSpewer>();
> +        spewer->init();
> +        DebugOnly<PRStatus> status = PR_SetThreadPrivate(SpewerTLSIndex, spewer);
> +        JS_ASSERT(status == PR_SUCCESS);

It looks like in both thread-safe and non-threadsafe builds, |spewer| gets leaked (also the allocation isn't checked).

In non-debug builds, can we avoid creating any spewer, and also avoid making the TLS key?

::: js/src/ion/shared/Assembler-shared.h
@@ +231,5 @@
>      {
>          // Note: the condition is a hack to avoid this assert when OOM testing,
> +        // see bug 756614. Parallel compilation may cause code block allocation
> +        // to fail as in an OOM, so similarly skip the assert.
> +        JS_ASSERT_IF(OOM_counter < OOM_maxAllocations && !js_IonOptions.parallelCompile, !used());

Why would parallel compilation cause this whereas non-parallel doesn't?
Comment 15 David Anderson [:dvander] 2012-07-30 18:29:58 PDT
Oh, we could also just pre-generate wrappers for *all* vm calls, drop the weak refs in IonCompartment, etc. Not sure how much memory that would use.
Comment 16 Brian Hackett (:bhackett) 2012-07-30 18:36:53 PDT
(In reply to David Anderson [:dvander] from comment #14)
> Comment on attachment 646555 [details] [diff] [review]
> clean up script->ion handling
> 
> Review of attachment 646555 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Still working on this, will have more questions tomorrow -
> 
> ::: js/src/ion/Ion.cpp
> @@ +156,5 @@
> > +        // This is a compilation thread, don't try to allocate an IonCode or an
> > +        // executable code block, as neither allocator is thread safe. Instead,
> > +        // post the requested job, trigger an interrupt and wait for the main
> > +        // thread to perform the allocation.
> > +        types::AnalysisHelperThread *thread = rt->analysisHelperState.currentThread();
> 
> I wonder if there are alternatives to this mechanism, since it seems
> complicated. One is that we could perform code generation on the main thread
> (it should be... kind of fast, compared to the rest of the pipeline). We
> could also guarantee that trampolines are created up-front, but that sounds
> tricky for call wrappers. We could make call wrapper generation (or really
> any trampoline generation) deferred, and have the linking step for
> MacroAssembler occur on the main thread. Doing that might have a nice
> refactoring, like:

That should be fine, for the first cut I like the idea of doing code generation on the main thread the most.  The main thread allocation is indeed pretty ugly.  Deferred linking could be good, depending on how much time codegen takes, but I'd like to avoid doing anything too invasive to the pipeline for the initial landing.

> ::: js/src/ion/shared/Assembler-shared.h
> @@ +231,5 @@
> >      {
> >          // Note: the condition is a hack to avoid this assert when OOM testing,
> > +        // see bug 756614. Parallel compilation may cause code block allocation
> > +        // to fail as in an OOM, so similarly skip the assert.
> > +        JS_ASSERT_IF(OOM_counter < OOM_maxAllocations && !js_IonOptions.parallelCompile, !used());
> 
> Why would parallel compilation cause this whereas non-parallel doesn't?

This is tied into the IonCode allocation stuff above.  In some cases the main thread won't perform the requested allocation, particularly when it is waiting for the compilation to finish for a GC or invalidation, and in these cases the allocation will be NULL and we'll hit this assert while unwinding the stack.
Comment 17 David Anderson [:dvander] 2012-07-31 19:00:38 PDT
Comment on attachment 646300 [details] [diff] [review]
decouple JSContext from Ion backend

Review of attachment 646300 [details] [diff] [review]:
-----------------------------------------------------------------

r=me w/ nits picked, would be great if this could land ahead of time, since this patch cleans up some stuff that shouldn't be in CodeGenerator anyway.
Comment 18 Luke Wagner [:luke] 2012-07-31 21:42:41 PDT
Why are the four scattered WaitForOffThreadCompileToFinish calls necessary?  Ideally, the engine at large wouldn't have to care about what scripts were being concurrently compiled.  It would be really good to avoid another global "don't forget to check we aren't in X" state (the old JS_ON_TRACE; the current ExpandInlineFrames), especially since this one is non-deterministic.
Comment 19 Brian Hackett (:bhackett) 2012-08-01 04:42:29 PDT
(In reply to Luke Wagner [:luke] from comment #18)
> Why are the four scattered WaitForOffThreadCompileToFinish calls necessary? 
> Ideally, the engine at large wouldn't have to care about what scripts were
> being concurrently compiled.  It would be really good to avoid another
> global "don't forget to check we aren't in X" state (the old JS_ON_TRACE;
> the current ExpandInlineFrames), especially since this one is
> non-deterministic.

Waiting for off thread compilation to finish is necessary in cases where Ion needs to reason about all code that has been compiled for a script, so that it can invalidate it (addPendingRecompile, InvalidateAll), modify it (ToggleBarriers) or trace it (IonCompartment::mark).

Only the last one is really strictly necessary, as the IonBuilder used and manipulated off thread refers to (immutable) GC things, and if we GC on the main thread we could cause the compilation thread to crash.  If we use it in the one place, why not use it in places with similar, yet less dire, use cases?
Comment 20 Luke Wagner [:luke] 2012-08-01 09:06:46 PDT
(In reply to Brian Hackett (:bhackett) from comment #19)
> If we use it in the one place, why not use it in places with similar, yet less
> dire, use cases?

My question is whether there is some pinchpoint so that it wasn't necessary for it to be *explicit* in all these cases.  For example, StackIter means we don't have to explicitly ExpandInlineFrames in 20 different places.  I was wanting to know if a similar thing existed here.
Comment 21 Brian Hackett (:bhackett) 2012-08-01 09:15:38 PDT
I don't think this is quite the same as ExpandInlineFrames or JS_ON_TRACE, where the JIT's behavior affects other parts of the VM.  The calls here are in ion/, except addPendingRecompile which is Ion specific and probably should be in ion/ (boundary between TI and the JITs here is kind of fuzzy).
Comment 22 Brian Hackett (:bhackett) 2012-08-01 12:24:36 PDT
Decoupling patch:

https://hg.mozilla.org/projects/ionmonkey/rev/15dc7dc4243a
Comment 23 Brian Hackett (:bhackett) 2012-08-07 09:18:25 PDT
Created attachment 649669 [details] [diff] [review]
patch (b2361e15b665)

Factor things so that code generation occurs on the main thread, considerably simplifying things.
Comment 24 David Anderson [:dvander] 2012-08-10 16:15:10 PDT
Comment on attachment 649669 [details] [diff] [review]
patch (b2361e15b665)

Review of attachment 649669 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay. Still have to review Ion.cpp and jsinfer.cpp changes, initial comments:

::: js/src/ion/Ion.h
@@ +173,5 @@
>          slowCallLimit(512)
> +    {
> +#ifdef JS_THREADSAFE
> +        if (GetCPUCount() > 1)
> +            parallelCompilation = true;

I would like to land this pref'd off, since we are close to landing and don't fully understand the performance and stability impact yet, and are still tracking down problems on TBPL.

The shell flag should probably be:
  --ion-parallel-compile=off    ; Never compile in parallel.
  --ion-parallel-compile=on     ; Compile if CPU count > 1
  --ion-parallel-compile=always ; Compile no matter what the CPU count is

This would also help the fuzzers since we could accurately determine whether a failure is specifically due to off-thread compilation.

::: js/src/ion/IonCompartment.h
@@ +70,5 @@
> +    // Any scripts for which off thread compilation has successfully finished,
> +    // failed, or been cancelled. All off thread compilations which are started
> +    // will eventually appear in this list asynchronously. Protected by the
> +    // runtime's analysis lock.
> +    Vector<IonBuilder*,0,SystemAllocPolicy> finishedOffThreadCompilations;

nit: spaces after ,

::: js/src/ion/IonSpewer.cpp
@@ +43,5 @@
>  void
>  ion::IonSpewNewFunction(MIRGraph *graph, JSScript *function)
>  {
> +    if (!js_IonOptions.parallelCompilation)
> +        ionspewer.beginFunction(graph, function);

This is okay - being off-thread shouldn't change what the spew is. If we need it back we can implement off-thread spew.

::: js/src/jscntxt.cpp
@@ +932,5 @@
>  
>      if (rt->gcIsNeeded)
>          GCSlice(rt, GC_NORMAL, rt->gcTriggerReason);
>  
> +    ion::OnOperationCallback(cx);

I'd prefer a more direct name here, like "AttachFinishedCompilations" or something.

::: js/src/jsscript.h
@@ +623,5 @@
>      void *padding_;
>  #endif
>  
>      bool hasIonScript() const {
> +        return ion && ion != ION_DISABLED_SCRIPT && ion != ION_COMPILING_SCRIPT;

Uses of ION_DISABLED_SCRIPT in JIT code (should be in CodeGenerator.cpp, MonoIC.cpp) need to be changed to be < ION_COMPILING_SCRIPT instead. Or just make ION_COMPILING_SCRIPT=1 and ION_DISABLED_SCRIPT=2.

::: js/src/methodjit/Compiler.cpp
@@ +945,5 @@
> +        return false;
> +
> +    // If the script has been compiled by ion but can't be entered at this loop,
> +    // compile the script again with JM.
> +    if (script->hasIonScript() && pc && script->code != pc && script->ion->osrPc() != pc)

Does this prevent us from ever entering at the Ion OSR pc, or might we still get a chance as we keep running?

::: js/src/methodjit/StubCalls.cpp
@@ +784,5 @@
>          THROW();
>  }
>  
>  void JS_FASTCALL
> +stubs::TriggerIonCompile(VMFrame &f)

Does this mean JM won't recompile for inlining anymore?

@@ +789,4 @@
>  {
>      JSScript *script = f.script();
>  
> +    if (script->ion)

nit: Comment here why the check is not hasIonScript(), so it doesn't get search+replaced. Or, add separate accessor(s) to JSScript (really, ->ion should have been private from the beginning...)
Comment 25 David Anderson [:dvander] 2012-08-10 18:03:00 PDT
Comment on attachment 649669 [details] [diff] [review]
patch (b2361e15b665)

Review of attachment 649669 [details] [diff] [review]:
-----------------------------------------------------------------

Comments on jsinfer.cpp:

::: js/src/jsinfer.cpp
@@ +5932,5 @@
> +
> +    if (!state.threads && !state.initThreads(cx->runtime))
> +        return false;
> +
> +    state.lock();

This should use RAII.

@@ +5968,5 @@
> +
> +    if (!state.threads)
> +        return;
> +
> +    state.lock();

Use RAII here.

@@ +5981,5 @@
> +                    state.ionWorklist.popBack();
> +                else
> +                    state.ionWorklist[i] = state.ionWorklist.popCopy();
> +                state.unlock();
> +                return;

If I'm reading this right: this works because, we're guaranteeing that an IonBuilder for a script is either in the process of being compiled, or it's in the "finished" queue, but never both, right?

@@ +5995,5 @@
> +
> +    /* Wait for in progress entries to finish up. */
> +    while (true) {
> +        bool finished = true;
> +        for (size_t i = 0; i < state.numThreads; i++) {

nit: This block would read a little better as:

for (...) {
    if (...) {
        state.notify();
        state.join();
        break;
    }
}

(or similarly by inverting the way finished is set)

@@ +6004,5 @@
> +        if (finished)
> +            break;
> +
> +        state.notify(AnalysisHelperState::HELPER);
> +        state.wait(AnalysisHelperState::MAIN);

Just so I understand how this works: None of the threads will be locked because we already acquired the lock. This will atomically unlock, and wake up any threads waiting on the condvar. Then each thread will one by one attempt to acquire the lock, upon doing so, get the next item in the queue, unlock (deferring to another thread), re-acquire the lock, and then notify main.

What happens if we wake up in the worker, and the queue is empty? Will it crash trying to popCopy?

What happens if the worker is not waiting, but is in CompileBackEnd?

Will main wake up if any one worker notifies it?

@@ +6009,5 @@
> +    }
> +
> +    /* Cancel code generation for any completed entries. */
> +    ion::IonCompartment *ion = compartment->ionCompartment();
> +    if (ion) {

Is it possible to bail out much earlier on this condition, near !state.threads? If we don't have an IonCompartment we shouldn't have started any compilations.

@@ +6049,5 @@
> +    }
> +
> +    uint32_t cpuCount = GetCPUCount();
> +    if (cpuCount <= 1)
> +        return false;

Does it make sense, for testing, to base this on the shell flag instead?

@@ +6154,5 @@
> +AnalysisHelperThread::destroy()
> +{
> +    AnalysisHelperState &state = runtime->analysisHelperState;
> +
> +    if (thread) {

Can you invert this test to unindent the rest of the code?

@@ +6175,5 @@
> +void
> +AnalysisHelperThread::threadLoop()
> +{
> +    AnalysisHelperState &state = runtime->analysisHelperState;
> +    state.lock();

I'm not too familiar with threading: Why is it okay to hold this lock while waiting on the helper? Does waiting on the condvar atomically release the lock, and take it again upon waking up?

@@ +6238,5 @@
> +
> +AnalysisHelperState::~AnalysisHelperState() {}
> +
> +void
> +AnalysisHelperState::lock() {}

If it's possible these will be called in non-JS_THREADSAFE builds, could you reformat them so GDB won't get confused? Like:

void
AnalysisHelperState::lock()
{
}

::: js/src/jsinfer.h
@@ +1225,5 @@
> +
> +void CancelOffThreadCompile(JSCompartment *compartment, JSScript *script);
> +
> +/* Per-runtime state for parallel analysis and compilation. */
> +struct AnalysisHelperState

The name "AnalysisHelperState" is kind of vague. How about "ParallelCompilationState" or something?

@@ +1233,5 @@
> +    size_t numThreads;
> +
> +    enum CondVar {
> +        MAIN,
> +        HELPER

These names confused me at first, "MAIN" seems fine but maybe "HELPER" should be "WORKER"?

@@ +1261,5 @@
> +    /*
> +     * Lock protecting all mutable shared state accessed by helper threads, and
> +     * used by all condition variables.
> +     */
> +    PRLock *lock_;

Could we suffix all member variables with _? (That's Ion style.. I guess we're not in Ion here but it's weird seeing it on just one member)

@@ +1290,5 @@
> +    /*
> +     * If non-zero, this thread is compiling a script and is currently waiting
> +     * for the main thread to allocate an IonCode of the specified size.
> +     */
> +    size_t ionCodeNeeded;

Is this needed yet?

@@ +1293,5 @@
> +     */
> +    size_t ionCodeNeeded;
> +
> +    /* The result of the main thread IonCode allocation. */
> +    ion::IonCode *ionCode;

Or this?

@@ +1297,5 @@
> +    ion::IonCode *ionCode;
> +
> +    void destroy();
> +
> +    static void threadMain(void *arg);

nit: ThreadMain
Comment 26 Brian Hackett (:bhackett) 2012-08-10 19:44:09 PDT
(In reply to David Anderson [:dvander] from comment #24)
> ::: js/src/ion/IonSpewer.cpp
> @@ +43,5 @@
> >  void
> >  ion::IonSpewNewFunction(MIRGraph *graph, JSScript *function)
> >  {
> > +    if (!js_IonOptions.parallelCompilation)
> > +        ionspewer.beginFunction(graph, function);
> 
> This is okay - being off-thread shouldn't change what the spew is. If we
> need it back we can implement off-thread spew.

The problem here is that there is a single global IonSpewer, and it isn't thread safe --- it keeps track of the currently compiled script and if a different thread compiling a different script tries to use it then things will crash.

> ::: js/src/methodjit/Compiler.cpp
> @@ +945,5 @@
> > +        return false;
> > +
> > +    // If the script has been compiled by ion but can't be entered at this loop,
> > +    // compile the script again with JM.
> > +    if (script->hasIonScript() && pc && script->code != pc && script->ion->osrPc() != pc)
> 
> Does this prevent us from ever entering at the Ion OSR pc, or might we still
> get a chance as we keep running?

With this patch, yes.  If Ion compilation finishes, we throw away the JM code, and are then forced to recompile JM code because we haven't hit the OSR PC, that frame will finish fully in JM.  This will need to get fixed in order to deal well with long running frames with nested loops (kraken-gaussian-blur is a major example) but that shouldn't be too hard.
Comment 27 Brian Hackett (:bhackett) 2012-08-10 19:54:39 PDT
(In reply to David Anderson [:dvander] from comment #25)
> @@ +5981,5 @@
> > +                    state.ionWorklist.popBack();
> > +                else
> > +                    state.ionWorklist[i] = state.ionWorklist.popCopy();
> > +                state.unlock();
> > +                return;
> 
> If I'm reading this right: this works because, we're guaranteeing that an
> IonBuilder for a script is either in the process of being compiled, or it's
> in the "finished" queue, but never both, right?

Right.  When we start an off thread compilation we set script->ion to ION_COMPILING_SCRIPT, which will disable any future attempts to compile the script.  script->ion will only be changed when the script is removed from the finished list.

> @@ +6004,5 @@
> > +        if (finished)
> > +            break;
> > +
> > +        state.notify(AnalysisHelperState::HELPER);
> > +        state.wait(AnalysisHelperState::MAIN);
> 
> Just so I understand how this works: None of the threads will be locked
> because we already acquired the lock. This will atomically unlock, and wake
> up any threads waiting on the condvar. Then each thread will one by one
> attempt to acquire the lock, upon doing so, get the next item in the queue,
> unlock (deferring to another thread), re-acquire the lock, and then notify
> main.
> 
> What happens if we wake up in the worker, and the queue is empty? Will it
> crash trying to popCopy?

No, the structure of the wait loop ensures this.  If the worker wakes up and the worklist is empty, it just resumes waiting.

> What happens if the worker is not waiting, but is in CompileBackEnd?

Once CompileBackEnd finishes, the worker reaquires the analysis lock, notifies the main thread (via condvar and runtime interrup) and goes back into the wait loop.

> Will main wake up if any one worker notifies it?

Yes.

> @@ +6175,5 @@
> > +void
> > +AnalysisHelperThread::threadLoop()
> > +{
> > +    AnalysisHelperState &state = runtime->analysisHelperState;
> > +    state.lock();
> 
> I'm not too familiar with threading: Why is it okay to hold this lock while
> waiting on the helper? Does waiting on the condvar atomically release the
> lock, and take it again upon waking up?

Yes, this is how the condvars work; each one is associated with a lock, which must be held when waiting or notifying the condvar.
Comment 28 David Anderson [:dvander] 2012-08-16 00:13:31 PDT
(In reply to Brian Hackett (:bhackett) from comment #27)
> No, the structure of the wait loop ensures this.  If the worker wakes up and
> the worklist is empty, it just resumes waiting.

Thanks for explaining, I missed that.

> > Will main wake up if any one worker notifies it?
> 
> Yes.

Is that a problem, if main wakes up, but other workers are still processing?
Comment 29 David Anderson [:dvander] 2012-08-16 00:27:03 PDT
Comment on attachment 649669 [details] [diff] [review]
patch (b2361e15b665)

Review of attachment 649669 [details] [diff] [review]:
-----------------------------------------------------------------

Patch looks good, though I would like to take one last sweep over the final version before r+'ing. Promise 24hr turnaround on the next one.

::: js/src/ion/Ion.cpp
@@ +179,5 @@
> +    for (size_t i = 0; i < finishedOffThreadCompilations.length(); i++) {
> +        IonBuilder *builder = finishedOffThreadCompilations[i];
> +        FinishOffThreadBuilder(builder);
> +    }
> +    finishedOffThreadCompilations.clear();

nit: This member variable should be suffixed with _

@@ +832,5 @@
>      return true;
>  }
>  
> +static void
> +RemoveJITScript(JSContext *cx, JSScript *script)

Could this placed into MethodJIT.cpp (or jsinfer.cpp) instead?

@@ +874,2 @@
>  /* static */ bool
> +TestCompiler(IonBuilder &builder, MIRGraph &graph, AutoDestroyAllocator &autoDestroy)

Could you change TestCompiler to take an IonBuilder * instead? It's not clear from reading the call to StartOffThreadCompile whether IonBuilder is on the stack or not, * might make it clearer.

@@ +931,5 @@
> +            IonContext ictx(cx, cx->compartment, &builder->temp());
> +
> +            CodeGenerator codegen(builder, *builder->lir);
> +            if (!codegen.generate())
> +                cx->clearPendingException();

Could you comment why cx->clearPendingException was necessary? Hopefully the CG doesn't do throw anything other than OOM...

@@ +933,5 @@
> +            CodeGenerator codegen(builder, *builder->lir);
> +            if (!codegen.generate())
> +                cx->clearPendingException();
> +            else if (script->hasIonScript())
> +                RemoveJITScript(cx, script);

This is a little unfortunate, but it makes sense given that we can't OSR from JM into Ion.
Comment 30 Brian Hackett (:bhackett) 2012-08-21 07:23:20 PDT
Created attachment 653742 [details] [diff] [review]
patch (9df5b5265826)

Updated and rebased with comments.  A couple changes from the previous patch:

- This moves the threading code out of jsinfer.* and into new jsworkers.* files, with generic WorkerThread names.  The intent here is that we should be able to consolidate work that is done off thread using a common mechanism, to make it easier to add new off thread work in the future and reduce the number of OS threads used.

- Fixes racy/incorrect usage of CompilerRoot by attaching the list to TempAllocator (in which the roots are allocated) rather than the runtime.
Comment 31 David Anderson [:dvander] 2012-08-21 17:47:03 PDT
Comment on attachment 653742 [details] [diff] [review]
patch (9df5b5265826)

Review of attachment 653742 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/MIR.cpp
@@ +257,5 @@
>  MConstant *
>  MConstant::New(const Value &v)
>  {
> +    if (v.isString())
> +        JS_ASSERT(v.toString() != (JSString*)0xdadadada);

Heh... if this was meant to be checked in, could we assert(toString()->compartment()) instead?
Comment 32 Brian Hackett (:bhackett) 2012-08-21 18:00:38 PDT
(In reply to David Anderson [:dvander] from comment #31)
> Heh... if this was meant to be checked in, could we
> assert(toString()->compartment()) instead?

Whoops, I'm getting lazy with leaving debug code around.  This caught a case where we were trying to compile code while the JSRuntime was being torn down on the main thread.
Comment 33 Brian Hackett (:bhackett) 2012-08-22 18:01:45 PDT
https://hg.mozilla.org/projects/ionmonkey/rev/4225ee7e35a0
Comment 34 David Anderson [:dvander] 2012-08-22 22:28:06 PDT
Backed out as https://hg.mozilla.org/projects/ionmonkey/rev/2a3e2f6288b7 - unfortunately jit-tests started failing on TBPL preventing additional patches from landing
Comment 35 Brian Hackett (:bhackett) 2012-08-23 06:12:08 PDT
Per bug 784906 I can't reproduce this and the failures don't show up on try either, so I'm going to try fixing in place.  I see a potential race that could cause the process to hang during shutdown if the worker threads never run at all before the runtime is torn down, but going to confirm this is a problem with the threads themselves.  This push sets numThreads == 0, regardless of GetCPUCount().

https://hg.mozilla.org/projects/ionmonkey/rev/f42381e2760d
Comment 36 Brian Hackett (:bhackett) 2012-08-23 07:19:31 PDT
The previous patch seems to be green (hard to be completely sure, since pgo builds aren't being run?).  This push fixes the race and turns thread creation back on.

https://hg.mozilla.org/projects/ionmonkey/rev/f01d30041e4e
Comment 37 Brian Hackett (:bhackett) 2012-08-23 15:46:23 PDT
A couple followup fixes to hopefully fix a timeout that started showing up on jit-tests/tests/basic/math-jit-tests in Windows debug builds.  The problem seems to be that the LIRGraph constructor, which took a MIRgraph by reference, is miscompiled by Windows when that constructor is used in a templated x->new_<Y> form.

https://hg.mozilla.org/projects/ionmonkey/rev/af96913013d8
https://hg.mozilla.org/projects/ionmonkey/rev/8be94866f330
Comment 38 David Anderson [:dvander] 2012-08-23 20:11:42 PDT
Unfortunately we're taking a 2-3% sunspider and v8 hit on AWFY from this patch and I think we have to back it out again.
Comment 39 Brian Hackett (:bhackett) 2012-08-23 20:46:57 PDT
Is this on AWFY or in a browser?  I'll compare opt browsers tomorrow, can you hang on until then?  Testing this in the shell, I saw no SS difference and no V8 difference except maybe 500-1000pts on v8-crypto.  This seemed to be due to additional compilations on a script that was already being compiled over and over and over again (some ForceInvalidation bailout?).  I'm hoping to investigate this soon but can only do if this is in the tree.  Along with backing things out due to phantom regressions, backing things out because they randomly perturbed some broken heuristic is another theme with a long, sad history.
Comment 40 David Anderson [:dvander] 2012-08-23 22:05:03 PDT
Yeah I understand, unfortunately it's fairly critical that IonMonkey has stable TBPL and performance results over the next few days, since we want to land on mozilla-inbound, and for the past 24 hours it hasn't been clear what state the tree is in.
Comment 41 Brian Hackett (:bhackett) 2012-08-24 08:44:24 PDT
Created attachment 655024 [details] [diff] [review]
restore previous compilation-triggering behavior when !js_IonOptions.parallelCompilation

Testing in the browser was inconclusive, I get times between 280ms and 320ms for SS both before and after this patch (???).  I instrumented the shell to track interpreted opcodes, compilation events and JIT entering events for both compilers, and there are some differences here due to the changes to how Ion compilation is triggered from JM.  This change restores the old behavior when parallel compilation is off; I now get the same score in v8-crypto.

This leaves the only changes to behavior when parallel compilation is off as:

- JM will no longer inline functions if it is compiling above a high use threshold.
- Overhead to take an uncontended lock when e.g. invalidating code.

I don't think either of these is a credible reason for a performance difference on SS/v8.
Comment 42 Brian Hackett (:bhackett) 2012-08-24 13:03:30 PDT
https://hg.mozilla.org/projects/ionmonkey/rev/d73db1016da1
Comment 43 Brian Hackett (:bhackett) 2012-08-24 19:54:15 PDT
Created attachment 655258 [details] [diff] [review]
tweak recompilation check behavior

Continuing the litany of sorrow, the last patch screwed up tuning on v8-deltablue somehow.  Restoring the earlier behavior of not checking script->ion before calling out to the recompilation stub (which will destroy the calling JM code) fixes this.
Comment 44 Brian Hackett (:bhackett) 2012-08-25 05:30:17 PDT
https://hg.mozilla.org/projects/ionmonkey/rev/01854460aa68
Comment 45 Paul Wright 2012-11-16 11:09:06 PST
Is there more to do here?
Comment 46 Brian Hackett (:bhackett) 2012-11-16 11:46:46 PST
This still needs some fuzz and performance testing before enabling it in the browser, but the baseline stuff is done.

Note You need to log in before you can comment on or make changes to this bug.