Closed Bug 750834 Opened 12 years ago Closed 12 years ago

JM+TI: don't throw away code on every GC

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla15
blocking-kilimanjaro +

People

(Reporter: dmandelin, Assigned: bhackett1024)

References

Details

(Whiteboard: [k9o:p1:fx15])

Attachments

(2 files, 2 obsolete files)

Currently we throw away TI constraints, invalidation info, and native code on GC. Having to recompile afterward can cause long pauses. This needs to be fixed to get smooth perf on apps and games.

Is it possible to finish this by the Fx15 merge (5 weeks from now)?
blocking-kilimanjaro: --- → +
I have a WIP patch to take a crack at this.  A little slow going because of newly introduced very-hard-to-debug GC hazards (looking at you, RegExpShared).  Should have something later this week.
Blocks: 744546
(In reply to Brian Hackett (:bhackett) from comment #1)
> I have a WIP patch to take a crack at this.  A little slow going because of
> newly introduced very-hard-to-debug GC hazards (looking at you,
> RegExpShared).  Should have something later this week.

That would be much appreciated. Let me know if you want any help with the GC hazards.
Attached patch WIP (46647a0711e9) (obsolete) — Splinter Review
Works pretty effectively to cut visible pause times (at least from my perspective) on the no-comply demo.  Still crashy, and still needs heuristics for when to keep vs. discard JIT code (currently it always keeps the code).
Blocks: 747656
No longer depends on: 747656
Attached patch patch (46647a0711e9) (obsolete) — Splinter Review
Fixes two bugs turned up by running jit-tests with GC zeal on (thanks for the tip Bill), and now I no longer get crashes with or without IGC on.  Other stuff:

- I also tested the Score Rush game and the hitches there seem to be gone.

- On both NoComply and Score Rush we keep around about 50 MB of analysis data and 30 MB of jitcode.  This seems to be fairly stable over time but I didn't check closely.

- I added a "Mark Types" phase to the GC statistics, to count the amount of time we spend doing the marking necessary to keep analysis data and jitcode around.  This is a subset of the total root marking time.  On the Score Rush game I spend about 20 ms marking types during GCs, out of about 25 ms total root marking time.

- Type marking time could be decreased (not sure by how much) by ensuring that objects with singleton types are allocated in non-background-finalizable arenas, which include far fewer objects than background-finalizable arenas.

- Type marking also seems like it could be done as part of incremental slices, and not up front as is required with stack root marking.

- Still needs heuristics for when to discard vs. keep jitcode.
Assignee: general → bhackett1024
Attachment #620071 - Attachment is obsolete: true
Naive thought: perhaps there are things you could add to TelemetryHistogram.h to track in order to get broader data about how this is doing in the wild in order to tune the heuristics over time?
Also, this is exciting that the hitches are gone; nice work! :-)
This looks great! Thanks for doing the write barriers as well. There were two things I didn't understand. Why do you need to change needsBarrier_ around MarkRuntime in ValidateIncrementalMarking? Also, why do you need to ExpandInlineFrames in ResetIncrementalGC?
(In reply to Bill McCloskey (:billm) from comment #7)
> This looks great! Thanks for doing the write barriers as well. There were
> two things I didn't understand. Why do you need to change needsBarrier_
> around MarkRuntime in ValidateIncrementalMarking? Also, why do you need to
> ExpandInlineFrames in ResetIncrementalGC?

Both of these are because scripts can now have both normal and barriered code compiled, and if there is jitcode on the stack then the compartment's needBarrier_ value needs to reflect the state of that jitcode (or else stack patching won't work).  It might be better to have ValidateIncrementalMarking just ExpandInlineFrames in each compartment before doing the MarkRuntime.
Some more fixes and cleanup, and add heuristics for when to keep vs. purge JIT code.  The strategy is to keep JIT code around if the window has requested an animation frame within the last second.  This works for the two games I tested, NoComply and Score Rush.  The problem with this approach is that if the page is manufacturing scripts, or somehow performing a lot of compilation activity, the memory used by the page can grow without being collected.  I thought about adding some glue and tape to watch for this but it seems better to just watch for memory pressure notifications from the browser and collect only under those circumstances.  Not sure if such a thing exists, if not then I can go the hack route in a followup patch/bug.

Also, VerifyBarriers will always discard JIT code, how often is this triggered and does it impede the ability to test this functionality in debug builds?  Is there an easy fix (it's not obvious how this code works).
Attachment #620285 - Attachment is obsolete: true
Attachment #620696 - Flags: review?(wmccloskey)
Also, the above patch adds a gcPreserveCode() function to the shell which causes JIT code to always be preserved by GCs afterwards.  Fuzzing with the patch applied and that function called would be great.
I'm on it now :)
Green on LangFuzz so far. I'll let it run until tomorrow but I assume that if at all, it won't be a huge number of bugs popping up.
(In reply to Brian Hackett (:bhackett) from comment #9)
> Created attachment 620696 [details] [diff] [review]
> patch (46647a0711e9)

Green on jsfunfuzz except:

function f(code) {
  code.replace(/\s/g, " ");
  return {
    allowExec: g(code)
  };
}
function g(code) {
  code.replace(/\s/g, " ");
}
function h(code) {
  f("")
}
h();
h();
quit()

causes a leak on m-c changeset b4d2343fff9d with this patch at:

==14378== 144 (64 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 564 of 698
==14378==    at 0xC743: malloc (vg_replace_malloc.c:266)
==14378==    by 0x100183418: js::RegExpCompartment::get(JSContext*, JSAtom*, JSAtom*, js::RegExpFlag, js::RegExpCompartment::Type, js::RegExpGuard*) (Utility.h:184)
==14378==    by 0x100181FBC: js::RegExpObject::createShared(JSContext*, js::RegExpGuard*) (jsval.h:675)
==14378==    by 0x1001B2932: js::mjit::Compiler::jsop_regexp() (Compiler.cpp:6728)
==14378==    by 0x10019A54D: js::mjit::Compiler::generateMethod() (Compiler.cpp:3114)
==14378==    by 0x100194F4B: js::mjit::Compiler::performCompilation() (Compiler.cpp:550)
==14378==    by 0x100194C64: js::mjit::Compiler::compile() (Compiler.cpp:147)
==14378==    by 0x1001A1372: js::mjit::CanMethodJIT(JSContext*, JSScript*, unsigned char*, bool, js::mjit::CompileRequest) (Compiler.cpp:994)
==14378==    by 0x100078926: js::Interpret(JSContext*, js::StackFrame*, js::InterpMode) (jsinterp.cpp:2813)
==14378==    by 0x10018AFD1: js::mjit::JaegerShot(JSContext*, bool) (MethodJIT.cpp:1079)
==14378==    by 0x10006C083: js::RunScript(JSContext*, JSScript*, js::StackFrame*) (jsinterp.cpp:477)
==14378==    by 0x10008071E: js::ExecuteKernel(JSContext*, JSScript*, JSObject&, JS::Value const&, js::ExecuteType, js::StackFrame*, JS::Value*) (jsinterp.cpp:679)
==14378== 
==14378== LEAK SUMMARY:
==14378==    definitely lost: 64 bytes in 1 blocks
==14378==    indirectly lost: 80 bytes in 1 blocks
==14378==      possibly lost: 0 bytes in 0 blocks
==14378==    still reachable: 128 bytes in 2 blocks
==14378==         suppressed: 326,366 bytes in 1,014 blocks
==14378== Reachable blocks (those to which a pointer was found) are not shown.
==14378== To see them, rerun with: --leak-check=full --show-reachable=yes
==14378== 
==14378== For counts of detected and suppressed errors, rerun with: -v
==14378== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)


This leak does not occur without the patch on m-c changeset b4d2343fff9d.
(In reply to Brian Hackett (:bhackett) from comment #4)
> - I also tested the Score Rush game and the hitches there seem to be gone.

\o/
Comment on attachment 620696 [details] [diff] [review]
patch (46647a0711e9)

Review of attachment 620696 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jscompartment.cpp
@@ +468,5 @@
> +        for (int constructing = 0; constructing <= 1; constructing++) {
> +            for (int barriers = 0; barriers <= 1; barriers++) {
> +                mjit::JITScript *jit = script->getJIT((bool) constructing, (bool) barriers);
> +                if (jit)
> +                    jit->trace(trc);

Can you move this so that it's called from the script's trace hook? It seems a little less hacky that way, and it might cut down the time for the first GC slice a little bit. However, you'll have to use MarkObjectUnbarriered instead of MarkObjectRoot inside the trace method. I think this should be fine, since the data being traced is immutable.

::: js/src/jsgc.cpp
@@ +2299,1 @@
>              c->markTypes(trc);

I just noticed that neither AutoPhase nor markTypes is safe to use unless IS_GC_MARKING_TRACER(trc) holds. Could you add this condition to the if statement?

@@ +3258,5 @@
> +ShouldPreserveJITCode(JSCompartment *c, int64_t currentTime)
> +{
> +    if (c->rt->alwaysPreserveCode)
> +        return true;
> +    if (c->lastAnimationTime + (PRMJ_USEC_PER_MSEC * 1000) >= currentTime)

You can just use PRMJ_USEC_PER_SEC.

Also, it seems like we might want to incorporate something here about whether we're shutting down. Maybe, at the top of the function, return false if !rt->hasContexts()?

::: js/src/methodjit/Compiler.cpp
@@ +6721,5 @@
>       * Force creation of the RegExpShared in the script's RegExpObject so that
> +     * we grab it in the getNewObject template copy. A strong reference to the
> +     * RegExpShared will be added when the jitcode is created. Any GC activity
> +     * between now and construction of that jitcode could purge the shared
> +     * info, but such activity will also abort compilation.

Could you remove the branch above that tests if |cx->runtime->gcIncrementalState == gc::MARK|?

::: js/src/shell/js.cpp
@@ +1215,5 @@
>           * with METHODJIT_ALWAYS recompilation can happen and discard the
>           * script's jitcode.
>           */
>          if (!cx->typeInferenceEnabled() &&
> +            !cx->fp()->script()->getJIT(cx->fp()->isConstructing(), cx->compartment->needsBarrier())) {

Could this just be cx->fp()->jit()?

::: js/src/vm/RegExpObject.h
@@ +206,5 @@
>   *      been traced through before the new RegExpShared was installed, in which
>   *      case deleting the RegExpShared would turn the RegExpObject's reference
>   *      into a dangling pointer
>   *
>   * The activeUseCount and gcNumberWhenUsed fields are used to track these two

Could you fix up this sentence now that there are three things?
Attachment #620696 - Flags: review?(wmccloskey) → review+
-    for (CompartmentsIter c(rt); !c.done(); c.next()) {
-        if (c->activeAnalysis)
-            return IncrementalSafety::Unsafe("activeAnalysis set");
-    }

I thought I had convinced myself that this was safe, but now I'm thinking it's not. Say you do the first slice of an incremental GC with activeAnalysis false. Then in the last slice it's true. Then we'll keep the TI info but we won't mark the stuff it depends on.

In fact, I think that the gcPreserveCode flag has the same problem. I would suggest setting it in BeginMarkPhase (which happens in the first slice) and then leaving it with the same value throughout the GC. You could unset it in EndMarkPhase, or just leave it be. For the activeAnalysis flag, it seems like the code we have now should be left in.

Also, the reason Start/EndVerifyBarriers throws away JIT code is to mimic what the GC is doing and to cause the barriers to be enabled in the JIT. I think that code should be removed in this patch.
I found two possible regressions here with the fuzzing patch. Ping me on IRC if you still need those :)
As discussed on IRC, here's the second regression I found. The first is best checked on the machine as it's not deterministic.

Backtrace of this crash:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000455757 in js::mjit::JITScript::chunkDescriptor (this=0x0, i=0) at /home/ownhero/homes/mozilla/repos/mozilla-central/js/src/methodjit/MethodJIT.h:834
834         unsigned chunkIndex(jsbytecode *pc) {
(gdb) bt
#0  0x0000000000455757 in js::mjit::JITScript::chunkDescriptor (this=0x0, i=0) at /home/ownhero/homes/mozilla/repos/mozilla-central/js/src/methodjit/MethodJIT.h:834
#1  0x00000000006f5af9 in js::mjit::Compiler::Compiler (this=0x7fffffff7f50, cx=0xbdcaa0, outerScript=0x7ffff0913240, chunkIndex=0, isConstructing=true)
    at /home/ownhero/homes/mozilla/repos/mozilla-central/js/src/methodjit/Compiler.cpp:132
#2  0x00000000006f9cda in js::mjit::CanMethodJIT (cx=0xbdcaa0, script=0x7ffff0913240, pc=0xbff85c "\202", construct=true, request=js::mjit::CompileRequest_Interpreter)
    at /home/ownhero/homes/mozilla/repos/mozilla-central/js/src/methodjit/Compiler.cpp:993
[...]
(gdb) x /i $pc
=> 0x455757 <js::mjit::JITScript::chunkDescriptor(unsigned int)+19>:    mov    0x38(%rax),%eax
(gdb) info reg rax
rax            0x0      0
We need to make this work for pages that animate using setTimeout, which is still the common case on the Web.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #20)
> We need to make this work for pages that animate using setTimeout, which is
> still the common case on the Web.

Definitely.  Is there a good known way to detect such pages, without too high a false positive rate?  (False positives can cause code and analysis info to be kept around longer than necessary).
(In reply to Brian Hackett (:bhackett) from comment #21)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #20)
> > We need to make this work for pages that animate using setTimeout, which is
> > still the common case on the Web.
> 
> Definitely.  Is there a good known way to detect such pages, without too
> high a false positive rate?  (False positives can cause code and analysis
> info to be kept around longer than necessary).

Let's get a followup bug on these.
Depends on: 753630
https://hg.mozilla.org/mozilla-central/rev/fbff86190de6
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
In bug 738153, bug 739512, and bug 742163, I shrank JSScript from 128 bytes to 96 bytes on 32-bit platforms, and from 208 bytes to 144 bytes on 64-bit platforms.  So it's frustrating to see it grow again to 104 and 160 bytes just for barriers, esp. when we so rarely need both the normal and ctor handles.  Hmm.
(In reply to Nicholas Nethercote [:njn] from comment #25)
> In bug 738153, bug 739512, and bug 742163, I shrank JSScript from 128 bytes
> to 96 bytes on 32-bit platforms, and from 208 bytes to 144 bytes on 64-bit
> platforms.  So it's frustrating to see it grow again to 104 and 160 bytes
> just for barriers, esp. when we so rarely need both the normal and ctor
> handles.  Hmm.

Yeah, early on I tried to just have a dynamically managed array[4] of JITScript handles pointed to by the script, non-NULL iff the script has any handles at all.  That would be pretty easy to do but I decided to split it off into a separate bug which I never filed.  Will look into that this week.
Depends on: 755639
Depends on: 758613
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: