Add gczeal setting to GC on every allocation.

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: bhackett, Assigned: billm)

Tracking

(Blocks 2 bugs)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

8 years ago
Currently, gczeal(2) only causes us to garbage collect when free lists for a given thing kind are emptied and a new page/arena is needed.  This makes it harder to find and isolate GC bugs that can be triggered only within a small execution window, as the stars have to be aligned so that a freelist empties within that window.  From cursory examination of the code, gczeal also doesn't seem to induce GCs after mallocs or JSArenaPool allocations.

It would be nicer (and, possibly, the historical behavior?) to have a setting that will trigger GC every time we try to allocate.  If changing gczeal(2) to do this would break/slow-down tests too much, maybe do this with a gzeal(3) instead.

Comment 1

8 years ago
Brian:

Indeed, the historical behaviour of gczeal(2) is as you described.  This must have been changed less than a year ago, as I was just using it on a JSAPI of that vintage the other day.

You are right, the historical behaviour is very helpful for find GC bugs, or rooting bugs in the embedding!
I second the need for this, it should always be possible to get a GC at every possible time where it could also happen in reality. Sometimes this is the only way to reproduce fragile GC problems.
Assignee: general → wmccloskey
Posted patch patch (obsolete) — Splinter Review
I realized that the parameter to GC zeal is really treated as a boolean. Doing gczeal(1) is no different than gczeal(2). So I decided not to muck with this--with the patch, we GC on every allocation for gczeal(n), n>0. I don't see any tests breaking in the shell. I'll tryserver it when that's open again.

My hope is that bug 640265 will ensure that we never GC inside malloc/calloc/etc. So I didn't add anything there.

I don't know much about arenas, but I don't see any obvious GC points there.
Attachment #527835 - Flags: review?(anygregor)
(Reporter)

Comment 4

8 years ago
Looking at the arena usage again, I think arena-heavy stuff like the parser and type inference would get very unhappy anyways if we GC'ed on every single allocation.  Can you cc me on bug 640265?

Comment 5

8 years ago
Bill - historically, gcZeal has not been treated like a boolean, although the difference between 1 and 2 has never been well-documented.

When JS_SetGCZeal() was introduced, 1 was equivalent to the old -DTOO_MUCH_GC and 2 was equivalent to -DWAY_TOO_MUCH_GC.   

The effect was
cx->gcZeal=1 
  - cause JS_MaybeGC() to run JS_GC() every time
  - run GC every time we make a new GC thing and rt->gcPoke is set
  - rt->gcPoke is set whenever we do something GC-related, like removing a root
  
cx->gcZeal = 2
  - cause JS_MaybeGC() to run JS_GC() every time
  - run GC every time we make a new GC thing
  
Embedders, at least, like the two options because #1 increases GC frequency without making apps so slow they are unusable.
You are right that the old implementation does not support different levels of gczeal. I am a little bit worried that with the new implementation it gets impossible to perform any kind of gczeal tests in the browser. Right now it takes a long time to start the browser but if we perform a GC on every allocation this might become impossible. 
We should add another level that allows us to perform less GCs. Jesse was suggesting for a long time now that he would love to see an additional parameter that indicates after how many allocations we perform a GC. so gcZeal(50) would perform a GC after 50 allocations.
I'd rather not change the current meaning of GC zeal too much. So how about this: gczeal(1) means "just GC for MaybeGC". gczeal(2) means "GC for MaybeGC and all allocations". I think this is pretty much what Wes wants.

Then we could add some new environment variables that would allow you to do stuff like:
- Start GCing after i allocations
- GC every j'th allocation
- ... or GC with probability 1/k on every allocation

What do you guys think?
Yeah the problem is that the current meaning is not the current implementation...

I am missing the current behavior with the new patch where it is still possible to start the browser if you get a cup of coffee and run "make test" in the shell over night. 
MaybeGC is pretty much useless to find real rooting bugs. A GC on every allocation would probably be a 50 - 100x slowdown?
I vote for an additional setting for the current behavior where we GC on every arena switch.

I don't want to block on this if other people think we should go with this new behavior.

Maybe we could also get a gczeal test for the jsapi tests on tinderbox but this is definitely a different story :)
The purpose of the "GC every j'th allocation" parameter I suggested is to solve this problem. We could use it to balance test coverage versus time to run the tests. With the old "GC every Refill", you're basically stuck with a fixed frequency.

Comment 10

8 years ago
(In reply to comment #7)
> Then we could add some new environment variables that would allow you to do
> stuff like:
> - Start GCing after i allocations
> - GC every j'th allocation
> - ... or GC with probability 1/k on every allocation

For fuzzing, I'd prefer the ability to say "GC 10 allocations from now".  That needs to be a shell API, not an environment variable.  See bug 569451.
(In reply to comment #10)
> For fuzzing, I'd prefer the ability to say "GC 10 allocations from now".  That
> needs to be a shell API, not an environment variable.  See bug 569451.

OK, thanks for pointing that out. I can fold it into the next patch once we agree on something. I think it's orthogonal to this bug.
BillM: If it's possible to preserve (well, reintroduce) the gcPoke semantics, that would be cool -- where rt->gcPoke goes true for GC-related operations and triggers GC zeal on the next allocation.  That was also key for cx->gcZeal=1.  Merely turning maybeGC into GC can be easily done by the embedder and gains us little (especially when most embedders never call JS_MaybeGC).

GWagner: the behaviour I described *is* the old gcZeal behaviour.  It just got lost somewhere along the way.  I described it by reading the source code to the JS 1.8.0-rc1 source release. A quick grep of my source tree shows that it was still present in ~1.8.2 as well (I am on the rev before fatvals landed, 21e90d198613).

Jesse's idea is interesting, I see it as JS_SetGCCountdown(count, callback) where we decrement a counter in the current compartment until it hits zero and then we GC and after we GC we run the optional callback.  That would give him a bunch of fuzzing power if were exposed right up to the shell including the callback. Definitely another bug, though.
Wes, can you explain why the gcPoke behavior was helpful? We're planning on eliminating gcPoke in bug 589771.
In the context of gc-debugging, things often "go wrong" around operations like root removal, property removal, etc.  Triggering a GC soon after these helps trigger the embedding bug close to the error-site (the whole point of zealous GC), without incurring the sizeable performance penalty of gcZeal=2.

So, in this context, gcZeal=1 is using rt->gcPoke to tell it was is going on in the gc/engine, rather than its true purpose (which actually eludes me, I think it has something to do with making sure we loop to get all the garbage, and don't consider something reachable when it is not).

If we had the JS_SetGCCountdown(cx, count, callback) interface I described above, we could get the gcZeal-related gcPoke functionality back with something like

#ifdef JS_GC_ZEAL
# define GC_POKE(cx, v) \
  do { \
    if (cx->gcZeal == 1 && !cx->gcCountdown && !cx->gcCountdownCallback) { \
      cx->gcCountdown = 1; \
    } \
  } while(0)
#else
# define GC_POKE (JS_TRUE)
#endif

Although I guess picking a better name/interface might be preferable to the macro-mess.
OK, that's helpful. It sounds like we want four possibilities:

1. GC on operations that are likely to cause problems for embedders, like root removal.
2. GC on every allocation
3. GC on every k'th allocation
4. Schedule a GC after n allocations

I'll put a patch together soon that does all this.
Sounds like you're right on target Bill.

#4, I assume you mean "Schedule a GC after n allocations from now", which is exactly the functionality Jesse needs in the shell.

Although I still think the option to run a callback afterwards would be neat, too -- if only to make debugging tests easier.

Wes
One more thing we need: the ability to trigger per-compartment GCs as well as full GCs.
Oooh! Yes!  But not just as part of gcZeal, through JSAPI in general

I currently have the following scenario:
 - worker threads (shared nothing)
 - "async" thread
 - normal JS thread

Each thread has it's own cx.  The worker threads are well-understood, it would be nice if these would GC regularly without stopping the world (I am on older JSAPI, I hope new will just magically do this based on GC threshold at least).

Truthfully, I don't remember what compartment the "async" thread is in, but it's job is to catch POSIX signals and other similar events, and to relay them to the "normal" thread via thread-safe C methods and to trigger activity on the "normal" thread via the operation callback. I think the async thread is on the same compartment as the "normal JS" thread, because it doesn't call into JS, ever: it uses the cx for JS_malloc, JS_GC, etc.

The 'async' thread also has one other job, and that is to call JS_MaybeGC every few seconds  (that should probably really be happening on a dedicated thread). Anyhow, the MaybeGC is called in this way in the hopes that the async thread will be the one to run the global GC while the "normal JS" thread is blocked on I/O (which happens a lot in my environment), via the suspend/resume request mechanism.

It would be sweet if the MaybeGC described above could cause a per-compartment GC while the thread that the compartment is running on was blocking for I/O.
Well, I don't totally understand that. Keep in mind that compartment GCs still cause all JS code to block. However, if they're not in a request (and it sounds like they're not) then I think you'll be okay.

I'm thinking that this new mechanism would trigger a GC in the compartment of the context that's passed in. Would that work for you?

Comment 20

8 years ago
> I'm thinking that this new mechanism would trigger a GC in the compartment of
> the context that's passed in. Would that work for you?

So a function I can call from chrome code that can take a content-window-global as an argument? Sounds good. (For workers, I guess I'd pass the Worker object?)
(In reply to comment #20)
> So a function I can call from chrome code that can take a content-window-global
> as an argument? Sounds good. (For workers, I guess I'd pass the Worker object?)

Right now I'm thinking you would pass in the compartment directly. So I think you could just do this:
  js_GC(cx, global->compartment())
However, I don't know much about this stuff. I'm pretty sure we can make it work, though.

Comment 22

8 years ago
Oh. I'm interested in the API exposed to chrome JS, not the lower-level JSAPI.
Posted patch updated patch (obsolete) — Splinter Review
I think this does pretty much what has been discussed. I used a command line param rather than an environment var for the GC frequency, though. I also added the ability to ask for a GC in a given compartment in the shell. You just pass it an object from the right compartment.
Attachment #527835 - Attachment is obsolete: true
Attachment #527835 - Flags: review?(anygregor)
Attachment #529201 - Flags: review?(anygregor)
I tried the patch and it introduces a crash with this code:

this.__defineSetter__('x', gc);
x=null;
Wow, 10 points for creativity, Christian!
(In reply to comment #25)
> Wow, 10 points for creativity, Christian!

I'll pass that on to my fuzzer, it'll be happy ;)

Comment 27

8 years ago
Would it make sense for schedulegc to let me specify a compartment?
Posted patch patch v3Splinter Review
OK, new patch.

Thanks for finding that bug, decoder. It was a mismatch between the semantics of Value::isObject and JSVAL_IS_OBJECT.

I added some new options for fuzzing compartment GCs. First, you can now ask for compartment GCs during GC zeal. This is done either by running with -Z zeal,frequency,1 or by gczeal(zeal, frequency, true). Second, schedulegc also takes a boolean saying whether the scheduled GC should be a compartment GC (meaning it GCs whatever the current compartment is when the GC happens).

I though about allowing you to explicitly specify a compartment for schedulegc. However, this is a lot more complicated to implement and I'm not sure it's worth it. Hopefully what's in the patch will be good enough.
Attachment #529201 - Attachment is obsolete: true
Attachment #529201 - Flags: review?(anygregor)
Attachment #529774 - Flags: review?(anygregor)
Comment on attachment 529774 [details] [diff] [review]
patch v3

>From: Bill McCloskey <wmccloskey@mozilla.com>
>
>diff --git a/js/src/jsapi.cpp b/js/src/jsapi.cpp
>--- a/js/src/jsapi.cpp
>+++ b/js/src/jsapi.cpp
>@@ -2586,24 +2586,31 @@ JS_DumpHeap(JSContext *cx, FILE *fp, voi
> 
> extern JS_PUBLIC_API(JSBool)
> JS_IsGCMarkingTracer(JSTracer *trc)
> {
>     return IS_GC_MARKING_TRACER(trc);
> }
> 
> JS_PUBLIC_API(void)
>-JS_GC(JSContext *cx)
>+JS_CompartmentGC(JSContext *cx, JSCompartment *comp)
> {
>     LeaveTrace(cx);
> 
>     /* Don't nuke active arenas if executing or compiling. */
>     if (cx->tempPool.current == &cx->tempPool.first)
>         JS_FinishArenaPool(&cx->tempPool);
>-    js_GC(cx, NULL, GC_NORMAL);
>+
>+    js_GC(cx, comp, GC_NORMAL);
>+}

We don't support it for the atomsCompartment.
Assert or comment?


>gc([compartment])        Run the garbage collector",

The argument is an object right? You should add this information somehow.


We also have the option that disables per-compartment GCs. (JSGC_MODE_GLOBAL)
They seem to conflict with each other. We should document the behavior or fix it.
I don't want to block on this so followup bug.

The "common" way to use GCZeal is still with the level and no arguments.
So it would be nice to have the information about the behavior explicit:
level 1: ...
level 2: ...
(right now it's hidden in your long comment)

It would also make sense to enable this behavior in about:config right?
Maybe another followup.

I guess there is also a MDC page on gcZeal that needs to be changed.

Great patch!
Attachment #529774 - Flags: review?(anygregor) → review+
Blocks: GC
I would *love* to see this on tracemonkey as soon as possible. Bill?

Updated

8 years ago
Blocks: 661469
Follow-up patch:
http://hg.mozilla.org/tracemonkey/rev/b4ddabc55995

It turns out this was triggering spurious GCs when we set the GC zeal to 0 in between jstest runs. Hopefully this will fix the timeouts.
You need to log in before you can comment on or make changes to this bug.