Closed Bug 558861 Opened 14 years ago Closed 14 years ago

Compartmental GC

Categories

(Core :: JavaScript Engine, defect)

Other Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jorendorff, Assigned: gwagner)

References

Details

(Whiteboard: [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey[suspect-regress-ts_cold_generated_med][suspect-regress-dromaeo_css][suspect-regress-dromaeo_jslib])

Attachments

(3 files, 25 obsolete files)

8.73 KB, text/plain
Details
250.69 KB, patch
Details | Diff | Splinter Review
3.56 KB, patch
Details | Diff | Splinter Review
The plan is to divide the JSRuntime into compartments (subheaps). Each compartment will contain some global objects and everything parented to them. We'll impose a write barrier on all references across compartments.

Then we'll support GC on an individual compartment, treating all references pointing into that compartment from outside as roots. This will not collect cyclic garbage that spans compartments; so we will also continue to support JSRuntime-wide GC.
Would it be feasible to have one global per compartment? It would save us the parent link per object. Just thinking out loud.
(In reply to comment #1)
> Would it be feasible to have one global per compartment? It would save us the
> parent link per object. Just thinking out loud.

We can amortize that without making a compartment per global, but it's a good question how this would perform.

Jason: for a write barrier across compartments, were you thinking of exposing reference to compartment A to compartment B code? I woulda thunk we would impose a proxy, not just a barrier. This bears on the cost of compartment-per-global vs. real web cross-frame workload.

/be
Depends on: 553671
Jason, you gonna take this bug? I can help review patches in bug 553671.

/be
We probably get some memory fragmentation. We have about 6 different finalization types that each need a separate arena, so it would be about 6 * 4k / 2 = 12k wasted memory on average per global. How many globals do we have in a browser session?
(In reply to comment #4)
> We probably get some memory fragmentation. We have about 6 different
> finalization types that each need a separate arena, so it would be about 6 * 4k
> / 2 = 12k wasted memory on average per global. How many globals do we have in a
> browser session?

Depends on whether you mean an inner window. Let's say not. We have one outer window per chrome window, one for each tab, one for each frame or iframe. The hard part is covering the # of open tabs. See the testpilot results:

https://testpilot.mozillalabs.com/testcases/tab-open-close/aggregated-data.html

/be
> How many globals do we have in a browser session?

Your typical modern web page will have somewhere between 1 and 20 (depending on how it does its ads).  So if we ignore stuff like xul and xbl prototype globals and the globals for our XUL windows (basically one per window), we're probably talking between 1 (for someone who never has multiple tabs open and only uses one window) to several thousand (for someone with a large session).

Fwiw, I think 14k per global is an ok cost.  ccing jst to see what he thinks.
Assignee: general → jorendorff
(In reply to comment #5)
> (In reply to comment #4)
> > We probably get some memory fragmentation. We have about 6 different
> > finalization types that each need a separate arena, so it would be about 6 * 4k
> > / 2 = 12k wasted memory on average per global. How many globals do we have in a
> > browser session?
> 
> Depends on whether you mean an inner window. Let's say not. We have one outer
> window per chrome window,

Correction -- jst reminded me this one is an nsXULWindow, no DOM window jazz here.

> one for each tab, one for each frame or iframe.

I'm thinking per-outer-window to start, see how that goes. Same origin too? We have cross-origin wrappers already.

/be
Whiteboard: [evang-wanted]
> nsXULWindow, no DOM window jazz here.

Sorry, no.  We still have a DOM window for the chrome document, in addition to the nsXULWindow.

> Same origin too?

That seems like it introduces complications like which heap should be used for the JSObject for the outer window itself (which needs to more or less persist across origin changes, though maybe only the wrapper needs to persist....)
Depends on: 560471
Depends on: 561364
Assignee: jorendorff → anygregor
Depends on: compartments-api
Depends on: 563106
blocking2.0: --- → final+
Grabbing this while gregor is on vacation.
Assignee: anygregor → gal
Attached patch patch (obsolete) — Splinter Review
Incomplete. This patch is for discussion with gregor only. Do not review. Do not comment on it.
Attached patch WIP: gluing things together. (obsolete) — Splinter Review
work in progress....
Attachment #452833 - Attachment is obsolete: true
Attached patch WIP2 (obsolete) — Splinter Review
gluing 2: passing trace-tests and jsapi tests. fail some jstests.
Attachment #454115 - Attachment is obsolete: true
Attached patch WIP (obsolete) — Splinter Review
passes all reftests. still fails during browser startup.
needs more refactoring once the fatval patch is in.
Attachment #455060 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
new version. browser crash fixed.
Attachment #456349 - Attachment is obsolete: true
Assignee: gal → anygregor
Attached patch patch (obsolete) — Splinter Review
new verison. bug fixes
Attachment #456988 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
new version. some cleanup
Attachment #457162 - Attachment is obsolete: true
If this makes the GC faster it will help us on Splay.js (and probably other v8 benchmarks).
Blocks: JaegerSpeed
Attached patch patch (obsolete) — Splinter Review
WiP: merging + reordering for inlining.
Attachment #457440 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
current version. still asserts in delayed marking
Attachment #458555 - Attachment is obsolete: true
Comment on attachment 459145 [details] [diff] [review]
patch

>diff -r f93be2cc64d8 js/src/Y.js
>--- a/js/src/Y.js	Tue Jul 20 21:10:57 2010 -0700
>+++ b/js/src/Y.js	Wed Jul 21 13:55:49 2010 -0700
>@@ -1,19 +1,9 @@
>-// The Y combinator, applied to the factorial function
>-
>-function factorial(proc) {
>-    return function (n) {
>-        return (n <= 1) ? 1 : n * proc(n-1);
>-    }
>-}
>-
>-function Y(outer) {
>-    function inner(proc) {
>-        function apply(arg) {
>-            return proc(proc)(arg);
>-        }
>-        return outer(apply);
>-    }
>-    return inner(inner);
>-}
>-
>-print("5! is " + Y(factorial)(5));
>+var x = ({});
>+y = x;
>+for (var j = 0; j < 1000; ++j)
>+	for (var i = 0; i < 100; ++i) {
>+		x.next = ({});
>+		x = x.next;
>+	}
>+for (i = 0; i < 10000; ++i)
>+    	gc();

Undo.
> JS_PUBLIC_API(void)
> JS_CallTracer(JSTracer *trc, void *thing, uint32 kind)
> {
>     JS_ASSERT(thing);
>-    Mark(trc, thing, kind);
>+    MarkUntyped(trc, thing, kind);
> }

MarkKind is better I think.

>             /*
>              * Empty the GC free lists to trigger a last-ditch GC when any GC
>              * thing is allocated later on this thread. This makes unnecessary
>              * to check for the memory pressure on the fast path of the GC
>              * allocator. We cannot touch the free lists on other threads as
>              * their manipulation is not thread-safe.
>              */
>-            JS_THREAD_DATA(this)->gcFreeLists.purge();
>+            for (JSCompartment **c = runtime->compartments.begin(); c != runtime->compartments.end(); ++c) {
>+                (*c)->freeLists.purge();
>+            }

No { }. the comment doesn't match the code here any more. The code here seems to purge free lists to hasten the next check
for malloc pressure. However, its going to happen anyway eventually (we will run out of things on some freelist, at 
most 4k / 16 bytes later). I think this is strictly over-engineering here. Lets remove all of it in a separate patch
on TM right now (so we can tryserver it).

>             js_TriggerGC(this, true);

I guess we can keep the TriggerGC but this all seems very weird and unnatural. Lets discuss this offline.

>         }
>     }
> }
>                                                                               \
>-        if (!JS_CHECK_STACK_SIZE(cx, stackDummy_)) {                          \
>+        if (!CheckStackSize(cx->stackLimit, &stackDummy_)) {                          \

The \ is out of whack.

>+    gc::MarkIdRange(trc, idArray->length, idArray->vector, "JSAutoIdArray.idArray");

Either always use gc:: or never.

>+#ifdef DEBUG
>+template <typename T>
>+bool
>+Arena<T>::assureInThingsArray(T *thing)
>+{
>+    for (T *curr = &t.things[0]; curr < &t.things[ThingsPerArena]; curr++) {
>+        if (thing == curr)
>+            return true;
>+    }
>+    return false;
>+}
>+#endif

WTF?? assureInThingsArray() { return curr >= t.things[0] && curr < things[ThingsPerArray]. }


>+    if (thing > &t.things[ThingsPerArena-1] || thing < &t.things[0]) {
>+        return;
>+    }

No {}

> static void
> UpdateArenaStats(JSGCArenaStats *st, uint32 nlivearenas, uint32 nkilledArenas,
>                  uint32 nthings)
> {
>     size_t narenas;
> 
>     narenas = nlivearenas + nkilledArenas;
>-    JS_ASSERT(narenas >= st->livearenas);
>+    //JS_ASSERT(narenas >= st->livearenas);

?


>+    for (Chunk **c = rt->gcChunks.begin(); c != rt->gcChunks.end(); ++c) {
>+        ReleaseGCChunk(rt, *c);
>+    }

{}

>+//js_DumpGCStats(rt, stdout);

?
>+    //inline bool isMarked(size_t bit) {
>+     //   uintptr_t *word = &bitmap[bit / BitsPerWord];
>+     //   JS_ASSERT(word < &bitmap[JS_ARRAY_LENGTH(bitmap)]);
>+     //   return *word & (uintptr_t(1) << (bit % BitsPerWord));
>+    //}

?

>+    /*inline bool markIfUnmarked(size_t bit) {
>+        uintptr_t *word = &bitmap[bit / BitsPerWord];
>+        JS_ASSERT(word < &bitmap[JS_ARRAY_LENGTH(bitmap)]);
>+        uintptr_t mask = (uintptr_t(1) << (bit % BitsPerWord));
>+        if (*word & mask)
>+            return false;
>+        *word |= mask;
>+        return true;
>+    }*/

?
Attached patch patch (obsolete) — Splinter Review
new version. passes reftests.
Attachment #459145 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
fixed browser build, down to tryserver leaks.
Attachment #459246 - Attachment is obsolete: true
Attached patch rebase (obsolete) — Splinter Review
Attachment #459894 - Attachment is obsolete: true
Still leaking?
Attached patch patch (obsolete) — Splinter Review
no more leaks. tryserver looks green. looking at performance numbers.
Attachment #460702 - Attachment is obsolete: true
Blocks: compartments
Everything sounds good. How's the performance? Can we land it?
Blocks: 585787
blocking2.0: final+ → beta5+
Attached patch patch (obsolete) — Splinter Review
New version. Still have some troubles with locking in multithreaded environment.

First performance numbers for v8 benchmarks in single-threaded shell for Mac 10.6:
current tip:
host-5-167:v8 idefix2$ ../OPT.OBJ/js -j run.js
Richards: 5003
DeltaBlue: 2482
Crypto: 2904
RayTrace: 737
EarleyBoyer: 755
----
Score: 1821

with patch:
host-5-167:v8 idefix2$ ../OPT.OBJ/js -j run.js
Richards: 5013
DeltaBlue: 2642
Crypto: 3296
RayTrace: 782
EarleyBoyer: 756
----
Score: 1916
Attachment #462636 - Attachment is obsolete: true
Would be good to test on a newer v8, esp one that includes splay.
An email discussion with Andreas from yesterday:

On Aug 11, 2010, at 12:13 AM, Gregor Wagner wrote:

I found some wired behavior that doesn't go along with the current GC scheme.
I am debugging multithreaded mochitest in the browser.
I crash in finalizeArenaList because the freelist is not returned to the arena header and therefore I am finalizing also free objects.
The reason for not returning the freelist to the header comes from the fact that different contexts allocate arenas into the defaultCompartment (at the same time). I guess that is some race condition because I can't reproduce the crash all the time.

I added some printf functions to log when arenas are allocated for a compartment.

The default compartment has following address:
(gdb) x cx->runtime->defaultCompartment 
0x1054426c0

In my log i see:
context: 0x11d9b5520
arena 0x1215d4000, type: 0, comp: 0x1054426c0
context: 0x124b15970
arena 0x1215ce000, type: 0, comp: 0x1054426c0

So here 2 contexts allocated for the same Type 0 (JSObject) an arena for the default compartment.
I also log when an arena becomes full but this is not the case here.  

The next lines of the log show that the compartments are changed for the contexts:
context: 0x124b15970
arena 0x11da28000, type: 0, comp: 0x124bdd210
context: 0x11d9b5520
arena 0x11da2d000, type: 0, comp: 0x11db08790

So is this behavior a race condition when switching compartments or has this something to do with allocating new compartments?
We can't let different contexts using the default compartment at the same time.



Andreas replied:

Yeah, this is definitely true. We don't have a look in the allocator path again, so only one thread can access the default compartment at a time. We could have one default compartment per thread.

We should probably paste this discussion into a bug so it can be tracked.

Andreas
Depends on: 586534
Is this just because Gecko isn't fully compartmentalized yet (i.e., we are currently doing stuff in the default compartment that we don't really want to do there)?

Or is it because some runtime-wide atoms (which we put in the default compartment on purpose) are lazily allocated?

The latter issue seems pretty easy to knock down.
Whiteboard: [evang-wanted] → [evang-wanted][compartments]
Depends on: 586773
Blocks: mt-wrappers
Depends on: 587898
I added per compartment heap statistics. We still put a lot of gcthings into the defaultCompartment.
For a browser startup we allocate 29 object arenas and 56 function arenas.
Attached patch patch (obsolete) — Splinter Review
rebase + adding per compartment arena stats.
Attachment #464615 - Attachment is obsolete: true
Depends on: 589216
Attached patch patch (obsolete) — Splinter Review
new version.
Attachment #466883 - Attachment is obsolete: true
Depends on: 590319
Depends on: 590325
blocking2.0: beta5+ → beta6+
In the bug 584688 I have seen rather nice GC wins from moving finalization of GC things to the background thread. Suppose now that we can move finalization of practically all GC things to the background thread. Would the compartmental GC be still beneficial? AFAICS on a dual-core CPU it would not obviously speedup the finalization while not contributing to the faster marking cycle of the global GC/cycle collector. So perhaps we should focus on making the global GC mark parallel with using different threads to mark the compartments?
Blocks: 584789
Compartmentalization helps when marking a mostly live heap. We don't have many cross-compartment wrappers.

That brings up the main point of compartmentalization, in my view: wrapper-only refs among compartments.

We're not backing off on compartments, but we need more than a sweep-time win to have a competitive GC.

/be
Compartments can have all sorts of benefits for sweep time. Most importantly, with some more pieces in place we can collect all compartments that are not currently operated on in the background.
(In reply to comment #36)
> Compartments can have all sorts of benefits for sweep time.

The global GC also can have very tiny sweep time since most of the finalization can be done in the background with relatively small changes. So this cannot be an argument for the compartment GC.

> Most importantly,
> with some more pieces in place we can collect all compartments that are not
> currently operated on in the background.

But how that would benefit the marking time of the global GC that we would still need to run to collect cross-compartments references? That would be the biggest contributor to the pause time so the question is can the compartment GC help with that?
I really don't think this discussion is a good use of time.
> can the compartment GC help with that?

Yes, by making the global GC rarer.
Depends on: 594455
Attached patch patch (obsolete) — Splinter Review
After going through the code with Luke I changed some parts.
Attachment #467999 - Attachment is obsolete: true
Attachment #473966 - Flags: review?(gal)
I talked to Brendan about a possible defaultCompartment optimization today. If we don't leak other objects in there anymore we can avoid marking and sweeping the defaultCompartment during each GC (there are only atoms in there).
Attached file numbers
Some numbers for this patch in the attachment.
(In reply to comment #42)
> Created attachment 473977 [details]
> numbers
> 
> Some numbers for this patch in the attachment.

Are this for the browser or shell run?
(In reply to comment #42)
> Created attachment 473977 [details]
> numbers

The numbers from the bug 595048 indicates that SS win is most likely attributed to the inlining of js_NewFinalizableGCThing. This suggests to rebase the patch here on that bug and do the measurements again to have a clear picture of the performance impact.
(In reply to comment #44)
> (In reply to comment #42)
> > Created attachment 473977 [details] [details]
> > numbers
> 
> The numbers from the bug 595048 indicates that SS win is most likely attributed
> to the inlining of js_NewFinalizableGCThing. This suggests to rebase the patch
> here on that bug and do the measurements again to have a clear picture of the
> performance impact.

Why? Because that would delay everything for a few more days? I guess Brendan for the Scope patch or Luke for the fatval patch would have been happy if I tell them I want to land 10% of their patch a few days before they land their actual patch.


The tryserver turned orange from 3 days ago to yesterday. 
It seems I missed something with my last merge yesterday. Debugging....
Attached patch patch (obsolete) — Splinter Review
Tryserver is happy again.
Attachment #473966 - Attachment is obsolete: true
Attachment #474138 - Flags: review?(gal)
Attachment #473966 - Flags: review?(gal)
Regarding the numbers and merge pain: if the patch would use JS_NEVER_INLINE, not JS_ALWAYS_INLINE, for NewFinalizableGCThing, then it would allow to measure the real impact of *just the compartment GC* on the benchmark performance.
Why do we care?  We don't split other performance-improving patches up to see where the wins come, nor should we.  Take the win, then reprofile and look for whatever win is the next best bang-for-buck under the target load.
(In reply to comment #43)
> (In reply to comment #42)
> > Created attachment 473977 [details] [details]
> > numbers
> > 
> > Some numbers for this patch in the attachment.
> 
> Are this for the browser or shell run?

that was single-threaded shell

Browser numbers for the current patch:
(v8.googlecode.com/svn/data/benchmarks/v5/run.html)

current tip:
Score: 2522
Richards: 7291
DeltaBlue: 2823
Crypto: 2297
RayTrace: 1454
EarleyBoyer: 2518
RegExp: 1682
Splay: 2230

with this patch:
Score: 2821
Richards: 7378
DeltaBlue: 2986
Crypto: 2283
RayTrace: 1536
EarleyBoyer: 3248
RegExp: 1912
Splay: 2964
Ups these are the numbers with the different GC heuristics to avoid the big GC in the shell. I still have to look how affective our current GC heuristics with the browser version of v8 is.
Completely concur with #48. Don't rebase. This patch is ready for review, and should land asap. Its blocking final and a bunch of compartment bugs.
Comment on attachment 474138 [details] [diff] [review]
patch

>+                    cx->compartment = cx->runtime->defaultCompartment;
>                     key = js_NewString(cx, str->flatChars(), str->flatLength());
>+                    cx->compartment = comp;

An auto helper would be nicer here. JSAutoEnterCompartment ac(cx, cx->runtime->defaultCompartment) { key = ...; }

>+                    cx->compartment = cx->runtime->defaultCompartment;
>                     key = js_NewStringCopyN(cx, str->flatChars(), str->flatLength());
>+                    cx->compartment = comp;

And here.

>+#ifdef JS_THREADSAFE
>+    bool                defaultCompartmentIsLocked;
>+#endif

Isn't this DEBUG only?

>+#ifdef JS_THREADSAFE
>+        cx->runtime->defaultCompartmentIsLocked = true;
>+#endif

Debug only?

>+            for (JSCompartment **c = trc->context->runtime->compartments.begin(); c != trc->context->runtime->compartments.end(); ++c) {

This is a little long, even for my taste.

Awesome work Gregor. Can't wait to see this land as soon as the last defaultCompartment bug is fixed.
Attachment #474138 - Flags: review?(gal) → review+
(In reply to comment #48)
> Why do we care?  We don't split other performance-improving patches up to see
> where the wins come, nor should we.

The reason for the compartment GC was that it supposed to decrease our GC pause in the browser. It would be nice to see some numbers that confirm this like the average/worst GC times for a typical browser session exposing the true value of the work done.

Fundamentally this patch should not affect SS/V8 timing since GC does not happen during SS and for V8 it would be the single compartment that is responsible for for almost all garbage. So any benchmarks wins or loses should be investigated to see what exactly is going on as result changes could be due to the bugs.

That is why I wanted to see this rebased on top of the bug 595048 after the patch showed strong wins especially in SS. But if that is difficult, then this patch could just disable the inlining leaving that to a followup. After 10 minutes of doing that I have the following number in js shell:

SS - no essential changes (this is good!)
V8:

** TOTAL **:      -                 2213.2ms +/- 2.7%   2189.3ms +/- 2.6% 

=============================================================================

  v8:             -                 2213.2ms +/- 2.7%   2189.3ms +/- 2.6% 
    crypto:       *1.106x as slow*   307.4ms +/- 3.3%    339.8ms +/- 3.4%     significant
    deltablue:    *1.091x as slow*   361.0ms +/- 3.3%    393.7ms +/- 3.2%     significant
    earley-boyer: ??                 342.7ms +/- 2.6%    345.2ms +/- 2.7%     not conclusive: might be *1.007x as slow*
    raytrace:     ??                 324.9ms +/- 3.0%    326.6ms +/- 2.9%     not conclusive: might be *1.005x as slow*
    regexp:       *1.117x as slow*   242.4ms +/- 2.8%    270.9ms +/- 2.6%     significant
    richards:     *1.090x as slow*   231.1ms +/- 3.0%    251.8ms +/- 2.9%     significant
    splay:        1.55x as fast      403.7ms +/- 2.6%    261.2ms +/- 2.3%     significant

I.e. no total change (which is good), with some loses spread over benchmarks and a big win in splay.
Attached patch rebased (obsolete) — Splinter Review
This doesn't quite compile, because of some header dependencies.

I also had to move Brendan's emptyShape tracing into MarkChildren.

Then, I noticed that

MarkChildren(JSTracer *trc, JSObject *obj)
and
MarkChildren(JSTracer *trc, JSFunction *fun)

have mostly identical bodies. lame.

In file included from /Users/sayrer/dev/tracemonkey/js/src/jsgc.h:60,
                 from /Users/sayrer/dev/tracemonkey/js/src/jscntxt.h:63,
                 from /Users/sayrer/dev/tracemonkey/js/src/jsapi.cpp:60:
/Users/sayrer/dev/tracemonkey/js/src/jsscope.h: In static member function ‘static js::EmptyShape* js::EmptyShape::create(JSContext*, js::Class*)’:
/Users/sayrer/dev/tracemonkey/js/src/jsscope.h:621: error: ‘JS_PROPERTY_TREE’ was not declared in this scope
In file included from /Users/sayrer/dev/tracemonkey/js/src/jscntxt.h:63,
                 from /Users/sayrer/dev/tracemonkey/js/src/jsapi.cpp:60:
/Users/sayrer/dev/tracemonkey/js/src/jsgc.h: In function ‘void js::gc::MarkObject(JSTracer*, JSObject&, const char*)’:
/Users/sayrer/dev/tracemonkey/js/src/jsgc.h:1254: error: no matching function for call to ‘Mark(JSTracer*&, JSObject&)’
/Users/sayrer/dev/tracemonkey/js/src/jsgc.h: At global scope:
/Users/sayrer/dev/tracemonkey/js/src/jsgc.h:1014: warning: inline function ‘void js::gc::MarkObject(JSTracer*, JSObject*, const char*)’ used but never defined
make[1]: *** [jsapi.o] Error 1
Attached patch rebased, files added. (obsolete) — Splinter Review
missed added files last time.
Attachment #475952 - Attachment is obsolete: true
Attachment #474138 - Attachment is obsolete: true
Attachment #475953 - Attachment is obsolete: true
(In reply to comment #56)
> Created attachment 476092 [details] [diff] [review]
> compiles, but missing brendan's emptyShape check in MarkObject

I meant missing brendan's emptyShape check in /MarkChildren/
Attached patch rebased again (obsolete) — Splinter Review
Attachment #476092 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Attachment #476330 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Attachment #476342 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Fix browser build. Tryserver is green. Is it safe to push?
Attachment #476347 - Attachment is obsolete: true
Blocks: 595574
Whiteboard: [evang-wanted][compartments] → [evang-wanted][compartments][ETA: 9/24]
Currently seeing a regression on the first half of v8:

v8-v5
               trunk      gc-patch
-----                     
Score:         3059       3080
Richards:      7556       7466       2% slower
DeltaBlue:     3705       3381       9% slower
Crypto:        3833       3048      20% slower
RayTrace:      1816       1737       5% slower
EarleyBoyer:   3553       3358      same
RegExp:        1572       1955      25% faster
Splay:         2301       2996      30% faster



v8-v6
               trunk      gc-patch
-----
Score:         1389       1317       
Richards:      4468       4152       8% slower
DeltaBlue:     1652       1454      12% slower
Crypto:        2139       1341      38% slower
RayTrace:      1092       1019       7% slower
EarleyBoyer:   1311       1282      same
RegExp:         961       1086      13% faster
Splay:          458        600      30% faster
Sounds to me we schedule the GC at different points.
(In reply to comment #63)
> Sounds to me we schedule the GC at different points.

I am on my way back to the US. I will take a look.
(In reply to comment #62)
> Currently seeing a regression on the first half of v8:
> 

I can't see the regression. I also use 10.6.
I added to my mozconfig:
ac_add_options --enable-gctimer
export CFLAGS="-g -gfull"
export CXXFLAGS="-g -gfull"

My numbers are for v8 v5(with a clean browser build):
trunk:
Score: 2736
Richards: 7316
DeltaBlue: 3249
Crypto: 2551
RayTrace: 1500
EarleyBoyer: 2959
RegExp: 1761
Splay: 2419

with patch:
Score: 3006
Richards: 7347
DeltaBlue: 3292
Crypto: 2561
RayTrace: 1741
EarleyBoyer: 3584
RegExp: 1854
Splay: 3100

I noticed very different numbers for the very first browser start after a  build. So I always restart the browser twice before I run the benchmarks.
Attached patch patchSplinter Review
rebase
Attachment #476760 - Attachment is obsolete: true
This is for references a patch to disable inlined allocation on top of the main patch. This is to see the effect of just the compartment GC on benchmarks so it is easier to spot an extra GC runs during benchmarks compared with the base.

Gregor, could you for references posts the scores with this patch?
(In reply to comment #65)
> (In reply to comment #62)
> > Currently seeing a regression on the first half of v8:
> > 
> 
> I can't see the regression. I also use 10.6.
> I added to my mozconfig:
> ac_add_options --enable-gctimer
> export CFLAGS="-g -gfull"
> export CXXFLAGS="-g -gfull"
> 
> My numbers are for v8 v5(with a clean browser build):
> trunk:
> Score: 2736
> Richards: 7316
> DeltaBlue: 3249
> Crypto: 2551
> RayTrace: 1500
> EarleyBoyer: 2959
> RegExp: 1761
> Splay: 2419
> 
> with patch:
> Score: 3006
> Richards: 7347
> DeltaBlue: 3292
> Crypto: 2561
> RayTrace: 1741
> EarleyBoyer: 3584
> RegExp: 1854
> Splay: 3100
> 
> I noticed very different numbers for the very first browser start after a 
> build. So I always restart the browser twice before I run the benchmarks.

The V8 benchmarks are extremely sensitive to how often GCs occur and when they do.  These numbers are showing a difference only for the regressions that do GCs for me in the shell: earley-boyer, raytrace and splay.  I'm running into the same issues in bug 584917, which also has a big effect on when we GC (I'll land that after this bug).  Would it make sense to land this, land 584917, and then (in a separate bug) tune the GC for the browser tests?  (the sunspider --v8-suite which AWFY is tracking is not too relevant, especially on splay).
Blocks: 584917
No longer depends on: 584197
(In reply to comment #68)
>   Would it make sense to land this, land 584917, and
> then (in a separate bug) tune the GC for the browser tests?  (the sunspider
> --v8-suite which AWFY is tracking is not too relevant, especially on splay).

Yes absolutely. The next thing I want to do is GC heuristics.
http://hg.mozilla.org/tracemonkey/rev/1c913526c597
Whiteboard: [evang-wanted][compartments][ETA: 9/24] → [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey
Blocks: 599503
http://hg.mozilla.org/tracemonkey/rev/a6232ac986de - followup to add missing explicit template instantiations for assureThingIsAligned. Without this fix GCC 4.4 on Linux gives link errors in DEBUG builds.
(In reply to comment #71)
> http://hg.mozilla.org/tracemonkey/rev/a6232ac986de - followup to add missing
> explicit template instantiations for assureThingIsAligned. Without this fix GCC
> 4.4 on Linux gives link errors in DEBUG builds.

Will making assureThingIsAligned an inline function accomplish this as well?  getAlignedThing seems to avoid these explicit template instantiations, and it's inline (I'm trying to get rid of explicit instantiations for bug 584917, which would otherwise double the number required).
I strongly discourage explicit instantiations, and would prefer to make the template bodies available (even if they aren't inline). For clarity, we can separate out the bodies into another file (SomeHeader-inl.h is the google standard that we're also following).
Depends on: 599773
Gregor: in the new world the arena header is allocated together with arena. What is the reason not to use separated array? Previously it was separated to avoid touching extra pages when working over arena lists which was visible in synthetic benchmarks or callgrind.
Gregor: before the patch js_AtomizeString unlocked the atom state when creating a new string. With the landed patch that change is gone so strings are allocated under the compartment lock. That may lead to a deadlock when string allocation triggers a last-ditch GC or when OOM is reported. Should I file a new bug on this?
Another question about atomization: why js_AtomizeString uses the default compartment when allocating new strings while already pre-allocated strings may live in any compartment? I.e. an operation like obj[String.fromCharCode(100, 101, 102, 103)] will put into the atoms table the string from the current compartment while eval("obj["+String.fromCharCode(100, 101, 102, 103)+"]") will use the default compartment. 

This non-default compartment atomization creates a permanent compartment leak if, for example, a later initialized chrome script or extension will have "defg" string embedded into into. The atom lookup will see that this string is atomized and will put into the compiled script the string from non-default compartment preventing that compartment from GC. 

To avoid that I think the atomization should dup strings from non-default compartments to ensure that everything lives in the default one.
Depends on: 600172
Depends on: 600173
(In reply to comment #75)
> Gregor: before the patch js_AtomizeString unlocked the atom state when creating
> a new string. With the landed patch that change is gone so strings are
> allocated under the compartment lock. That may lead to a deadlock when string
> allocation triggers a last-ditch GC or when OOM is reported. Should I file a
> new bug on this?

We release the lock once we trigger a last-ditch GC. 

There is still the defaultCompartmentIsLocked check involved. It is needed because we are chasing the last defaultcompartment mixups and it could otherwise lead to a deadlock if we allocate other things in the defaultCompartment. This should go soon.
(In reply to comment #77)
> We release the lock once we trigger a last-ditch GC. 

There is also an issue of reporting OOM under the lock. That should be avoided and for this reason in the patch for bug 600173 I have added AutoUnlockDefaultCompartment.
Blocks: 600234
No longer blocks: 600234
Depends on: 600234
Depends on: 600402
http://hg.mozilla.org/mozilla-central/rev/1c913526c597
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Depends on: 600593
No longer depends on: 600593
Depends on: 600687
Whiteboard: [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey → [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey[suspect-regress-ts_cold_generated_med]
A changeset from this bug was associated with a XP Ts, Cold MED Dirty Profile regression on Firefox. boo-urns :(

  Previous: avg 420.356 stddev 3.679 of 30 runs up to a9d1ad0bc386
  New     : avg 430.832 stddev 1.905 of 5 runs since a60414d076b5
  Change  : +10.475 (2.49% / z=2.847)
  Graph   : http://mzl.la/bZFaB3

The regression occurred from changesets in the following range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a9d1ad0bc386&tochange=a60414d076b5

The tag [suspect-regress-ts_cold_generated_med] has been added to the status whiteboard;
please remove it only once you have confirmed this bug is not the cause
of the regression.
Whiteboard: [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey[suspect-regress-ts_cold_generated_med] → [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey[suspect-regress-ts_cold_generated_med][suspect-regress-dromaeo_css]
A changeset from this bug was associated with a Win7 Dromaeo (CSS) regression on Firefox. boo-urns :(

  Previous: avg 2014.419 stddev 40.480 of 30 runs up to a9d1ad0bc386
  New     : avg 1901.610 stddev 12.432 of 5 runs since a60414d076b5
  Change  : -112.809 (-5.6% / z=2.787)
  Graph   : http://mzl.la/9gFu4n

The regression occurred from changesets in the following range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a9d1ad0bc386&tochange=a60414d076b5

The tag [suspect-regress-dromaeo_css] has been added to the status whiteboard;
please remove it only once you have confirmed this bug is not the cause
of the regression.
Whiteboard: [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey[suspect-regress-ts_cold_generated_med][suspect-regress-dromaeo_css] → [evang-wanted][compartments][ETA: 9/24] fixed-in-tracemonkey[suspect-regress-ts_cold_generated_med][suspect-regress-dromaeo_css][suspect-regress-dromaeo_jslib]
A changeset from this bug was associated with a Win7 Dromaeo (jslib) regression on Firefox. boo-urns :(

  Previous: avg 127.610 stddev 4.222 of 30 runs up to a9d1ad0bc386
  New     : avg 116.384 stddev 0.751 of 5 runs since a60414d076b5
  Change  : -11.226 (-8.8% / z=2.659)
  Graph   : http://mzl.la/bZu5UP

The regression occurred from changesets in the following range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a9d1ad0bc386&tochange=a60414d076b5

The tag [suspect-regress-dromaeo_jslib] has been added to the status whiteboard;
please remove it only once you have confirmed this bug is not the cause
of the regression.
A changeset from this bug was associated with a XP Dromaeo (CSS) regression on Firefox. boo-urns :(

  Previous: avg 2045.275 stddev 49.676 of 30 runs up to a9d1ad0bc386
  New     : avg 1936.120 stddev 13.937 of 5 runs since a60414d076b5
  Change  : -109.155 (-5.34% / z=2.197)
  Graph   : http://mzl.la/b0dlou

The regression occurred from changesets in the following range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a9d1ad0bc386&tochange=a60414d076b5

The tag [suspect-regress-dromaeo_css] has been added to the status whiteboard;
please remove it only once you have confirmed this bug is not the cause
of the regression.
A changeset from this bug was associated with a XP Dromaeo (jslib) regression on Firefox. boo-urns :(

  Previous: avg 129.703 stddev 4.099 of 30 runs up to a9d1ad0bc386
  New     : avg 117.954 stddev 0.660 of 5 runs since a60414d076b5
  Change  : -11.749 (-9.06% / z=2.866)
  Graph   : http://mzl.la/avZij4

The regression occurred from changesets in the following range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a9d1ad0bc386&tochange=a60414d076b5

The tag [suspect-regress-dromaeo_jslib] has been added to the status whiteboard;
please remove it only once you have confirmed this bug is not the cause
of the regression.
Depends on: compartmentGC
This is a tracker bug. It should not block b7. A dependent bug blocks b8.
Depends on: 606057
No longer depends on: 606057
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Not blocking beta7 per gal's last comment.
blocking2.0: beta7+ → beta8+
Closing this bug. Its dependencies will keep it alive for tracking purposes.
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
blocking2.0: beta8+ → ---
I removed the blocking flag. There is no action that is planned or associated with this bug. If there is a regression or something similar that you want fixed, please file a new bug and re-nominate.
Depends on: 616927
No longer depends on: compartmentGC
Depends on: compartmentGC
No longer depends on: 600234
No longer depends on: 616927
You need to log in before you can comment on or make changes to this bug.