Closed Bug 505315 Opened 13 years ago Closed 13 years ago

constructing GC free lists during finalization

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: gal, Assigned: igor)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 7 obsolete files)

This has 2 goals:

- Make ST and MT builds more similar in behavior (easier to maintain and test).
- Allow inlining of the fast path (take item from local free list), whereas the slow path will not be inline (refill local free list). This will be a separate patch.

The patch doesn't slow down ST builds, and the extra memory use is negligible. MT builds are essentially binary-identical.
Attached patch patch (obsolete) — Splinter Review
Assignee: general → gal
Attachment #389538 - Flags: review?(brendan)
Attachment #389538 - Flags: review?(brendan) → review?(igor)
I think this is premature without an explicit second patch to guess the amount of extra work to maintain threadsafe/non-threadsafe builds.
What do you mean with extra work?
Igor, what do you mean with extra work? The patch removes differences between ST and MT builds and should make things easier to maintain.
Comment on attachment 389538 [details] [diff] [review]
patch

The current implementation of local free lists pools relies on JS_EndRequest that calls js_RevokeGCLocalFreeLists to release the local free lists and make them available to other contexts and threads. Single-threaded embedding does not call JS_BeginRequest/JS_EndRequest so there is no way to release the free lists for a context that is no longer active.

I think a better way to merge single and multi-threaded code paths would be to remove the local free lists pool altogether and put the local free lists into JSThreadData. The reason for introduction of the pool was the need to support some corner cases in multi-threaded, multi-runtime embeddings when the JSThread was shared among multiple JSRuntime instances. Since this is no longer the case, the free lists can be put into JSThreadData.
Attachment #389538 - Flags: review?(igor) → review-
Yeah, I agree. This should go into JSThreadData. We can mirror the values into cx when threads attach to a context if we are concerned about the indirection.
Attached patch patch (obsolete) — Splinter Review
Attachment #389538 - Attachment is obsolete: true
Attachment #390051 - Flags: review?(igor)
(In reply to comment #7)
> Created an attachment (id=390051) [details]
> patch

I meant to remove the pool completely and store the heads of free lists in JSThreadData. I will post here shortly the patch that removes exactly that. It should minimize the amount of differences between single and multi threaded builds.
Attached patch removal of thread-local pool (obsolete) — Splinter Review
The patch removes the thread-local list pools and stores the heads of the free lists directly in JSThread. With the patch the difference between single and multi threaded builds becomes rather minimal.

Moreover, the allocator can be made more uniform if the sweep phase of the GC under multi-threaded build will assemble for a particular arena not the free list but rather the lists of lists. In this way the allocator will avoid the extra scanning of the global free lists to turn its beginning into thread-local lists. Also AFAICS  this schema can use compare-and-swap, not the full lock, to grab new thread local lists from the global list-of-lists.
Great. I will hand over the bug to you then. I want 0 difference between ST and MT, as long it doesn't cost ST too much. My patch is a minimal speedup for ST, so I am pretty sure that's doable.
Assignee: gal → igor
Depends on: 506243
Depends on: 506315
Depends on: 506125
No longer depends on: 506315
Attachment #390051 - Flags: review?(igor)
Depends on: 521390
Attached patch free list per arena (obsolete) — Splinter Review
The patch makes the free list to span over the whole GC arena. This way the free list is assembled during the GC and stored in the arena's header. The patch also eliminates the free flag. Instead to distinguish during finalization phase between a free GC cell and an about-to-be-finalized GC tjing the patch consults the free list. If the thing is not there, it should be finalized.

With this change the allocation code does not need to update the flags and just needs to pick up the thing from the free list. This code is fully shared between single- and multi-threaded builds.

The patch is untested work in progress and depends on the patch from the bug 521390 comment 0.
Attachment #390051 - Attachment is obsolete: true
Attachment #390310 - Attachment is obsolete: true
Attached patch free list per arena v2 (obsolete) — Splinter Review
The new patch fixes various bugs in v1 so it now passes the shell tests. With it I observed 4% speedup with v8 benchmark.
Attachment #405576 - Attachment is obsolete: true
No longer depends on: 521390
Attached patch free list per arena v3 (obsolete) — Splinter Review
Here is a version of the patch that does not depend on the bug 521390 - that bug requires more work. 

Since the patch makes the free list to span the whole arena and the allocation does not need to access thing's flags, the patch no longer uses template specialization for GC thing allocation code.
Attachment #405788 - Attachment is obsolete: true
Attachment #405901 - Flags: review?(brendan)
Comment on attachment 405901 [details] [diff] [review]
free list per arena v3

>     union {
>-        jsuword     untracedThings;     /* bitset for fast search of marked
>-                                           but not yet traced things */
>-        bool        hasMarkedDoubles;   /* the arena has marked doubles */
>+        struct {
>+            JSGCThing   *freeList;
>+            jsuword     untracedThings;     /* bitset for fast search of marked
>+                                               but not yet traced things */
>+        } finalizable;
>+
>+        bool            hasMarkedDoubles;   /* the arena has marked doubles */
>     } u;

Could lose the 'u' name now -- C++ anonymous unions ftw.

>-const uint8 GCF_MARK   = JS_BIT(0);
>-const uint8 GCF_FINAL  = JS_BIT(1);
>-const uint8 GCF_LOCK   = JS_BIT(2);    /* lock request bit in API */
>+/* GC flag definitions, must fit in 8 bits. */
>+const uint8 GCF_MARK            = JS_BIT(0);
>+const uint8 GCF_LOCK            = JS_BIT(1); /* lock request bit in API */
>+const uint8 GCF_CHILDREN        = JS_BIT(2); /* GC things with children to be
>+                                                marked later. */
..12345678901234567890123456789012345678901234567890123456789012345678901234567890

Nit: too much horizontal whitespace before the =, crunching on the right into the comment. Suggest using 1 mod 4 column number virtual tabstops.

> #define THING_TO_ARENA(thing)                                                 \
>     (JS_ASSERT(!JSString::isStatic(thing)),                                   \
>      (JSGCArenaInfo *)(((jsuword) (thing) | GC_ARENA_MASK)                    \
>                        + 1 - sizeof(JSGCArenaInfo)))
> etc.

Would welcome a separate patch using inlines.

>+JSGCFreeLists::purge()
>+{
>+    /*
>+     * Return the free list back to the arena so the GC finalization would

s/would/will/

>+     * treat its GC things as free and will not run the finalizers over
>+     * unitialized bytes.

Could just lose "would treat its GC things as free and" without loss of meaning.

>      * Since the initial value of the gcLastBytes parameter is not equal to
>      * zero (see the js_InitGC function) the return value is false when
>      * the gcBytes value is close to zero at the JS engine start.

Pre-existing nit: comment wants rewrapping.

>+        while (!!(a = arenaList->cursor)) {

Why !! here? It's not needed AFAIK and it is not house style. != NULL after the parenthesized assignment is prevailing style, though. If you want to just double-parenthesize we should talk -- that's possible to lose any extra noise of the !! or != NULL kind, although I think exceptional style rules deserve attention-grabbing extra notation (we don't nest assignments overmuch as a general style rule).

>+#if defined JS_GC_ZEAL && defined JS_TRACER
>+        if (rt->gcZeal >= 1 && JS_TRACE_MONITOR(cx).useReservedObjects) {
>+            goto testReservedObjects;
>+        }

Nit: overbraced.

>+    JSGCThing **tailp = &head;
>+    jsuword thingptr = ARENA_INFO_TO_START(a);

Nit: thingp a better name?

>+   /*
>+    * Ensure at least nobjects objects are in the list. fslots[1] of each
>+    * object on the reservedObjects list is the length of the list from there.

Pre-existing nit: "to there" or "to this object", not "from there".

>+    */
>+   JSObject *&head = JS_TRACE_MONITOR(cx).reservedObjects;
>+   size_t i = head ? JSVAL_TO_INT(head->fslots[1]) : 0;
>+   while (i < nobjects) {
>+       JSObject *obj = js_NewGCObject(cx);
>+       if (!obj)
>+           return JS_FALSE;
>+       memset(obj, 0, sizeof(JSObject));
>+       /* The class must be set to something for finalization. */

Pre-existing nit: blank line before comment.

> #ifdef JS_GC_ASSUME_LOW_C_STACK

Have we tested with this enabled lately? We should rig up some automated tests if we don't have 'em. Or does the testsuite hit the normal stack limit?

[deleting - lines below:]
>     for (;;) {
>+        JS_ASSERT(a->list == arenaList);
>         JS_ASSERT(a->prevUntracedPage == 0);
>+        JS_ASSERT(a->u.finalizable.untracedThings == 0);
> 
>+        JSGCThing *freeList = NULL;
>+        JSGCThing **tailp = &freeList;
>         bool allClear = true;
>+        uint8 *flagp = THING_FLAGP(a, 0);
>+        JSGCThing *thing =
>+            reinterpret_cast<JSGCThing *>(ARENA_INFO_TO_START(a));
>+        JSGCThing *thingsend =
>+            reinterpret_cast<JSGCThing *>(ARENA_INFO_TO_START(a) +
>+                                          THINGS_PER_ARENA(sizeof(T)) *
>+                                          sizeof(T));
>+        JSGCThing* nextfree = a->u.finalizable.freeList
>+                              ? a->u.finalizable.freeList
>+                              : thingsend;

Nit: thingsEnd and nextFree to match freeList and allClear? The trailing p-for-pointer is the exception.

>+/*
>+ * Allocates a new GC thing. After a successful allocation
>+ * the caller must fully initialize the thing before calling any function that
>+ * can potentially trigger GC. This will ensure that GC tracing never sees junk
>+ * values stored in the partially initialized thing.

Comment wants rewrapping.

Looks great otherwise -- r=me with nits addressed. Thanks,

/be
Attachment #405901 - Flags: review?(brendan) → review+
Attached patch free list per arena v4 (obsolete) — Splinter Review
The patch addresses the nits from the comment 14.

(In reply to comment #14)
> > #ifdef JS_GC_ASSUME_LOW_C_STACK
> 
> Have we tested with this enabled lately? We should rig up some automated tests
> if we don't have 'em.

There are few tests that constructs an object graph that would contain a million or so stack frames if the GC would use recursion. So presumably we do have the test coverage with the default 512K stack limit.
Attachment #405901 - Attachment is obsolete: true
Attachment #406409 - Flags: review+
I change the title of the bug to reflect the essence of the changes in the patch.
Summary: TM: use cx-local freelists in the ST build as well → constructing GC free lists during finalization
http://hg.mozilla.org/tracemonkey/rev/487b81c753c0
Whiteboard: fixed-in-tracemonkey
Did you benchmark this patch? I am seeing a 20ms or so performance loss on the bench script. Its pretty noisy though. I will run a full SS test tomorrow.
(In reply to comment #17)
> http://hg.mozilla.org/tracemonkey/rev/487b81c753c0

I backed out the patch due to talos fails
Whiteboard: fixed-in-tracemonkey
(In reply to comment #18)
> Did you benchmark this patch? 

Yes - no difference with SS and 4% speedup with v8.
Here is the test case for the regression that caused Talos failure:

gcparam("maxMallocBytes", 1e6);

function thisTest() {
    
    var str = "";

    for ( var i = 0; i < 16384; i++ )
        str += 'a';
    gc();
    
    for ( var j = 0; j < 150; j++ ) {
        str.slice( 12000, -1 );
    }
    gc();
}

thisTest();

The reason for the crash is bad free list management with jit enabled and the following patch fixes it. But this also demonstrates that the shell test suite should be run with both jit and non jit shelss for tests that do not explicitly require jit/nonjit. Otherwise a Lot of coverage is lost.
Please add the test in comment 21 as js/src/trace-test/tests/basic/bug505315.js or some such name. Easy and it gets -j coverage.

/be
The new version fixes the bug in the free list management. The comments in the following delta should explain all:

     METER(cx->runtime->gcStats.arenaStats[thingKind].alloc++);
 
     JSGCThing **freeListp =
         JS_THREAD_DATA(cx)->gcFreeLists.finalizables + thingKind;
     JSGCThing *thing = *freeListp;
     JSRuntime *rt = cx->runtime;
-#ifdef JS_THREADSAFE
-    bool tooMuchMalloc = (rt->gcMaxMallocBytes - rt->gcMallocBytes <=
-                          JS_THREAD_DATA(cx)->gcMallocBytes);
-#else
-    bool tooMuchMalloc = (rt->gcMaxMallocBytes <= rt->gcMallocBytes);
-#endif
 #ifdef JS_TRACER
     bool fromTraceReserve = false;
 #endif
 
     for (;;) {
-        if (thing && !tooMuchMalloc) {
-            *freeListp = thing->link;
-            METER(astats->localalloc++);
-            break;
+        if (thing) {
+#ifdef JS_THREADSAFE
+            bool tooMuchMalloc = (rt->gcMaxMallocBytes - rt->gcMallocBytes <=
+                                  JS_THREAD_DATA(cx)->gcMallocBytes);
+#else
+            bool tooMuchMalloc = (rt->gcMaxMallocBytes <= rt->gcMallocBytes);
+#endif
+            if (!tooMuchMalloc || JS_ON_TRACE(cx)) {
+                *freeListp = thing->link;
+                METER(astats->localalloc++);
+                break;
+            }
+
+            /*
+             * We will try to run the GC in RefillFinalizableFreeList and need
+             * to put the free list starting in thing back to the arena.
+             * Without this, if the GC will be canceled, we will loose this
+             * free list when assigning after RefillFinalizableFreeList
+             * returns and eventually run finalizars on free GC things.
+             */
+            RestoreGCArenaFreeList(freeListp);
         }
 
 #if defined JS_GC_ZEAL && defined JS_TRACER
         if (rt->gcZeal >= 1 && JS_TRACE_MONITOR(cx).useReservedObjects)
             goto testReservedObjects;
 #endif
 
         thing = RefillFinalizableFreeList(cx, thingKind);
         if (thing) {
+            JS_ASSERT(!*freeListp);
             *freeListp = thing->link;
             break;
         }
 
 #ifdef JS_TRACER
         if (JS_TRACE_MONITOR(cx).useReservedObjects) {
 #ifdef JS_GC_ZEAL
           testReservedObjects:
 #endif
             JS_ASSERT(JS_ON_TRACE(cx));
-            JS_ASSERT(thingKind == FINALIZE_OBJECT ||
-                      thingKind == FINALIZE_FUNCTION);
+            JS_ASSERT(thingKind == FINALIZE_OBJECT);
             JSTraceMonitor *tm = &JS_TRACE_MONITOR(cx);
             thing = (JSGCThing *) tm->reservedObjects;
             JS_ASSERT(thing);
             tm->reservedObjects = JSVAL_TO_OBJECT(tm->reservedObjects->fslots[0]);
             fromTraceReserve = true;
             break;
         }
 #endif
Attachment #406409 - Attachment is obsolete: true
Attachment #406515 - Flags: review?(brendan)
Comment on attachment 406515 [details] [diff] [review]
free list per arena v5

>+            /*
>+             * We will try to run the GC in RefillFinalizableFreeList and need
>+             * to put the free list starting in thing back to the arena.

s/back to the arena/back into its arena/.

>+             * Without this, if the GC will be canceled, we will loose this

s/will/would/
s/loose/lose/

>+             * free list when assigning after RefillFinalizableFreeList
>+             * returns and eventually run finalizars on free GC things.

"would run" if it fits (rewrapping may help).

/be
Attachment #406515 - Flags: review?(brendan) → review+
landed with nits addressed - https://hg.mozilla.org/tracemonkey/rev/876738eb9cf0
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/487b81c753c0
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 522775
(In reply to comment #18)
> Did you benchmark this patch? I am seeing a 20ms or so performance loss on the
> bench script. Its pretty noisy though. I will run a full SS test tomorrow.

The tinderbox shows a perf win on OS 10.5 after igor landed the follow up fix in bug 522775.
This bug causes GC crashes on the old trace-tests.js. Backing it out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I have a real hard time backing this patch out. There are 3 other patches by Igor on top of it that are interwoven. I might have to back out all 4.
Or do a hot fix -- what is the crash signature? Any diagnosis?

/be
We did a fix that hopefully is correct, but this patch was in there and broken for a week. Thats really not good. I wasn't able to back it out due to so much stacking of patches on top of it. If our fuzzer heroes find a massive GC safety regression in a landed patch, it has to be fixed right away, or backed out. Letting it sink in for a week is really not a good way to go about it.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
(In reply to comment #29)
> This bug causes GC crashes on the old trace-tests.js. Backing it out.

I suppose this is not the patch in this bug but rather the patch for the bug 523370 which lead to bug 523947. That bug has not made into m-c.
You need to log in before you can comment on or make changes to this bug.