Closed Bug 331966 Opened 14 years ago Closed 13 years ago

GC: Allocating initial slots together with object

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: igor, Assigned: igor)

References

Details

Attachments

(3 files, 14 obsolete files)

52.30 KB, patch
igor
: review+
Details | Diff | Splinter Review
1.94 KB, text/plain
Details
718 bytes, patch
igor
: review+
Details | Diff | Splinter Review
AFAICS each object always allocates at least JS_INITIAL_NSLOTS slots. Moreover, as follows from statistics from bug 331770, the vast majority of objects allocates exactly JS_INITIAL_NSLOTS slots. Thus the idea is to extend JSObjects struct with fixed-size array to hold JS_INITIAL_NSLOT slots and a pointer to extra dynamically-sized array. 

For the price of slightly more complex access to slots this would improve cache locality and number of allocations for most of objects.
Attached patch Proof of concept (obsolete) — Splinter Review
This is something that allows to browse and collect stats. The patch does not deal with the fact that with fat objects gcBytes is significantly higher.
Attached patch Updated patch (obsolete) — Splinter Review
Assignee: general → igor.bukanov
Attachment #216766 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attached patch Patch update (obsolete) — Splinter Review
Attachment #244205 - Attachment is obsolete: true
Attachment #247261 - Flags: review?(brendan)
Attached patch Implementation v1 (obsolete) — Splinter Review
The previous patch was empty
Attachment #247261 - Attachment is obsolete: true
Attachment #247262 - Flags: review?(brendan)
Attachment #247261 - Flags: review?(brendan)
Attached patch Implementation v1 (obsolete) — Splinter Review
Attachment #247262 - Attachment is obsolete: true
Attachment #247263 - Flags: review?(brendan)
Attachment #247262 - Flags: review?(brendan)
The patch did not change the start-up time for the optimized build of the browser from the current trunk. With and without the patch the best time was the same: 1.56 seconds.   
Depends on: 362668
Attached patch Implementation v2 (obsolete) — Splinter Review
This is a patch version built on top of the patch for bug 362668. This is untested so treat it as using bugzilla-as-a-backup.
Attachment #247263 - Attachment is obsolete: true
Attachment #247263 - Flags: review?(brendan)
Attached patch Ignore this (obsolete) — Splinter Review
The patch update to switch few more places to use STOBJ_ macros.
Attachment #247479 - Attachment is obsolete: true
Attachment #247548 - Flags: review?(brendan)
Comment on attachment 247548 [details] [diff] [review]
Ignore this

The right attachment for the wrong bug :(
Attachment #247548 - Attachment is obsolete: true
Attachment #247548 - Flags: review?(brendan)
Attachment #247548 - Attachment description: Implementation v3 → Ignore this
Attached patch Implementation v3 (obsolete) — Splinter Review
[The patch update that assumes that the patch from bug 362668 is applied.]

In this version I changed jsobj.c to always allocate extra slots using malloc and OBJ_GET_CLASS is always defined as STOBJ_GET_CLASS. The latter is possible as the class is immune and comes from the fixed slot.

With this patch the best start up time for the optimized browser build went from 0.66 to 0.59 s (10% improvement) on my Ubuntu 6.10 Pentium 1.1 GHz laptop.

The patch is generated using quilt and I will attach cvs diff if the patch from bug 362668 would be committed.
Attachment #247733 - Flags: review?(brendan)
Attached patch Implementation v4 (obsolete) — Splinter Review
[The patch update that assumes that the patch from bug 362668 is applied.]

In this version the code does not assume that the map->nslots is at least JS_INITIAL_SLOTS.
Attachment #247733 - Attachment is obsolete: true
Attachment #247743 - Flags: review?(brendan)
Attachment #247733 - Flags: review?(brendan)
Comment on attachment 247743 [details] [diff] [review]
Implementation v4


>+static uintN
>+GetGCListIndex(uintN type, uintN thingSize)
>+{
>+    JS_ASSERT(type < GCX_NTYPES);
>+    JS_ASSERT(thingSize < GC_NBYTES_MAX);
>+    JS_ASSERT(thingSize % sizeof(JSGCThing) == 0);
>+
>+    if (type == GCX_OBJECT) {
>+        JS_ASSERT(thingSize == sizeof(JSObject));
>+        return 0;
>+    }
>+
>+    return thingSize / sizeof(JSGCThing);
>+}

Why is this special case needed?  IOW, why can't the size classification be independent of GC-thing type, as it was with the macro before (to wit: (thingSize / sizeof(JSGCThing)) - 1)?

Nit: use size_t consistently throughout for sizes.

>@@ -2974,22 +2996,22 @@ restart:
> 
>     /*
>      * Sweep phase.
>      *
>      * Finalize as we sweep, outside of rt->gcLock but with rt->gcRunning set
>      * so that any attempt to allocate a GC-thing from a finalizer will fail,
>      * rather than nest badly and leave the unmarked newborn to be swept.
>      *
>-     * Finalize smaller objects before larger, to guarantee finalization of
>-     * GC-allocated obj->slots after obj.  See FreeSlots in jsobj.c.
>+     * JSObject are finalized first to guarantee finalization of GC-allocated
>+     * obj->slots after obj. See FreeSlots in jsobj.c.

obj->dslots.  But as this patch uses malloc/free (JS_ veneered), the final paragraph above is no longer true.

More detailed review in a bit, but would appreciate enlightenment on size to freelist special case for GCX_OBJECT.  If it's related to the obsolete comment about finalization order, it can be removed and the macro restored.

/be
Great to see the Ts numbers!  SpiderMonkey is finally getting some performance love, in time for Gecko 1.9.  More yet to come, too.

/be
(In reply to comment #12)
> Why is this special case needed?  IOW, why can't the size classification be
> independent of GC-thing type, as it was with the macro before (to wit:
> (thingSize / sizeof(JSGCThing)) - 1)?

Then it is necessary to finalize the list with object size first making an ugly code like in the initial patch:

     for (i = 0; i < GC_NUM_FREELISTS; i++) {
-        arenaList = &rt->gcArenaList[i];
-        nbytes = GC_FREELIST_NBYTES(i);
+        /*
+         * A hack to ensure that list containg objects is finalized first
+         * before all other lists.
+         */
+        arenaList = &rt->gcArenaList[i == 0
+                                     ? GC_FREELIST_INDEX(sizeof(JSObject))
+                                     : i == GC_FREELIST_INDEX(sizeof(JSObject))
+                                     ? 0
+                                     : i];
+        nbytes = arenaList->thingSize;
 
So I decided to use a separated list for objects since it also make exploiting GC hazards slightly harder since now it is less trivial to override the object storage with other memory. But if that is too controversial I will use the code from the initial patch. 

> 
> Nit: use size_t consistently throughout for sizes.
> 
> >@@ -2974,22 +2996,22 @@ restart:
> > 
> >     /*
> >      * Sweep phase.
> >      *
> >      * Finalize as we sweep, outside of rt->gcLock but with rt->gcRunning set
> >      * so that any attempt to allocate a GC-thing from a finalizer will fail,
> >      * rather than nest badly and leave the unmarked newborn to be swept.
> >      *
> >-     * Finalize smaller objects before larger, to guarantee finalization of
> >-     * GC-allocated obj->slots after obj.  See FreeSlots in jsobj.c.
> >+     * JSObject are finalized first to guarantee finalization of GC-allocated
> >+     * obj->slots after obj. See FreeSlots in jsobj.c.
> 
> obj->dslots.  But as this patch uses malloc/free (JS_ veneered), the final
> paragraph above is no longer true.

Right, but there is another problem not mentionned in comments: JSFunction is allocated as GCX_PRIVATE and its finalization depends on the fact that JSObject wrapping it is finalized first.
Ok, so new patch with accurate comments on why freelist 0 is special and only for object GC-things?

/be
I made a stupid mistake when running the start up tests. I had a FF 2.0 window opened on another desktop so the numbers show how fast is to open new window in FF 2.0 with a particular browsing history through starting a new trunk build process.

With no browsers running on empty desktop I got not so roisy pictire. The fastest startup time did drop from 1.54 to 1.51s, a 2% improvement. Then I repeat the tests with FF 2.0 running again, this time with a single frame/single tab config. With the trunk build process was able to open the window faster: the fastest time decreased with the patch from .39 to 0.37s or 5% improvement. 

But the numbers is nowhere near of those 10%... 
Attached patch Implementation v5 (obsolete) — Splinter Review
[The patch update that assumes that the patch from bug 362668 is applied.]

This is the version that does not use a separate list just for GCX_OBJECT but rather returns to the initial idea of finalizing the list with objects first for the minimality of changes.
Attachment #247743 - Attachment is obsolete: true
Attachment #247836 - Flags: review?(brendan)
Attachment #247743 - Flags: review?(brendan)
Comment on attachment 247836 [details] [diff] [review]
Implementation v5

>+/*
>+ * Ensure that GC-allocated JSFunction and JSObject would go to different
>+ * lists so we can easly finalize JSObject before JSFunction. See comments
>+ * in js_GC

Period at end of last sentence.

>     /*
>      * Sweep phase.
>      *
>      * Finalize as we sweep, outside of rt->gcLock but with rt->gcRunning set
>      * so that any attempt to allocate a GC-thing from a finalizer will fail,
>      * rather than nest badly and leave the unmarked newborn to be swept.
>      *
>-     * Finalize smaller objects before larger, to guarantee finalization of
>-     * GC-allocated obj->slots after obj.  See FreeSlots in jsobj.c.
>+     * Here we need to ensure that JSObject instances are finalized before GC-
>+     * allocated JSFunction so fun_finalize from jsfun.c can get the proper

s/JSFunction/&s/ -- or s/JSFunction/& instances/ ?

>+     * result from the call to js_IsAboutToBeFinalized. For that we simply
>+     * finalize the list containin JSObject first since the static assert at

typo for "containing"

>+     * the beginning of the file guarantee that JSFunction instances are

"guarantees"

>+     * allocated from a different list.
>      */
>     for (i = 0; i < GC_NUM_FREELISTS; i++) {
>-        arenaList = &rt->gcArenaList[i];
>-        nbytes = GC_FREELIST_NBYTES(i);
>+        arenaList = &rt->gcArenaList[i == 0
>+                                     ? GC_FREELIST_INDEX(sizeof(JSObject))
>+                                     : i == GC_FREELIST_INDEX(sizeof(JSObject))
>+                                     ? 0
>+                                     : i];

Scratching my head for a shorter way to compute finalization index here... Nothing comes to mind except table-driven alternative.

>+    /*
>+     * Always initialize slots beyond oslots so the GC would not see junk
>+     * values when it scans the array.
>+     */
>+    i = obj->nslots;
>+    obj->nslots = nslots;
>+    for (; i < nslots; i++)
>+        STOBJ_SET_SLOT(obj, i, JSVAL_VOID);

Put i = obj->nslots in for (; head and put obj->nslots = nslots after, for fewer source lines and just as good compiled code?

>     /*
>      * Root obj to prevent it from being collected out from under this call.
>-     * to js_NewObject.  AllocSlots can trigger a finalizer from a last-ditch
>+     * to js_NewObject.  ReallocSlots can trigger a finalizer from a last-ditch
>      * GC calling JS_ClearNewbornRoots. There's also the possibilty of things
>      * happening under the objectHook call-out further below.

Extra . at end of first line, or extra "to js_NewObject." at start of second.

Sentence starting in second line with "ReallocSlots can trigger..." now lies, strike it.

>@@ -2405,37 +2403,35 @@ js_NewObject(JSContext *cx, JSClass *cla
>         /* Leave parent alone.  Allocate a new map for obj. */
>         map = ops->newObjectMap(cx, 1, ops, clasp, obj);
>         if (!map)
>             goto bad;
>         obj->map = map;
> 
>         /* Let ops->newObjectMap set nslots so as to reserve slots. */
>         nslots = map->nslots;
>+        JS_ASSERT(nslots >= JSSLOT_PRIVATE);
>+        if (nslots > JS_INITIAL_NSLOTS) {
>+            obj->dslots
>+                = JS_malloc(cx, (nslots - JS_INITIAL_NSLOTS) * sizeof(jsval));

Prevailing style puts = on first line (visible in ReallocSlots part of patch).


> /*
>- * In the original JS engine design, obj->slots pointed to a vector of length
>- * JS_INITIAL_NSLOTS words if obj->map was shared with a prototype object,
>- * else of length obj->map->nslots.  With the advent of JS_GetReservedSlot,
>- * JS_SetReservedSlot, and JSCLASS_HAS_RESERVED_SLOTS (see jsapi.h), the size
>- * of the minimum length slots vector in the case where map is shared cannot
>- * be constant.  This length starts at JS_INITIAL_NSLOTS, but may advance to
>- * include all the reserved slots.
>+ * JSObject needs to record explicitly the number of available slots as
>+ * obj->map->nslot can not be used for that. With the advent of

obj->map->nslots -- and you might say why not, just to be clear: because obj->map may be a prototype object's map, shared by obj, where the prototype has many slots allocated but the unmutated proto-child has no dynamic slots of its own.

>+ * JS_GetReservedSlot, JS_SetReservedSlot, and JSCLASS_HAS_RESERVED_SLOTS
>+ * (see jsapi.h), the size of the minimum length slots vector in the case
>+ * where map is shared cannot be constant.
>  *
>- * Therefore slots must be self-describing.  Rather than tag its low order bit
>- * (a bit is all we need) to distinguish initial length from reserved length,
>- * we do "the BSTR thing": over-allocate slots by one jsval, and store the
>- * *net* length (counting usable slots, which have non-negative obj->slots[]
>- * indices) in obj->slots[-1].  All code that sets obj->slots must be aware of
>- * this hack -- you have been warned, and jsobj.c has been updated!
>+ * FIXME: since JSClass is stored in JSObject.fslots, one can alsways extract
>+ * the number of reserved in the case of the shared map. Thus it should be
>+ * possible to remove JSObject.nslots.

Hmm, not sure what to do about this.  We want to pack JSObject.  We'd also like to avoid recomputing (clasp->reserveSlots might be non-trivial in cost; anyway calling it more often begs questions about cost and invariance).  Maybe this is not a FIXME yet?

>+#define STOBJ_SLOTP_IMPL(obj, slot)                                           \
>+    (JS_ASSERT((uint32)(slot) < (obj)->nslots),                               \
>+     ((slot) < JS_INITIAL_NSLOTS                                              \
>+      ? (obj)->fslots + (slot)                                                \
>+      : (obj)->dslots + ((slot) - JS_INITIAL_NSLOTS)))

Unless we have a FOO, I say we don't need a FOO_IMPL.  In this case STOBJ_SLOTP is enough.

>+/*
>+ * Class is immune and comes from the fixed slot. Thus no locking is neccesary

s/immune/invariant/ (not immutable, but JSClass is immutable in practice, too -- but I think you mean invariant here).

"necessary" misspelled.

>+    /*
>+     * Null-check of i.script is required since the parser sets interpreted
>+     * very early.
>+     *
>+     * Here js_IsAboutToBeFinalized works since obj is finalized before

s/since/because/

>+     * JSFunction. See comments in js_GC() before the finalization loop.

Mega-nit: "js_GC before ..." is the prevailing style, no extra "()" needed.

>+     */
>     if (FUN_INTERPRETED(fun) && fun->u.i.script &&
>         js_IsAboutToBeFinalized(cx, fun))
>     {
>         script = fun->u.i.script;
>         fun->u.i.script = NULL;
>         js_DestroyScript(cx, script);
>     }
> }

/be
(In reply to comment #18)
> >+    i = obj->nslots;
> >+    obj->nslots = nslots;
> >+    for (; i < nslots; i++)
> >+        STOBJ_SET_SLOT(obj, i, JSVAL_VOID);
> 
> Put i = obj->nslots in for (; head and put obj->nslots = nslots after, for
> fewer source lines and just as good compiled code?

I did initially like that. The problem is that STOBJ_SET_SLOT asserts that i <= nslots. I think I just inline the macro here without the assert since the function manipulates the slots directly in any case. 

> >+ * FIXME: since JSClass is stored in JSObject.fslots, one can alsways extract
> >+ * the number of reserved in the case of the shared map. Thus it should be
> >+ * possible to remove JSObject.nslots.
> 
> Hmm, not sure what to do about this.  We want to pack JSObject.  We'd also like
> to avoid recomputing (clasp->reserveSlots might be non-trivial in cost; anyway
> calling it more often begs questions about cost and invariance).  Maybe this is
> not a FIXME yet?

Right, I forgot about JSClass.reservedSlots.

> 
> >+#define STOBJ_SLOTP_IMPL(obj, slot)                                           \
....
> Unless we have a FOO, I say we don't need a FOO_IMPL.  In this case STOBJ_SLOTP
> is enough.

I will drop STOBJ_SLOTP altogether.

> 
> >+/*
> >+ * Class is immune and comes from the fixed slot. Thus no locking is neccesary
> 
> s/immune/invariant/ (not immutable, but JSClass is immutable in practice, too
> -- but I think you mean invariant here).
> 

Right. Too bad that the same non-locking access can be provided for parent as there are public API to change it explicitly.
Attached patch Implementation v6 (obsolete) — Splinter Review
In this version I removed JSObject.nslots and instead store the number of allocated slots in JSObject.dslots[-1] in the same way as JSObject.slots does it currently. When in JSObject.dslots is null, the code assumes that for an object with a shared map the number of slots in use is JS_INITIAL_NSLOTS which the patch bumps to 6.

The only drawback as far as I can see is that for objects that use a shared map GC has to scan an extra fixed slot that would be most likely unused. 

Another chnage is that the patch updates gcBytes counting in jsgc.c so a fat objects still contribute to gcBytes in the same way as current thin objects do.
Attachment #247836 - Attachment is obsolete: true
Attachment #248163 - Flags: review?(brendan)
Attachment #247836 - Flags: review?(brendan)
Attached patch Implementation v7 (obsolete) — Splinter Review
In this version compared with v6 I mostly replaced the explicit references to obj->slots in comments by generic terms.
Attachment #248163 - Attachment is obsolete: true
Attachment #248165 - Flags: review?(brendan)
Attachment #248163 - Flags: review?(brendan)
It turned out that fatter objects caused an extra GC run during the browser startup. In particular, JS_MaybeGC triggers with the last patch GC 4 times, not 3 times as was before the patch. In all cases this triggered by "bytes > lastBytes + lastBytes / 3" condition from JS_MaybeGC: 

    if ((bytes > 8192 && bytes > lastBytes + lastBytes / 3) ||
        rt->gcMallocBytes >= rt->gcMaxMallocBytes) {
        JS_GC(cx);
    }

The condition comes from the observation that the arena fragmentation measured as the free list density (number of bytes in the free list / total number of bytes in GC arenas, see bug 312238 and comments in MaybeGC for details) is about 20%. This density is dominated by object arenas. 

With fatter objects the fragmentation drops. In particular, compiling with JS_METERGC and making sure that the stats is dumped on each GC, I got the following stats after the browser startup until it shows a window without tabs with a document without any JavaScript: 

GC allocation statistics:
ARENA LIST 0 (thing size 8):
                     arenas: 10
                 max arenas: 10
                     things: 9536
                 max things: 7416
                  free list: 355
          free list density: 3.5%
  average free list density: 1.9%
                   recycles: 20
        recycle/alloc ratio: 0.00
ARENA LIST 1 (thing size 16): NEVER USED
ARENA LIST 2 (thing size 24):
                     arenas: 14
                 max arenas: 14
                     things: 4429
                 max things: 3183
                  free list: 110
          free list density: 2.3%
  average free list density: 7.4%
                   recycles: 59
        recycle/alloc ratio: 0.01
ARENA LIST 3 (thing size 32):
                     arenas: 33
                 max arenas: 33
                     things: 7477
                 max things: 5072
                  free list: 558
          free list density: 6.6%
  average free list density: 13.0%
                   recycles: 154
        recycle/alloc ratio: 0.02
ARENA LIST 4 (thing size 40): NEVER USED
ARENA LIST 5 (thing size 48): NEVER USED
ARENA LIST 6 (thing size 56): NEVER USED
ARENA LIST 7 (thing size 64): NEVER USED
ARENA LIST 8 (thing size 72): NEVER USED
ARENA LIST 9 (thing size 80): NEVER USED
TOTAL STATS:
            bytes allocated: 526224
             alloc attempts: 24698
        alloc without locks: 21945
            total GC things: 21442
        max total GC things: 15671
             GC things size: 421848
allocation retries after GC: 0
        allocation failures: 0
         things born locked: 4
           valid lock calls: 417
         valid unlock calls: 25
       mark recursion depth: 0
     maximum mark recursion: 24
     mark C recursion depth: 0
   maximum mark C recursion: 14
      delayed scan bag adds: 0
   maximum GC nesting level: 0
potentially useful GC calls: 4
           useless GC calls: 0
  thing arenas freed so far: 0
     stack segments scanned: 3
stack segment slots scanned: 10
reachable closeable objects: 0
    max reachable closeable: 0
      scheduled close hooks: 0
  max scheduled close hooks: 0


The important numbers here is averages for the free list density:

ARENA LIST 0 (thing size 8):
                     arenas: 10
                 max arenas: 10
          free list density: 3.5%
  average free list density: 1.9%
ARENA LIST 2 (thing size 24):
                     arenas: 14
                 max arenas: 14
          free list density: 2.3%
  average free list density: 7.4%
ARENA LIST 3 (thing size 32):
                     arenas: 33
                 max arenas: 33
          free list density: 6.6%
  average free list density: 13.0%

where the list 3 is the list containing fat objects and the list 2 contains JSFunction allocations.

This is definitely less then 20% so the condition requires updating.
Attached patch Implementation v8 (obsolete) — Splinter Review
In this version of the patch I removed gcPrivateBytes. The fat objects change GC behaviour sufficiently so it is pointless to try to keep compatibility with embeddings that did not change their GC allocatiuon limits. Moreover, the checks in JS_MayBe that guard the GC call in a typical browser run,

    if ((bytes > 8192 && bytes > lastBytes + lastBytes /5) ||
        rt->gcMallocBytes >= rt->gcMaxMallocBytes) {
        JS_GC(cx);
    }

trigger the GC in almost all cases due to bytes > lastBytes + lastBytes /5 condition, not rt->gcMallocBytes >= rt->gcMaxMallocBytes. Thus to preserve compatibility it is necessary to alter that condition, not allocation size boundaries.

After some experimenting I have found out that changing that to bytes > lastBytes + lastBytes /3 ensures that MaybeGC calls GC on brpwser's startup the same 3 times with fat objects as with bytes > lastBytes + lastBytes /5 and thin objects. This is independently confirmed by stats (see bellow) from JS_GCMETER runs. The new stats show lesser free list density of 11-15% then 20-30% without the patch.

The patch also adds JS_GCMETER printing of the free list density averaged for all arena lists and fixes  JS_GCMETER to properly update JSGCArenaStats.(nthings|maxnthings) on each allocation. Currently allocations from the thrtead-local pool does not change   nthings|maxnthings leading to eroneous stats. 

With the same number of GC calls on the startup the patch changes the startup time for build from the current tip from 1.58 to 1.54 secons or 2.5% improvment.

The stats with the patch on the startup into single window/single tab configuration:

GC allocation statistics:
ARENA LIST 0 (thing size 8):
                     arenas: 8
                 max arenas: 8
                     things: 7109
                 max things: 7123
                  free list: 21
          free list density: 0.3%
  average free list density: 1.3%
                   recycles: 17
        recycle/alloc ratio: 0.00
ARENA LIST 1 (thing size 16): NEVER USED
ARENA LIST 2 (thing size 24):
                     arenas: 10
                 max arenas: 10
                     things: 3015
                 max things: 3043
                  free list: 36
          free list density: 1.1%
  average free list density: 9.5%
                   recycles: 55
        recycle/alloc ratio: 0.02
ARENA LIST 3 (thing size 32):
                     arenas: 20
                 max arenas: 20
                     things: 4705
                 max things: 4782
                  free list: 79
          free list density: 1.5%
  average free list density: 15.8%
                   recycles: 146
        recycle/alloc ratio: 0.02
ARENA LIST 4 (thing size 40): NEVER USED
ARENA LIST 5 (thing size 48): NEVER USED
ARENA LIST 6 (thing size 56): NEVER USED
ARENA LIST 7 (thing size 64): NEVER USED
ARENA LIST 8 (thing size 72): NEVER USED
ARENA LIST 9 (thing size 80): NEVER USED
TOTAL STATS:
            bytes allocated: 350816
             alloc attempts: 17075
        alloc without locks: 15179
            total GC things: 14829
        max total GC things: 14948
             GC things size: 279792
            total GC arenas: 38
    total free list density: 1.1%
  average free list density: 11.4%
allocation retries after GC: 0
        allocation failures: 0
         things born locked: 4
           valid lock calls: 327
         valid unlock calls: 23
       mark recursion depth: 0
     maximum mark recursion: 14
     mark C recursion depth: 0
   maximum mark C recursion: 12
      delayed scan bag adds: 0
   maximum GC nesting level: 0
potentially useful GC calls: 3
           useless GC calls: 0
  thing arenas freed so far: 0
     stack segments scanned: 0
stack segment slots scanned: 0
reachable closeable objects: 0
    max reachable closeable: 0
      scheduled close hooks: 0
  max scheduled close hooks: 0


The stats for multiple window/tabs after the browser runs (I entering this into this instance of the browser):
GC allocation statistics:
ARENA LIST 0 (thing size 8):
                     arenas: 43
                 max arenas: 48
                     things: 28918
                 max things: 48678
                  free list: 14681
          free list density: 33.3%
  average free list density: 28.2%
                   recycles: 52356
        recycle/alloc ratio: 0.10
ARENA LIST 1 (thing size 16):
                     arenas: 1
                 max arenas: 1
                     things: 3
                 max things: 3
                  free list: 6
          free list density: 1.2%
  average free list density: 1.2%
                   recycles: 0
        recycle/alloc ratio: 0.00
ARENA LIST 2 (thing size 24):
                     arenas: 56
                 max arenas: 56
                     things: 18328
                 max things: 18572
                  free list: 251
          free list density: 1.3%
  average free list density: 2.7%
                   recycles: 1886
        recycle/alloc ratio: 0.05
ARENA LIST 3 (thing size 32):
                     arenas: 160
                 max arenas: 169
                     things: 35349
                 max things: 41690
                  free list: 4331
          free list density: 10.6%
  average free list density: 12.3%
                   recycles: 23073
        recycle/alloc ratio: 0.09
ARENA LIST 4 (thing size 40): NEVER USED
ARENA LIST 5 (thing size 48): NEVER USED
ARENA LIST 6 (thing size 56): NEVER USED
ARENA LIST 7 (thing size 64): NEVER USED
ARENA LIST 8 (thing size 72): NEVER USED
ARENA LIST 9 (thing size 80): NEVER USED
TOTAL STATS:
            bytes allocated: 2400320
             alloc attempts: 901907
        alloc without locks: 809966
            total GC things: 82598
        max total GC things: 108943
             GC things size: 1802432
            total GC arenas: 260
    total free list density: 12.3%
  average free list density: 12.7%
allocation retries after GC: 0
        allocation failures: 0
         things born locked: 4
           valid lock calls: 5283
         valid unlock calls: 3841
       mark recursion depth: 0
     maximum mark recursion: 55
     mark C recursion depth: 0
   maximum mark C recursion: 41
      delayed scan bag adds: 0
   maximum GC nesting level: 0
potentially useful GC calls: 94
           useless GC calls: 0
  thing arenas freed so far: 80
     stack segments scanned: 7
stack segment slots scanned: 22
reachable closeable objects: 0
    max reachable closeable: 0
      scheduled close hooks: 0
  max scheduled close hooks: 0
Attachment #248165 - Attachment is obsolete: true
Attachment #248244 - Flags: review?(brendan)
Attachment #248165 - Flags: review?(brendan)
Comment on attachment 248244 [details] [diff] [review]
Implementation v8

English nits in JS_MaybeGC comments:
  Need "to" between "According" and "the stats from bug 331770".
  Want a comma between "bug 331770" and "F1 is about ..."

Misspelled "easly" in comment before jsgc.c JS_STATIC_ASSERT

jsscope.h comment on |uint32 slot| member of JSScopeProperty wants "abstract" before "obj slots vector".

Net number of slots grows [6, 9, 13, 19, 28, ...] while size of the malloc'ed obj->dslots grows [0, 4, 8, 14, 23, ...] words (check my math here).  Is it better to use binary exponential growth on gross number of dslots? [0, 4, 8, 16, 32, ...] e.g.

I will gladly r+ a nit-picked patch, esp. if it has re-rationalized dynamic slot allocation growth.

/be
Here for refrences the stats after the browser run for 16 hours with multiple windows/tabs:

GC allocation statistics:
ARENA LIST 0 (thing size 8):
                     arenas: 67
                 max arenas: 113
                     things: 32963
                 max things: 113959
                  free list: 35011
          free list density: 51.0%
  average free list density: 41.6%
                   recycles: 850393
        recycle/alloc ratio: 0.10
ARENA LIST 1 (thing size 16):
                     arenas: 1
                 max arenas: 1
                     things: 3
                 max things: 25
                  free list: 24
          free list density: 4.7%
  average free list density: 3.0%
                   recycles: 1
        recycle/alloc ratio: 0.04
ARENA LIST 2 (thing size 24):
                     arenas: 76
                 max arenas: 78
                     things: 22364
                 max things: 26015
                  free list: 3172
          free list density: 12.2%
  average free list density: 12.7%
                   recycles: 34602
        recycle/alloc ratio: 0.10
ARENA LIST 3 (thing size 32):
                     arenas: 221
                 max arenas: 266
                     things: 42102
                 max things: 65858
                  free list: 12706
          free list density: 22.5%
  average free list density: 24.4%
                   recycles: 464722
        recycle/alloc ratio: 0.11
ARENA LIST 4 (thing size 40): NEVER USED
ARENA LIST 5 (thing size 48):
                     arenas: 0
                 max arenas: 1
                     things: 0
                 max things: 1
                  free list: 0
          free list density: 0.0%
  average free list density: 0.0%
                   recycles: 0
        recycle/alloc ratio: 0.00
ARENA LIST 6 (thing size 56): NEVER USED
ARENA LIST 7 (thing size 64): NEVER USED
ARENA LIST 8 (thing size 72):
                     arenas: 0
                 max arenas: 1
                     things: 0
                 max things: 7
                  free list: 0
          free list density: 0.0%
  average free list density: 0.0%
                   recycles: 0
        recycle/alloc ratio: 0.00
ARENA LIST 9 (thing size 80): NEVER USED
TOTAL STATS:
            bytes allocated: 3369680
             alloc attempts: 14320411
        alloc without locks: 12876436
            total GC things: 97432
        max total GC things: 205865
             GC things size: 2147752
            total GC arenas: 365
    total free list density: 25.5%
  average free list density: 24.5%
allocation retries after GC: 0
        allocation failures: 0
         things born locked: 4
           valid lock calls: 82464
         valid unlock calls: 81164
       mark recursion depth: 0
     maximum mark recursion: 57
     mark C recursion depth: 0
   maximum mark C recursion: 43
      delayed scan bag adds: 0
   maximum GC nesting level: 0
potentially useful GC calls: 1429
           useless GC calls: 0
  thing arenas freed so far: 1144
     stack segments scanned: 10
stack segment slots scanned: 32
reachable closeable objects: 0
    max reachable closeable: 0
      scheduled close hooks: 0
  max scheduled close hooks: 0

That is, after the last GC JS heap was about 3MB with 25% of that not returned to the system due to a fragmentation. 

Now I need to shutdown it to test a patch upadate.
Attached patch Implementation v8b (obsolete) — Splinter Review
This version just addresses the comment nits.

I need to think what about that grouth changing strategy. The idea is to make sure that the total size of dslots in bytes is power of 2 so various modern malloc implemntations would allpcate the dynamic slots more tightly, righjt?
Attachment #248244 - Attachment is obsolete: true
Attachment #248244 - Flags: review?(brendan)
(In reply to comment #26)
> I need to think what about that growth changing strategy. The idea is to make
> sure that the total size of dslots in bytes is power of 2 so various modern
> malloc implementations would allocate the dynamic slots more tightly, right?

Right.  The old 1.5**n growth was not particularly well-founded.  Do you want to land this and address the growth function in a separate patch?

/be
Comment on attachment 248324 [details] [diff] [review]
Implementation v8b

>@@ -1925,31 +1925,31 @@ JS_MaybeGC(JSContext *cx)
>      * cell density currently and Sl and Fl are the size and the density
>      * right after GC. The density by definition is memory taken by free
>      * cells divided by total amount of memory. In other words, B and Bl
>      * are bytes and lastBytes with the new meaning and B*(1-F) and
>      * Bl*(1-Fl) are bytes and lastBytes with the original meaning.
>      *
>      * Our task is to exclude F and Fl from the last statement. According
>-     * the stats from bug 331770 Fl is about 20-30% for GC allocations
>-     * that contribute to S and Sl for a typical run of the browser. It
>-     * means that the original condition implied that we did not run GC
>-     * unless we exhausted the pool of free cells. Indeed if we still
>-     * have free cells, then B == Bl since we did not yet allocated any
>-     * new arenas and the condition means
>+     * to the stats from bug 331966 comment 23 Fl is about 11-15% for a

I would still favor a comma between "comment 23" and "F1 is about ...".

r+me with that.

/be
Attachment #248324 - Flags: review+
(In reply to comment #28)
> >-     * new arenas and the condition means
> >+     * to the stats from bug 331966 comment 23 Fl is about 11-15% for a
> 
> I would still favor a comma between "comment 23" and "F1 is about ...".

Sorry about it.

(In reply to comment #27)
> (In reply to comment #26)

> Do you want
> to land this and address the growth function in a separate patch?

That is an idea.
Blocks: 363557
Patch to commit with the last nit addressed.
Attachment #248324 - Attachment is obsolete: true
Attachment #248372 - Flags: review+
I committed the patch from comment 30 to the trunk:

Waiting for Emacs...Done
Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.294; previous revision: 3.293
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.130; previous revision: 3.129
done
Checking in jsdbgapi.c;
/cvsroot/mozilla/js/src/jsdbgapi.c,v  <--  jsdbgapi.c
new revision: 3.82; previous revision: 3.81
done
Checking in jsfun.c;
/cvsroot/mozilla/js/src/jsfun.c,v  <--  jsfun.c
new revision: 3.173; previous revision: 3.172
done
Checking in jsgc.c;
/cvsroot/mozilla/js/src/jsgc.c,v  <--  jsgc.c
new revision: 3.191; previous revision: 3.190
done
Checking in jsinterp.c;
/cvsroot/mozilla/js/src/jsinterp.c,v  <--  jsinterp.c
new revision: 3.307; previous revision: 3.306
done
Checking in jslock.c;
/cvsroot/mozilla/js/src/jslock.c,v  <--  jslock.c
new revision: 3.59; previous revision: 3.58
done
Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v  <--  jsobj.c
new revision: 3.305; previous revision: 3.304
done
Checking in jsobj.h;
/cvsroot/mozilla/js/src/jsobj.h,v  <--  jsobj.h
new revision: 3.59; previous revision: 3.58
done
Checking in jspubtd.h;
/cvsroot/mozilla/js/src/jspubtd.h,v  <--  jspubtd.h
new revision: 3.75; previous revision: 3.74
done
Checking in jsregexp.h;
/cvsroot/mozilla/js/src/jsregexp.h,v  <--  jsregexp.h
new revision: 3.24; previous revision: 3.23
done
Checking in jsscope.h;
/cvsroot/mozilla/js/src/jsscope.h,v  <--  jsscope.h
new revision: 3.40; previous revision: 3.39
done
Checking in jsutil.h;
/cvsroot/mozilla/js/src/jsutil.h,v  <--  jsutil.h
new revision: 3.13; previous revision: 3.12
done
Checking in liveconnect/jsj_JavaObject.c;
/cvsroot/mozilla/js/src/liveconnect/jsj_JavaObject.c,v  <--  jsj_JavaObject.c
new revision: 1.43; previous revision: 1.42
done
Checking in xpconnect/src/xpcdebug.cpp;
/cvsroot/mozilla/js/src/xpconnect/src/xpcdebug.cpp,v  <--  xpcdebug.cpp
new revision: 1.17; previous revision: 1.16
done
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Attached file Trace of assert/crash
After updating to trunk, I crashed at startup. This bug looks suspicious. I'm running OS X 10.4.8, Firefox. I'll attach a stack trace here FYI.

It starts up, and ends with this output before being killed:

<snip>
*** Registering xpconnect components (all right -- a generic module!)
*** Registering xpconnect_test components (all right -- a generic module!)
*** Registering nsSoftwareUpdate components (all right -- a generic module!)
Assertion failure: oslots > JS_INITIAL_NSLOTS, at /Users/hakan/Programmering/mozilla/firefox/mozilla/js/src/jsobj.c:2269
Program received signal:  "SIGTRAP".
Looks like all OS X trunk tinderboxes are orange after this change as well.
My bad. Why Linux tolerates this???
(In reply to comment #33)
> Looks like all OS X trunk tinderboxes are orange after this change as well.
> 

A stupid mistake tolerated by Linux. Do you have a chance to check the patch within 5-10 minutes?  
Comment on attachment 248385 [details] [diff] [review]
Fix for the orange tinderboxes

I committed the fix as well.
Attachment #248385 - Flags: review+
The reference for the CVS commit of the last patch:

Checking in jsobj.c;
/cvsroot/mozilla/js/src/jsobj.c,v  <--  jsobj.c
new revision: 3.306; previous revision: 3.305
done
Flags: in-testsuite-
You need to log in before you can comment on or make changes to this bug.