Status

()

Core
JavaScript Engine
--
enhancement
RESOLVED FIXED
13 years ago
12 years ago

People

(Reporter: Igor Bukanov, Assigned: Igor Bukanov)

Tracking

Trunk
Points:
---
Bug Flags:
in-testsuite -

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 11 obsolete attachments)

24.19 KB, patch
brendan
: review+
Details | Diff | Splinter Review
24.18 KB, patch
Details | Diff | Splinter Review
(Assignee)

Description

13 years ago
The current implementation of GC uses JSArena API to allocate chunks of memory
for GC things. I believe in many cases it causes unnecessary memory overhead. 

For example, GC often picks up arenas that contains hundreds of extra bytes and
locks them for long periods. In addition there is overhead of generic arena
structures which requires 4 words plus alignment overhead when GC implementation
effectively uses only one word and does not need any particular alignment
requirements.

Given that I suggest to replace JAArena calls by standard malloc/free to
minimize memory overhead and to release unused memory to the rest of system as
early as possible.
(Assignee)

Comment 1

13 years ago
Created attachment 183837 [details] [diff] [review]
Implementation: patch against CVS head from 2005-05-17

The patch adds JSGCAreaList and JSGCArea to implement functionality previously
provided by JSArena API. JSGCArea represents GC chunk and besides the data
itself contains just one field which is next pointer to implement area list.
JSGCAreaList holds list head of JSGCArea, free list of GC thinks and lastLimit
field. That field holds the limit of of GC things in the last arena.

The rest of changes is a straightforward adaptation of the existing code to use
new structures and malloc/free for memory management.
(Assignee)

Comment 2

13 years ago
Created attachment 191130 [details] [diff] [review]
Patch update to reflect HEAD changes
Attachment #183837 - Attachment is obsolete: true
(Assignee)

Comment 3

13 years ago
Comment on attachment 191130 [details] [diff] [review]
Patch update to reflect HEAD changes

In addition to saving memory and more informative  printouts with JS_GCMETER
defined, the patch would also simplify the patch from bug 280844 attachment
177334 [details] [diff] [review].

The later is out-of-sync and I would like to know if this patch would be
accepted before bringing attachment 177334 [details] [diff] [review] up to the CVS HEAD.

And the patch also fixes bug 285580.
Attachment #191130 - Flags: review?(shaver)
Attachment #191130 - Flags: review?(shaver) → review?(brendan)
Igor, sorry I haven't had time for these patches.  I'm thinking we should do
something that avoids spending 1 flag byte per JSGCThing (8 bytes typically). 
Your point about jsarena.h not adding value here is well-taken, but I would
raise your bet and argue that we should use a radix tree to map arbitrary
malloc'ed areas to flag/type/etc. bookkeeping.  I would like to do that for 1.9.

/be
Assignee: general → brendan
(Assignee)

Comment 5

13 years ago
(In reply to comment #4)
> I'm thinking we should do
> something that avoids spending 1 flag byte per JSGCThing (8 bytes typically). 

But all the bits in the flag byte are currently used. I do not know how frequent
2 lock bits are accessed, but the rest 6 AFAICS are all accessed very often. Or
do you mean that the current implementation uses one extra byte per each
sizeof(JSGCThing) bytes even if gc arena stores bigger then sizeof(JSGCThing)?

> Your point about jsarena.h not adding value here is well-taken, but I would
> raise your bet and argue that we should use a radix tree to map arbitrary
> malloc'ed areas to flag/type/etc. bookkeeping.

But would digital trees effectively imitate the system page tables that already
maps virtual address to the physical one? Why not to take advantage of that and
simply use mmap (or VirtualAlloc on Windows) to allocate system pages? Since the
pages are always aligned on the page boundary the current code to use 1K GC_PAGE
to locate arena start would be unnecessary and one can use pretty arbitrary
layout of flags compressing them if necessary.

Another advantages of mmap is that when the page unmapped any access to it would
trigger access violation making exposing bugs earlier. It also eliminates the
overhead of malloc and would return unused memory to the system. 

The price for this is extra system calls, but at least on Linux these are fast.
(Assignee)

Comment 6

13 years ago
Created attachment 196248 [details] [diff] [review]
Proof of concept: mmap for GC chunks

The patch replaces the gc arena by using single system page per chunk of GC
things. This is a proof of concept implementation without much comments that
implements system page allocation only for Posix systems. It uses one flag byte
per each allocated GC thing no matter what its size is.


For code like:

var N = 100*1000;

for (var step = 0; step != 3; ++step) {
	for (var i = 0; i != N; ++i) {
		new Object();
	}
	gc();
}

on my old notebook with Linux the overhead of the patch is about 3-5% depending
on the system load, but it does removed about 200K from 40MB Firefox image.

Comments?

Updated

13 years ago
Flags: testcase-
(Assignee)

Comment 7

13 years ago
Comment on attachment 196248 [details] [diff] [review]
Proof of concept: mmap for GC chunks

This is too preliminary. Lets consider arena-less GC first.
Attachment #196248 - Attachment is obsolete: true
(Assignee)

Comment 8

13 years ago
Created attachment 208311 [details] [diff] [review]
Updated patch to reflect CVS from 2006-01-12

Besides tracking CVS changes this version of the patch uses JS_malloc/JS_free for GC pages.

With this patch in place it would be simpler to update the patch from bug 280844 and to allow to find the runtime pointer from the pointer to GC thing so deflated_string_cache can be moved to JSRuntime without breaking API compatibility.
Assignee: brendan → igor.bukanov
Attachment #191130 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #208311 - Flags: review?(brendan)
Attachment #191130 - Flags: review?(brendan)
(In reply to comment #8)
> and to allow to find the runtime pointer from the pointer to GC thing so
> deflated_string_cache can be moved to JSRuntime without breaking API
> compatibility.

What's the motivation for this?  Either way, a lock is needed.  Most embeddings have one runtime per process, so there's no reduction in contention potential in the typical case.  The global deflated_string_cache could, on the plus side, help a multi-runtime embedding share deflated strings across runtimes.

/be
(Assignee)

Comment 10

13 years ago
(In reply to comment #9)
> What's the motivation for this?  Either way, a lock is needed.  Most embeddings
> have one runtime per process, so there's no reduction in contention potential
> in the typical case.

With per-runtime string cache the code can use rt->gcLock thus during string finalization no extra lock would be taken. It would also allow to GC-allocate hash entries and perhasps even deflated bytes avoiding malloc/free overhead in most cases. 



  The global deflated_string_cache could, on the plus side,
> help a multi-runtime embedding share deflated strings across runtimes.
> 
> /be
> 

(Assignee)

Comment 11

13 years ago
(In reply to comment #9)
> The global deflated_string_cache could, on the plus side,
> help a multi-runtime embedding share deflated strings across runtimes.

How this sharing can be organized? Given that deflated bytes are hashed based on string address, one has to share JSString itself for that to succeed. This require that string allocation should be somehow controlled to look for shared string with the same bytes instead of allocating new JSString instances and I just do not see how this can be done.

Or is it just about atomized strings when at some point one enumerates atoms from one runtime to lock and then atomize corresponding string into another runtime? Has anybody done it? 

> 
> /be
> 
I whiffed that answer -- you're right, no sharing unless someone takes extra steps, since the key in the deflated string cache is the JSString *.  So there's no benefit, just the cost of extra locking during finalization.  That is a lose, although it hasn't popped out on profiles, and again most embeddings have exactly one runtime.

/be
(Assignee)

Comment 13

13 years ago
Created attachment 208523 [details] [diff] [review]
Updated patch to reflect CVS from 2006-01-14
Attachment #208311 - Attachment is obsolete: true
Attachment #208523 - Flags: review?(brendan)
Attachment #208311 - Flags: review?(brendan)
Comment on attachment 208523 [details] [diff] [review]
Updated patch to reflect CVS from 2006-01-14

One suggestion to reduce the size of the patch: continue to use "arena" in comments, GC_ARENA_SIZE, etc. in macro names.  Just change the type to JSGCArenaPool from JSArenaPool, or JSGCArenaList if that's a better name.

>- * The overhead of this scheme for most platforms is (16+8*(8+1))/(16+9K) or
>- * .95% (assuming 16 byte JSArena header size, and 8 byte JSGCThing size).
>+ * The overhead of this scheme for most platforms is (4+8*(8+1))/(4+9K) or
>+ * 0.82% (assuming 4 byte JSGCArea header size, and 8 byte JSGCThing size).

Of course as you note the average overhead now is worse, since unusable flag bytes are allocated to all things of size > the minimum thing size.  But I agree we should conquer that problem in a subsequent patch.

>+struct JSGCArea
>+{

Nit: open brace on same line as struct tag.

>+    a->next = areaList->last;
>+    areaList->last = a;

Nit: prev instead of next, since the list links to previously allocated areas from the last (i.e., the most recent) one?

>+static void
>+destroy_gc_area(JSContext *cx, JSGCAreaList *areaList, JSGCArea **ap)

Nit: prevailing style here and below would name these static functions gc_..._area, not ..._gc_area.

>+{
>+    JSGCArea *a;
>+
>+    a = *ap;
>+    JS_ASSERT(a);
>+    METER(--areaList->stats.nareas);
>+    if (a == areaList->last) {

Minor nit: prevailing style does not brace single-statement-on-single-line then and else clauses.

>+        areaList->lastLimit = (a->next) ? GC_THINGS_SIZE : 0;
>+    }
>+    *ap = a->next;

[snip...]

>+static void
>+finish_gc_area_list(JSGCAreaList *areaList)
>+{
>+    while (areaList->last) {
>+        destroy_gc_area(NULL, areaList, &areaList->last);
>+    }
>+    areaList->freeList = NULL;
>+}

Same nits as above.

>+
>+static JSGCThing *
>+gc_new_thing(JSContext *cx, JSGCAreaList *areaList, size_t nbytes)
>+{
>+    JSGCArea *a;
>+    size_t limit;
>+    JSGCThing *thing;
>+
>+    a = areaList->last;
>+    if (!a || GC_THINGS_SIZE == areaList->lastLimit) {
>+        /*
>+         * Allocate new area if there are no areas or the last area
>+         * is fully used.
>+         */
>+        a = gc_new_area(cx, areaList, nbytes);
>+        if (a == NULL)

Nit: prevailing style tests if (!a).

>+            goto fail;

No need for fail: code and this goto, since code at fail: is just return NULL.

>+        JS_ASSERT(areaList->last == a);
>+    }
>+    limit = areaList->lastLimit;
>+    if ((limit & GC_PAGE_MASK) == 0) {
>+        /*
>+         * Skip JSGCPageInfo record located at GC_PAGE_SIZE
>+         * boundary.
>+         */
>+        limit += PAGE_THING_GAP(nbytes);
>+    }

Nit: extra blank line here.

>+    /*
>+     * Assert that at this point space for things should exist.
>+     */
>+    JS_ASSERT(limit + nbytes <= GC_THINGS_SIZE);
>+    areaList->lastLimit = limit + nbytes;
>+    thing = (JSGCThing *)(FIRST_THING_PAGE(a) + limit);
>+
>+    METER(++areaList->stats.nthings);
>+    METER(areaList->stats.maxthings
>+          = JS_MAX(areaList->stats.nthings, areaList->stats.maxthings));
>+
>     return thing;
>+
>+fail:
>+    return NULL;
> }

[snip...]

>@@ -351,21 +436,37 @@ js_InitGC(JSRuntime *rt, uint32 maxbytes
> JS_FRIEND_API(void)
> js_DumpGCStats(JSRuntime *rt, FILE *fp)
> {
>-    uintN i;
>+    unsigned i;
>+    unsigned long totalThinks, totalMaxThinks, totalBytes;

s/Think/Thing/g here and below.

[snip...]

>@@ -383,7 +484,7 @@ js_DumpGCStats(JSRuntime *rt, FILE *fp)
>     fprintf(fp, "   maximum GC nesting level: %lu\n", ULSTAT(maxlevel));
>     fprintf(fp, "potentially useful GC calls: %lu\n", ULSTAT(poke));
>     fprintf(fp, "           useless GC calls: %lu\n", ULSTAT(nopoke));
>-    fprintf(fp, "  thing arenas freed so far: %lu\n", ULSTAT(afree));
>+    fprintf(fp, "  thing areas freed so far: %lu\n", ULSTAT(afree));

Given the arena=>area name change, this last fprintf needs another leading space to make the : line up with other lines.  But if you restore the "arena" to keep change minimized, and co-opt the fine Arena name by prefixing it with GC (as in JSGCArena, JSGCArenaList), then this line shouldn't be patched at all.

Sorry for the nits, I think this is an improvement.  But, as I've noted before, I wonder whether we ought not make a bigger change.  Here's a sketch:

/*
 * 32-bit address interpretation
 * +---------+----------+----------+---+
 * |sn[31:23]|pn[22:13] |tn[12:t+1]|t:0|
 * +---------+----------+----------+---+
 * sn: segment number, 9 bits
 * pn: page number, 10 bits
 * tn: thing number, (12-t) bits
 * (t+1) tag bits
 *
 * So the segment table is 2K on 32-bit architectures (XXX 64-bit is TBD).
 * It is sparse, with an entry i allocated only when one or more GC-things
 * have segment number i, and unallocated entries containing null.  Each  * JSGCSegment's pageInfo array is likewise sparse with null holes.
 *
 * XXX can't assume coherent update on SMP, need PR_AtomicSet or similar...
 */
struct JSGCSegment {
    JSGCPageInfo *pageInfo[GC_PAGES_PER_SEGMENT];
};

JSGCSegment *js_SegmentTable[GC_NUM_SEGMENTS];

struct JSGCPageInfo {
    JSRuntime   *runtime;
    uint16      thingsize;
    uint16      nthings;
    uint8       flags[1];       /* JS_HOWMANY(GC_PAGE_SIZE, thingsize) */
};

uint8 *
js_GetGCThingFlags(void *thing)
{
    jsuword addr;
    JSGCSegment *seg;
    JSGCPageInfo *pi;

    /*
     * shift | load
     * shift | mask | add | load
     * mask | (shift or div) | add
     */
    addr = (jsuword) thing;
    seg = &js_SegmentTable[addr >> GC_SEGMENT_SHIFT];
    pi = seg->pageInfo[(addr >> GC_PAGE_SHIFT) & GC_PAGE_MASK];
    return pi->flags + ((addr & GC_PAGE_MASK) / pi->thingsize);
}

Idea follows your lead: use mmap or equivalent to allocate true OS page multiples, in 8K GC_PAGE_SIZE units.  Use sparse segment and page info tables to associate things in pages with their 1-byte flags.  Keep compatibility with APIs, allow multiple runtimes/GC-heaps in one address space, pave way for GC'd deflated string table.  Thoughts?

/be
(In reply to comment #0)
> For example, GC often picks up arenas that contains hundreds of extra bytes and
> locks them for long periods.

Is this still true?  JS_ArenaAllocate was changed a while ago to do exact fit allocation from its freelist.  Of course, we should get rid of the layering on jsarena.h here, I agree.  But is it still the case that hundreds of extra bytes can be wasted?

/be
(Assignee)

Comment 16

13 years ago
(In reply to comment #15)
> Is this still true?  JS_ArenaAllocate was changed a while ago to do exact fit
> allocation from its freelist.  Of course, we should get rid of the layering on
> jsarena.h here, I agree.  But is it still the case that hundreds of extra bytes
> can be wasted?

That came from printing the size of allocated arena which allocated for some reason sometimes 100+ bytes indeed despite the changed freelist code. I did not know at that time that it was not JSArea overhead per see but rather a bug during GC initialization containing:

     for (i = 0; i < GC_NUM_FREELISTS; i++) {
         JS_InitArenaPool(&rt->gcArenaPool[i], "gc-arena", GC_ARENA_SIZE,
                        GC_FREELIST_NBYTES(i));
     }

That forced arena code to allocate GC_ARENA_SIZE aligned on GC_FREELIST_NBYTES(i) boundary that can be up to 10*sizeof(JSGCThing) bytes forcing arena->mask to be 127 or (255 on 64 bits platforms). A simple fix replacing GC_FREELIST_NBYTES(i) by 1 would address it.
(Assignee)

Comment 17

13 years ago
Created attachment 208556 [details] [diff] [review]
Addressing the above nits
Attachment #208523 - Attachment is obsolete: true
Attachment #208556 - Flags: review?(brendan)
Attachment #208523 - Flags: review?(brendan)
(Assignee)

Comment 18

12 years ago
(In reply to comment #14)
> But, as I've noted before,
> I wonder whether we ought not make a bigger change.

IMO it is better to have a series of smaller bugs/patches if each can bring benefits on its own. Plus additional GC changes require non-arena allocation and this self-contained patch besides reducing arena overhead would make the subsequent patches smaller.

>  Here's a sketch:
...

What about using typed areas instead? Currently from 8 bits of GC flags 4 are used for thing type and 2 for locks which are applicable only for doubles and non-dependent strings. Thus if objects would be allocated from a separated list, only 2 GC-related bits are necessary. 

For GC-allocated object slots even those 2 bits are not necessary. Since the slot array is not aliased, when the object itself is freed the array can simply be added to a free list immediately. Ideally one need just a pointer to the list of free objects per arena to free the arena when the list becomes empty.

An extra benefit of typed arenas is that it would much harder to exploit unrooted-gc-things bugs since an evil script would not be able to re-interpret string as doubles and vice-versa. 

On the other hand typed GC arenas would require more arenas lists leading to waste of memory when a particular allocation/GC sequence would place one thing of particular type per arena. So a smaller arena should be used with this scheme.

But all of this would mean that one would not able to use the current trick of allocating 9K and using 8 1K-aligned pages for things and the rest for flags so aligned/paged allocation must be used. But that is for a separated bug.
(In reply to comment #16)
> A simple fix replacing GC_FREELIST_NBYTES(i) by 1 would address it.

Argh, that was supposed to be sizeof(JSGCThing)!  I should fix that ASAP on the trunk and for JS1.6RC1.  I've filed bug 323529 and will attach a minimal patch there.

/be

(In reply to comment #18)
> But all of this would mean that one would not able to use the current trick of
> allocating 9K and using 8 1K-aligned pages for things and the rest for flags so
> aligned/paged allocation must be used.

We could benefit from the PR_MapFile porting work from NSPR, but we would not want to unfork SpiderMonkey's NSPR files (too many embedders do not want any other library dependency, and some don't want JS_THREADSAFE).  Just a thought.

Yes, freelists are segregated by size class currently, which is not the same as GC-thing type as you note.  Fragmentation internal to each class's arena or area, underutilizing memory, is a possible problem.  But the literature is full of data on this general approach.

The current 9K=(1K flags + 8K things) trick is pretty tight, especially in terms of instructions in js_GetGCThingFlags (except for that darn branch -- there are some Henry Massalin type hacks I could use there on some architectures to avoid the if statement).  It seems from your pioneering work here (attachment 196248 [details] [diff] [review]) that we'll have some performance penalty with any change.  We'll have to evaluate as we go -- another reason to patch incrementally.

I'll review the reduced patch you attached shortly.

/be
Comment on attachment 208556 [details] [diff] [review]
Addressing the above nits

>+++ src/jsgc.c
>@@ -41,17 +41,16 @@
>  * JS Mark-and-Sweep Garbage Collector.
>  *
>  * This GC allocates only fixed-sized things big enough to contain two words
>- * (pointers) on any host architecture.  It allocates from an arena pool (see
>- * jsarena.h).  It uses an ideally parallel array of flag bytes to hold the
>- * mark bit, finalizer type index, etc.
>+ * (pointers) on any host architecture.  It allocates from a special GC arena
>+ * pool with each arena allocated using JS_malloc.  It uses an ideally parallel

s/JS_malloc/malloc/ (see below).  The first sentence of this paragraph has been out of date since I added more GC-thing types.  I'll fixit unless you beat me to it ;-).

>+#ifdef JS_GCMETER
>+
>+#define METER(x) x
>+
>+#else /* !JS_GCMETER */
>+
>+#define METER(x) ((void) 0)
>+
>+#endif

Nit: no need for blank lines here, or for the /* !JS_GCMETER */ comment which adds clutter that motivates the blank lines -- typical style adds such comments only to #else and #endif lines that are widely (screenful or more) apart.

Nit against the existing code, which moved: I've slowly started indenting the controlled #defines using old-style C preprocessor indentation:

#if FOO
# define BAR 1
#endif

Feel free to do that here :-).

>+static JSGCArena *
>+NewGCArena(JSContext *cx, JSGCArenaList *arenaList, size_t nbytes)

(Hey, good naming!  The gc_long_winded style is so passe' in SpiderMonkey.)

>+    a = JS_malloc(cx, offsetof(JSGCArena, base) + GC_ARENA_SIZE);
>+    if (!a)
>         return NULL;

This should use malloc, to avoid double OOM reporting (js_NewGCThing calls JS_ReportOutOfMemory on failure, and should).  Using malloc lets NewGCArena do without the cx parameter entirely, I hope.

>+static void
>+DestroyGCArena(JSContext *cx, JSGCArenaList *arenaList, JSGCArena **ap)
>+{
>+    JSGCArena *a;
>+
>+    a = *ap;
>+    JS_ASSERT(a);
>+    METER(--arenaList->stats.narenas);
>+    if (a == arenaList->last)
>+        arenaList->lastLimit = (a->prev) ? GC_THINGS_SIZE : 0;

Nit: typically member and primary expressions as condition of ?: do not need to be parenthesized in prevailing style.

>+    *ap = a->prev;
>+
>+#ifdef DEBUG
>+    memset(a, JS_FREE_PATTERN, offsetof(JSGCArena, base) + GC_ARENA_SIZE);
>+#endif
>+    JS_free(cx, a);

Use free for symmetry, and drop the cx param to DestroyGCArena too.  Also, not sure how valuable the memset is without jsarena.[ch], since the OS malloc may either fill with its own pattern.  But if not, I suppose it doesn't hurt for us to lay down our own pattern.

>+static JSGCThing *
>+AllocGCThing(JSContext *cx, JSGCArenaList *arenaList, size_t nbytes)

Again don't need cx, and could perhaps inline all of this in js_NewGCThing?

>+{
>+    JSGCArena *a;
>+    size_t limit;
>+    JSGCThing *thing;
>+
>+    a = arenaList->last;
>+    if (!a || GC_THINGS_SIZE == arenaList->lastLimit) {
>+        /*
>+         * Allocate new arena if there are no arenas or the last arena
>+         * is fully used.
>+         */
>+        a = NewGCArena(cx, arenaList, nbytes);
>+        if (!a)
>+            return NULL;
>+        JS_ASSERT(arenaList->last == a);
>+    }
>+    limit = arenaList->lastLimit;
>+    if ((limit & GC_PAGE_MASK) == 0) {
>+        /* Skip JSGCPageInfo record located at GC_PAGE_SIZE boundary. */
>+        limit += PAGE_THING_GAP(nbytes);
>+    }

How about a blank line here, per the coding style guide on the wiki (http://wiki.mozilla.org/index.php?title=JavaScript:SpiderMonkey:Coding_Style#Tabs_and_Spaces)?

>+    /*
>+     * Assert that at this point space for things should exist.
>+     */
>+    JS_ASSERT(limit + nbytes <= GC_THINGS_SIZE);
>+    arenaList->lastLimit = limit + nbytes;
>+    thing = (JSGCThing *)(FIRST_THING_PAGE(a) + limit);
>+
>+    METER(++arenaList->stats.nthings);
>+    METER(arenaList->stats.maxthings
>+          = JS_MAX(arenaList->stats.nthings, arenaList->stats.maxthings));
>+
>     return thing;
> }
> 
>@@ -351,21 +427,37 @@ js_InitGC(JSRuntime *rt, uint32 maxbytes
> JS_FRIEND_API(void)
> js_DumpGCStats(JSRuntime *rt, FILE *fp)
> {
>-    uintN i;
>+    unsigned i;
>+    unsigned long totalThings, totalMaxThings, totalBytes;

uintN is ok by local style, albeit perhaps unsightly in pure C ;-).

uint32 is better than unsigned long, I think -- unless you mean uint64.

>+    for (i = 0; i < GC_NUM_FREELISTS; i++) {
>+        JSGCArenaStats *st = &rt->gcArenaList[i].stats;

Nit: s/\<st\>/as/g to match typical concise naming convention (pi for JSGCPageInfo, etc.)?

[snip...]

>     } else {
>         if (rt->gcBytes < rt->gcMaxBytes &&
>             (tried_gc || rt->gcMallocBytes < rt->gcMaxBytes))
>         {

Norris got me bracing this way when the if condition was multi-line and (especially) the then-block had declarations, but I think we can do without it here -- unless you inline AllocGCThing here.

[snip...]

>@@ -1530,10 +1599,12 @@ js_GC(JSContext *cx, uintN gcflags)
>     JSStackFrame *fp, *chain;
>     uintN i, depth, nslots, type;
>     JSStackHeader *sh;
>-    size_t nbytes, nflags;
>-    JSArena *a, **ap;
>-    uint8 flags, *flagp, *split;
>-    JSGCThing *thing, *limit, **flp, **oflp;
>+    size_t nbytes, nflags, limit;
>+    JSGCArena *a, **ap;
>+    uint8 flags, *flagp, *firstpage;

Nit: firstPage, to match other multi-word-phrase names, in contrast to flagp, a, ap, etc., which support the more concise convention?  Both have their place, this just seems to introduce a "third style" without need.

>+    JSGCThing *thing, *flp;

Suggest renaming flp to freelist, since it's not a JSGCThing ** any longer.  My hax0r dictionary says freelist is one word, but it might be freeList spelling is better here -- hmm.

>+    size_t thingOffset;

Move up to share type with other size_t decls, and perhaps call it offset (to match limit -- on need for thing, and offset and limit go together like coffee and cake).

I'll r+ the final patch just so the bug records it before checkin.  Great work, thanks again.

/be
(Assignee)

Comment 22

12 years ago
Created attachment 208628 [details] [diff] [review]
Reflection on nits and inlining

This version addresses the above comment and inlines AllocGCThing slightly reorganizing source of in js_NewGCThing to get easier to follow allocation code.

The used-to-be-true-but-no-longer-applicable comments about GCThing size are kept.
Attachment #208556 - Attachment is obsolete: true
Attachment #208628 - Flags: review?(brendan)
Attachment #208556 - Flags: review?(brendan)
(Assignee)

Comment 23

12 years ago
Created attachment 208630 [details] [diff] [review]
Even less arena traces

This version of the patch moved call to JS_ArenaFinish from js_FinishGC to 
JS_DestroyRuntime since now GC does not depend on the arena list. Plus the patch fixes the comments about sizes of things GC can allocate.
Attachment #208628 - Attachment is obsolete: true
Attachment #208630 - Flags: review?(brendan)
Attachment #208628 - Flags: review?(brendan)
Comment on attachment 208630 [details] [diff] [review]
Even less arena traces

Thanks again, r=me -- final nits below.

/be

>@@ -40,18 +40,17 @@
> /*
>  * JS Mark-and-Sweep Garbage Collector.
>  *
>- * This GC allocates only fixed-sized things big enough to contain two words
>- * (pointers) on any host architecture.  It allocates from an arena pool (see
>- * jsarena.h).  It uses an ideally parallel array of flag bytes to hold the
>+ * This GC allocates fixed-sized things with sizes up to GC_NBYTES_MAX (see
>+ * jsgc.h). It allocates from a special GC arena pool with each arena allocated
>+ * using malloc.  It uses an ideally parallel array of flag bytes to hold the
>  * mark bit, finalizer type index, etc.

Two spaces after period in " * jsgc.h). It allocates..." line.

>+NewGCArena(JSGCArenaList *arenaList, size_t nbytes)
> {
>     uint8 *flagp, *split, *pagep, *limit;
>-    JSArena *a;
>-    jsuword p;
>-    JSGCThing *thing;
>+    JSGCArena *a;
>     JSGCPageInfo *pi;

Reorder to group by first use, as before.

>+    } else if (rt->gcBytes < rt->gcMaxBytes &&
>+               (tried_gc || rt->gcMallocBytes < rt->gcMaxBytes)) {
>+        /*
>+         * We can use more memory from arenaList. Get thing from there
>+         * allocating a new arena if last one is full.

Need a comma at end of first line in comment.

>+         */
>+        if (!arenaList->last || GC_THINGS_SIZE == arenaList->lastLimit) {

With GCC warning about it, I never felt the need to put the constant on the left of == in case of a = typo, and it's not prevailing style.  Reads a little less clearly to my eyes.

>@@ -1456,28 +1494,31 @@ gc_root_marker(JSDHashTable *table, JSDH
>         JSContext *cx = (JSContext *)arg;
> #ifdef DEBUG
>         uintN i;
>-        JSArena *a;
>+        JSGCArena *a;
>         jsuword firstpage;
>-        JSBool root_points_to_gcArenaPool = JS_FALSE;
>+        JSBool root_points_to_gcArenaList = JS_FALSE;
>         void *thing = JSVAL_TO_GCTHING(v);
> 
>         for (i = 0; i < GC_NUM_FREELISTS; i++) {
>-            for (a = cx->runtime->gcArenaPool[i].first.next; a; a = a->next) {
>+            JSGCArenaList *arenaList = &cx->runtime->gcArenaList[i];
>+            size_t limit = arenaList->lastLimit;

With new block locals in the closest enclosing block, I'd say move firstpage (firstPage?) down to its closest enclosing block.  This style of localizing to nearest containing block is not prevailing, but in this case, it's old (IIRC fur added this DEBUG code in 1998).  Could go either way -- toward localizing or toward hoisting to outermost DEBUG block -- but would prefer one or the other style only.

Ok, I'm done fussing ;-).  Thanks once again.
Attachment #208630 - Flags: review?(brendan) → review+
(Assignee)

Comment 25

12 years ago
Created attachment 208681 [details] [diff] [review]
Fix to restore GC call when arena allocation fails and nitlessing

The changes from the previous patch to js_NewGCThing caused failed malloc during arena allocation to jump to fail instead of trying GC as it was in the original code. For embedded systems it can be visible so the patch restores the previous behavior besides addressing nits.
Attachment #208630 - Attachment is obsolete: true
Attachment #208681 - Flags: review?(brendan)
(Assignee)

Comment 26

12 years ago
Created attachment 208683 [details] [diff] [review]
Fix to restore GC call when arena allocation fails and nitlessing
(Assignee)

Comment 27

12 years ago
Created attachment 208685 [details] [diff] [review]
Fix to restore GC call when arena allocation fails and nitlessing v2

Fixing bad grammar in comments.
Attachment #208681 - Attachment is obsolete: true
Attachment #208685 - Flags: review?(brendan)
Attachment #208681 - Flags: review?(brendan)
(Assignee)

Updated

12 years ago
Attachment #208683 - Attachment is obsolete: true
(Assignee)

Comment 28

12 years ago
Created attachment 208686 [details] [diff] [review]
Fix to restore GC call when arena allocation fails and nitlessing v3

This time I forgot to run "quilt refresh". Late night hacking...
Attachment #208685 - Attachment is obsolete: true
Attachment #208686 - Flags: review?(brendan)
Attachment #208685 - Flags: review?(brendan)
Comment on attachment 208686 [details] [diff] [review]
Fix to restore GC call when arena allocation fails and nitlessing v3

>+    /* Loop to repeat allocation after forced GC. */
>+    for (;;) {
> 

Nit: no need for blank line here.

[snip...]

>+                if (!NewGCArena(arenaList, nbytes))
>+                    goto try_gc;

Thanks for self-review -- my Spider-sense went off, but then I got distracted.

r=me again.

/be
Attachment #208686 - Flags: review?(brendan) → review+
(Assignee)

Comment 30

12 years ago
Created attachment 208689 [details] [diff] [review]
Patch to commit
(In reply to comment #28)
> Created an attachment (id=208686) [edit]
> Fix to restore GC call when arena allocation fails and nitlessing v3
> 
> This time I forgot to run "quilt refresh". Late night hacking...

What is quilt?

/be
(Assignee)

Comment 32

12 years ago
I committed the last patch
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
(Assignee)

Comment 33

12 years ago
(In reply to comment #31)
> (In reply to comment #28)
> > This time I forgot to run "quilt refresh". Late night hacking...
> 
> What is quilt?

Quilt, http://savannah.nongnu.org/projects/quilt, is a useful tool to track locally patches against some external evolving tree. Once I used darcs for that, http://abridgegame.org/darcs/, but even that is too heavy when one just want a throw away local branches for CVS/SVN. quilt has its deficiencies, but they are small compared with triviality of setup and usage.
(In reply to comment #33)
> Quilt, http://savannah.nongnu.org/projects/quilt, is a useful tool to track
> locally patches against some external evolving tree.

Ah, Quilt -- shaver knows it, I've never used it.  Have you tried Monotone?  We have hired Graydon Hoare, its author (but not to hack on it, to hack on JS!).

> Once I used darcs for
> that, http://abridgegame.org/darcs/, but even that is too heavy when one just
> want a throw away local branches for CVS/SVN. quilt has its deficiencies, but
> they are small compared with triviality of setup and usage.

I can't believe I still use separate CVS trees and juggle patches, including using patch -R and edited patches ;-).

The Darcs hackers met at the ICFP last fall in Tallinn, I hung out a bit and listened in.  It sounds great on paper ;-).

/be

Comment 35

12 years ago
Will this result in smaller memory usage?
And is it limited to Linux as the OS field indicates?
(Assignee)

Comment 36

12 years ago
(In reply to comment #35)
> Will this result in smaller memory usage?

The patch saves 3 words for each GC arena (12 bytes per 9K) and removes the need to lock arena_freelist_lock during arena allocation. But I doubt that one can see it. In this regard a one-line fix from bug 323529 should be much more exposed.

> And is it limited to Linux as the OS field indicates?

Fixed!
OS: Linux → All
Hardware: PC → All
(Assignee)

Comment 37

12 years ago
(In reply to comment #34)
>  Have you tried Monotone?

Yes, but IMO monotone require even more efforts to setup and track external CVS tree. In that regard git is much better as by design it does not assume that other git repositories are primary sources of changes. I would probably endup using it if quilt would not exist.

> We
> have hired Graydon Hoare, its author (but not to hack on it, to hack on JS!).

Nice, but does that mean that "you can use any versions system with Mozilla as long as it is Concurrent" continue to hold firmly ;) ?
> 
> > Once I used darcs for
> > that, http://abridgegame.org/darcs/, but even that is too heavy when one just
> > want a throw away local branches for CVS/SVN. quilt has its deficiencies, but
> > they are small compared with triviality of setup and usage.
> 
> I can't believe I still use separate CVS trees and juggle patches, including
> using patch -R and edited patches ;-).
> 
> The Darcs hackers met at the ICFP last fall in Tallinn, I hung out a bit and
> listened in.  It sounds great on paper ;-).
> 
> /be
> 

(In reply to comment #37)
> > We
> > have hired Graydon Hoare, its author (but not to hack on it, to hack on JS!).
> 
> Nice, but does that mean that "you can use any versions system with Mozilla as
> long as it is Concurrent" continue to hold firmly ;) ?

For now.  Chase is evaluating svn.  It seems the next step, albeit not a huge step up....

/be
You need to log in before you can comment on or make changes to this bug.