Closed Bug 508707 Opened 15 years ago Closed 13 years ago

Use one single GC heap chunk, avoiding frequent mmap and malloc calls

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: gal, Assigned: gwagner)

References

Details

Attachments

(1 file, 9 obsolete files)

      No description provided.
Attached patch patch (obsolete) — Splinter Review
Assignee: general → gal
The attached patch allocates the GC heap in a single chunks. This allows very fast identification of GC things, which is handy for conservative stack scanning.

On anything not-macosx pages are only touched if we actually need that much gc heap. Pages are never deallocated.

This is a 20ms on SS on mac, mostly due to the memset which avoids macosx page fault badness during SS execution. It would be great if someone can measure this on win32 or linux.

Even if there is no speedup on win32 or linux, I think we should seriously consider taking this patch. It removes a lot of complexity (and code) from jsgc and relying on virtual memory in the year 2009 seems appropriate.
I tried to compare on Linux but the patch has bit-rotted.  Can you rebase and
repost?
I got out of memory errors with this patch.

@@ -3736,12 +3271,6 @@ js_GC(JSContext *cx, JSGCInvocationKind 
      */
     js_SweepScriptFilenames(rt);
 
-    /*
-     * Destroy arenas after we finished the sweeping sofinalizers can safely
-     * use js_IsAboutToBeFinalized().
-     */
-    DestroyGCArenas(rt, emptyArenas);
-

The "allClear" state in js_GC has to be changed since empty arenas are not destroyed any more. A quick fix is to remove the allClear state.
Attached patch clean version (obsolete) — Splinter Review
Another version of the previous patch. No changes, just a new version that shouldn't result in conflicts.
Attachment #394165 - Attachment is patch: true
Attachment #394165 - Attachment mime type: application/octet-stream → text/plain
Attached patch fixed out of memory error (obsolete) — Splinter Review
Fix for out of memory error.
Added emptyArenas list. An arena is added to the list if it gets empty during js_GC. NewGCArena takes an arena from the list, if available, instead of creating a new arena.
Attachment #394165 - Attachment is obsolete: true
TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.028x as fast    872.9ms +/- 0.1%   848.9ms +/- 0.1%     significant

=============================================================================

  3d:                  1.053x as fast    135.7ms +/- 0.2%   128.9ms +/- 0.2%     significant
    cube:              1.106x as fast     38.1ms +/- 0.6%    34.5ms +/- 0.5%     significant
    morph:             1.092x as fast     27.9ms +/- 0.5%    25.5ms +/- 0.6%     significant
    raytrace:          1.012x as fast     69.7ms +/- 0.2%    68.8ms +/- 0.2%     significant

  access:              1.039x as fast    130.8ms +/- 0.1%   126.0ms +/- 0.4%     significant
    binary-trees:      1.099x as fast     38.9ms +/- 0.2%    35.4ms +/- 0.4%     significant
    fannkuch:          *1.037x as slow*   54.0ms +/- 0.2%    56.0ms +/- 0.6%     significant
    nbody:             1.171x as fast     24.6ms +/- 0.6%    21.0ms +/- 0.8%     significant
    nsieve:            *1.017x as slow*   13.3ms +/- 1.0%    13.5ms +/- 1.1%     significant

  bitops:              1.044x as fast     34.5ms +/- 0.6%    33.0ms +/- 0.7%     significant
    3bit-bits-in-byte: -                   1.6ms +/- 9.1%     1.5ms +/- 9.8% 
    bits-in-byte:      ??                  8.1ms +/- 1.1%     8.1ms +/- 1.1%     not conclusive: might be *1.002x as slow*
    bitwise-and:       -                   2.3ms +/- 5.8%     2.2ms +/- 5.2% 
    nsieve-bits:       1.059x as fast     22.5ms +/- 0.6%    21.3ms +/- 0.6%     significant

  controlflow:         *1.026x as slow*   32.2ms +/- 0.6%    33.1ms +/- 0.4%     significant
    recursive:         *1.026x as slow*   32.2ms +/- 0.6%    33.1ms +/- 0.4%     significant

  crypto:              -                  51.6ms +/- 0.8%    51.3ms +/- 1.2% 
    aes:               1.020x as fast     28.9ms +/- 1.2%    28.4ms +/- 1.6%     significant
    md5:               *1.017x as slow*   14.5ms +/- 1.0%    14.7ms +/- 1.6%     significant
    sha1:              ??                  8.2ms +/- 1.4%     8.2ms +/- 1.6%     not conclusive: might be *1.002x as slow*

  date:                1.049x as fast    130.5ms +/- 0.1%   124.3ms +/- 0.3%     significant
    format-tofte:      1.049x as fast     62.4ms +/- 0.2%    59.5ms +/- 0.4%     significant
    format-xparb:      1.050x as fast     68.1ms +/- 0.2%    64.8ms +/- 0.2%     significant

  math:                *1.010x as slow*   29.6ms +/- 0.5%    29.9ms +/- 0.6%     significant
    cordic:            ??                 10.5ms +/- 1.4%    10.6ms +/- 1.3%     not conclusive: might be *1.010x as slow*
    partial-sums:      *1.011x as slow*   13.1ms +/- 0.7%    13.2ms +/- 0.9%     significant
    spectral-norm:     ??                  6.0ms +/- 0.7%     6.1ms +/- 1.3%     not conclusive: might be *1.010x as slow*

  regexp:              ??                 39.4ms +/- 0.4%    39.5ms +/- 0.4%     not conclusive: might be *1.003x as slow*
    dna:               ??                 39.4ms +/- 0.4%    39.5ms +/- 0.4%     not conclusive: might be *1.003x as slow*

  string:              1.020x as fast    288.5ms +/- 0.2%   282.9ms +/- 0.2%     significant
    base64:            1.054x as fast     17.1ms +/- 0.5%    16.2ms +/- 1.1%     significant
    fasta:             1.044x as fast     60.2ms +/- 0.2%    57.6ms +/- 0.2%     significant
    tagcloud:          ??                 91.2ms +/- 0.2%    91.3ms +/- 0.3%     not conclusive: might be *1.001x as slow*
    unpack-code:       1.021x as fast     92.1ms +/- 0.4%    90.2ms +/- 0.3%     significant
    validate-input:    1.015x as fast     28.0ms +/- 0.2%    27.6ms +/- 0.7%     significant
Attached patch patch (obsolete) — Splinter Review
Attachment #392835 - Attachment is obsolete: true
Attachment #394185 - Attachment is obsolete: true
Attachment #395198 - Flags: review?(igor)
I will review the patch tomorrow, 2008-08-20.
Attached patch patch (obsolete) — Splinter Review
Attachment #395198 - Attachment is obsolete: true
Attachment #395441 - Flags: review?(igor)
Attachment #395198 - Flags: review?(igor)
I guess the instrumentation-variable-name for arenas that are returned to the global free list should be changed. METER(nkilledarenas++); is misleading.
trace-test.js fails with an out of memory error in:
#0  js_ReportOutOfMemory (cx=0x30a990) at ../jscntxt.cpp:1317
#1  0x00022211 in JS_ReportOutOfMemory (cx=0x30a990) at ../jsapi.cpp:5630
#2  0x0006ca70 in js_AddAsGCBytes (cx=0x30a990, sz=5760016) at ../jsgc.cpp:1820

values in AddAsGCBytes are:
if (rt->gcBytes >= rt->gcMaxBytes ||
            sz > (size_t) (rt->gcMaxBytes - rt->gcBytes)) {
fprintf(stderr, "sz: %d, gcBytes: %d, max: %d, diff: %d\n", sz, rt->gcBytes, rt->gcMaxBytes, (size_t)(rt->gcMaxBytes - rt->gcBytes));

sz: 5760016, gcBytes: 67113264, max: 67108864, diff: -4400
The AddAsGCBytes API is completely borked. We have an approved patch to remove it. I will try to land the RSS patch first, that will take care of this problem.
Comment on attachment 395441 [details] [diff] [review]
patch

> static void
> FinishGCArenaLists(JSRuntime *rt)
> {
...
> 
>+    rt->gcPtr = rt->gcBase;
>     rt->gcBytes = 0;
>-    JS_ASSERT(rt->gcChunkList == 0);

I do not see where the patch releases rt->gcBase. It can be done here with rt->gcBase set to NULL after calling free. r+ with this fixed.
(In reply to comment #12)
Since this patch would make my life easier with conservative stack-walking I was looking at the problem again.

We fill up the heap with arenas and only increase rt->gcBytes since we don't delete arenas any more.
The problem occurs with a very big (5760016 bytes) enumerator created in js_Enumerate. The size is added to the gcBytes with AddAsGCBytes.
Attached patch patch (obsolete) — Splinter Review
Attachment #395441 - Attachment is obsolete: true
Attachment #395441 - Flags: review?(igor)
Attached patch removed AddAsGCBytes (obsolete) — Splinter Review
Attachment #399609 - Attachment is obsolete: true
Summary: TM: Allocate GC heap in one chunk → Use one single GC heap chunk, avoiding frequent mmap and malloc calls
Blocks: 505308
Comment on attachment 399618 [details] [diff] [review]
removed AddAsGCBytes

>diff -r 465737c189fc js/src/jscntxt.h
>--- a/js/src/jscntxt.h	Wed Sep 09 16:51:46 2009 -0500
>+++ b/js/src/jscntxt.h	Wed Sep 09 16:01:57 2009 -0700
>@@ -395,18 +395,21 @@ struct JSRuntime {
>      * See bug 492355 for more details.
>      *
>      * This comes early in JSRuntime to minimize the immediate format used by
>      * trace-JITted code that reads it.
>      */
>     uint32              protoHazardShape;
> 
>     /* Garbage collector state, used by jsgc.c. */
>-    JSGCChunkInfo       *gcChunkList;
>+    uintptr_t           gcBase;
>+    uintptr_t           gcPtr;
>+    uintptr_t           gcEnd;

Nit but common naming conventions help a little: the common convention applied to gcPtr would yield gcCursor, and gcEnd would be gcLimit. A bit longer but that fits the member name style in general.

Similar style-mongering:

+    inline bool IsGCThing(void *ptr) {
+        return uintptr_t(ptr) >= gcBase && uintptr_t(ptr) < gcEnd;
+    }

The canonical name for ptr here is "thing".

jsgc.{cpp,h} use jsuword not uintptr_t. We can switch but let's do it all at once, or not at all since we have better things to do and jsuword is the right sized type on all platforms already.

This looks better with number-line order:

+    inline bool IsGCThing(void *ptr) {
+        return gcBase <= uintptr_t(ptr) && uintptr_t(ptr) < gcLimit;
+    }

As Igor did for JSString::isStatic, you should also JS_ASSERT((uintptr_t(thing) & JSVAL_TAGMASK) == 0).

/be
Comment on attachment 399618 [details] [diff] [review]
removed AddAsGCBytes

>diff -r 465737c189fc js/src/jscntxt.h
>--- a/js/src/jscntxt.h	Wed Sep 09 16:51:46 2009 -0500
>+++ b/js/src/jscntxt.h	Wed Sep 09 16:01:57 2009 -0700
>@@ -395,18 +395,21 @@ struct JSRuntime {
>      * See bug 492355 for more details.
>      *
>      * This comes early in JSRuntime to minimize the immediate format used by
>      * trace-JITted code that reads it.
>      */
>     uint32              protoHazardShape;
> 
>     /* Garbage collector state, used by jsgc.c. */
>-    JSGCChunkInfo       *gcChunkList;
>+    uintptr_t           gcBase;
>+    uintptr_t           gcPtr;
>+    uintptr_t           gcEnd;
>     JSGCArenaList       gcArenaList[GC_NUM_FREELISTS];
>+    JSGCArenaInfo       *emptyArenas;
>     JSGCDoubleArenaList gcDoubleArenaList;
>     JSDHashTable        gcRootsHash;
>     JSDHashTable        *gcLocksHash;
>     jsrefcount          gcKeepAtoms;
>     size_t              gcBytes;
>     size_t              gcLastBytes;
>     size_t              gcMaxBytes;
>     size_t              gcMaxMallocBytes;
>@@ -414,16 +417,20 @@ struct JSRuntime {
>     uint32              gcLevel;
>     uint32              gcNumber;
>     JSTracer            *gcMarkingTracer;
>     uint32              gcTriggerFactor;
>     size_t              gcTriggerBytes;
>     volatile JSBool     gcIsNeeded;
>     volatile JSBool     gcFlushCodeCaches;
> 
>+    inline bool IsGCThing(void *ptr) {
>+        return uintptr_t(ptr) >= gcBase && uintptr_t(ptr) < gcEnd;
>+    }
>+
>     /*
>      * NB: do not pack another flag here by claiming gcPadding unless the new
>      * flag is written only by the GC thread.  Atomic updates to packed bytes
>      * are not guaranteed, so stores issued by one thread may be lost due to
>      * unsynchronized read-modify-write cycles on other threads.
>      */
>     JSPackedBool        gcPoke;
>     JSPackedBool        gcRunning;
>diff -r 465737c189fc js/src/jsgc.cpp
>--- a/js/src/jsgc.cpp	Wed Sep 09 16:51:46 2009 -0500
>+++ b/js/src/jsgc.cpp	Wed Sep 09 16:01:57 2009 -0700
>@@ -101,56 +101,16 @@
>  */
> #ifdef MOZ_MEMORY
> extern "C" {
> #include "../../memory/jemalloc/jemalloc.h"
> }
> #endif
> 
> /*
>- * Include the headers for mmap unless we have posix_memalign and do not
>- * insist on mmap.
>- */
>-#if JS_GC_USE_MMAP || (!defined JS_GC_USE_MMAP && !HAS_POSIX_MEMALIGN)
>-# if defined(XP_WIN)
>-#  ifndef JS_GC_USE_MMAP
>-#   define JS_GC_USE_MMAP 1
>-#  endif
>-#  include <windows.h>
>-# elif defined(__SYMBIAN32__)
>-// Symbian's OpenC has mmap (and #defines _POSIX_MAPPED_FILES), but
>-// doesn't implement MAP_ANON.  If we have MOZ_MEMORY, then we can use
>-// posix_memalign; we've defined HAS_POSIX_MEMALIGN above.  Otherwise,
>-// we overallocate.
>-# else
>-#  if defined(XP_UNIX) || defined(XP_BEOS)
>-#   include <unistd.h>
>-#  endif
>-#  if _POSIX_MAPPED_FILES > 0
>-#   ifndef JS_GC_USE_MMAP
>-#    define JS_GC_USE_MMAP 1
>-#   endif
>-#   include <sys/mman.h>
>-
>-/* On Mac OS X MAP_ANONYMOUS is not defined. */
>-#   if !defined(MAP_ANONYMOUS) && defined(MAP_ANON)
>-#    define MAP_ANONYMOUS MAP_ANON
>-#   endif
>-#   if !defined(MAP_ANONYMOUS)
>-#    define MAP_ANONYMOUS 0
>-#   endif
>-#  else
>-#   if JS_GC_USE_MMAP
>-#    error "JS_GC_USE_MMAP is set when mmap is not available"
>-#   endif
>-#  endif
>-# endif
>-#endif
>-
>-/*
>  * Check JSTempValueUnion has the size of jsval and void * so we can
>  * reinterpret jsval as void* GC-thing pointer and use JSTVU_SINGLE for
>  * different GC-things.
>  */
> JS_STATIC_ASSERT(sizeof(JSTempValueUnion) == sizeof(jsval));
> JS_STATIC_ASSERT(sizeof(JSTempValueUnion) == sizeof(void *));
> 
> /*
>@@ -228,124 +188,43 @@ JS_STATIC_ASSERT(JSVAL_NULL == 0);
>  *   (GC_ARENA_SIZE - sizeof(JSGCArenaInfo)) / (thingSize + 1).
>  *
>  * To allocate doubles we use a specialized arena. It can contain only numbers
>  * so we do not need the type bits. Moreover, since the doubles do not require
>  * a finalizer and very few of them are locked via js_LockGCThing API, we use
>  * just one bit of flags per double to denote if it was marked during the
>  * marking phase of the GC. The locking is implemented via a hash table. Thus
>  * for doubles the flag area becomes a bitmap.
>- *
>- * JS_GC_USE_MMAP macro governs the choice of the aligned arena allocator.
>- * When it is true, a platform-dependent function like mmap is used to get
>- * memory aligned on CPU page boundaries. If the macro is false or undefined,
>- * posix_memalign is used when available. Otherwise the code uses malloc to
>- * over-allocate a chunk with js_gcArenasPerChunk aligned arenas. The
>- * approximate space overhead of this is 1/js_gcArenasPerChunk. For details,
>- * see NewGCChunk/DestroyGCChunk below.
>- *
>- * The code also allocates arenas in chunks when JS_GC_USE_MMAP is 1 to
>- * minimize the overhead of mmap/munmap. In this case js_gcArenasPerChunk can
>- * not be a compile-time constant as the system page size is not known until
>- * runtime.
>  */
>-#if JS_GC_USE_MMAP
>-static uint32 js_gcArenasPerChunk = 0;
>-static JSBool js_gcUseMmap = JS_FALSE;
>-#elif HAS_POSIX_MEMALIGN
>-# define js_gcArenasPerChunk 1
>-#else
>-# define js_gcArenasPerChunk 7
>-#endif
>-
>-#if defined(js_gcArenasPerChunk) && js_gcArenasPerChunk == 1
>-# define CHUNKED_ARENA_ALLOCATION 0
>-#else
>-# define CHUNKED_ARENA_ALLOCATION 1
>-#endif
> 
> #define GC_ARENA_SHIFT              12
> #define GC_ARENA_MASK               ((jsuword) JS_BITMASK(GC_ARENA_SHIFT))
> #define GC_ARENA_SIZE               JS_BIT(GC_ARENA_SHIFT)
> 
>-/*
>- * JS_GC_ARENA_PAD defines the number of bytes to pad JSGCArenaInfo structure.
>- * It is used to improve allocation efficiency when using posix_memalign. If
>- * malloc's implementation uses internal headers, then calling
>- *
>- *   posix_memalign(&p, GC_ARENA_SIZE, GC_ARENA_SIZE * js_gcArenasPerChunk)
>- *
>- * in a sequence leaves holes between allocations of the size GC_ARENA_SIZE
>- * due to the need to fit headers. JS_GC_ARENA_PAD mitigates that so the code
>- * calls
>- *
>- *     posix_memalign(&p, GC_ARENA_SIZE,
>- *                    GC_ARENA_SIZE * js_gcArenasPerChunk - JS_GC_ARENA_PAD)
>- *
>- * When JS_GC_ARENA_PAD is equal or greater than the number of words in the
>- * system header, the system can pack all allocations together without holes.
>- *
>- * With JS_GC_USE_MEMALIGN we want at least 2 word pad unless posix_memalign
>- * comes from jemalloc that does not use any headers/trailers.
>- */
>-#ifndef JS_GC_ARENA_PAD
>-# if HAS_POSIX_MEMALIGN && !MOZ_MEMORY
>-#  define JS_GC_ARENA_PAD (2 * JS_BYTES_PER_WORD)
>-# else
>-#  define JS_GC_ARENA_PAD 0
>-# endif
>-#endif
>-
> struct JSGCArenaInfo {
>     /*
>      * Allocation list for the arena or NULL if the arena holds double values.
>      */
>     JSGCArenaList   *list;
> 
>     /*
>      * Pointer to the previous arena in a linked list. The arena can either
>      * belong to one of JSContext.gcArenaList lists or, when it does not have
>      * any allocated GC things, to the list of free arenas in the chunk with
>      * head stored in JSGCChunkInfo.lastFreeArena.
>      */
>     JSGCArenaInfo   *prev;
> 
>-#if !CHUNKED_ARENA_ALLOCATION
>     jsuword         prevUntracedPage;
>-#else
>-    /*
>-     * A link field for the list of arenas with marked but not yet traced
>-     * things. The field is encoded as arena's page to share the space with
>-     * firstArena and arenaIndex fields.
>-     */
>-    jsuword         prevUntracedPage :  JS_BITS_PER_WORD - GC_ARENA_SHIFT;
>-
>-    /*
>-     * When firstArena is false, the index of arena in the chunk. When
>-     * firstArena is true, the index of a free arena holding JSGCChunkInfo or
>-     * NO_FREE_ARENAS if there are no free arenas in the chunk.
>-     *
>-     * GET_ARENA_INDEX and GET_CHUNK_INFO_INDEX are convenience macros to
>-     * access either of indexes.
>-     */
>-    jsuword         arenaIndex :        GC_ARENA_SHIFT - 1;
>-
>-    /* Flag indicating if the arena is the first in the chunk. */
>-    jsuword         firstArena :        1;
>-#endif
> 
>     union {
>         jsuword     untracedThings;     /* bitset for fast search of marked
>                                            but not yet traced things */
>         JSBool      hasMarkedDoubles;   /* the arena has marked doubles */
>     } u;
>-
>-#if JS_GC_ARENA_PAD != 0
>-    uint8           pad[JS_GC_ARENA_PAD];
>-#endif
> };
> 
> /*
>  * Verify that the bit fields are indeed shared and JSGCArenaInfo is as small
>  * as possible. The code does not rely on this check so if on a particular
>  * platform this does not compile, then, as a workaround, comment the assert
>  * out and submit a bug report.
>  */
>@@ -376,69 +255,16 @@ JS_STATIC_ASSERT(offsetof(JSGCArenaInfo,
> #define ARENA_INFO_TO_PAGE(arena)                                             \
>     (JS_ASSERT(IS_ARENA_INFO_ADDRESS(arena)),                                 \
>      ((jsuword) (arena) >> GC_ARENA_SHIFT))
> 
> #define GET_ARENA_INFO(chunk, index)                                          \
>     (JS_ASSERT((index) < js_gcArenasPerChunk),                                \
>      ARENA_START_TO_INFO(chunk + ((index) << GC_ARENA_SHIFT)))
> 
>-#if CHUNKED_ARENA_ALLOCATION
>-/*
>- * Definitions for allocating arenas in chunks.
>- *
>- * All chunks that have at least one free arena are put on the doubly-linked
>- * list with the head stored in JSRuntime.gcChunkList. JSGCChunkInfo contains
>- * the head of the chunk's free arena list together with the link fields for
>- * gcChunkList.
>- *
>- * Structure stored in one of chunk's free arenas. GET_CHUNK_INFO_INDEX gives
>- * the index of this arena. When all arenas in the chunk are used, it is
>- * removed from the list and the index is set to NO_FREE_ARENAS indicating
>- * that the chunk is not on gcChunkList and has no JSGCChunkInfo available.
>- */
>-
>-struct JSGCChunkInfo {
>-    JSGCChunkInfo   **prevp;
>-    JSGCChunkInfo   *next;
>-    JSGCArenaInfo   *lastFreeArena;
>-    uint32          numFreeArenas;
>-};
>-
>-#define NO_FREE_ARENAS              JS_BITMASK(GC_ARENA_SHIFT - 1)
>-
>-#ifdef js_gcArenasPerChunk
>-JS_STATIC_ASSERT(1 <= js_gcArenasPerChunk &&
>-                 js_gcArenasPerChunk <= NO_FREE_ARENAS);
>-#endif
>-
>-#define GET_ARENA_CHUNK(arena, index)                                         \
>-    (JS_ASSERT(GET_ARENA_INDEX(arena) == index),                              \
>-     ARENA_INFO_TO_START(arena) - ((index) << GC_ARENA_SHIFT))
>-
>-#define GET_ARENA_INDEX(arena)                                                \
>-    ((arena)->firstArena ? 0 : (uint32) (arena)->arenaIndex)
>-
>-#define GET_CHUNK_INFO_INDEX(chunk)                                           \
>-    ((uint32) ARENA_START_TO_INFO(chunk)->arenaIndex)
>-
>-#define SET_CHUNK_INFO_INDEX(chunk, index)                                    \
>-    (JS_ASSERT((index) < js_gcArenasPerChunk || (index) == NO_FREE_ARENAS),   \
>-     (void) (ARENA_START_TO_INFO(chunk)->arenaIndex = (jsuword) (index)))
>-
>-#define GET_CHUNK_INFO(chunk, infoIndex)                                      \
>-    (JS_ASSERT(GET_CHUNK_INFO_INDEX(chunk) == (infoIndex)),                   \
>-     JS_ASSERT((uint32) (infoIndex) < js_gcArenasPerChunk),                   \
>-     (JSGCChunkInfo *) ((chunk) + ((infoIndex) << GC_ARENA_SHIFT)))
>-
>-#define CHUNK_INFO_TO_INDEX(ci)                                               \
>-    GET_ARENA_INDEX(ARENA_START_TO_INFO((jsuword)ci))
>-
>-#endif
>-
> /*
>  * Macros for GC-thing operations.
>  */
> #define THINGS_PER_ARENA(thingSize)                                           \
>     ((GC_ARENA_SIZE - (uint32) sizeof(JSGCArenaInfo)) / ((thingSize) + 1U))
> 
> #define THING_TO_ARENA(thing)                                                 \
>     (JS_ASSERT(!JSString::isStatic((JSString *) (thing))),                    \
>@@ -817,279 +643,38 @@ ShrinkPtrTable(JSPtrTable *table, const 
> #else
> # define METER(x)               ((void) 0)
> # define METER_IF(condition, x) ((void) 0)
> #endif
> 
> #define METER_UPDATE_MAX(maxLval, rval)                                       \
>     METER_IF((maxLval) < (rval), (maxLval) = (rval))
> 
>-#if JS_GC_USE_MMAP || !HAS_POSIX_MEMALIGN
>-
>-/*
>- * For chunks allocated via over-sized malloc, get a pointer to store the gap
>- * between the malloc's result and the first arena in the chunk.
>- */
>-static uint32 *
>-GetMallocedChunkGapPtr(jsuword chunk)
>-{
>-    JS_ASSERT((chunk & GC_ARENA_MASK) == 0);
>-
>-    /* Use the memory after the chunk, see NewGCChunk for details. */
>-    return (uint32 *) (chunk + (js_gcArenasPerChunk << GC_ARENA_SHIFT));
>-}
>-
>-#endif
>-
>-static jsuword
>-NewGCChunk(void)
>-{
>-    void *p;
>-
>-#if JS_GC_USE_MMAP
>-    if (js_gcUseMmap) {
>-# if defined(XP_WIN)
>-        p = VirtualAlloc(NULL, js_gcArenasPerChunk << GC_ARENA_SHIFT,
>-                         MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);
>-        return (jsuword) p;
>-# else
>-        p = mmap(NULL, js_gcArenasPerChunk << GC_ARENA_SHIFT,
>-                 PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>-        return (p == MAP_FAILED) ? 0 : (jsuword) p;
>-# endif
>-    }
>-#endif
>-
>-#if HAS_POSIX_MEMALIGN
>-    if (0 != posix_memalign(&p, GC_ARENA_SIZE,
>-                            GC_ARENA_SIZE * js_gcArenasPerChunk -
>-                            JS_GC_ARENA_PAD)) {
>-        return 0;
>-    }
>-    return (jsuword) p;
>-#else
>-    /*
>-     * Implement chunk allocation using oversized malloc if mmap and
>-     * posix_memalign are not available.
>-     *
>-     * Since malloc allocates pointers aligned on the word boundary, to get
>-     * js_gcArenasPerChunk aligned arenas, we need to malloc only
>-     *
>-     *   ((js_gcArenasPerChunk + 1) << GC_ARENA_SHIFT) - sizeof(size_t)
>-     *
>-     * bytes. But since we stores the gap between the malloced pointer and the
>-     * first arena in the chunk after the chunk, we need to ask for
>-     *
>-     *   ((js_gcArenasPerChunk + 1) << GC_ARENA_SHIFT)
>-     *
>-     * bytes to ensure that we always have room to store the gap.
>-     */
>-    p = js_malloc((js_gcArenasPerChunk + 1) << GC_ARENA_SHIFT);
>-    if (!p)
>-        return 0;
>-
>-    {
>-        jsuword chunk;
>-
>-        chunk = ((jsuword) p + GC_ARENA_MASK) & ~GC_ARENA_MASK;
>-        *GetMallocedChunkGapPtr(chunk) = (uint32) (chunk - (jsuword) p);
>-        return chunk;
>-    }
>-#endif
>-}
>-
>-static void
>-DestroyGCChunk(jsuword chunk)
>-{
>-    JS_ASSERT((chunk & GC_ARENA_MASK) == 0);
>-#if JS_GC_USE_MMAP
>-    if (js_gcUseMmap) {
>-# if defined(XP_WIN)
>-        VirtualFree((void *) chunk, 0, MEM_RELEASE);
>-# elif defined(SOLARIS)
>-        munmap((char *) chunk, js_gcArenasPerChunk << GC_ARENA_SHIFT);
>-# else
>-        munmap((void *) chunk, js_gcArenasPerChunk << GC_ARENA_SHIFT);
>-# endif
>-        return;
>-    }
>-#endif
>-
>-#if HAS_POSIX_MEMALIGN
>-    js_free((void *) chunk);
>-#else
>-    /* See comments in NewGCChunk. */
>-    JS_ASSERT(*GetMallocedChunkGapPtr(chunk) < GC_ARENA_SIZE);
>-    js_free((void *) (chunk - *GetMallocedChunkGapPtr(chunk)));
>-#endif
>-}
>-
>-#if CHUNKED_ARENA_ALLOCATION
>-
>-static void
>-AddChunkToList(JSRuntime *rt, JSGCChunkInfo *ci)
>-{
>-    ci->prevp = &rt->gcChunkList;
>-    ci->next = rt->gcChunkList;
>-    if (rt->gcChunkList) {
>-        JS_ASSERT(rt->gcChunkList->prevp == &rt->gcChunkList);
>-        rt->gcChunkList->prevp = &ci->next;
>-    }
>-    rt->gcChunkList = ci;
>-}
>-
>-static void
>-RemoveChunkFromList(JSRuntime *rt, JSGCChunkInfo *ci)
>-{
>-    *ci->prevp = ci->next;
>-    if (ci->next) {
>-        JS_ASSERT(ci->next->prevp == &ci->next);
>-        ci->next->prevp = ci->prevp;
>-    }
>-}
>-
>-#endif
>-
> static JSGCArenaInfo *
> NewGCArena(JSRuntime *rt)
> {
>-    jsuword chunk;
>     JSGCArenaInfo *a;
> 
>-    if (rt->gcBytes >= rt->gcMaxBytes)
>-        return NULL;
>-
>-#if CHUNKED_ARENA_ALLOCATION
>-    if (js_gcArenasPerChunk == 1) {
>-#endif
>-        chunk = NewGCChunk();
>-        if (chunk == 0)
>+    if (rt->emptyArenas) {
>+        a = rt->emptyArenas;
>+        rt->emptyArenas = a->prev;
>+    } else {
>+        uintptr_t arena = rt->gcPtr;
>+        if (arena + GC_ARENA_SIZE > rt->gcEnd)
>             return NULL;
>-        a = ARENA_START_TO_INFO(chunk);
>-#if CHUNKED_ARENA_ALLOCATION
>-    } else {
>-        JSGCChunkInfo *ci;
>-        uint32 i;
>-        JSGCArenaInfo *aprev;
>-
>-        ci = rt->gcChunkList;
>-        if (!ci) {
>-            chunk = NewGCChunk();
>-            if (chunk == 0)
>-                return NULL;
>-            JS_ASSERT((chunk & GC_ARENA_MASK) == 0);
>-            a = GET_ARENA_INFO(chunk, 0);
>-            a->firstArena = JS_TRUE;
>-            a->arenaIndex = 0;
>-            aprev = NULL;
>-            i = 0;
>-            do {
>-                a->prev = aprev;
>-                aprev = a;
>-                ++i;
>-                a = GET_ARENA_INFO(chunk, i);
>-                a->firstArena = JS_FALSE;
>-                a->arenaIndex = i;
>-            } while (i != js_gcArenasPerChunk - 1);
>-            ci = GET_CHUNK_INFO(chunk, 0);
>-            ci->lastFreeArena = aprev;
>-            ci->numFreeArenas = js_gcArenasPerChunk - 1;
>-            AddChunkToList(rt, ci);
>-        } else {
>-            JS_ASSERT(ci->prevp == &rt->gcChunkList);
>-            a = ci->lastFreeArena;
>-            aprev = a->prev;
>-            if (!aprev) {
>-                JS_ASSERT(ci->numFreeArenas == 1);
>-                JS_ASSERT(ARENA_INFO_TO_START(a) == (jsuword) ci);
>-                RemoveChunkFromList(rt, ci);
>-                chunk = GET_ARENA_CHUNK(a, GET_ARENA_INDEX(a));
>-                SET_CHUNK_INFO_INDEX(chunk, NO_FREE_ARENAS);
>-            } else {
>-                JS_ASSERT(ci->numFreeArenas >= 2);
>-                JS_ASSERT(ARENA_INFO_TO_START(a) != (jsuword) ci);
>-                ci->lastFreeArena = aprev;
>-                ci->numFreeArenas--;
>-            }
>-        }
>+        rt->gcPtr = arena + GC_ARENA_SIZE;
>+        rt->gcBytes += GC_ARENA_SIZE;
>+        a = ARENA_START_TO_INFO(arena);
>     }
>-#endif
>-
>-    rt->gcBytes += GC_ARENA_SIZE;
>     a->prevUntracedPage = 0;
>     memset(&a->u, 0, sizeof(a->u));
>-
>     return a;
> }
> 
> static void
>-DestroyGCArenas(JSRuntime *rt, JSGCArenaInfo *last)
>-{
>-    JSGCArenaInfo *a;
>-
>-    while (last) {
>-        a = last;
>-        last = last->prev;
>-
>-        METER(rt->gcStats.afree++);
>-        JS_ASSERT(rt->gcBytes >= GC_ARENA_SIZE);
>-        rt->gcBytes -= GC_ARENA_SIZE;
>-
>-#if CHUNKED_ARENA_ALLOCATION
>-        if (js_gcArenasPerChunk == 1) {
>-#endif
>-            DestroyGCChunk(ARENA_INFO_TO_START(a));
>-#if CHUNKED_ARENA_ALLOCATION
>-        } else {
>-            uint32 arenaIndex;
>-            jsuword chunk;
>-            uint32 chunkInfoIndex;
>-            JSGCChunkInfo *ci;
>-# ifdef DEBUG
>-            jsuword firstArena;
>-
>-            firstArena = a->firstArena;
>-            arenaIndex = a->arenaIndex;
>-            memset((void *) ARENA_INFO_TO_START(a), JS_FREE_PATTERN,
>-                   GC_ARENA_SIZE - JS_GC_ARENA_PAD);
>-            a->firstArena = firstArena;
>-            a->arenaIndex = arenaIndex;
>-# endif
>-            arenaIndex = GET_ARENA_INDEX(a);
>-            chunk = GET_ARENA_CHUNK(a, arenaIndex);
>-            chunkInfoIndex = GET_CHUNK_INFO_INDEX(chunk);
>-            if (chunkInfoIndex == NO_FREE_ARENAS) {
>-                chunkInfoIndex = arenaIndex;
>-                SET_CHUNK_INFO_INDEX(chunk, arenaIndex);
>-                ci = GET_CHUNK_INFO(chunk, chunkInfoIndex);
>-                a->prev = NULL;
>-                ci->lastFreeArena = a;
>-                ci->numFreeArenas = 1;
>-                AddChunkToList(rt, ci);
>-            } else {
>-                JS_ASSERT(chunkInfoIndex != arenaIndex);
>-                ci = GET_CHUNK_INFO(chunk, chunkInfoIndex);
>-                JS_ASSERT(ci->numFreeArenas != 0);
>-                JS_ASSERT(ci->lastFreeArena);
>-                JS_ASSERT(a != ci->lastFreeArena);
>-                if (ci->numFreeArenas == js_gcArenasPerChunk - 1) {
>-                    RemoveChunkFromList(rt, ci);
>-                    DestroyGCChunk(chunk);
>-                } else {
>-                    ++ci->numFreeArenas;
>-                    a->prev = ci->lastFreeArena;
>-                    ci->lastFreeArena = a;
>-                }
>-            }
>-        }
>-# endif
>-    }
>-}
>-
>-static void
> InitGCArenaLists(JSRuntime *rt)
> {
>     uintN i, thingSize;
>     JSGCArenaList *arenaList;
> 
>     for (i = 0; i < GC_NUM_FREELISTS; i++) {
>         arenaList = &rt->gcArenaList[i];
>         thingSize = GC_FREELIST_NBYTES(i);
>@@ -1105,27 +690,27 @@ InitGCArenaLists(JSRuntime *rt)
> static void
> FinishGCArenaLists(JSRuntime *rt)
> {
>     uintN i;
>     JSGCArenaList *arenaList;
> 
>     for (i = 0; i < GC_NUM_FREELISTS; i++) {
>         arenaList = &rt->gcArenaList[i];
>-        DestroyGCArenas(rt, arenaList->last);
>         arenaList->last = NULL;
>         arenaList->lastCount = THINGS_PER_ARENA(arenaList->thingSize);
>         arenaList->freeList = NULL;
>     }
>-    DestroyGCArenas(rt, rt->gcDoubleArenaList.first);
>     rt->gcDoubleArenaList.first = NULL;
>     rt->gcDoubleArenaList.nextDoubleFlags = DOUBLE_BITMAP_SENTINEL;
> 
>+    if (rt->gcBase)
>+        free((void *)rt->gcBase);
>+    rt->gcPtr = rt->gcBase = rt->gcEnd = 0;
>     rt->gcBytes = 0;
>-    JS_ASSERT(rt->gcChunkList == 0);
> }
> 
> /*
>  * This function must not be called when thing is jsdouble.
>  */
> static uint8 *
> GetGCThingFlags(void *thing)
> {
>@@ -1229,70 +814,28 @@ typedef struct JSGCRootHashEntry {
>     JSDHashEntryHdr hdr;
>     void            *root;
>     const char      *name;
> } JSGCRootHashEntry;
> 
> /* Initial size of the gcRootsHash table (SWAG, small enough to amortize). */
> #define GC_ROOTS_SIZE   256
> 
>-#if CHUNKED_ARENA_ALLOCATION
>-
>-/*
>- * For a CPU with extremely large pages using them for GC things wastes
>- * too much memory.
>- */
>-# define GC_ARENAS_PER_CPU_PAGE_LIMIT JS_BIT(18 - GC_ARENA_SHIFT)
>-
>-JS_STATIC_ASSERT(GC_ARENAS_PER_CPU_PAGE_LIMIT <= NO_FREE_ARENAS);
>-
>-#endif
>-
> JSBool
>-js_InitGC(JSRuntime *rt, uint32 maxbytes)
>+js_InitGC(JSRuntime *rt, size_t maxbytes)
> {
>+    /* Overallocate so we can make sure our heap is aligned. */
>+    size_t bytes = maxbytes + GC_ARENA_SIZE - 1;
>+    rt->gcBase = uintptr_t(malloc(bytes));

Long-standing silly embedding practice was to pass 0xffffffff or (uint32)-1 for maxbytes. You must not overflow to wrap to a small size, which you then buffer-overrun.

Best course is to clamp against a sane upper bound (2^30 on 32 bits, 2^40 on 64?).

Don't test 0 assuming null -> 0, use a void *base = malloc(bytes) temporary.

I thought we were mmapping/virtual-allocating? For a big allocation malloc may well do that (should) but don't we want to decommit physical memory if possible?

>+    rt->gcPtr = uintptr_t((rt->gcBase + GC_ARENA_SIZE - 1) & (~GC_ARENA_MASK));

uintptr_t -> jsuword

Use GC_ARENA_MASK not GC_ARENA_SIZE - 1 (see other such expressions, e.g.

jsgc.cpp:        chunk = ((jsuword) p + GC_ARENA_MASK) & ~GC_ARENA_MASK;

and follow suit).

Don't overparenthesize ~GC_ARENA_MASK :-P.

/be
Attachment #399618 - Flags: review-
To quote Andreas: "bugzilla should magically delete overcited code for me."

/be
Attached patch new version (obsolete) — Splinter Review
Attachment #399618 - Attachment is obsolete: true
Attached patch changed maxbyte handling (obsolete) — Splinter Review
Attachment #399852 - Attachment is obsolete: true
Attachment #400169 - Flags: review?(brendan)
Attachment #400169 - Flags: review?(brendan) → review+
Comment on attachment 400169 [details] [diff] [review]
changed maxbyte handling

>+    void *base = malloc(bytes);
>+    if (base == NULL)

House style wants if (!base).

s/MAXBYTES_LIMIT/GC_&/g to match other #defines from jsgc.{h,cpp}. r=me with these nits picked. Thanks,

/be
Attached patch new versionSplinter Review
Attachment #400169 - Attachment is obsolete: true
New Performance numbers wo/with patch:

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.024x as fast    756.4ms +/- 0.1%   738.5ms +/- 0.2%     significant

=============================================================================

  3d:                  1.035x as fast    115.7ms +/- 0.4%   111.8ms +/- 0.5%     significant
    cube:              1.041x as fast     33.1ms +/- 0.7%    31.8ms +/- 0.9%     significant
    morph:             1.080x as fast     23.0ms +/- 0.0%    21.3ms +/- 1.6%     significant
    raytrace:          1.015x as fast     59.6ms +/- 0.6%    58.7ms +/- 0.6%     significant

  access:              1.028x as fast    115.7ms +/- 0.8%   112.6ms +/- 0.5%     significant
    binary-trees:      1.034x as fast     33.5ms +/- 1.1%    32.4ms +/- 1.1%     significant
    fannkuch:          *1.010x as slow*   49.3ms +/- 0.7%    49.8ms +/- 0.6%     significant
    nbody:             1.148x as fast     21.7ms +/- 1.6%    18.9ms +/- 1.2%     significant
    nsieve:            ??                 11.2ms +/- 2.7%    11.5ms +/- 3.3%     not conclusive: might be *1.027x as slow*

  bitops:              1.059x as fast     32.1ms +/- 1.6%    30.3ms +/- 2.2%     significant
    3bit-bits-in-byte: ??                  1.3ms +/- 26.6%     1.4ms +/- 26.4%     not conclusive: might be *1.077x as slow*
    bits-in-byte:      -                   8.3ms +/- 4.2%     8.3ms +/- 4.2% 
    bitwise-and:       -                   2.1ms +/- 10.8%     2.0ms +/- 0.0% 
    nsieve-bits:       1.097x as fast     20.4ms +/- 1.8%    18.6ms +/- 2.0%     significant

  controlflow:         -                  27.9ms +/- 0.8%    27.9ms +/- 0.8% 
    recursive:         -                  27.9ms +/- 0.8%    27.9ms +/- 0.8% 

  crypto:              1.033x as fast     46.9ms +/- 1.5%    45.4ms +/- 1.3%     significant
    aes:               1.047x as fast     26.7ms +/- 1.8%    25.5ms +/- 2.0%     significant
    md5:               -                  12.9ms +/- 1.8%    12.9ms +/- 1.8% 
    sha1:              -                   7.3ms +/- 4.7%     7.0ms +/- 0.0% 

  date:                1.016x as fast    108.7ms +/- 0.4%   107.0ms +/- 0.5%     significant
    format-tofte:      1.007x as fast     57.1ms +/- 0.4%    56.7ms +/- 0.6%     significant
    format-xparb:      1.026x as fast     51.6ms +/- 0.7%    50.3ms +/- 0.7%     significant

  math:                ??                 25.9ms +/- 3.3%    26.4ms +/- 1.9%     not conclusive: might be *1.019x as slow*
    cordic:            -                   8.6ms +/- 4.3%     8.3ms +/- 4.2% 
    partial-sums:      *1.068x as slow*   11.8ms +/- 2.6%    12.6ms +/- 2.9%     significant
    spectral-norm:     -                   5.5ms +/- 6.8%     5.5ms +/- 6.8% 

  regexp:              -                  34.7ms +/- 1.0%    34.4ms +/- 1.1% 
    dna:               -                  34.7ms +/- 1.0%    34.4ms +/- 1.1% 

  string:              1.025x as fast    248.8ms +/- 0.2%   242.7ms +/- 0.2%     significant
    base64:            1.056x as fast     11.3ms +/- 3.1%    10.7ms +/- 3.2%     significant
    fasta:             1.055x as fast     53.6ms +/- 0.7%    50.8ms +/- 0.6%     significant
    tagcloud:          1.006x as fast     80.6ms +/- 0.5%    80.1ms +/- 0.3%     significant
    unpack-code:       1.022x as fast     80.5ms +/- 0.5%    78.8ms +/- 0.4%     significant
    validate-input:    1.022x as fast     22.8ms +/- 1.3%    22.3ms +/- 1.5%     significant
Blocks: 504640
Gregor Wagner contributed significantly to this patch.

http://hg.mozilla.org/tracemonkey/rev/5f449dffdff5
Whiteboard: fixed-in-tracemonkey
No longer blocks: 504640
http://hg.mozilla.org/mozilla-central/rev/5f449dffdff5
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
http://hg.mozilla.org/mozilla-central/rev/99c3d3e1722d backs this out on mozilla-central to see if it's the cause of the Tp4 heap size regression.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Backed out from TM too: http://hg.mozilla.org/tracemonkey/rev/10c17aa517b9
Whiteboard: fixed-in-tracemonkey
I was looking at the regressions for MacOSX Darwin 9.0.0 tracemonkey talos tp4 pbytes.

(in millions) 
120 initial before patch 
153 applying patch
122 after removing memset((void *)rt->gcBase, 0, bytes);
123 right before big merge
140 merge
138 right before back out
136 after back out
138 last build

I learned from the newsgroup today: 
It looks like it means virtual memory utilization on all the platforms  
except for OS X, so the regression there probably isn't good... 
and:
- On OS X it is the task's resident size. (bug 516497) 

This shows that this patch was not the reason for the regression on OS X.
Please do try calloc. See bug 518730 comment 0.

/be
I see a minimal performance win on my system. For the regression tests I would need access to the try-server.
FROM = malloc
TO   = calloc

BTW: malloc + memset((void *)rt->gcBase, 0, bytes) runs also about 1208ms.

TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           -                 1211.1ms +/- 0.2%   1208.8ms +/- 0.1% 

=============================================================================

  3d:                  1.004x as fast     177.4ms +/- 0.3%    176.7ms +/- 0.2%     significant
    cube:              1.010x as fast      50.8ms +/- 0.6%     50.3ms +/- 0.7%     significant
    morph:             -                   36.7ms +/- 0.9%     36.6ms +/- 1.0% 
    raytrace:          -                   89.9ms +/- 0.3%     89.8ms +/- 0.3% 

  access:              -                  197.4ms +/- 0.4%    197.0ms +/- 0.4% 
    binary-trees:      ??                  52.5ms +/- 0.7%     52.7ms +/- 0.7%     not conclusive: might be *1.004x as slow*
    fannkuch:          -                   90.6ms +/- 0.4%     90.6ms +/- 0.4% 
    nbody:             -                   35.6ms +/- 1.0%     35.3ms +/- 1.0% 
    nsieve:            -                   18.7ms +/- 1.8%     18.4ms +/- 2.0% 

  bitops:              -                   53.2ms +/- 1.1%     53.2ms +/- 1.1% 
    3bit-bits-in-byte: -                    2.3ms +/- 15.0%      2.3ms +/- 15.0% 
    bits-in-byte:      -                   14.4ms +/- 2.6%     14.4ms +/- 2.6% 
    bitwise-and:       -                    2.9ms +/- 7.8%      2.8ms +/- 10.8% 
    nsieve-bits:       ??                  33.6ms +/- 1.1%     33.7ms +/- 1.0%     not conclusive: might be *1.003x as slow*

  controlflow:         -                   51.0ms +/- 0.0%     51.0ms +/- 0.0% 
    recursive:         -                   51.0ms +/- 0.0%     51.0ms +/- 0.0% 

  crypto:              -                   72.0ms +/- 2.3%     71.6ms +/- 1.3% 
    aes:               -                   41.9ms +/- 3.6%     41.3ms +/- 2.5% 
    md5:               -                   19.1ms +/- 1.2%     19.1ms +/- 1.2% 
    sha1:              *1.018x as slow*    11.0ms +/- 0.0%     11.2ms +/- 2.7%     significant

  date:                -                  163.3ms +/- 0.3%    163.1ms +/- 0.2% 
    format-tofte:      -                   86.5ms +/- 0.4%     86.3ms +/- 0.4% 
    format-xparb:      -                   76.8ms +/- 0.4%     76.8ms +/- 0.4% 

  math:                ??                  44.6ms +/- 1.4%     44.7ms +/- 0.8%     not conclusive: might be *1.002x as slow*
    cordic:            ??                  14.9ms +/- 1.5%     15.0ms +/- 0.0%     not conclusive: might be *1.007x as slow*
    partial-sums:      ??                  20.5ms +/- 1.8%     20.6ms +/- 1.8%     not conclusive: might be *1.005x as slow*
    spectral-norm:     -                    9.2ms +/- 3.3%      9.1ms +/- 2.5% 

  regexp:              -                   59.4ms +/- 0.6%     59.2ms +/- 0.5% 
    dna:               -                   59.4ms +/- 0.6%     59.2ms +/- 0.5% 

  string:              -                  392.8ms +/- 0.2%    392.3ms +/- 0.2% 
    base64:            -                   16.8ms +/- 1.8%     16.4ms +/- 2.3% 
    fasta:             -                   83.6ms +/- 0.4%     83.5ms +/- 0.5% 
    tagcloud:          ??                 124.3ms +/- 0.3%    124.5ms +/- 0.3%     not conclusive: might be *1.002x as slow*
    unpack-code:       -                  132.8ms +/- 0.3%    132.7ms +/- 0.3% 
    validate-input:    -                   35.3ms +/- 1.0%     35.2ms +/- 0.9%
This made it onto 1.9.2 somehow. It shouldn't be there. Its backed out from TM.
(In reply to comment #34)
> This made it onto 1.9.2 somehow. It shouldn't be there. Its backed out from TM.

I accidentally pushed this and its backout:

http://hg.mozilla.org/releases/mozilla-1.9.2/rev/f6bcaffb09ef

so it's not there. sorry for the confusion.
Assignee: gal → anygregor
Gregor, do we still need this?
The intention was:
The attached patch allocates the GC heap in a single chunk. This allows very
fast identification of GC things, which is handy for conservative stack
scanning.

This patch is less important now because we increased the chunk size. We might want something similar in our new compartments. 
We should measure the impact once conservative stack scanning is enabled and decide if this is still necessary.
Status: REOPENED → RESOLVED
Closed: 15 years ago13 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: