Closed Bug 562553 Opened 14 years ago Closed 14 years ago

TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 2 obsolete files)

I took a quick look at v8-splay.js with Cachegrind:

--------------------------------------------------------------------------------
          Ir
--------------------------------------------------------------------------------
3,926,401,052  PROGRAM TOTALS

--------------------------------------------------------------------------------
          Ir  file:function
--------------------------------------------------------------------------------
1,549,212,505  /home/njn/moz/ws0/js/src/optg32/../dtoa.c:js_dtostr
 291,775,264  /home/njn/moz/ws0/js/src/optg32/../dtoa.c:multadd
 200,155,572  /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_malloc
 173,398,709  /home/njn/moz/ws0/js/src/optg32/../jsgc.cpp:js_CallGCMarker
 172,717,716  ???:???
 118,097,974  /home/njn/moz/ws0/js/src/optg32/../jsops.cpp:js_Interpret
  81,618,330  /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_free
  77,324,952  /home/njn/moz/ws0/js/src/optg32/../dtoa.c:lshift

Wow, what a lot of double-to-string conversions.  They come from this function:


Digging some more, we have lots of calls to js_NumberToString().  They are due to this function:

function GeneratePayloadTree(depth, key) {
  if (depth == 0) {
    return {
      array  : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ],
      string : 'String for key ' + key + ' in leaf node'
    };
  } else {
    return {
      left:  GeneratePayloadTree(depth - 1, key),
      right: GeneratePayloadTree(depth - 1, key)
    };
  }
}

This is called with 'depth'==5 and 'key'==a random generated double.  Because it's recursive, we end up converting 'key' 32 times.  This function is called a zillion times.

I tried changing the benchmark to do the conversion first, it ran about 2.3x faster:

Index: tests/v8-v4/v8-splay.js
===================================================================
--- tests/v8-v4/v8-splay.js     (revision 52427)
+++ tests/v8-v4/v8-splay.js     (working copy)
@@ -69,7 +69,7 @@
   do {
     key = GenerateKey();
   } while (splayTree.find(key) != null);
-  splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key));
+  splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, "" + key));
   return key;
 }


If I instead generate the entire string first, the 2.3x improves to about 2.8x:

Index: tests/v8-v4/v8-splay.js
===================================================================
--- tests/v8-v4/v8-splay.js     (revision 52427)
+++ tests/v8-v4/v8-splay.js     (working copy)
@@ -45,7 +45,7 @@
   if (depth == 0) {
     return {
       array  : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ],
-      string : 'String for key ' + key + ' in leaf node'
+      string : key
     };
   } else {
     return {
@@ -69,7 +69,7 @@
   do {
     key = GenerateKey();
   } while (splayTree.find(key) != null);
-  splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key));
+  splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, 'String for key ' + key + ' in leaf node'));
   return key;
 }

The program has this rather amusing/inaccurate comment at the top:

// This benchmark is based on a JavaScript log processing module used
// by the V8 profiler to generate execution time profiles for runs of
// JavaScript applications, and it effectively measures how fast the
// JavaScript engine is at allocating nodes and reclaiming the memory
// used for old nodes. Because of the way splay trees work, the engine
// also has to deal with a lot of changes to the large tree object
// graph.

I see two options here.  If we assume we can't change the benchmark, we should just cache the last double-to-string conversion -- a stupid hack for a stupid benchmark.

Alternatively, we could file a bug with Google and try to get them to improve it.

Or we could do both.
Hmm.  Any idea how other engines handle this case?
++njn for the profiling work. nice find.
Attached patch patch v2 (obsolete) — Splinter Review
This is just Andreas's patch with a couple of extra comments.  It looks good to me but I'm not expert on thread-local stuff so someone who knows about that should check it.

I see a 2.3x speedup on i386/mac and 1.7x on X64/mac;  not sure why there's such a discrepancy, it seems that js_dtostr is cheaper on 64-bit.  These improvements result in respective 10% and 6% improvements for V8 overall.
Attachment #442320 - Attachment is obsolete: true
Comment on attachment 442329 [details] [diff] [review]
patch v2

>diff --git a/js/src/jscntxt.cpp b/js/src/jscntxt.cpp
>--- a/js/src/jscntxt.cpp
>+++ b/js/src/jscntxt.cpp
>@@ -175,16 +175,18 @@ JSThreadData::purge(JSContext *cx)
>     if (cx->runtime->gcRegenShapes)
>         traceMonitor.needFlush = JS_TRUE;
> #endif
> 
>     /* Destroy eval'ed scripts. */
>     js_DestroyScriptsToGC(cx, this);
> 
>     js_PurgeCachedNativeEnumerators(cx, this);
>+
>+    dtoaCache.s = NULL;
> }
> 
> void
> JSThreadData::purgeGCFreeLists()
> {
>     if (!localRootStack) {
>         gcFreeLists.purge();
>     } else {
>diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h
>--- a/js/src/jscntxt.h
>+++ b/js/src/jscntxt.h
>@@ -551,16 +551,24 @@ struct JSThreadData {
> 
> #ifdef JS_EVAL_CACHE_METERING
>     JSEvalCacheMeter    evalCacheMeter;
> #endif
> 
>     /* State used by dtoa.c. */
>     DtoaState           *dtoaState;
> 
>+    /* State used to cache some double-to-string conversions.  Sometimes we
>+     * get lucky and are asked to convert the same double twice in a row. */

Comment style in the JS engine:

/*
 *
 */

>+    struct {
>+        jsdouble d;
>+        jsint    base;
>+        JSString *s;        // if s==NULL, d and base are not valid

Maybe stick to /* */ ? But // seems ok too

>+    } dtoaCache;
>+
>     /*
>      * Cache of reusable JSNativeEnumerators mapped by shape identifiers (as
>      * stored in scope->shape). This cache is nulled by the GC and protected
>      * by gcLock.
>      */
> #define NATIVE_ENUM_CACHE_LOG2  8
> #define NATIVE_ENUM_CACHE_MASK  JS_BITMASK(NATIVE_ENUM_CACHE_LOG2)
> #define NATIVE_ENUM_CACHE_SIZE  JS_BIT(NATIVE_ENUM_CACHE_LOG2)
>diff --git a/js/src/jsnum.cpp b/js/src/jsnum.cpp
>--- a/js/src/jsnum.cpp
>+++ b/js/src/jsnum.cpp
>@@ -880,22 +880,28 @@ js_NumberToStringWithBase(JSContext *cx,
>         if (base == 10 && jsuint(i) < INT_STRING_LIMIT)
>             return JSString::intString(i);
>         if (jsuint(i) < jsuint(base)) {
>             if (i < 10)
>                 return JSString::intString(i);
>             return JSString::unitString(jschar('a' + i - 10));
>         }
>     }
>+    JSThreadData *data = JS_THREAD_DATA(cx);
>+    if (data->dtoaCache.s && data->dtoaCache.base == base && data->dtoaCache.d == d)
>+        return data->dtoaCache.s;
>     numStr = NumberToCString(cx, d, base, buf, sizeof buf);
>     if (!numStr)
>         return NULL;
>     s = JS_NewStringCopyZ(cx, numStr);
>     if (!(numStr >= buf && numStr < buf + sizeof buf))
>         js_free(numStr);
>+    data->dtoaCache.base = base;
>+    data->dtoaCache.d = d;
>+    data->dtoaCache.s = s;
>     return s;
> }
> 
> JSString * JS_FASTCALL
> js_NumberToString(JSContext *cx, jsdouble d)
> {
>     return js_NumberToStringWithBase(cx, d, 10);
> }
Summary: TM: analyze v8-splay.js → TM: cache double-to-string conversions to speed up v8-splay.js
> >+    /* State used to cache some double-to-string conversions.  Sometimes we
> >+     * get lucky and are asked to convert the same double twice in a row. */
> 
> Comment style in the JS engine:
> 
> /*
>  *
>  */

I'll fix.

> 
> >+    struct {
> >+        jsdouble d;
> >+        jsint    base;
> >+        JSString *s;        // if s==NULL, d and base are not valid
> 
> Maybe stick to /* */ ? But // seems ok too

// is used elsewhere for one-liners.

Andreas on nit patrol, who would have thought? :P
Attached patch patch v3Splinter Review
Attachment #442329 - Attachment is obsolete: true
Summary: TM: cache double-to-string conversions to speed up v8-splay.js → TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js
http://hg.mozilla.org/tracemonkey/rev/74ef6155d096
Whiteboard: fixed-in-tracemonkey
Google has a bug tracking the problem with v8-splay.js:

http://code.google.com/p/v8/issues/detail?id=690
http://hg.mozilla.org/mozilla-central/rev/74ef6155d096
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
What about caching or optimizing: string : 'String for key ' + key + ' in leaf node'?
That is also called 32 times with the same key.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: