Closed Bug 562553 Opened 15 years ago Closed 15 years ago

TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 2 obsolete files)

I took a quick look at v8-splay.js with Cachegrind: -------------------------------------------------------------------------------- Ir -------------------------------------------------------------------------------- 3,926,401,052 PROGRAM TOTALS -------------------------------------------------------------------------------- Ir file:function -------------------------------------------------------------------------------- 1,549,212,505 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:js_dtostr 291,775,264 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:multadd 200,155,572 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_malloc 173,398,709 /home/njn/moz/ws0/js/src/optg32/../jsgc.cpp:js_CallGCMarker 172,717,716 ???:??? 118,097,974 /home/njn/moz/ws0/js/src/optg32/../jsops.cpp:js_Interpret 81,618,330 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_free 77,324,952 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:lshift Wow, what a lot of double-to-string conversions. They come from this function: Digging some more, we have lots of calls to js_NumberToString(). They are due to this function: function GeneratePayloadTree(depth, key) { if (depth == 0) { return { array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], string : 'String for key ' + key + ' in leaf node' }; } else { return { left: GeneratePayloadTree(depth - 1, key), right: GeneratePayloadTree(depth - 1, key) }; } } This is called with 'depth'==5 and 'key'==a random generated double. Because it's recursive, we end up converting 'key' 32 times. This function is called a zillion times. I tried changing the benchmark to do the conversion first, it ran about 2.3x faster: Index: tests/v8-v4/v8-splay.js =================================================================== --- tests/v8-v4/v8-splay.js (revision 52427) +++ tests/v8-v4/v8-splay.js (working copy) @@ -69,7 +69,7 @@ do { key = GenerateKey(); } while (splayTree.find(key) != null); - splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key)); + splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, "" + key)); return key; } If I instead generate the entire string first, the 2.3x improves to about 2.8x: Index: tests/v8-v4/v8-splay.js =================================================================== --- tests/v8-v4/v8-splay.js (revision 52427) +++ tests/v8-v4/v8-splay.js (working copy) @@ -45,7 +45,7 @@ if (depth == 0) { return { array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], - string : 'String for key ' + key + ' in leaf node' + string : key }; } else { return { @@ -69,7 +69,7 @@ do { key = GenerateKey(); } while (splayTree.find(key) != null); - splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key)); + splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, 'String for key ' + key + ' in leaf node')); return key; } The program has this rather amusing/inaccurate comment at the top: // This benchmark is based on a JavaScript log processing module used // by the V8 profiler to generate execution time profiles for runs of // JavaScript applications, and it effectively measures how fast the // JavaScript engine is at allocating nodes and reclaiming the memory // used for old nodes. Because of the way splay trees work, the engine // also has to deal with a lot of changes to the large tree object // graph. I see two options here. If we assume we can't change the benchmark, we should just cache the last double-to-string conversion -- a stupid hack for a stupid benchmark. Alternatively, we could file a bug with Google and try to get them to improve it. Or we could do both.
Hmm. Any idea how other engines handle this case?
++njn for the profiling work. nice find.
Attached patch patch v2 (obsolete) — Splinter Review
This is just Andreas's patch with a couple of extra comments. It looks good to me but I'm not expert on thread-local stuff so someone who knows about that should check it. I see a 2.3x speedup on i386/mac and 1.7x on X64/mac; not sure why there's such a discrepancy, it seems that js_dtostr is cheaper on 64-bit. These improvements result in respective 10% and 6% improvements for V8 overall.
Attachment #442320 - Attachment is obsolete: true
Comment on attachment 442329 [details] [diff] [review] patch v2 >diff --git a/js/src/jscntxt.cpp b/js/src/jscntxt.cpp >--- a/js/src/jscntxt.cpp >+++ b/js/src/jscntxt.cpp >@@ -175,16 +175,18 @@ JSThreadData::purge(JSContext *cx) > if (cx->runtime->gcRegenShapes) > traceMonitor.needFlush = JS_TRUE; > #endif > > /* Destroy eval'ed scripts. */ > js_DestroyScriptsToGC(cx, this); > > js_PurgeCachedNativeEnumerators(cx, this); >+ >+ dtoaCache.s = NULL; > } > > void > JSThreadData::purgeGCFreeLists() > { > if (!localRootStack) { > gcFreeLists.purge(); > } else { >diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h >--- a/js/src/jscntxt.h >+++ b/js/src/jscntxt.h >@@ -551,16 +551,24 @@ struct JSThreadData { > > #ifdef JS_EVAL_CACHE_METERING > JSEvalCacheMeter evalCacheMeter; > #endif > > /* State used by dtoa.c. */ > DtoaState *dtoaState; > >+ /* State used to cache some double-to-string conversions. Sometimes we >+ * get lucky and are asked to convert the same double twice in a row. */ Comment style in the JS engine: /* * */ >+ struct { >+ jsdouble d; >+ jsint base; >+ JSString *s; // if s==NULL, d and base are not valid Maybe stick to /* */ ? But // seems ok too >+ } dtoaCache; >+ > /* > * Cache of reusable JSNativeEnumerators mapped by shape identifiers (as > * stored in scope->shape). This cache is nulled by the GC and protected > * by gcLock. > */ > #define NATIVE_ENUM_CACHE_LOG2 8 > #define NATIVE_ENUM_CACHE_MASK JS_BITMASK(NATIVE_ENUM_CACHE_LOG2) > #define NATIVE_ENUM_CACHE_SIZE JS_BIT(NATIVE_ENUM_CACHE_LOG2) >diff --git a/js/src/jsnum.cpp b/js/src/jsnum.cpp >--- a/js/src/jsnum.cpp >+++ b/js/src/jsnum.cpp >@@ -880,22 +880,28 @@ js_NumberToStringWithBase(JSContext *cx, > if (base == 10 && jsuint(i) < INT_STRING_LIMIT) > return JSString::intString(i); > if (jsuint(i) < jsuint(base)) { > if (i < 10) > return JSString::intString(i); > return JSString::unitString(jschar('a' + i - 10)); > } > } >+ JSThreadData *data = JS_THREAD_DATA(cx); >+ if (data->dtoaCache.s && data->dtoaCache.base == base && data->dtoaCache.d == d) >+ return data->dtoaCache.s; > numStr = NumberToCString(cx, d, base, buf, sizeof buf); > if (!numStr) > return NULL; > s = JS_NewStringCopyZ(cx, numStr); > if (!(numStr >= buf && numStr < buf + sizeof buf)) > js_free(numStr); >+ data->dtoaCache.base = base; >+ data->dtoaCache.d = d; >+ data->dtoaCache.s = s; > return s; > } > > JSString * JS_FASTCALL > js_NumberToString(JSContext *cx, jsdouble d) > { > return js_NumberToStringWithBase(cx, d, 10); > }
Summary: TM: analyze v8-splay.js → TM: cache double-to-string conversions to speed up v8-splay.js
> >+ /* State used to cache some double-to-string conversions. Sometimes we > >+ * get lucky and are asked to convert the same double twice in a row. */ > > Comment style in the JS engine: > > /* > * > */ I'll fix. > > >+ struct { > >+ jsdouble d; > >+ jsint base; > >+ JSString *s; // if s==NULL, d and base are not valid > > Maybe stick to /* */ ? But // seems ok too // is used elsewhere for one-liners. Andreas on nit patrol, who would have thought? :P
Attached patch patch v3Splinter Review
Attachment #442329 - Attachment is obsolete: true
Summary: TM: cache double-to-string conversions to speed up v8-splay.js → TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js
Whiteboard: fixed-in-tracemonkey
Google has a bug tracking the problem with v8-splay.js: http://code.google.com/p/v8/issues/detail?id=690
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
What about caching or optimizing: string : 'String for key ' + key + ' in leaf node'? That is also called 32 times with the same key.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: