Closed
Bug 562553
Opened 14 years ago
Closed 14 years ago
TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 2 obsolete files)
2.51 KB,
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
I took a quick look at v8-splay.js with Cachegrind: -------------------------------------------------------------------------------- Ir -------------------------------------------------------------------------------- 3,926,401,052 PROGRAM TOTALS -------------------------------------------------------------------------------- Ir file:function -------------------------------------------------------------------------------- 1,549,212,505 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:js_dtostr 291,775,264 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:multadd 200,155,572 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_malloc 173,398,709 /home/njn/moz/ws0/js/src/optg32/../jsgc.cpp:js_CallGCMarker 172,717,716 ???:??? 118,097,974 /home/njn/moz/ws0/js/src/optg32/../jsops.cpp:js_Interpret 81,618,330 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_free 77,324,952 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:lshift Wow, what a lot of double-to-string conversions. They come from this function: Digging some more, we have lots of calls to js_NumberToString(). They are due to this function: function GeneratePayloadTree(depth, key) { if (depth == 0) { return { array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], string : 'String for key ' + key + ' in leaf node' }; } else { return { left: GeneratePayloadTree(depth - 1, key), right: GeneratePayloadTree(depth - 1, key) }; } } This is called with 'depth'==5 and 'key'==a random generated double. Because it's recursive, we end up converting 'key' 32 times. This function is called a zillion times. I tried changing the benchmark to do the conversion first, it ran about 2.3x faster: Index: tests/v8-v4/v8-splay.js =================================================================== --- tests/v8-v4/v8-splay.js (revision 52427) +++ tests/v8-v4/v8-splay.js (working copy) @@ -69,7 +69,7 @@ do { key = GenerateKey(); } while (splayTree.find(key) != null); - splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key)); + splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, "" + key)); return key; } If I instead generate the entire string first, the 2.3x improves to about 2.8x: Index: tests/v8-v4/v8-splay.js =================================================================== --- tests/v8-v4/v8-splay.js (revision 52427) +++ tests/v8-v4/v8-splay.js (working copy) @@ -45,7 +45,7 @@ if (depth == 0) { return { array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], - string : 'String for key ' + key + ' in leaf node' + string : key }; } else { return { @@ -69,7 +69,7 @@ do { key = GenerateKey(); } while (splayTree.find(key) != null); - splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key)); + splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, 'String for key ' + key + ' in leaf node')); return key; } The program has this rather amusing/inaccurate comment at the top: // This benchmark is based on a JavaScript log processing module used // by the V8 profiler to generate execution time profiles for runs of // JavaScript applications, and it effectively measures how fast the // JavaScript engine is at allocating nodes and reclaiming the memory // used for old nodes. Because of the way splay trees work, the engine // also has to deal with a lot of changes to the large tree object // graph. I see two options here. If we assume we can't change the benchmark, we should just cache the last double-to-string conversion -- a stupid hack for a stupid benchmark. Alternatively, we could file a bug with Google and try to get them to improve it. Or we could do both.
Comment 1•14 years ago
|
||
Hmm. Any idea how other engines handle this case?
Comment 2•14 years ago
|
||
Comment 3•14 years ago
|
||
++njn for the profiling work. nice find.
Assignee | ||
Comment 4•14 years ago
|
||
This is just Andreas's patch with a couple of extra comments. It looks good to me but I'm not expert on thread-local stuff so someone who knows about that should check it. I see a 2.3x speedup on i386/mac and 1.7x on X64/mac; not sure why there's such a discrepancy, it seems that js_dtostr is cheaper on 64-bit. These improvements result in respective 10% and 6% improvements for V8 overall.
Attachment #442320 -
Attachment is obsolete: true
Comment 5•14 years ago
|
||
Comment on attachment 442329 [details] [diff] [review] patch v2 >diff --git a/js/src/jscntxt.cpp b/js/src/jscntxt.cpp >--- a/js/src/jscntxt.cpp >+++ b/js/src/jscntxt.cpp >@@ -175,16 +175,18 @@ JSThreadData::purge(JSContext *cx) > if (cx->runtime->gcRegenShapes) > traceMonitor.needFlush = JS_TRUE; > #endif > > /* Destroy eval'ed scripts. */ > js_DestroyScriptsToGC(cx, this); > > js_PurgeCachedNativeEnumerators(cx, this); >+ >+ dtoaCache.s = NULL; > } > > void > JSThreadData::purgeGCFreeLists() > { > if (!localRootStack) { > gcFreeLists.purge(); > } else { >diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h >--- a/js/src/jscntxt.h >+++ b/js/src/jscntxt.h >@@ -551,16 +551,24 @@ struct JSThreadData { > > #ifdef JS_EVAL_CACHE_METERING > JSEvalCacheMeter evalCacheMeter; > #endif > > /* State used by dtoa.c. */ > DtoaState *dtoaState; > >+ /* State used to cache some double-to-string conversions. Sometimes we >+ * get lucky and are asked to convert the same double twice in a row. */ Comment style in the JS engine: /* * */ >+ struct { >+ jsdouble d; >+ jsint base; >+ JSString *s; // if s==NULL, d and base are not valid Maybe stick to /* */ ? But // seems ok too >+ } dtoaCache; >+ > /* > * Cache of reusable JSNativeEnumerators mapped by shape identifiers (as > * stored in scope->shape). This cache is nulled by the GC and protected > * by gcLock. > */ > #define NATIVE_ENUM_CACHE_LOG2 8 > #define NATIVE_ENUM_CACHE_MASK JS_BITMASK(NATIVE_ENUM_CACHE_LOG2) > #define NATIVE_ENUM_CACHE_SIZE JS_BIT(NATIVE_ENUM_CACHE_LOG2) >diff --git a/js/src/jsnum.cpp b/js/src/jsnum.cpp >--- a/js/src/jsnum.cpp >+++ b/js/src/jsnum.cpp >@@ -880,22 +880,28 @@ js_NumberToStringWithBase(JSContext *cx, > if (base == 10 && jsuint(i) < INT_STRING_LIMIT) > return JSString::intString(i); > if (jsuint(i) < jsuint(base)) { > if (i < 10) > return JSString::intString(i); > return JSString::unitString(jschar('a' + i - 10)); > } > } >+ JSThreadData *data = JS_THREAD_DATA(cx); >+ if (data->dtoaCache.s && data->dtoaCache.base == base && data->dtoaCache.d == d) >+ return data->dtoaCache.s; > numStr = NumberToCString(cx, d, base, buf, sizeof buf); > if (!numStr) > return NULL; > s = JS_NewStringCopyZ(cx, numStr); > if (!(numStr >= buf && numStr < buf + sizeof buf)) > js_free(numStr); >+ data->dtoaCache.base = base; >+ data->dtoaCache.d = d; >+ data->dtoaCache.s = s; > return s; > } > > JSString * JS_FASTCALL > js_NumberToString(JSContext *cx, jsdouble d) > { > return js_NumberToStringWithBase(cx, d, 10); > }
Assignee | ||
Updated•14 years ago
|
Summary: TM: analyze v8-splay.js → TM: cache double-to-string conversions to speed up v8-splay.js
Assignee | ||
Comment 6•14 years ago
|
||
> >+ /* State used to cache some double-to-string conversions. Sometimes we > >+ * get lucky and are asked to convert the same double twice in a row. */ > > Comment style in the JS engine: > > /* > * > */ I'll fix. > > >+ struct { > >+ jsdouble d; > >+ jsint base; > >+ JSString *s; // if s==NULL, d and base are not valid > > Maybe stick to /* */ ? But // seems ok too // is used elsewhere for one-liners. Andreas on nit patrol, who would have thought? :P
Assignee | ||
Comment 7•14 years ago
|
||
Assignee | ||
Updated•14 years ago
|
Attachment #442329 -
Attachment is obsolete: true
Updated•14 years ago
|
Attachment #442333 -
Flags: review+
Assignee | ||
Updated•14 years ago
|
Summary: TM: cache double-to-string conversions to speed up v8-splay.js → TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js
Assignee | ||
Comment 8•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/74ef6155d096
Whiteboard: fixed-in-tracemonkey
Assignee | ||
Comment 9•14 years ago
|
||
Google has a bug tracking the problem with v8-splay.js: http://code.google.com/p/v8/issues/detail?id=690
Comment 10•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/74ef6155d096
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment 11•14 years ago
|
||
What about caching or optimizing: string : 'String for key ' + key + ' in leaf node'? That is also called 32 times with the same key.
You need to log in
before you can comment on or make changes to this bug.
Description
•