Closed
Bug 562553
Opened 15 years ago
Closed 15 years ago
TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 2 obsolete files)
2.51 KB,
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
I took a quick look at v8-splay.js with Cachegrind:
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
3,926,401,052 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
1,549,212,505 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:js_dtostr
291,775,264 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:multadd
200,155,572 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_malloc
173,398,709 /home/njn/moz/ws0/js/src/optg32/../jsgc.cpp:js_CallGCMarker
172,717,716 ???:???
118,097,974 /home/njn/moz/ws0/js/src/optg32/../jsops.cpp:js_Interpret
81,618,330 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_free
77,324,952 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:lshift
Wow, what a lot of double-to-string conversions. They come from this function:
Digging some more, we have lots of calls to js_NumberToString(). They are due to this function:
function GeneratePayloadTree(depth, key) {
if (depth == 0) {
return {
array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ],
string : 'String for key ' + key + ' in leaf node'
};
} else {
return {
left: GeneratePayloadTree(depth - 1, key),
right: GeneratePayloadTree(depth - 1, key)
};
}
}
This is called with 'depth'==5 and 'key'==a random generated double. Because it's recursive, we end up converting 'key' 32 times. This function is called a zillion times.
I tried changing the benchmark to do the conversion first, it ran about 2.3x faster:
Index: tests/v8-v4/v8-splay.js
===================================================================
--- tests/v8-v4/v8-splay.js (revision 52427)
+++ tests/v8-v4/v8-splay.js (working copy)
@@ -69,7 +69,7 @@
do {
key = GenerateKey();
} while (splayTree.find(key) != null);
- splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key));
+ splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, "" + key));
return key;
}
If I instead generate the entire string first, the 2.3x improves to about 2.8x:
Index: tests/v8-v4/v8-splay.js
===================================================================
--- tests/v8-v4/v8-splay.js (revision 52427)
+++ tests/v8-v4/v8-splay.js (working copy)
@@ -45,7 +45,7 @@
if (depth == 0) {
return {
array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ],
- string : 'String for key ' + key + ' in leaf node'
+ string : key
};
} else {
return {
@@ -69,7 +69,7 @@
do {
key = GenerateKey();
} while (splayTree.find(key) != null);
- splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key));
+ splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, 'String for key ' + key + ' in leaf node'));
return key;
}
The program has this rather amusing/inaccurate comment at the top:
// This benchmark is based on a JavaScript log processing module used
// by the V8 profiler to generate execution time profiles for runs of
// JavaScript applications, and it effectively measures how fast the
// JavaScript engine is at allocating nodes and reclaiming the memory
// used for old nodes. Because of the way splay trees work, the engine
// also has to deal with a lot of changes to the large tree object
// graph.
I see two options here. If we assume we can't change the benchmark, we should just cache the last double-to-string conversion -- a stupid hack for a stupid benchmark.
Alternatively, we could file a bug with Google and try to get them to improve it.
Or we could do both.
Comment 1•15 years ago
|
||
Hmm. Any idea how other engines handle this case?
Comment 2•15 years ago
|
||
Comment 3•15 years ago
|
||
++njn for the profiling work. nice find.
Assignee | ||
Comment 4•15 years ago
|
||
This is just Andreas's patch with a couple of extra comments. It looks good to me but I'm not expert on thread-local stuff so someone who knows about that should check it.
I see a 2.3x speedup on i386/mac and 1.7x on X64/mac; not sure why there's such a discrepancy, it seems that js_dtostr is cheaper on 64-bit. These improvements result in respective 10% and 6% improvements for V8 overall.
Attachment #442320 -
Attachment is obsolete: true
Comment 5•15 years ago
|
||
Comment on attachment 442329 [details] [diff] [review]
patch v2
>diff --git a/js/src/jscntxt.cpp b/js/src/jscntxt.cpp
>--- a/js/src/jscntxt.cpp
>+++ b/js/src/jscntxt.cpp
>@@ -175,16 +175,18 @@ JSThreadData::purge(JSContext *cx)
> if (cx->runtime->gcRegenShapes)
> traceMonitor.needFlush = JS_TRUE;
> #endif
>
> /* Destroy eval'ed scripts. */
> js_DestroyScriptsToGC(cx, this);
>
> js_PurgeCachedNativeEnumerators(cx, this);
>+
>+ dtoaCache.s = NULL;
> }
>
> void
> JSThreadData::purgeGCFreeLists()
> {
> if (!localRootStack) {
> gcFreeLists.purge();
> } else {
>diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h
>--- a/js/src/jscntxt.h
>+++ b/js/src/jscntxt.h
>@@ -551,16 +551,24 @@ struct JSThreadData {
>
> #ifdef JS_EVAL_CACHE_METERING
> JSEvalCacheMeter evalCacheMeter;
> #endif
>
> /* State used by dtoa.c. */
> DtoaState *dtoaState;
>
>+ /* State used to cache some double-to-string conversions. Sometimes we
>+ * get lucky and are asked to convert the same double twice in a row. */
Comment style in the JS engine:
/*
*
*/
>+ struct {
>+ jsdouble d;
>+ jsint base;
>+ JSString *s; // if s==NULL, d and base are not valid
Maybe stick to /* */ ? But // seems ok too
>+ } dtoaCache;
>+
> /*
> * Cache of reusable JSNativeEnumerators mapped by shape identifiers (as
> * stored in scope->shape). This cache is nulled by the GC and protected
> * by gcLock.
> */
> #define NATIVE_ENUM_CACHE_LOG2 8
> #define NATIVE_ENUM_CACHE_MASK JS_BITMASK(NATIVE_ENUM_CACHE_LOG2)
> #define NATIVE_ENUM_CACHE_SIZE JS_BIT(NATIVE_ENUM_CACHE_LOG2)
>diff --git a/js/src/jsnum.cpp b/js/src/jsnum.cpp
>--- a/js/src/jsnum.cpp
>+++ b/js/src/jsnum.cpp
>@@ -880,22 +880,28 @@ js_NumberToStringWithBase(JSContext *cx,
> if (base == 10 && jsuint(i) < INT_STRING_LIMIT)
> return JSString::intString(i);
> if (jsuint(i) < jsuint(base)) {
> if (i < 10)
> return JSString::intString(i);
> return JSString::unitString(jschar('a' + i - 10));
> }
> }
>+ JSThreadData *data = JS_THREAD_DATA(cx);
>+ if (data->dtoaCache.s && data->dtoaCache.base == base && data->dtoaCache.d == d)
>+ return data->dtoaCache.s;
> numStr = NumberToCString(cx, d, base, buf, sizeof buf);
> if (!numStr)
> return NULL;
> s = JS_NewStringCopyZ(cx, numStr);
> if (!(numStr >= buf && numStr < buf + sizeof buf))
> js_free(numStr);
>+ data->dtoaCache.base = base;
>+ data->dtoaCache.d = d;
>+ data->dtoaCache.s = s;
> return s;
> }
>
> JSString * JS_FASTCALL
> js_NumberToString(JSContext *cx, jsdouble d)
> {
> return js_NumberToStringWithBase(cx, d, 10);
> }
Assignee | ||
Updated•15 years ago
|
Summary: TM: analyze v8-splay.js → TM: cache double-to-string conversions to speed up v8-splay.js
Assignee | ||
Comment 6•15 years ago
|
||
> >+ /* State used to cache some double-to-string conversions. Sometimes we
> >+ * get lucky and are asked to convert the same double twice in a row. */
>
> Comment style in the JS engine:
>
> /*
> *
> */
I'll fix.
>
> >+ struct {
> >+ jsdouble d;
> >+ jsint base;
> >+ JSString *s; // if s==NULL, d and base are not valid
>
> Maybe stick to /* */ ? But // seems ok too
// is used elsewhere for one-liners.
Andreas on nit patrol, who would have thought? :P
Assignee | ||
Comment 7•15 years ago
|
||
Assignee | ||
Updated•15 years ago
|
Attachment #442329 -
Attachment is obsolete: true
Updated•15 years ago
|
Attachment #442333 -
Flags: review+
Assignee | ||
Updated•15 years ago
|
Summary: TM: cache double-to-string conversions to speed up v8-splay.js → TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js
Assignee | ||
Comment 8•15 years ago
|
||
Whiteboard: fixed-in-tracemonkey
Assignee | ||
Comment 9•15 years ago
|
||
Google has a bug tracking the problem with v8-splay.js:
http://code.google.com/p/v8/issues/detail?id=690
Comment 10•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment 11•15 years ago
|
||
What about caching or optimizing: string : 'String for key ' + key + ' in leaf node'?
That is also called 32 times with the same key.
You need to log in
before you can comment on or make changes to this bug.
Description
•