Closed Bug 562553 Opened 15 years ago Closed 15 years ago

TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 2 obsolete files)

quick and dirty implementation of Nick's idea 15 years ago Andreas Gal :gal 2.30 KB, patch		Details \| Diff \| Splinter Review
patch v2 15 years ago Nicholas Nethercote [inactive] 2.50 KB, patch		Details \| Diff \| Splinter Review
patch v3 15 years ago Nicholas Nethercote [inactive] 2.51 KB, patch	cdleary : review+	Details \| Diff \| Splinter Review

Nicholas Nethercote [inactive]

Assignee

Description

•

15 years ago

I took a quick look at v8-splay.js with Cachegrind: -------------------------------------------------------------------------------- Ir -------------------------------------------------------------------------------- 3,926,401,052 PROGRAM TOTALS -------------------------------------------------------------------------------- Ir file:function -------------------------------------------------------------------------------- 1,549,212,505 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:js_dtostr 291,775,264 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:multadd 200,155,572 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_malloc 173,398,709 /home/njn/moz/ws0/js/src/optg32/../jsgc.cpp:js_CallGCMarker 172,717,716 ???:??? 118,097,974 /home/njn/moz/ws0/js/src/optg32/../jsops.cpp:js_Interpret 81,618,330 /build/buildd/eglibc-2.10.1/malloc/malloc.c:_int_free 77,324,952 /home/njn/moz/ws0/js/src/optg32/../dtoa.c:lshift Wow, what a lot of double-to-string conversions. They come from this function: Digging some more, we have lots of calls to js_NumberToString(). They are due to this function: function GeneratePayloadTree(depth, key) { if (depth == 0) { return { array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], string : 'String for key ' + key + ' in leaf node' }; } else { return { left: GeneratePayloadTree(depth - 1, key), right: GeneratePayloadTree(depth - 1, key) }; } } This is called with 'depth'==5 and 'key'==a random generated double. Because it's recursive, we end up converting 'key' 32 times. This function is called a zillion times. I tried changing the benchmark to do the conversion first, it ran about 2.3x faster: Index: tests/v8-v4/v8-splay.js =================================================================== --- tests/v8-v4/v8-splay.js (revision 52427) +++ tests/v8-v4/v8-splay.js (working copy) @@ -69,7 +69,7 @@ do { key = GenerateKey(); } while (splayTree.find(key) != null); - splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key)); + splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, "" + key)); return key; } If I instead generate the entire string first, the 2.3x improves to about 2.8x: Index: tests/v8-v4/v8-splay.js =================================================================== --- tests/v8-v4/v8-splay.js (revision 52427) +++ tests/v8-v4/v8-splay.js (working copy) @@ -45,7 +45,7 @@ if (depth == 0) { return { array : [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ], - string : 'String for key ' + key + ' in leaf node' + string : key }; } else { return { @@ -69,7 +69,7 @@ do { key = GenerateKey(); } while (splayTree.find(key) != null); - splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, key)); + splayTree.insert(key, GeneratePayloadTree(kSplayTreePayloadDepth, 'String for key ' + key + ' in leaf node')); return key; } The program has this rather amusing/inaccurate comment at the top: // This benchmark is based on a JavaScript log processing module used // by the V8 profiler to generate execution time profiles for runs of // JavaScript applications, and it effectively measures how fast the // JavaScript engine is at allocating nodes and reclaiming the memory // used for old nodes. Because of the way splay trees work, the engine // also has to deal with a lot of changes to the large tree object // graph. I see two options here. If we assume we can't change the benchmark, we should just cache the last double-to-string conversion -- a stupid hack for a stupid benchmark. Alternatively, we could file a bug with Google and try to get them to improve it. Or we could do both.

Boris Zbarsky [:bzbarsky]

Comment 1

•

15 years ago

Hmm. Any idea how other engines handle this case?

Andreas Gal :gal

Comment 2

•

15 years ago

Attached patch quick and dirty implementation of Nick's idea (obsolete) — Details — Splinter Review

Andreas Gal :gal

Comment 3

•

15 years ago

++njn for the profiling work. nice find.

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

15 years ago

Attached patch patch v2 (obsolete) — Details — Splinter Review

This is just Andreas's patch with a couple of extra comments. It looks good to me but I'm not expert on thread-local stuff so someone who knows about that should check it. I see a 2.3x speedup on i386/mac and 1.7x on X64/mac; not sure why there's such a discrepancy, it seems that js_dtostr is cheaper on 64-bit. These improvements result in respective 10% and 6% improvements for V8 overall.

Attachment #442320 - Attachment is obsolete: true

Andreas Gal :gal

Comment 5

•

15 years ago

Comment on attachment 442329 [details] [diff] [review] patch v2 >diff --git a/js/src/jscntxt.cpp b/js/src/jscntxt.cpp >--- a/js/src/jscntxt.cpp >+++ b/js/src/jscntxt.cpp >@@ -175,16 +175,18 @@ JSThreadData::purge(JSContext *cx) > if (cx->runtime->gcRegenShapes) > traceMonitor.needFlush = JS_TRUE; > #endif > > /* Destroy eval'ed scripts. */ > js_DestroyScriptsToGC(cx, this); > > js_PurgeCachedNativeEnumerators(cx, this); >+ >+ dtoaCache.s = NULL; > } > > void > JSThreadData::purgeGCFreeLists() > { > if (!localRootStack) { > gcFreeLists.purge(); > } else { >diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h >--- a/js/src/jscntxt.h >+++ b/js/src/jscntxt.h >@@ -551,16 +551,24 @@ struct JSThreadData { > > #ifdef JS_EVAL_CACHE_METERING > JSEvalCacheMeter evalCacheMeter; > #endif > > /* State used by dtoa.c. */ > DtoaState *dtoaState; > >+ /* State used to cache some double-to-string conversions. Sometimes we >+ * get lucky and are asked to convert the same double twice in a row. */ Comment style in the JS engine: /* * */ >+ struct { >+ jsdouble d; >+ jsint base; >+ JSString *s; // if s==NULL, d and base are not valid Maybe stick to /* */ ? But // seems ok too >+ } dtoaCache; >+ > /* > * Cache of reusable JSNativeEnumerators mapped by shape identifiers (as > * stored in scope->shape). This cache is nulled by the GC and protected > * by gcLock. > */ > #define NATIVE_ENUM_CACHE_LOG2 8 > #define NATIVE_ENUM_CACHE_MASK JS_BITMASK(NATIVE_ENUM_CACHE_LOG2) > #define NATIVE_ENUM_CACHE_SIZE JS_BIT(NATIVE_ENUM_CACHE_LOG2) >diff --git a/js/src/jsnum.cpp b/js/src/jsnum.cpp >--- a/js/src/jsnum.cpp >+++ b/js/src/jsnum.cpp >@@ -880,22 +880,28 @@ js_NumberToStringWithBase(JSContext *cx, > if (base == 10 && jsuint(i) < INT_STRING_LIMIT) > return JSString::intString(i); > if (jsuint(i) < jsuint(base)) { > if (i < 10) > return JSString::intString(i); > return JSString::unitString(jschar('a' + i - 10)); > } > } >+ JSThreadData *data = JS_THREAD_DATA(cx); >+ if (data->dtoaCache.s && data->dtoaCache.base == base && data->dtoaCache.d == d) >+ return data->dtoaCache.s; > numStr = NumberToCString(cx, d, base, buf, sizeof buf); > if (!numStr) > return NULL; > s = JS_NewStringCopyZ(cx, numStr); > if (!(numStr >= buf && numStr < buf + sizeof buf)) > js_free(numStr); >+ data->dtoaCache.base = base; >+ data->dtoaCache.d = d; >+ data->dtoaCache.s = s; > return s; > } > > JSString * JS_FASTCALL > js_NumberToString(JSContext *cx, jsdouble d) > { > return js_NumberToStringWithBase(cx, d, 10); > }

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Summary: TM: analyze v8-splay.js → TM: cache double-to-string conversions to speed up v8-splay.js

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

15 years ago

> >+ /* State used to cache some double-to-string conversions. Sometimes we > >+ * get lucky and are asked to convert the same double twice in a row. */ > > Comment style in the JS engine: > > /* > * > */ I'll fix. > > >+ struct { > >+ jsdouble d; > >+ jsint base; > >+ JSString *s; // if s==NULL, d and base are not valid > > Maybe stick to /* */ ? But // seems ok too // is used elsewhere for one-liners. Andreas on nit patrol, who would have thought? :P

Nicholas Nethercote [inactive]

Assignee

Comment 7

•

15 years ago

Attached patch patch v3 — Details — Splinter Review

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Attachment #442329 - Attachment is obsolete: true

Chris Leary [:cdleary] (not checking bugmail)

Updated

•

15 years ago

Attachment #442333 - Flags: review+

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Summary: TM: cache double-to-string conversions to speed up v8-splay.js → TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/74ef6155d096

Whiteboard: fixed-in-tracemonkey

Nicholas Nethercote [inactive]

Assignee

Comment 9

•

15 years ago

Google has a bug tracking the problem with v8-splay.js: http://code.google.com/p/v8/issues/detail?id=690

Robert Sayre

Comment 10

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/74ef6155d096

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Alfred Kayser

Comment 11

•

15 years ago

What about caching or optimizing: string : 'String for key ' + key + ' in leaf node'? That is also called 32 times with the same key.

You need to log in before you can comment on or make changes to this bug.

Bugzilla

TM: stupidly cache double-to-string conversions to speed up stupid v8-splay.js

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Crash Data

Security

(public)

User Story

Attachments

(1 file, 2 obsolete files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Updated

Updated

Updated

Comment 8

Comment 9

Comment 10

Comment 11

Attachment

General

Description

File Name

Content Type