Closed
Bug 587257
Opened 14 years ago
Closed 13 years ago
Make Array.prototype.join faster
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: dmandelin, Assigned: luke)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(4 files, 10 obsolete files)
1.59 KB,
patch
|
Details | Diff | Splinter Review | |
810 bytes,
patch
|
cdleary
:
review+
|
Details | Diff | Splinter Review |
6.18 KB,
patch
|
Waldo
:
review+
|
Details | Diff | Splinter Review |
14.34 KB,
patch
|
Waldo
:
review+
|
Details | Diff | Splinter Review |
We are 1.3x slower than JSC on this microbenchmark: function f(a) { var s; for (var i = 0; i < 100000; ++i) { s = a.join(''); } } var a = new Array(60); for (var i = 0; i < 60; ++i) a[i] = 'c'; This is costing us 2 ms on fasta, and probably more on other benchmarks. var t0 = new Date; f(a); print(new Date - t0);
Summary: Make String.join faster → Make Array.prototype.join faster
Comment 1•14 years ago
|
||
Sunspider calls Array.prototype.join 2837 times (V8 benchmark does not use it). Most of these calls have an empty separator (''), 935 calls in fasta, 1001 in format-tofte, 3 in aes and probably some others. V8 and JSC special-case this so they don't append an empty separator for each item, and we can remove the "index + 1 != length" check. Not sure how much this helps, but it could be a start.
Assignee | ||
Comment 2•14 years ago
|
||
Speeding up array_join_sub (bug 200505 comment 35) was my first bug, so I'd be happy to take this, but not for a month or so. Comment 1 looks like a great start.
Comment 3•14 years ago
|
||
No difference on SunSpider (haven't sharked yet), but I cooked up a little microbenchmark for string-join, and this patch does seem to win there: TEST COMPARISON FROM TO ============================================================================= ** TOTAL **: - 3157.1ms +/- 5.1% 3013.4ms +/- 3.3% ============================================================================= join: - 3157.1ms +/- 5.1% 3013.4ms +/- 3.3% empty-sep: 1.086x as fast 630.8ms +/- 0.2% 581.0ms +/- 0.6% single-sep: 1.027x as fast 695.4ms +/- 1.0% 677.3ms +/- 1.4% long-sep: - 1830.9ms +/- 8.8% 1755.1ms +/- 5.5%
Comment 4•14 years ago
|
||
Comment 5•14 years ago
|
||
Looking at the Shark profiles, I'd say the harness underestimates the win here, because here's a lot of time spent shutting down.
Comment 6•14 years ago
|
||
Would it make sense to extend this to empty-sep + dense array? iirc this is true for the string-fasta benchmark and probably others. This could remove the call to GetArrayElement and the repeated checks done there. With my limited SM-knowledge I don't know if this is possible/desirable/worthwhile though...
Comment 8•14 years ago
|
||
(In reply to comment #6) > Would it make sense to extend this to empty-sep + dense array? That's a good idea. Besides, our behavior here looks pretty suboptimal for extremely sparse arrays.
Comment 9•14 years ago
|
||
Comment 10•14 years ago
|
||
dmandelin's benchmark from comment #0: TM tip: 143ms JSC tip: 113ms patched: 90ms
Comment 11•14 years ago
|
||
better results on my microbenchmark, too: TEST COMPARISON FROM TO ============================================================================= ** TOTAL **: - 3245.5ms +/- 13.8% 3196.1ms +/- 20.6% ============================================================================= join: - 3245.5ms +/- 13.8% 3196.1ms +/- 20.6% empty-sep: 1.22x as fast 630.2ms +/- 0.5% 518.4ms +/- 0.3% single-sep: 1.067x as fast 689.6ms +/- 2.2% 646.3ms +/- 2.5% long-sep: ?? 1925.7ms +/- 23.6% 2031.4ms +/- 32.8%
Comment 12•14 years ago
|
||
We can do more here, but I think this win is good enough to take for now.
Assignee: general → sayrer
Attachment #467786 -
Attachment is obsolete: true
Attachment #467863 -
Attachment is obsolete: true
Attachment #467873 -
Flags: review?(lw)
Assignee | ||
Comment 13•14 years ago
|
||
Mmmm, AutoArrayCycleDetector is nice. One comment: HashSet's default allocator is the ContextAllocPolicy, which reports OOM, so AutoArrayCycleDetector doesn't need to.
Comment 14•14 years ago
|
||
Comment on attachment 467873 [details] [diff] [review] cleanup, some renames Drive-by nits: >+typedef js::HashSet<JSObject *> ObjSet; No js:: required. >+class AutoArrayCycleDetector >+{ >+ public: >+ AutoArrayCycleDetector(JSContext *cx, JSObject *obj) >+ : mContext(cx), mObject(obj), mGenBefore(0), mHashPointer(mContext->busyArrays.lookupForAdd(mObject)) { Here is where the style guide says (or should say) put the { on its own line. Our four-space indentation otherwise makes it hard to see quickly where the body starts. >+ if ((!(rval->isMagic(JS_ARRAY_HOLE) || rval->isNullOrUndefined())) The ! expression doesn't need parens around it. > && !GetElementCharacters(cx, obj, locale, &cb, rval)) ..1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 Way over the 100 column limit, and here is a case where an outer if testing just the !(... || ...) and an inner testing the fallible call would win, IMHO. >+ if ((!(hole || rval->isNullOrUndefined())) && !GetElementCharacters(cx, obj, locale, &cb, rval)) Same nits here. /be
Comment 15•14 years ago
|
||
Attachment #467873 -
Attachment is obsolete: true
Attachment #467911 -
Flags: review?(lw)
Attachment #467873 -
Flags: review?(lw)
Assignee | ||
Comment 16•14 years ago
|
||
Comment on attachment 467911 [details] [diff] [review] address comments from lw and brendan Looks good. >+class AutoArrayCycleDetector >+{ >+ public: >+ AutoArrayCycleDetector(JSContext *cx, JSObject *obj) >+ : mContext(cx), mObject(obj), mGenBefore(0), >+ mHashPointer(mContext->busyArrays.lookupForAdd(mObject)) >+ { >+ JS_GUARD_OBJECT_NOTIFIER_INIT; >+ } I think you need a JS_GUARD_OBJECT_NOTIFIER_PARAM in the ctor parameter list for it to work. Also, the naming convention for member variables has changed from mFoo to just foo.
Attachment #467911 -
Flags: review?(lw) → review+
Assignee | ||
Comment 17•14 years ago
|
||
Three ideas for further speedup: 1. give js_ValueToCharBuffer an inline fast path for v.isString() 2. hoist "if (local)" out of the dense array loop, so we don't check on every iteration 3. I have been repeatedly disappointed by the code produced by gcc on loops over an index that go through the obj->getXYZ(i) inline helpers. Rewrite the array loop to use: for (Value *vp = obj->getDenseArrayElements(), *end = vp + len; vp != end; ++vp)
Comment 18•14 years ago
|
||
Comment on attachment 467911 [details] [diff] [review] address comments from lw and brendan >+ if (!(rval->isMagic(JS_ARRAY_HOLE) || rval->isNullOrUndefined())) { Applying De Morgan's theorem seems good here: >+ if (!rval->isMagic(JS_ARRAY_HOLE) && !rval->isNullOrUndefined()) { "If noot a hole and not null or undefined, then get the element." >+ if (!(hole || rval->isNullOrUndefined())) { Ditto. /be
Comment 19•14 years ago
|
||
patch changed enough to need re-review.
Attachment #467911 -
Attachment is obsolete: true
Attachment #468046 -
Flags: review?(lw)
Comment 20•14 years ago
|
||
(In reply to comment #17) > Three ideas for further speedup: > 1. give js_ValueToCharBuffer an inline fast path for v.isString() I copied the relevant part into GetElementCharacters, and that took us from 90ms to 73ms on dmandelin's test. > 2. hoist "if (local)" out of the dense array loop, so we don't check on every > iteration I deleted the locale check altogether, and didn't get a speedup. > 3. I have been repeatedly disappointed by the code produced by gcc on loops > over an index that go through the obj->getXYZ(i) inline helpers. Rewrite the > array loop to use: > > for (Value *vp = obj->getDenseArrayElements(), *end = vp + len; vp != end; > ++vp) This worked as well, taking us from 73ms to 65ms on dmandelin's test. new results for dmandelin's benchmark from comment #0: TM tip: 143ms JSC tip: 113ms patched: 65ms
Comment 21•14 years ago
|
||
This patch seems to get me a reliable, small speedup on SunSpider itself.
Comment 22•14 years ago
|
||
Attachment #468046 -
Attachment is obsolete: true
Attachment #468047 -
Flags: review?(lw)
Attachment #468046 -
Flags: review?(lw)
Assignee | ||
Comment 23•14 years ago
|
||
Comment on attachment 468047 [details] [diff] [review] clean up the inline Far out! >+ AutoArrayCycleDetector(JSContext *cx, JSObject *obj) >+ : context(cx), object(obj), genBefore(0), >+ hashPointer(context->busyArrays.lookupForAdd(object) >+ JS_GUARD_OBJECT_NOTIFIER_PARAM) AutoArrayCycleDetector(JSContext *cx, JSObject *obj JS_GUARD_OBJECT_NOTIFIER_PARAM) >+ for (Value *vp = obj->getDenseArrayElements(), *end = vp + len; vp != end; ++vp) { >+ if (!JS_CHECK_OPERATION_LIMIT(cx)) >+ return false; >+ *rval = *vp; >+ if (!rval->isMagic(JS_ARRAY_HOLE) && !rval->isNullOrUndefined()) { >+ if (!GetElementCharacters(cx, obj, locale, &cb, rval)) >+ return false; >+ } >+ } One last potential optimization: I think we can remove the "*rval = *vp" assignment and use rval directly in the subsequent test/call. However, this also requires changing GetElementCharacters to not mutate its argument, which we can do using our new stack-scanning superpowers: just pass rval by const Value& and take out the rval.setObjectOrNull. (Also, since rval would no longer be an outparam, cb should probably be the last parameter.)
Comment 24•14 years ago
|
||
(In reply to comment #23) > > AutoArrayCycleDetector(JSContext *cx, JSObject *obj > JS_GUARD_OBJECT_NOTIFIER_PARAM) > uh, duh. > just pass rval by const > Value& and take out the rval.setObjectOrNull. I don't think this idea will work without a lot more refactoring, because of js_TryMethod. I commented out the locale parameter just to get an idea of the speedup without the assignment. dmandelin's benchmark went from 65ms to 63ms. I don't think it's worth chasing right now.
Comment 25•14 years ago
|
||
Attachment #468047 -
Attachment is obsolete: true
Attachment #468157 -
Flags: review?(lw)
Attachment #468047 -
Flags: review?(lw)
Assignee | ||
Updated•14 years ago
|
Attachment #468157 -
Flags: review?(lw) → review+
Comment 26•14 years ago
|
||
this passes all the tests.
Attachment #468157 -
Attachment is obsolete: true
Attachment #468168 -
Flags: review?(lw)
Comment 27•14 years ago
|
||
bugzilla interdiff works: https://bugzilla.mozilla.org/attachment.cgi?oldid=468157&action=interdiff&newid=468168&headers=1
Comment 28•14 years ago
|
||
I did my usual Cachegrind run on Sayre's latest patch (on TM, not JM): --------------------------------------------------------------- | millions of instructions executed | | total | on-trace (may overestimate) | --------------------------------------------------------------- | 110.883 110.883 (------) | 20.808 20.808 (------) | 3d-cube | 56.935 56.936 (------) | 22.313 22.313 (------) | 3d-morph | 145.607 145.641 (------) | 21.968 21.966 (------) | 3d-raytrace | 76.966 76.966 (------) | 17.407 17.407 (------) | access-binary- | 183.411 183.412 (------) | 103.705 103.705 (------) | access-fannkuc | 41.358 41.358 (------) | 17.552 17.552 (------) | access-nbody | 59.988 59.989 (------) | 27.013 27.013 (------) | access-nsieve | 15.671 15.672 (------) | 2.862 2.862 (------) | bitops-3bit-bi | 43.281 43.282 (------) | 30.281 30.281 (------) | bitops-bits-in | 22.749 22.751 (------) | 10.219 10.219 (------) | bitops-bitwise | 71.607 71.609 (------) | 39.453 39.453 (------) | bitops-nsieve- | 42.348 42.350 (------) | 26.655 26.655 (------) | controlflow-re | 46.905 46.907 (------) | 5.661 5.661 (------) | crypto-md5 | 31.207 31.209 (------) | 7.109 7.109 (------) | crypto-sha1 | 114.340 113.398 (1.008x) | 11.936 11.912 (1.002x) | date-format-to | 86.148 86.149 (------) | 10.620 10.620 (------) | date-format-xp | 52.587 52.588 (------) | 27.614 27.614 (------) | math-cordic | 28.962 28.962 (------) | 6.166 6.166 (------) | math-partial-s | 28.456 28.456 (------) | 10.286 10.286 (------) | math-spectral- | 57.966 57.966 (------) | 34.591 34.591 (------) | regexp-dna | 40.441 40.441 (------) | 10.397 10.397 (------) | string-base64 | 122.858 117.574 (1.045x) | 24.544 24.430 (1.005x) | string-fasta | 185.180 185.181 (------) | 11.439 11.439 (------) | string-tagclou | 180.737 180.737 (------) | 10.931 10.931 (------) | string-unpack- | 51.295 51.295 (------) | 8.419 8.419 (------) | string-validat ------- | 1897.897 1891.722 (1.003x) | 519.958 519.819 (------) | all So I can believe it would save 1 or 2 ms overall.
Assignee | ||
Comment 29•14 years ago
|
||
Comment on attachment 468168 [details] [diff] [review] fix thinko in js_ValueToCharBuffer, fix control flow in AutoArrayCycleDetector >+ AutoArrayCycleDetector(JSContext *cx, JSObject *obj JS_GUARD_OBJECT_NOTIFIER_PARAM) >+ : context(cx), object(obj), genBefore(0), >+ hashPointer(context->busyArrays.lookupForAdd(object)), cycle(!!hashPointer) Style seems to be to unindent the ": context(cx)" line 2 spaces so context lines up with hashPointer. >+ ~AutoArrayCycleDetector() { >+ JS_ASSERT(context); >+ if (!cycle) { >+ if (genBefore == context->busyArrays.generation()) >+ context->busyArrays.remove(hashPointer); >+ else >+ context->busyArrays.remove(object); >+ } >+ } I think there is a problem if ~AutoArrayCycleDetector is called with !cycle, context->busyArrays.generation() == 0 and !hashPointer. I believe this could happen if AutoArrayCycleDetector::update fails. The fix seems to be changing "if (!cycle)" to "if (!cycle && hashPointer)".
Comment 30•14 years ago
|
||
Attachment #468168 -
Attachment is obsolete: true
Attachment #468370 -
Flags: review?(lw)
Attachment #468168 -
Flags: review?(lw)
Assignee | ||
Updated•14 years ago
|
Attachment #468370 -
Flags: review?(lw) → review+
Comment 31•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/b404ad209cb9
Status: NEW → ASSIGNED
Whiteboard: fixed-in-tracemonkey
Comment 32•14 years ago
|
||
I had to back this out due to in-browser jstest failures. (it passed in the shell here). http://hg.mozilla.org/tracemonkey/rev/c3c185d6ee88
Whiteboard: fixed-in-tracemonkey
Comment 33•14 years ago
|
||
Did this get tracked down / fixed?
Comment 34•13 years ago
|
||
Ping?
Comment 35•13 years ago
|
||
Tom, everyone's really busy fixing blockers for Firefox 4.0. Nobody will have a chance to look at this until it's released. Thanks for your patience!
Assignee | ||
Comment 36•13 years ago
|
||
I'll take this over if you don't mind. I rebased the patches and fixed two bugs (lack of check for js_ProrotypeHasIndexedProperties and an unlikely case with the guards and OOM). Using a few-months-old jsc and d8 (no crankshaft), I get: trunk: 250ms jsc: 244ms d8: 155ms patch: 101ms SS also shows a reliable 1-2ms speedup, mostly in fasta as predicted.
Assignee: sayrer → luke
Assignee | ||
Comment 37•13 years ago
|
||
Attachment #522852 -
Flags: review?(cdleary)
Assignee | ||
Comment 38•13 years ago
|
||
Might as well apply the same cleanup to array_toSource.
Attachment #522853 -
Flags: review?(jwalden+bmo)
Assignee | ||
Comment 39•13 years ago
|
||
Attachment #468370 -
Attachment is obsolete: true
Attachment #522855 -
Flags: review?(jwalden+bmo)
Assignee | ||
Comment 40•13 years ago
|
||
(Green on try)
Updated•13 years ago
|
Attachment #522852 -
Flags: review?(cdleary) → review+
Comment 41•13 years ago
|
||
Comment on attachment 522853 [details] [diff] [review] clean up array_toSource in the same way as array_toString_sub This detector class is such a great idea.[0] >diff --git a/js/src/jsarray.cpp b/js/src/jsarray.cpp >+ ~ArraySharpDetector() { >+ if (chars) >+ cx->free(chars); >+ if (he && !sharp) >+ js_LeaveSharpObject(cx, NULL); How can |he| ever be NULL here? Seems like the last test should just be |!sharp|, as it was previously. >- /* Cycles/joins are indicated by sharp objects. */ > #if JS_HAS_SHARP_VARS >- if (IS_SHARP(he)) { >- JS_ASSERT(sharpchars != 0); >- sb.replaceRawBuffer(sharpchars, js_strlen(sharpchars)); >+ if (detector.initiallySharp()) { >+ jschar *chars = detector.takeSharpChars(); >+ sb.replaceRawBuffer(chars, js_strlen(chars)); I think you lost a |chars != NULL| assertion here. Looks like that can go in takeSharpChars, I think. > /* Get element's character string. */ > JSString *str; > if (hole) { > str = cx->runtime->emptyString; > } else { > str = js_ValueToSource(cx, *vp); > if (!str) >- goto out; >+ return false; >+ vp->setString(str); Is this purely anchoring? >+ JS_SET_RVAL(cx, vp, StringValue(str)); >+ return true; Why not assign to |*vp|? More traditional, and I suspect the set-rval system isn't going to be any more future-proof if we ever do the pass-arguments-and-length-as-combined-class thing. > /* Finalize the buffer. */ >+ JSString *str; >+ str = sb.finishString(); >+ if (!str) > goto out; Please to be unifying the declaration and initialization. 0. Were you maybe expecting a different adjective?
Attachment #522853 -
Flags: review?(jwalden+bmo) → review+
Assignee | ||
Comment 42•13 years ago
|
||
(In reply to comment #41) > How can |he| ever be NULL here? Seems like the last test should just be > |!sharp|, as it was previously. If js_EnterSharpObject returns NULL. > I think you lost a |chars != NULL| assertion here. Looks like that can go in > takeSharpChars, I think. From previous nits, SM style is to generally not assert != NULL since you will get a clean unexploitable crash if it is NULL. > Is this purely anchoring? Yes, I think I was just preserving code here, but it looks like I can rely on stack scanning here and nix the anchoring. > >+ JS_SET_RVAL(cx, vp, StringValue(str)); > >+ return true; > > Why not assign to |*vp|? More traditional, and I suspect the set-rval system > isn't going to be any more future-proof if we ever do the > pass-arguments-and-length-as-combined-class thing. I rather prefer JS_SET_RVAL/JS_ARGV/JS_CALLEE to manual vp manipulation. Plus, this greases the path for bug 580337. > >+ JSString *str; > >+ str = sb.finishString(); > > Please to be unifying the declaration and initialization. Verily, the next patch (still r?you) does just that. (It is currently this way because of the "goto crosses initialization" error.)
Comment 43•13 years ago
|
||
Comment on attachment 522855 [details] [diff] [review] Sayre's patch, rebased and twiddled >diff --git a/js/src/jshashtable.h b/js/src/jshashtable.h >+ /* Leaves Ptr uninitialized. */ >+ Ptr() {} >+ >@@ -137,16 +140,19 @@ class HashTable : private AllocPolicy >+ /* Leaves AddPtr uninitialized. */ >+ AddPtr() {} Put something deadbeefy here maybe? >diff --git a/js/src/jsstr.cpp b/js/src/jsstr.cpp > /* This function implements E-262-3 section 9.8, toString. */ > bool >-js::ValueToStringBuffer(JSContext *cx, const Value &arg, StringBuffer &sb) >+js::ValueToStringBufferSlow(JSContext *cx, const Value &arg, StringBuffer &sb) > { > Value v = arg; > if (v.isObject() && !DefaultValue(cx, &v.toObject(), JSTYPE_STRING, &v)) > return false; > >- if (v.isString()) { >- JSString *str = v.toString(); >- size_t length = str->length(); >- const jschar *chars = str->getChars(cx); >- if (!chars) >- return false; >- return sb.append(chars, length); >- } >+ if (v.isString()) >+ return sb.append(v.toString()); > if (v.isNumber()) > return NumberValueToStringBuffer(cx, v, sb); > if (v.isBoolean()) > return BooleanToStringBuffer(cx, v.toBoolean(), sb); > if (v.isNull()) > return sb.append(cx->runtime->atomState.nullAtom); > JS_ASSERT(v.isUndefined()); > return sb.append(cx->runtime->atomState.typeAtoms[JSTYPE_VOID]); Should this perhaps require callers who might have a string to use the inline version that pre-handles the string case? Then make this assert non-stringness. >diff --git a/js/src/tests/ecma/Array/15.4.4.3-2.js b/js/src/tests/ecma/Array/15.4.4.3-2.js Add some lines that test when the property is set both directly on the object and up the prototype chain, as either null or undefined.
Attachment #522855 -
Flags: review?(jwalden+bmo) → review+
Assignee | ||
Comment 44•13 years ago
|
||
(In reply to comment #43) > Should this perhaps require callers who might have a string to use the inline > version that pre-handles the string case? I thought about it, but, because of the DefaultValue call above, we already have to check for string, so I didn't see any reason to restrict usage.
Assignee | ||
Comment 45•13 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/4b49ec2083b1 http://hg.mozilla.org/tracemonkey/rev/aa74d34a1096 http://hg.mozilla.org/tracemonkey/rev/a60ea67d1471
Whiteboard: fixed-in-tracemonkey
Comment 46•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/4b49ec2083b1 http://hg.mozilla.org/mozilla-central/rev/aa74d34a1096 http://hg.mozilla.org/mozilla-central/rev/a60ea67d1471
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•