Closed
Bug 596026
Opened 14 years ago
Closed 14 years ago
js_GetPropertyHelper() called too many times for string-fasta.js
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 2 obsolete files)
9.06 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
Under JM+TM, js_GetPropertyHelper() is called 175,155 times. cdleary says that there's a PIC that should be catching these but it seems it is not. The properties being accessed are just normal ones: IUB.a, HomoSap.t, etc.
Updated•14 years ago
|
Blocks: JaegerSpeed
Comment 1•14 years ago
|
||
Okay, so without the getelem PIC, we hit the stub like so: getelem stub (in cycles): Mean: 1109.57 Min: 811 Max: 222309 StdDev: 2342.99 Count: 175152 Total: 1.94343e+08 ms @ 3GHz: 64.7811 With the PIC enabled, we hit it thusly: getelem stub (in cycles): Mean: 1047.41 Min: 849 Max: 214416 StdDev: 1808.62 Count: 27769 Total: 2.90856e+07 ms @ 3GHz: 9.6952 So we're getting 84% of all of them, but that's another 16% (25k misses) we might be able to pick up for 10ms! Looking into why we're hitting the stub now.
Comment 2•14 years ago
|
||
Ugh, nevermind those numbers, accidentally was measuring a debug build.
Comment 3•14 years ago
|
||
I've measured this before in bug 578761 comment 48 -- either something changed getelem stub (in cycles): Mean: 175.835 Min: 130 Max: 223732 StdDev: 2129.62 Count: 27769 Total: 4.88275e+06 ms @ 3GHz: 1.62758 Increasing PIC entries to 32: getelem stub (in cycles): Mean: 2638.5 Min: 1323 Max: 3954 StdDev: 1860.4 Count: 2 Total: 5277 ms @ 3GHz: 0.001759 We can do this for a 1-3ms win if we think it's worth it -- if so, I'll run the rest of the suite to make sure it's not negatively affected. Still, this only accounts for 30k of the 175k that Nick is seeing -- will set some breakpoints.
Comment 4•14 years ago
|
||
(In reply to comment #3) > I've measured this before in bug 578761 comment 48 -- either something changed ... or I was somehow using the slow call count profiler improperly.
Comment 5•14 years ago
|
||
A PIC is never big enough for the biggest object you might for-in over. We can tune for SS but really, PIC is not exactly the right tool here. We want something based on the enumeration protocol and a bit of compile-time analysis. What about bug 580138 (someone should dup bug 560989)? See also bug 593931 (which is not really about fasta if we do the right kind of prediction). /be
Comment 6•14 years ago
|
||
The issue with TM+JM seems to be that TM is taking over and we never get to really exercise the JM GETELEM PIC. Nick and I are working on adding a PIC-like table-lookup mechanism to jstracer.cpp:GetPropertyByName to avoid this overhead and maybe make things a bit easier on trace learning.
Comment 7•14 years ago
|
||
Nick and I whipped this up. SunSpider comparison said ~17% speedup on fasta with 32 entries, ~10% on fasta with 16 entries.
Attachment #475231 -
Flags: review?(jorendorff)
Comment 8•14 years ago
|
||
Remove debugging crud.
Attachment #475231 -
Attachment is obsolete: true
Attachment #475231 -
Flags: review?(jorendorff)
Comment 9•14 years ago
|
||
Last quick fixup before review, I promise. :-) Got rid of _JS_CTYPE_VOID_STAR in favor of _JS_CTYPE_PICTABLE.
Attachment #475235 -
Attachment is obsolete: true
Updated•14 years ago
|
Attachment #475243 -
Flags: review?(jorendorff)
Assignee | ||
Comment 10•14 years ago
|
||
Cachegrind results on my Mac, which has a lot of extra system overhead than my Linux box -- if I ran these on my linux box the improvements/degradations would be greater. --------------------------------------------------------------- | millions of instructions executed | | total | on-trace (may overestimate) | --------------------------------------------------------------- | 188.022 188.060 (------) | 48.598 48.598 (------) | 3d-cube | 136.549 136.548 (------) | 25.517 25.517 (------) | 3d-morph | 196.263 196.280 (------) | 44.925 44.924 (------) | 3d-raytrace | 122.554 122.979 (0.997x) | 14.796 14.796 (------) | access-binary- | 190.659 190.633 (------) | 93.423 93.423 (------) | access-fannkuc | 116.424 116.413 (------) | 17.692 17.692 (------) | access-nbody | 126.109 126.109 (------) | 28.379 28.379 (------) | access-nsieve | 95.092 95.096 (------) | 3.618 3.618 (------) | bitops-3bit-bi | 124.478 124.479 (------) | 32.909 32.909 (------) | bitops-bits-in | 103.514 103.519 (------) | 12.380 12.380 (------) | bitops-bitwise | 130.887 130.887 (------) | 38.377 38.377 (------) | bitops-nsieve- | 109.375 109.381 (------) | 17.894 17.894 (------) | controlflow-re | 125.267 125.278 (------) | 6.666 6.666 (------) | crypto-md5 | 108.621 108.618 (------) | 7.532 7.532 (------) | crypto-sha1 | 182.161 182.160 (------) | 22.689 22.689 (------) | date-format-to | 165.617 162.990 (1.016x) | 9.728 9.753 (0.998x) | date-format-xp | 130.774 130.775 (------) | 31.463 31.463 (------) | math-cordic | 110.582 110.588 (------) | 6.460 6.460 (------) | math-partial-s | 109.770 109.753 (------) | 14.025 14.025 (------) | math-spectral- | 138.422 138.433 (------) | 36.117 36.117 (------) | regexp-dna | 118.406 118.412 (------) | 10.927 10.927 (------) | string-base64 | 197.259 182.823 (1.079x) | 23.935 24.111 (0.993x) | string-fasta | 227.677 227.660 (------) | 19.436 19.436 (------) | string-tagclou | 264.846 265.821 (0.996x) | 29.674 29.674 (------) | string-unpack- | 133.538 133.537 (------) | 8.839 8.839 (------) | string-validat ------- | 3652.880 3637.243 (1.004x) | 606.012 606.209 (------) | all
Comment 11•14 years ago
|
||
>+ *shapeOut = NULL; > /* Convert string indices to integers if appropriate. */ Style nit: Blank line before a comment. > protoIndex = js_LookupPropertyWithFlags(cx, aobj, id, cx->resolveFlags, > &obj2, &prop); >+ *holderOut = obj2; > if (protoIndex < 0) > return JS_FALSE; Nit: Move the olderOut line after the error check, please. > shape = (Shape *) prop; >+ *shapeOut = shape; > > if (getHow & JSGET_CACHE_RESULT) { > JS_ASSERT_NOT_ON_TRACE(cx); > JS_PROPERTY_CACHE(cx).fill(cx, aobj, 0, protoIndex, obj2, shape); > } Hmm. It'd be nice, since we've got out params now, to move that property cache fill to the caller and get rid of JSGET_CACHE_RESULT. ...but maybe that would be bad. >+} >+ >+ >+JSBool Style nit: delete the extra blank line. >+struct PICTableEntry >+{ >+ uint32 shape; >+ jsid id; >+ uint32 slot; >+}; Does this waste a few bytes on 64-bit platforms? >- if (!RootedStringToId(cx, namep, &id) || !obj->getProperty(cx, id, vp)) { >+ if (!RootedStringToId(cx, namep, &id)) >+ return false; >+ >+ /* Duplicated from JSObject::getProperty. */ >+ PropertyIdOp op = obj->getOps()->getProperty; >+ if (op) >+ return op(cx, obj, id, vp); For both these ifs, you need to SetBuiltinError iff an error occurred. Add a test for that using a proxy or a getter to make the op call fail. >+ /* Try to hit in the cache. */ >+ { >+ uint32 slot; >+ if (picTable->scan(obj->shape(), id, &slot)) { >+ *vp = obj->getSlot(slot); >+ return cx->tracerState->builtinStatus == 0; >+ } >+ } Style nit: Lose the extra braces? >+ /* Only update the table when the object is the holder of the property. */ >+ if (obj == holder) { Also avoid updating the PIC when shape->slot == SHAPE_INVALID_SLOT, and add a test for that case (if there isn't one already). r=me with these changes. This is a very cute patch.
Updated•14 years ago
|
Attachment #475243 -
Flags: review?(jorendorff) → review+
Assignee | ||
Comment 12•14 years ago
|
||
The slight degradations in instruction counts are because js_GetPropertyHelperWithShape() is returning (by reference) additional values.
You could avoid this (see the JM PICs) by using FindProperty to fill.
Assignee | ||
Comment 14•14 years ago
|
||
(In reply to comment #13) > You could avoid this (see the JM PICs) by using FindProperty to fill. That's a good idea, but it won't help with the degradations -- those ones don't hit GetPropertyByName, it's just that js_GetPropertyName() has become slightly more expensive.
Assignee | ||
Comment 15•14 years ago
|
||
Bumping the PIC size to 32 changes the string-fasta and total results as follows: ... | 197.259 173.529 (1.137x) | 23.935 24.110 (0.993x) | string-fasta ... | 3652.863 3627.924 (1.007x) | 606.012 606.209 (------) | all All the others were unchanged.
Comment 16•14 years ago
|
||
Comment on attachment 475243 [details] [diff] [review] Tracer PIC for GetPropByName. >+bool >+js_GetPropertyHelperWithShape(JSContext *cx, JSObject *obj, jsid id, >+ uintN getHow, Value *vp, >+ const Shape **shapeOut, JSObject **holderOut) Too many args and a mostly (all but one bit ;-) wasted return type -- suggest returning Shape * directly for some of those instruction count losses one back. >+ /* Only update the table when the object is the holder of the property. */ >+ if (obj == holder) { >+ /* >+ * Note: we insert the non-normalized id into the table so you don't need to >+ * normalize it before hitting in the table (faster lookup). >+ */ >+ picTable->update(obj->shape(), id, shape->slot); But update just fills the PIC, there's no self-organization on a scan hit, and once full you're stuck with possibly irrelevant cache "lines". Better to wrap around and use MRU. You'd fill at the cursor and start scanning just after it, wrapping around and skipping (initially) free entries. Don't tune for SS too much and declare victory. We have a chance with this kind of approach to do better than non-self-organizing branch exits (trace trees), we should exploit it. /be
Comment 17•14 years ago
|
||
Argh, s/one back/won back/ in last comment! (Are textareas updating badly when you type? Is this bug 130078 fallout? Argh!) /be
Comment 18•14 years ago
|
||
(In reply to comment #16) > Too many args and a mostly (all but one bit ;-) wasted return type -- suggest > returning Shape * directly for some of those instruction count losses one back. We should be able to force inline that function to avoid the overhead (DCE the unused arg stores). > But update just fills the PIC, there's no self-organization on a scan hit, and > once full you're stuck with possibly irrelevant cache "lines". This is the way JM PICs work. :-) In the JM PICs dmandelin measured that balancing didn't yield a speedup, but it may be different for our tables, since we don't invalidate the I$ as a result of balancing. > Don't tune for SS too much and declare victory. Will file a follow up bug to try balancing (so long as this patch doesn't regress V8) in an attempt to iterate early and often.
Comment 19•14 years ago
|
||
(In reply to comment #18) > (In reply to comment #16) > > Too many args and a mostly (all but one bit ;-) wasted return type -- suggest > > returning Shape * directly for some of those instruction count losses one back. > > We should be able to force inline that function to avoid the overhead (DCE the > unused arg stores). I'd rather a simpler signature with one fewer out params (Out params :-/) to fiddle with. > > But update just fills the PIC, there's no self-organization on a scan hit, and > > once full you're stuck with possibly irrelevant cache "lines". > > This is the way JM PICs work. :-) I know. This is a different animal, not a PIC based on literal property name (Self/Smalltalk message selector), or a likely repeated array index? See first para in comment 5. > Will file a follow up bug to try balancing (so long as this patch doesn't > regress V8) in an attempt to iterate early and often. Thanks, /be
Comment 20•14 years ago
|
||
(Curse this bug 130078 fallout, my typing results in even more typos than usual!)
Comment 21•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/62545b0da5bc
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•14 years ago
|
Whiteboard: fixed-in-tracemonkey
Comment 23•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/62545b0da5bc
Comment 24•14 years ago
|
||
Fileup fodder filed as bug 597099.
Status: RESOLVED → UNCONFIRMED
Ever confirmed: false
Resolution: FIXED → ---
Comment 25•14 years ago
|
||
Not sure why the status changed, but changing back to resolved/fixed.
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•