Closed
Bug 592965
Opened 14 years ago
Closed 11 years ago
JM: Inline PutCallObject
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: dvander, Assigned: dvander)
References
Details
Attachments
(1 file, 3 obsolete files)
8.98 KB,
patch
|
Details | Diff | Splinter Review |
PutCallObject is fairly simple, but expensive. It copies all args and locals and does some dynamic tests. We statically known exactly which slots must be written, so we can specialize it in the return path of methods. This doesn't do anything for SS, maybe 2% for v8-v4, but a lot for v8-v5. Before: Richards: 7298 DeltaBlue: 2806 Crypto: 3177 RayTrace: 1811 EarleyBoyer: 2915 ---- Score: 3215 After: Richards: 7417 DeltaBlue: 2694 Crypto: 3760 RayTrace: 1769 EarleyBoyer: 4061 ---- Score: 3519
Assignee | ||
Comment 1•14 years ago
|
||
Attachment #471436 -
Flags: review?(dmandelin)
Assignee | ||
Comment 2•14 years ago
|
||
This patch is also needed for further argument optimizations, on top of Luke's work.
Attachment #471437 -
Flags: review?(dmandelin)
Assignee | ||
Comment 3•14 years ago
|
||
not ready for review yet because deltablue and SS get slightly slower (SS by ~5ms).
Comment 4•14 years ago
|
||
Comment on attachment 471437 [details] [diff] [review] part 2: mark which arguments escape >+ if (!(cg->flags & TCF_FUN_CALLS_EVAL) && pn->pn_defn && Nit: break after every && (||) so as to avoid obscuring the clauses. This is common style when writing multiline conjunctions and disjunctions. Nit: "ClosedOverName"? Perhaps not, thought I would moot it. > case TOK_ARGSBODY: >- ok = js_EmitTree(cx, cg, pn->last()); >+ { >+ JSParseNode *pnlast = pn->last(); >+ for (JSParseNode *pn2 = pn->pn_head; pn2 != pnlast; pn2 = pn2->pn_next) { >+ if (!BindNameToSlot(cx, cg, pn2)) >+ return JS_FALSE; >+ if (JOF_OPTYPE(pn2->pn_op) == JOF_QARG && >+ !(cg->flags & TCF_FUN_USES_ARGUMENTS) && The middle condition is loop-invariant, so hoist it and make the outer block around this case's statements the |then| block of an if, ending before the ok = js_EmitTree(...) call. >+++ b/js/src/jsinterp.cpp >@@ -2625,6 +2625,7 @@ END_CASE(JSOP_TRACE) > /* ADD_EMPTY_CASE is not used here as JSOP_LINENO_LENGTH == 3. */ > BEGIN_CASE(JSOP_LINENO) > BEGIN_CASE(JSOP_DEFUPVAR) >+BEGIN_CASE(JSOP_DEFUPARG) > END_CASE(JSOP_DEFUPVAR) Saying it with bytecode helps something (I don't see what, in this patch, though) but it occurs to me that script meta-data processed by native code might be a lot faster than prolog ops dispatched by some interpreter or compiled by some JIT. But maybe I'm missing the motivation for emitting JSOP_DEFUP{VAR,ARG} prolog ops. I remember JSOP_DEFUPVAR but didn't think about the script meta-data alternative then. /be
Assignee | ||
Comment 5•14 years ago
|
||
Comment on attachment 471437 [details] [diff] [review] part 2: mark which arguments escape (In reply to comment #4) > The middle condition is loop-invariant Whoops - just realized this condition shouldn't even be there. > Saying it with bytecode helps something (I don't see what It's a prerequisite to optimizations that rely on knowing which arguments can't be mutated by implicit calls. That it's expressed in bytecode as a nop (nothing emitted in JIT) is just a hack. We could hang this information off the script.
Attachment #471437 -
Flags: review?(dmandelin)
Assignee | ||
Comment 6•14 years ago
|
||
Comment on attachment 471436 [details] [diff] [review] part 1: mark scripts as using arguments Okay, patch queue fail - up too late. Was getting the wrong baseline. This is a slight perf win (eliminates every PutCallObject call) but nowhere near as impossibly dramatic. I'm going to file a separate bug for the DEFUPARG analysis, and move both that and DEFUPVAR into the script rather than a prolog op. That will block this and other arguments-related optimizations.
Attachment #471436 -
Attachment is obsolete: true
Attachment #471436 -
Flags: review?(dmandelin)
Assignee | ||
Comment 7•14 years ago
|
||
Improves v8-v5 earley-boyer by about 100 points on my machine, but it's probably not worth taking this. It only seems to gain about 5 nanoseconds per PutCallObject removal on a microbenchmark. Tabling for a rainy day.
Attachment #471437 -
Attachment is obsolete: true
Attachment #471438 -
Attachment is obsolete: true
Comment 8•14 years ago
|
||
is this still not worth taking? 100 points is currently approx 14% of the difference between us and the complete v8 score on awfy (729 ms diff) it's also currently raining in Plattsburgh, NY
Comment 9•14 years ago
|
||
Seems to be an easy performance win, and getting a higher score will probably impress some kids. We already had snow in Germany.
Comment 10•14 years ago
|
||
Just to clarify, in response to comment 8, 100 points on the V8 score is completely different than 100 ms on AWFY. With that being said, we are currently far behind V8 on the Earley-Boyer test, so any improvement there (particularly if it is low-hanging fruit) may be worth the effort.
Updated•11 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•