Closed
Bug 592976
Opened 15 years ago
Closed 15 years ago
JM: Optimize formal arguments access
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: dvander, Assigned: dvander)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(3 files, 1 obsolete file)
|
659 bytes,
application/javascript
|
Details | |
|
8.67 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
|
26.23 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
We can treat accesses to local arguments (almost) like local variables. "Almost" depends on the outcome of |f.arguments| breakage. If we can't nop sets to these ninja'd arguments objects, we have to eagerly write the results of SETARG/FORARG to memory. However, we can still copy propagate.
| Assignee | ||
Comment 1•15 years ago
|
||
Right now, args are roughly 10% slower than local variables.
Assignee: general → dvander
Status: NEW → ASSIGNED
| Assignee | ||
Updated•15 years ago
|
OS: Linux → All
Hardware: x86_64 → All
| Assignee | ||
Comment 2•15 years ago
|
||
This makes args as fast as locals in the attached test case. For the most part it's a very small win, maybe 2ms on SunSpider, but we lose 1ms reproducibly on date-format-xparb. No change in v8.
Posting the patch for posterity, I'll look at it again once all of the necessary bugs have landed.
| Assignee | ||
Comment 3•15 years ago
|
||
Comment on attachment 475314 [details] [diff] [review]
treat args as locals
Getting review now in case this wants to land in a week ;)
Attachment #475314 -
Flags: review?(dmandelin)
Comment 4•15 years ago
|
||
Comment on attachment 475314 [details] [diff] [review]
treat args as locals
Looks good to me. A few questions/requests:
1.
+FrameState::getArg(uint32 slot)
+{
+ JS_ASSERT(slot < nargs);
+ FrameEntry *fe = &entries[slot];
Is this future-proof? I.e., are we requiring that entries[0] always be the first arg? Might be good to put a thin abstraction in place here.
2. isClosedVar and isClosedArg are a little funny as names. I'm not sure what a better name would be--it seems the sense is "can this var be accessed otherwise than through its local name". isNonLocal sounds about right but I'm not sure the terminology is familiar there. isAliased kind of works, but is not accurate about the closed-over part. Maybe maybeImplicitlyAccessed or the like?
In addition, or alternatively, it would be good to comment those functions on exactly what they do mean.
3. Does trace-tests exercise the main paths through incarg et al? If not, please add some test cases for those.
On that note, it might be a good idea to add some tests for the major kinds of implicit accesses we are trying to detect and deoptimize for.
Other than the test coverage, I think all these could be addressed as followup bugs if that is more convenient than updating this patch.
Attachment #475314 -
Flags: review?(dmandelin) → review+
| Assignee | ||
Comment 5•15 years ago
|
||
This patch bitrotted enough to warrant a rewrite. The new queue addresses most of Dave's comments, and the tracer incarg tests caught bugs along the way so I think we've got good coverage.
Since the frame depth will now include arguments and (callee, this), this patch removes frameDepth() to audit its callers and use the appropriate semantics for each.
Attachment #475314 -
Attachment is obsolete: true
Attachment #488541 -
Flags: review?(lw)
| Assignee | ||
Comment 6•15 years ago
|
||
Changes in this patch:
* Track callee in FrameState, always as an object.
* Track |this| in FrameState, sometimes as an object.
* Track arguments in FrameState.
* Rewrite arginc & friends to be much simpler and saner.
On the microbenchmark, before -
Arg perf: 145
Var perf: 114
After -
Arg perf: 112
Var perf: 115
SunSpider shows maybe a 2ms win, v8 shows a 2.5% win, both -m only. v8 mostly seeps up in earley-boyer, deltablue, and richards. These do lots of constructing, and register allocation is much better with lots of writes to "this" in a row.
Attachment #488543 -
Flags: review?(lw)
| Assignee | ||
Updated•15 years ago
|
Attachment #488543 -
Attachment description: part 2: change some invariants → part 2: track args, callee, this
Comment 7•15 years ago
|
||
Out-of-order comment on part 2: could you remove RematInfo::isConstant() since its not used? (Otherwise, there is the latent problem that isConstant() implies inMemory())
Comment 8•15 years ago
|
||
Comment on attachment 488541 [details] [diff] [review]
part 1: remove FrameState::frameDepth()
> OOL_STUBCALL_SLOTS(JS_FUNC_TO_DATA_PTR(void *, stubs::UncachedCall),
>- frame.frameDepth() + frameDepthAdjust);
>+ frame.localSlots() + frameDepthAdjust);
Since 'slots' seems to usually mean frame.frameSlots(), perhaps the macro name could be changed to OOL_STUBCALL_LOCAL_SLOTS ?
>- callIC.frameSize.initStatic(frame.frameDepth(), speculatedArgc - 1);
>+ callIC.frameSize.initStatic(frame.localSlots(), speculatedArgc - 1);
For regularity, could you also change the corresponding member of FrameSize from staticFrameDepth to staticLocalSlots?
Attachment #488541 -
Flags: review?(lw) → review+
Comment 9•15 years ago
|
||
Comment on attachment 488543 [details] [diff] [review]
part 2: track args, callee, this
Looks good, only nits:
>+ frame.learnThisIsObject();
Cool
>+ // Before:
>+ // After: V
>+ frame.pushArg(slot);
>+
>+ // Before: V
>+ // After: 1
>+ frame.push(Int32Value(amt));
>+
>+ // Note, SUB will perform integer conversion for us.
>+ // Before: V
>+ // After: N+1
>+ jsop_binary(JSOP_SUB, stubs::Sub);
I think the second after and first before should be "V 1" ?
Also, it might be a bit shorter to just have a single between each push/pop that just says what the current stack state is...
>- args = entries;
>+ callee_ = entries;
>+ this_ = entries + 1;
>+ args = entries + 2;
> locals = args + nargs;
> spBase = locals + script->nfixed;
> sp = spBase;
Not related to the patch, but its a bit disturbing when trying to see what's local vs. memvar to see mixed used of trailing _. Perhaps they could be all _?
>+ if (!eval) {
>+ if (script->nslots) {
>+ closedVars = (JSPackedBool *)cursor;
>+ cursor += sizeof(JSPackedBool) * script->nslots;
>+ }
>+ if (!usesArguments && nargs) {
>+ closedArgs = (JSPackedBool *)cursor;
>+ cursor += sizeof(JSPackedBool) * nargs;
>+ }
> }
Perhaps set to null in else path to make potential bugs unexploitable?
>+ uint32 indexOf(int32 depth) { return uint32((sp + depth) - entries); }
>+ uint32 indexOfFe(FrameEntry *fe) const { return uint32(fe - entries); }
>+ uint32 feLimit() { return script->nslots + nargs + 2; }
Perhaps add asserts that fe/depth are within feLimit() ?
Attachment #488543 -
Flags: review?(lw) → review+
| Assignee | ||
Comment 10•15 years ago
|
||
Whiteboard: fixed-in-tracemonkey
Comment 11•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•