Closed
Bug 589398
Opened 13 years ago
Closed 13 years ago
JM: Use a PIC for JSOP_NEW's prototype lookup
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: dvander, Assigned: dvander)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 11 obsolete files)
126.56 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
For JSOP_NEW we call a NewObject stub that does a property lookup and Object allocation. The property lookup can be PIC'd. Disclaimer: I have not yet investigated how much this is costing us. Our call ICs have a fast path that guards on the callee object identity. The prototype lookup can be baked in as part of the fast path. This would be very easy to add after bug 587698 lands. Even easier (no patching needed) would be to just pass the IC struct to NewObject, and store the shape & sprop in the IC struct. There are many other approaches. Re-using the existing GETPROP machinery is probably overkill and would be hard to integrate without hoisting it above the entire call path. That would be slightly slower in the fastest case (minimally, you need to guard on the callee like the call path does), but conceptually much cleaner.
Comment 1•13 years ago
|
||
On SS this is all in access-binary-trees and costing 4-5ms iirc. Much more in V8.
![]() |
Assignee | |
Comment 2•13 years ago
|
||
Thanks - I don't get that much but this approach could probably be better. This eliminates the getProperty() for SS and v8. 1% on SS and v8 each.
Comment 3•13 years ago
|
||
dvander said that I could drive this into the tree. Will check how it does on binary trees, where we see this matter in profiling.
Assignee: general → cdleary
Status: NEW → ASSIGNED
![]() |
||
Updated•13 years ago
|
blocking2.0: --- → ?
Updated•13 years ago
|
![]() |
Assignee | |
Comment 5•13 years ago
|
||
un-stealing with permission. JSC places the constructor in the callee - I think this is a good approach, since then we don't have to sync |thisv| before the call. the hard part will be rejiggering generatePrologue() to be callable without a Compiler instance, since we don't want to re-JIT methods but functions must support both constructing and non-constructing calls.
Assignee: cdleary → dvander
![]() |
Assignee | |
Comment 6•13 years ago
|
||
Moves |this| construction into the callee. Unfortunately there's a bunch of special cases - the empty script, InvokeConstructor on non-natives, etc, so there is a fair amount of duplication. tracer changes coming next.
Attachment #468077 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 7•13 years ago
|
||
![]() |
Assignee | |
Comment 8•13 years ago
|
||
Thinking aloud, it would be nice if the call object creation could go in JSOP_BEGIN. script->nfixed initialization, too.
![]() |
Assignee | |
Comment 9•13 years ago
|
||
I guess that's hard because of: /* * Compute |this|. Currently, this must happen after the frame is pushed * and fp->scopeChain is correct because the thisObject hook may call * JS_GetScopeChain. */ So nevermind; both callobj and nfixed initialization will have to be included in the method prologue.
![]() |
||
Comment 10•13 years ago
|
||
As soon as brain transplants land, bug 592992 wants to hoist the thisObject hook to ExternalInvoke (since it can only happen from extrnal api calls), which should remove the constraint.
![]() |
Assignee | |
Comment 11•13 years ago
|
||
Awesome! I'll leave it as a follow-up, then. I think it will simplify things.
![]() |
Assignee | |
Comment 12•13 years ago
|
||
More factoring to split out prolog generation, fixes a bunch of OOM/refcount bugs in PICs in the process.
Attachment #479653 -
Attachment is obsolete: true
Attachment #479677 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 13•13 years ago
|
||
Comment on attachment 479718 [details] [diff] [review] WIP v2 It turns out the prologue-bifurcation idea is quite difficult to make fast. Right now we have the caller checking for primitive |this|, and reading |thisv| out of the old frame. It can't do that anymore with an arity mismatch; |this| is written after the frame-fixup. We can't guarantee that |thisv| is written to both the canonical and actual argv without mysterious and undesirable changes to the tracer. Really, we want the callee to perform |thisv| fixups, but if we did that with only header-splitting, every return would need an extra bit check in fp->flags and then some gross stuff. So, new approach. We'll give JSScript the ability to have two JITScripts, one for normal calls and another for constructing. This would slow down how IC stubs access their members however, by adding a few loads and a branch. Part of this patch queue is going to pass IC pointers directly to stubs.
Attachment #479718 -
Attachment is obsolete: true
![]() |
Assignee | |
Comment 14•13 years ago
|
||
Attachment #480003 -
Flags: review?(lw)
![]() |
Assignee | |
Comment 15•13 years ago
|
||
Attachment #480004 -
Flags: review?(dmandelin)
![]() |
||
Comment 16•13 years ago
|
||
Comment on attachment 480003 [details] [diff] [review] part 1: formalize compiler's cx->fp() requirement I know its pre-existing, but should Compiler::Compile be Compiler::compile? r+ regardless.
Attachment #480003 -
Flags: review?(lw) → review+
![]() |
Assignee | |
Comment 17•13 years ago
|
||
method JIT changes aren't done yet, but this part is nicely separable
Attachment #480011 -
Flags: review?(lw)
Comment 18•13 years ago
|
||
Comment on attachment 480004 [details] [diff] [review] part 2: pass IC pointers directly to stubs I noticed that there is callIC.addrLabel1 = stubcc.masm.moveWithPatch(ImmPtr(NULL), Registers::ArgReg1); callIC.addrLabel2 = stubcc.masm.moveWithPatch(ImmPtr(NULL), Registers::ArgReg1); Looks like maybe those should use passPICAddress (unless there is some other constraint).
Attachment #480004 -
Flags: review?(dmandelin) → review+
![]() |
Assignee | |
Comment 19•13 years ago
|
||
fixes IRL comments about InvokeConstructor and CreateThis
Attachment #480011 -
Attachment is obsolete: true
Attachment #480200 -
Flags: review?(lw)
Attachment #480011 -
Flags: review?(lw)
![]() |
Assignee | |
Comment 20•13 years ago
|
||
with some extra gunk removed
Attachment #480200 -
Attachment is obsolete: true
Attachment #480202 -
Flags: review?(lw)
Attachment #480200 -
Flags: review?(lw)
![]() |
Assignee | |
Comment 21•13 years ago
|
||
This patch passes tests except for debug mode; part 5 will fix that. Maybe a 1% change SunSpider, biggest % win is in access-binary-trees, but that's about 0.4ms on my machine. v8 is better. With "-m -j" it's about a 5.5% win (132ms in over-optimistic AWFY units), and 9.6% win with "-m" alone (284ms).
Attachment #480280 -
Flags: review?(dmandelin)
![]() |
Assignee | |
Comment 22•13 years ago
|
||
Lets debug mode work with new JITScript bifurcation.
Attachment #480338 -
Flags: review?(sstangl)
Comment 23•13 years ago
|
||
Comment on attachment 480280 [details] [diff] [review] part 4: bifurcation, PIC r+ with these small changes: 1. Use script->maybeNativeMap() at the point where nmap is needed in jsinterp, just so we have less mutable state to go wrong. 2. Consider some extra abstraction around fp->isConstructing(), especially in the compiler, such as creating a member for constructor mode. It always kind of bothered me that the compiler reaches into the current frame directly to figure out what to do. I'm not sure that change is a good idea, though. 3. Add comments to loadReturnValue, fixPrimitiveReturn, and constructThis. In particular, for this kind of function I really like to have an explanation of what the inputs and outputs are of the generated code. 4. There is a comment "Any value value" in jsscript.h.
Attachment #480280 -
Flags: review?(dmandelin) → review+
Comment 24•13 years ago
|
||
(In reply to comment #21) > Created attachment 480280 [details] [diff] [review] > part 4: bifurcation, PIC This patch changes the "register version" of FrameState::storeTo() by adding a 64-bit specific patch. However, the #define used is "JS_PUNBOX", when the 64-bit value format is "JS_PUNBOX64". Additionally, it is only beneficial to use loadValueAsComponents() when both values have to be loaded -- otherwise, loading a single part is faster. So the condition must be > fe->type.inMemory() && fe->data.inMemory()
Comment 25•13 years ago
|
||
(In reply to comment #14) > Created attachment 480003 [details] [diff] [review] > part 1: formalize compiler's cx->fp() requirement This introduces Compiler::Compile() and Compiler::performCompilation(), which mean the same thing. Since the purpose of ::Compile() is to be called by mjit::TryCompile(), would it make sense to rename Compiler::Compile() to Compiler::TryCompile(), and Compiler::performCompilation() to Compiler::Compile()?
Updated•13 years ago
|
Attachment #480338 -
Flags: review?(sstangl) → review+
![]() |
||
Comment 26•13 years ago
|
||
Comment on attachment 480202 [details] [diff] [review] part 3: interp+tracer changes Nice work! >- Value &thisv = fp->functionThis(); >- JS_ASSERT_IF(flags & JSINVOKE_CONSTRUCT, !thisv.isPrimitive()); >- if (thisv.isObject() && !(flags & JSINVOKE_CONSTRUCT)) { >+ if (!(flags & JSINVOKE_CONSTRUCT)) { >+ Value &thisv = fp->functionThis(); Cool >@@ -1156,8 +1161,9 @@ InvokeConstructor(JSContext *cx, const C >+ // Scripts create their own |this|. >+ if (!fun || fun->isNative()) { >+ JSObject *obj = js_CreateThis(cx, callee); Perhaps: /* Scripts create their own |this| in JSOP_BEGIN */ ? Also, so that the reader doesn't have to spend time wondering if there are non-isNative() non-isInterpreted() functions, could you s/fun->isNative()/!fun->isInterpreted()/ ? >+ // The interpreter fixes rval for us. >+ JS_ASSERT(!fun->isInterpreted()); /* */ >+ regs.fp->formalArgs()[-1].setObject(*obj2); Perhaps regs.fp->functionThis().setObject(*obj2); >+// Specialized call for constructing |this| with a known function callee, >+// and a known prototype. /* */
Attachment #480202 -
Flags: review?(lw) → review+
![]() |
Assignee | |
Comment 27•13 years ago
|
||
Review follow-ups: (In reply to comment #16) > Should Compiler::Compile be Compiler::compile? Yeah, I'll clean this up. Sean had good complaints here too. (In reply to comment #18) > Looks like maybe those should use passPICAddress passPICAddress only works on PICs; since there's only two callIC sites it just got expanded. (In reply to comment #23) > 1. Use script->maybeNativeMap() at the point where nmap is needed in jsinterp, just so we have less mutable state to go wrong. Actually caching this state could break after recompilation, so thanks, will fix. > 2. Consider some extra abstraction around fp->isConstructing ... compiler reaches into the current frame directly Okay, this makes sense given that we cache other things too. Though, it doesn't peek into the current frame, but the given frame. Will do on the rest. (In reply to comment #24) (In reply to comment #25) (In reply to comment #26) Thanks, will fix these.
![]() |
Assignee | |
Comment 28•13 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/de5d1b528b9a
Whiteboard: fixed-in-tracemonkey
![]() |
Assignee | |
Comment 29•13 years ago
|
||
backed out as: http://hg.mozilla.org/tracemonkey/rev/d6e4ffae0a68 http://hg.mozilla.org/tracemonkey/rev/568660610e19
Whiteboard: fixed-in-tracemonkey
![]() |
Assignee | |
Comment 30•13 years ago
|
||
Some follow-up fixes: Bump XDR, fix no-IC builds, remove bogus asserts, don't explicitly mark fp->thisValue (stack space scanner does so automatically & conservatively).
Attachment #480003 -
Attachment is obsolete: true
Attachment #480004 -
Attachment is obsolete: true
Attachment #480202 -
Attachment is obsolete: true
Attachment #480280 -
Attachment is obsolete: true
Attachment #480338 -
Attachment is obsolete: true
Attachment #480800 -
Flags: review+
![]() |
Assignee | |
Comment 31•13 years ago
|
||
take two: http://hg.mozilla.org/tracemonkey/rev/32b049250e03
Whiteboard: fixed-in-tracemonkey
Comment 32•13 years ago
|
||
JSOP_BEGIN is always emitted at the start of an interpreted function's script, so why not just hardcode its guts (interp, mjit, tracer) on entry to function code? That seems strictly faster in terms of dispatch -- native C++ optimized and PGO'ed instead of a bytecode to interpret or JIT-compile. /be
![]() |
Assignee | |
Comment 33•13 years ago
|
||
On v8, regress-x86 shows this as a 258ms (9%) win for -m, and 88ms (3.7%) win for -m -j. SunSpider appears to have gotten a 5ms (1%) regression. It's in: * access-fannkuch (2.7ms) * math-cordic (2ms) This ate up a 1.7ms win on access-binary-trees. This is curious as neither calls |new| in any performance meaningful way. I will investigate and file a follow-up tomorrow.
Comment 34•13 years ago
|
||
You might want to compare browser before and after. In the shell, fannkuch especially is subject to random but repeatable gain or loss in speed; I think it is very sensitive to the layout of the four arrays it accesses vis a vis the cache, and since the shell is deterministic (unlike the browser) values will be laid out the same way every time it runs. See bugs 584917 (compares browser vs. shell) and 580492 (uninvestigated, probably the same thing).
![]() |
Assignee | |
Comment 35•13 years ago
|
||
Urgh, what a pill crab. This is all too familiar. The "bug" is that two members were removed from JSScript. If I add padding in, the perf hit goes away (and it's sometimes quite large, 10% on my MBP). Luke wants a follow-up bug to investigate these weird effects. For now, I'll add two members to JSScript in the fix for bug 601982.
![]() |
||
Comment 36•13 years ago
|
||
(In reply to comment #35) > Urgh, what a pill crab. This is all too familiar. The "bug" is that two members > were removed from JSScript. If I add padding in, the perf hit goes away (and > it's sometimes quite large, 10% on my MBP). > > Luke wants a follow-up bug to investigate these weird effects. For now, I'll > add two members to JSScript in the fix for bug 601982. Three cheers for measuring instruction counts as well as time!
![]() |
Assignee | |
Comment 37•13 years ago
|
||
(In reply to comment #32) Moving the guts of JSOP_BEGIN out means putting it in both JSOP_NEW and the prologue of Interpret(). The new opcode cleanly focuses the semantics. The dispatch overhead shouldn't matter since we're interpreting so rarely.
Comment 38•13 years ago
|
||
(In reply to comment #37) > (In reply to comment #32) > > Moving the guts of JSOP_BEGIN out means putting it in both JSOP_NEW and the > prologue of Interpret(). C'mon, that's what inline helpers are for. > The new opcode cleanly focuses the semantics. The > dispatch overhead shouldn't matter since we're interpreting so rarely. I figured, but this is probably a mistake, or just more work to refactor later when eliminating the interpreter. Why not use a (static or class-based) inline helper now and be done? /be
Comment 39•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/32b049250e03
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•