Closed Bug 705756 Opened 13 years ago Closed 12 years ago

remove finddef for pure class load + static invocations

Categories

(Tamarin Graveyard :: Virtual Machine, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED
Q2 12 - Cyril

People

(Reporter: pnkfelix, Unassigned)

References

Details

(Whiteboard: PACMAN)

Attachments

(2 files)

Math.abs() and friends are slower than they should be. We believe this is because the VM is still emitting finddef. (But Ed recently notes that it might be more subtle, in that the emitted finddef should be eliminatable but for some reason is not going away.)
Attached file microbenchmark A v1
On my MacBook Pro (2.8 Ghz Core 2 Duo, 4 GB memory): 15μs Math.abs 104μs C=Math; C.abs 76μs Mymath.absUntyped 11μs Mymath.absTyped 53μs staticAbsUntyped 71μs this.dispatchAbsUntyped 16μs staticAbsTyped 11μs this.dispatchAbsTyped 6μs (x<0)?-x:x Its not suprising that the inline code version at the end is fast. The problem is why is Mymath.absTyped and this.dispatchAbsTyped so much faster than Math.abs.
A last minute note from #tamarin: 7:01:08 PM edwsmith: i just noticed that finddef is already marked "PURE" which could be a bug. 7:01:26 PM edwsmith: but it also could mean we already remove finddef<Math> in the case we're thinking of. 7:01:32 PM edwsmith: if not, one wonders why not. 7:01:35 PM fklockii: I don't think we do 7:01:54 PM fklockii: At least, not to the degree that Lars wants 7:02:07 PM fklockii: I have a benchmark that may be of use 7:03:25 PM fklockii: I'll file a bug for this work and post my benchmark to it. 7:04:16 PM edwsmith: okay, i see, we're doing this: ptr = finddef<Math>; Math = load ptr[Math_offset]; nullcheck(Math). 7:04:30 PM edwsmith: the nullcheck keeps the load alive, which then keeps finddef alive. 7:05:19 PM edwsmith: if we can arrange for the load to return type (Math$, not-null), null check goes away. 7:05:27 PM fklockii: hmm. 7:05:37 PM fklockii: Okay, so this might be simpler than I had thought. 7:07:03 PM edwsmith: worth trying, at least.
amplification of what I wrote in IRC: > function my_abs(x:Number):Number { return Math.abs(x) } generates > var function(Number):Number /* disp_id=0 method_id=0 */ > { > // local_count=2 max_scope=0 max_stack=2 framesize=4 code_len=9 code_offset=137 > 0 findpropstrict {, private, http://adobe.com/AS3/2006/builtin, }::Math > 2 getproperty {, private, http://adobe.com/AS3/2006/builtin, }::Math > 4 getlocal1 > 5 callproperty {, private, http://adobe.com/AS3/2006/builtin, }::abs (1) > 8 returnvalue > } If we can arrange it so the result of 2:getproperty<Math> is type (Math$,not-null) then the null check generated after 5:callproperty<abs> should go away, leaving the getproperty result dead, leaving the finddef result dead. Dead code elimination by nanojit::Assembler should take care of the rest. class VarTracker in CodegenLIR has the ability to track certain expressions that are known to be not-null, even though the verifier thinks they are nullable. If we can mark the getproperty result nonnull this way, then it is a pure optimization that cannot effect Verifier behavior. else, If the verifier types the result of 2:getproperty as (Math,not-null) then that's a change that will cascade and have nonlocal effects (some desireable, of course). A change like that ultimately is observable, so we would want to version check it. recent changes to getproperty and applytype, for Vector<T>, are like this.
Blocks: 708204
Raising the heat a little bit here: we really want this for good float/float4 performance.
Priority: -- → P2
Target Milestone: --- → Q2 12 - Cyril
So this isn't right - there needs to be a guard on the object being the global object, something to do with checking the traits - but with this fix in place, the ray tracer speeds up to 42 fps, handily beating the scalar code: diff --git a/core/Verifier.cpp b/core/Verifier.cpp --- a/core/Verifier.cpp +++ b/core/Verifier.cpp @@ -2906,18 +2906,21 @@ Binding b = toplevel->getBinding(obj.traits, &multiname); Traits* propType = readBinding(obj.traits, b); emitCheckNull(state->sp()-(n-1)); // early bind slot if (AvmCore::isSlotBinding(b)) { + bool notNull = false; + if (multiname.getName() == core->kfloat4) // MORE HERE + notNull = true; coder->writeOp1(state, pc, OP_getslot, AvmCore::bindingToSlotId(b), propType); - state->pop_push(n, propType); + state->pop_push(n, propType, notNull); return; } // early bind accessor if (AvmCore::hasGetterBinding(b)) { // Invoke the getter int disp_id = AvmCore::bindingToGetterId(b);
Attached patch Temporary fixSplinter Review
I'm wondering if I can get some eyes on this fairly quickly (by Friday morning my time is ideal). I don't propose this for TR but if you think it is OK for fr-float for the time being (ie narrow enough not to cause any problems in practice) then I will land it there; that will unblock the performance bug.
Attachment #580012 - Flags: feedback?(fklockii)
Attachment #580012 - Flags: feedback?(edwsmith)
Comment on attachment 580012 [details] [diff] [review] Temporary fix If it works in tr-float, ship it. For TR, though, testing for a specific class should be done by an id. With some careful review, the logic could be this if (/* slot is a class */ && /* pool is builtin */) notnull = true the invariant should works as long as we don't have circular references in static init code for classes, where class A loads a reference to B, while B's static initializer is still running. Math.as is lazy-initialized, but Number.as and Float.as are included by builtin.as which is the first abc script to be initialized. Math.as probably belongs in builtin.as too.
Attachment #580012 - Flags: feedback?(edwsmith) → feedback+
Comment on attachment 580012 [details] [diff] [review] Temporary fix I'll want to play with it on my own copy of TR, but the logic here looks sound enough to me.
Attachment #580012 - Flags: feedback?(fklockii) → feedback+
OK, thanks both. I'll land in tr-float and fr-float but with a comment that says not to merge it back to TR in its current shape.
Pushed: changeset: 7052:25acba717e3c tag: tip user: Lars T Hansen <lhansen@adobe.com> date: Thu Dec 08 16:18:50 2011 +0100 summary: Fix 708204 - RayTracer-float4-V slower than RayTracer-float-V Also fr-float: Change 1011234 created with 3 open file(s). Submitting change 1011234. Locking 3 files ... edit //depot/users/tamarin/float/third_party/avmplus/core/AvmCore.cpp#9 edit //depot/users/tamarin/float/third_party/avmplus/core/AvmCore.h#8 edit //depot/users/tamarin/float/third_party/avmplus/core/Verifier.cpp#9
(In reply to Lars T Hansen from comment #9) > OK, thanks both. I'll land in tr-float and fr-float but with a comment that > says not to merge it back to TR in its current shape. This has actually landed in tamarin-redux in changeset: 25acba717e3c http://hg.mozilla.org/tamarin-redux/rev/25acba717e3c Lars/Felix what needs to be done here in order to clean this up? (Felix: I will also make a note to myself to ping Lars about this offline when he gets back)
the string-matching should at least be replaced by id matching, per comment #7. (match builtin class ids, instead of string names). That said, I can't think of anything glaringly unsound with the code as-is. Its on a code path that has already early bound to a class slot, and is well guarded for a known set of builtin classes.
Assignee: nobody → fklockii
See Also: → 515722
Whiteboard: PACMAN
Assignee: fklockii → nobody
(i guess this is fixed. or at least Ed signed off on the hack that Lars put in. I don't remember how much more I was hoping for, maybe the right thing would be to check how it affected the microbenchmark I posted to this ticket.)
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: