On trace, replace js_SetCallArg/js_SetCallVar with specialized code

RESOLVED FIXED

Status

()

RESOLVED FIXED
9 years ago
9 years ago

People

(Reporter: bzbarsky, Assigned: bzbarsky)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment, 2 obsolete attachments)

Created attachment 415692 [details] [diff] [review]
Proposed patch

Similar to bug 530225, but for the set case.  This situation is simpler, since we were already basically calling the sprop setter on trace directly; this is just unwinding the onion-layers of that setter and leaving just the guts.  This applies on top of bug 530225.

Again, this only works if I can assume all the relevant slots already exist, etc; if not then we need to go through JS_SetReservedSlot.  I'd really like to avoid that if I can (esp. the locking).

And again, we could poke dslots directly here, I think.  Would that be ok?

This speeds up the slice test on dromaeo from 170 runs/s (after bug 530225) to
200 runs/s or so.  The overall js string tests go from 280 runs/s to 460 runs/s with this patch and the one in bug 530225.  That's almost within spitting distance of Safari (500 runs/s) and faster than Chrome (405 runs/s).
Attachment #415692 - Flags: review?(dmandelin)
Attachment #415692 - Flags: review?(brendan)
This looks good, aside from the adj_slot issue I mentioned in the other bug.

This is really good work. I'm pleased (and impressed) that it yields such a big speedup.
(Assignee)

Comment 2

9 years ago
Er, I meant bug 530255 throughout.
Depends on: 530255
No longer depends on: 530225

Comment 3

9 years ago
I would like to randomly express my dismay with the fact that "string" tests in Dromaeo are gated by closure access performance. We need a decent benchmark suite.
Blocks: 532568
Boris, hope you don't mind taking ownership since you are patching. If you want to hand off patches, just call for a volunteer new assignee. Great to see your work here, you are inducted into the TraceMonkey JIT-grooming troop!

/be
Assignee: general → bzbarsky
OS: Mac OS X → All
Hardware: x86 → All
Attachment #415692 - Flags: review?(brendan) → review+
Comment on attachment 415692 [details] [diff] [review]
Proposed patch

>+template<typename T>
>+inline JSBool
>+SetInClosure(JSContext* cx, JSObject* call, uint32 slot, jsval v)
>+{
>+    JS_ASSERT(OBJ_GET_CLASS(cx, call) == &js_CallClass);
>+    JS_ASSERT(slot < T::maxObjSlots(call));
>+
>+    /*
>+     * Do we also want to assume that this will always live in dslots
>+     * (which is the case for Call objects, in fact), or not be quite
>+     * so brazen?

This code is a brazen hussy, smooching jsfun.cpp and its manly static assertions that uphold the dslots guarantee! Please change this comment to reference the static asserts over there, and don't be coy about it! ;-)

>+    JS_ASSERT(sprop->flags & SPROP_HAS_SHORTID);
>     LIns* args[] = {
>         box_jsval(v, v_ins),
>-        INS_CONST(SPROP_USERID(sprop)),
>+        INS_CONST(uint32(sprop->shortid)),

This is tricky -- you need a uint16 cast, if I'm not mistaken, not uint32.

r=me with these changes and dmandelin's blessing,

/be
(Assignee)

Comment 6

9 years ago
> This is tricky -- you need a uint16 cast, if I'm not mistaken, not uint32.

Could be, but that code change is gone anyway with LIR-ification.
(Assignee)

Comment 7

9 years ago
Created attachment 415807 [details] [diff] [review]
Better, stronger, faster

This passes dmandelin's upvar tests (at least as much so as m-c does, at least), and gets us up to 650 runs/s or so on the dromaeo-object-string tests.
Attachment #415692 - Attachment is obsolete: true
Attachment #415807 - Flags: review?(dmandelin)
Attachment #415807 - Flags: review?(brendan)
Attachment #415692 - Flags: review?(dmandelin)
(Assignee)

Updated

9 years ago
Blocks: 501472
(Assignee)

Comment 8

9 years ago
OK, I ran full dromaeo (including the DOM tests, etc, not what runs on our tbox).  A lot of that stuff doesn't hit the upvar code, so the speedup across the board was only about 6%.  Still, better than nothing.
Comment on attachment 415807 [details] [diff] [review]
Better, stronger, faster

>+    if (!callobj->getPrivate()) {
>+        // Due to the parent guard in guardCallee, we know that on trace we
>+        // will also not have a non-null private in the Call object.  So all we

Nit: French spacing frowned upon in local style.

>+        // need to do is write the value to the Call object's slots.

Nit: s/slots/slot/

This comment could be more explicit: "Because of the parent guard in guardCallee ensures this Call object is the same one on trace, and because once a Call object loses its frame it never regains one, ...."

>+        JS_ASSERT(sprop->flags & SPROP_HAS_SHORTID);
>+        int32 offset = sprop->shortid;
>+        if (sprop->setter == SetCallArg)
>+            offset += ArgClosureTraits::slot_offset(callobj);
>+        else if (sprop->setter == SetCallVar)
>+            offset += VarClosureTraits::slot_offset(callobj);
>+        else
>+            RETURN_STOP("can't trace special CallClass setter");
>+        LIns* base =
>+            lir->insLoad(LIR_ldp, callobj_ins, offsetof(JSObject, dslots));

Nit: no need to break this across two lines. Might be easier to read with a blank line before, though.

>+        lir->insStorei(box_jsval(v, v_ins), base, offset*sizeof(jsval));

Nit: space around binary ops.

/be
Attachment #415807 - Flags: review?(brendan) → review+
Er, my wording is slightly off ("Because of the blah ensures blargh"), but you get the idea ;-).

/be
Fixed the nits locally.
The intarwebs misled me on "French spacing" meaning two spaces after full stop, can you believe it? The wikipedia page adverts to the crazy inversion of the original meaning. I'll try to change. Vive la France!

/be
Comment on attachment 415807 [details] [diff] [review]
Better, stronger, faster

6% speedup across the board from a 15-line patch is a big win in my book.
Attachment #415807 - Flags: review?(dmandelin) → review+
Well, it needs bug 530255 as well for the 6% thing.
Created attachment 416019 [details] [diff] [review]
Updated to comments
Attachment #415807 - Attachment is obsolete: true
Pushed http://hg.mozilla.org/tracemonkey/rev/d292473f8ebe
Whiteboard: fixed-in-tracemonkey

Comment 18

9 years ago
http://hg.mozilla.org/mozilla-central/rev/d292473f8ebe
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.