Closed Bug 594247 Opened 14 years ago Closed 14 years ago

JM: Add specialized codepaths for typed arrays

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: dvander)

References

(Blocks 2 open bugs, )

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(2 files, 7 obsolete files)

On the testcase in the url field, we spend 11% of the time doing SetElem on typed arrays (canvas imagedata) and another 2% of the time doing GetElem.
Blocks: 590379
Blocks: 601988
Blocks: 604905
Blocks: 497458
Assignee: general → dvander
Status: NEW → ASSIGNED
The second test case in bug 601988 goes from 3.1s to 1.7s. v8 gets 2.6s.

bug 604905's test case:

pre -m -j:
(10x10) = 1ms
(50x50) = 4ms
(100x100) = 13ms
(500x500) = 364ms
(1000x1000) = 1500ms
 
post -m -j:
(10x10) = 1ms
(50x50) = 3ms
(100x100) = 10ms
(500x500) = 283ms
(1000x1000) = 1160ms

v8:
(10x10) = 1ms
(50x50) = 2ms
(100x100) = 2ms
(500x500) = 52ms
(1000x1000) = 191ms
Comment on attachment 487055 [details] [diff] [review]
part 1: ICs for reading from typed arrays

Do we already have tests for 32-bit typed arrays, especially the unsigned ones? We need to have those for coverage on the conversion optimizations.

>+static inline int
>+TypedArraySlotWidth(js::TypedArray *tarray)

Maybe this should be a method of TypedArray?

>+LookupStatus
>+GetElementIC::attachTypedArray(JSContext *cx, JSObject *obj, const Value &v, Value *vp)

I really liked the comments within this function--they made it much easier to review.

>+    // Return uncacheable so GetElement/CallElement will fetch the value.
>+    return Lookup_Uncacheable;

Ick. This would probably cause confusion someday. How about a new return value that indicates a cache was generated but we still need to do the lookup? Or pull the lookup into a helper function that can be called from here and return Lookup_Cacheable?
Attachment #487055 - Flags: review?(dmandelin)
forgot to hg add the int32 tests

New return value for LookupStatus would require auditing all its callers so this patch just fetches the value.
Attachment #487066 - Flags: review?(dmandelin)
Attachment #487066 - Flags: review?(dmandelin) → review+
dvander, what do things look like on the bug 604905 testcase if you do what the beginning of bug 604905 comment 17 says?
+1
Should definitely be a priority.
Attached patch fix x64 SingleByteRegs (obsolete) — Splinter Review
The "set" part of this patch - forthcoming - changes slightly how the assembler handles single-byte registers, since int8/uint8 arrays stress that path more than usual.

It looks like x64 restricts its single-byte register range, but the spec says that with a REX prefix (which the macro assembler emits), all sixteen registers have single-byte aliases.
Attachment #489397 - Flags: review?(sstangl)
Attached patch WIP, setting integer arrays (obsolete) — Splinter Review
The set part of this patch turned out to be significantly harder, but everything is in place to finish this up very soon. The problem is that when assigning to a typed array some kind of conversion might need to happen, and rather than give up and roll over - we do the conversion inline, like the tracer.

But this requires potentially making a call and/or mutating the RHS, which we haven't done in an IC so far. So a good part of this patch is expanding the call abstractions introduced in bug 609222 to accommodate more advanced use.

Remaining: float32/float64 arrays.
Attachment #489397 - Flags: review?(sstangl) → review+
Attached patch SetElementIC changes (obsolete) — Splinter Review
Attachment #489434 - Flags: review?(cdleary)
Comment on attachment 489434 [details] [diff] [review]
SetElementIC changes

Vlad, mind double-checking the typed-array coercion logic?
Attachment #489434 - Flags: review?(vladimir)
Comment on attachment 489434 [details] [diff] [review]
SetElementIC changes

Looks quite reasonable to me -- I don't know the jaeger jit that well, but nothing jumped out at me.
Attachment #489434 - Flags: review?(vladimir) → review+
Attached patch rolled up patch (obsolete) — Splinter Review
applies against tip + bug 609222
Attachment #487055 - Attachment is obsolete: true
Attachment #489402 - Attachment is obsolete: true
testSetTypedIntArray started failing after bug 592976 landed. It looks like the increased register pressure found an edge case where no single byte registers were available. This patch (rolled up, sorry, qfolded everything yesterday) changes StoreToTypedArray to account for that.
Attachment #489434 - Attachment is obsolete: true
Attachment #489590 - Attachment is obsolete: true
Attachment #489879 - Flags: review?(cdleary)
Attachment #489434 - Flags: review?(cdleary)
Blocks: 612019
Comment on attachment 489879 [details] [diff] [review]
rolled up patch v2

This patch kind of made my head explode, but I do have an understanding of all the cases that are laid out and their correctness WRT the tests. Only a few minor comments:

- The call ABI stuff is cool -- hand't looked at that before.
- We can probably use a store64 for the ImmDoubles in the masm interface.
- Would be nice to have a commend on addressOfExtra that it uses a constant displacement based on stackOffset and so all space must be already-allocated before calling it.
- Would be cool to assert TempRegs fits in that 16b volatile bitmask
- I think it's |temp regs - arg regs| instead of union (symmetric difference).

Nice patch -- leading the way for even *more* insane ICs!
Attachment #489879 - Flags: review?(cdleary) → review+
Blocks: 619761
Attached patch rebased (obsolete) — Splinter Review
Attachment #487066 - Attachment is obsolete: true
Attachment #489397 - Attachment is obsolete: true
Not that I can really follow what's going on, but was this intended:

    5.48 +        OP2_CVTSS2SD_VsdEd  = 0x5A,
    5.49 +        OP2_CVTSD2SS_VsdEd  = 0x5A,

(ie both enumerations the same value)?
It's okay. CVTSS2SD is F3 0F 5A  and CVTSD2SS is F2 0F 5A. In cvtss2sd_rr the prefix F3 is used and in cvtsd2ss_rr the prefix F2.
http://hg.mozilla.org/mozilla-central/rev/5bb0f4c62370
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
I'm getting a 5% perf boost in my code now. I'd declare this a success. :D
Depends on: 624483
Depends on: 625141
Blocks: 628170
No longer blocks: 628170
Depends on: 628170
Blocks: 652346
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: