Closed Bug 615279 Opened 15 years ago Closed 14 years ago

JM: Undo ADD and INC/DEC in overflow path to avoid unnecessary syncs/moves

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: bhackett1024, Unassigned)

Details

Attachments

(1 file)

JM patch for ADD 15 years ago Brian Hackett [Laid off!] 4.23 KB, patch		Details \| Diff \| Splinter Review

Brian Hackett [Laid off!]

Reporter

Description

•

15 years ago

When doing an ADD under heavy register pressure, JM reuses an operand register for the result. In order to ensure that the operand can be rematerialized in the slow path if the ADD overflows, JM ensures that operand is synced, issuing a store before the operation if necessary. In the JM branch, this happens three times in the am3 loop on v8-crypto. These stores are unnecessary, as if the ADD overflows the original operands can still be recovered: add x y jo > neg x > add x y > neg x > sync x, y, call stub This works even if the negs overflow (x == 0x80000000). This also applies to INC/DEC operations, which will use an extra register and issue an extra move to hold the old value in case of overflow. Not such a big deal for am3, maybe wait for INC/DEC ops to be coalesced into adds (bug 610618?). This should also be doable for SUB, but that case is more complicated (not commutative) and doesn't bite on am3.

Brian Hackett [Laid off!]

Reporter

Comment 1

•

15 years ago

Attached patch JM patch for ADD — Details — Splinter Review

Applied this patch to JM, but could apply to TM too. This only handles ADD operations. http://hg.mozilla.org/projects/jaegermonkey/rev/e18996c2a36f

David Mandelin [:dmandelin]

Comment 2

•

15 years ago

(In reply to comment #1) > Created attachment 493743 [details] [diff] [review] > JM patch for ADD > > Applied this patch to JM, but could apply to TM too. This only handles ADD > operations. Does the patch speed up v8-crypto or some other benchmark of interest?

Brian Hackett [Laid off!]

Reporter

Comment 3

•

15 years ago

With type inference this gets me a couple ms on v8-crypto (will apply to TM to measure without inference). This plus a fix for types in arginc (http://hg.mozilla.org/projects/jaegermonkey/rev/30ffdc01adf2) removed 9 memory accesses from the loop in am3, and cut 6ms from my time on v8-crypto (62ms -> 56ms). This loop has very high register pressure, so is a good regalloc test case. It also has a pretty long linear body, so the current regalloc does a good job with it. After the above two patches, there are still 33 memory accesses in this loop, 13 of which are necessary for getelem/setelems. Of the remaining 20, 7 can be removed with a linear scan regalloc (WIP), several more by using ebp as a GPR.

Tom S. (please needinfo tschuster)

Updated

•

14 years ago

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

JM: Undo ADD and INC/DEC in overflow path to avoid unnecessary syncs/moves

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: bhackett1024, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Updated

Attachment

General

Description

File Name

Content Type