Closed Bug 714369 Opened 14 years ago Closed 6 years ago

Removing bitshifts makes code slower

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: azakai, Unassigned)

References

Details

Attachments

(4 files)

Attached is a benchmark, two versions. Version A is the normal version, version B has an optimization applied, that things like H[x>>2] + H[(x+4)>>2] + .. are turned into var x$ = x>>2; H[x$] + H[x$+1] + .. That is, instead of multiple shifts, we create another variable, shift it, and use it. A diff between the two files is also attached. Numbers: js -m -n on A 1.604 js -m -n on B 1.766 node on A 2.345 node on B 2.197 In node, the result is as expected: There are far fewer shifts done, and B is indeed a little faster. In SpiderMonkey however, the result is the opposite of what I would expect.
Attached file Version A
Attached file Version B
x$+1 might overflow to double, in general, while (x+4)>>2 is guaranteed to be an integer, right? So the former array access code can probably be type-specialized better....
True, but this is still puzzling to me. It seems like (x + 4) >> 2 needs to add (fast, since I assume TI knows these are int32s), check for overflowing into a double, convert into an int32 if it overflowed, then shift it. Whereas in the shorter code x$ + 1 we just needs to add then check for overflowing into a double. So far it looks like the shorter form is faster. Then, regarding the typed array access, if you know the value is already an int32, you can just use it, whereas in the shorter case you in theory need to handle the case of possible double values. However, I am not sure they need to be handled much: typed arrays can't be big enough that indexes are larger than 32 bits, and reading or writing of larger indexes than the size have no effect (reads return undefined). (I don't see these in the spec, but I see this behavior in the implementation.) So it seems like all that is needed is an overflow check is needed, and if so then do nothing/return undefined, which still seems to imply that the shorter code should be faster I think.
Do you have a version of the benchmark that works in the browser? I get this error when trying to run it: Timestamp: 12/31/11 5:01:11 PM Error: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object" nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)" location: "JS frame :: file:///Users/brianhackett/bench/test3.js :: <TOP_LEVEL> :: line 899" data: no]
Sure, here's an HTML file that will work for both versions, just change "a.js" inside it.
> needs to add (fast, since I assume TI knows these are int32s), check for overflowing > into a double, convert into an int32 if it overflowed, then shift it. In theory, just adding as int32s (with int32 overflow rules) would work, as long as you know that the result it going to have ToInt32 called on it, no? I could be wrong, of course; comment 3 was just my guess as to what we end up doing here.
Assignee: general → nobody

Version B (~700ms) is now faster than Version A (~755ms) for me, therefore closing as WFM.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: