Closed
Bug 714369
Opened 14 years ago
Closed 6 years ago
Removing bitshifts makes code slower
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: azakai, Unassigned)
References
Details
Attachments
(4 files)
Attached is a benchmark, two versions. Version A is the normal version, version B has an optimization applied, that things like
H[x>>2] + H[(x+4)>>2] + ..
are turned into
var x$ = x>>2;
H[x$] + H[x$+1] + ..
That is, instead of multiple shifts, we create another variable, shift it, and use it. A diff between the two files is also attached.
Numbers:
js -m -n on A 1.604
js -m -n on B 1.766
node on A 2.345
node on B 2.197
In node, the result is as expected: There are far fewer shifts done, and B is indeed a little faster. In SpiderMonkey however, the result is the opposite of what I would expect.
Reporter | ||
Comment 1•14 years ago
|
||
Reporter | ||
Comment 2•14 years ago
|
||
![]() |
||
Comment 3•14 years ago
|
||
x$+1 might overflow to double, in general, while (x+4)>>2 is guaranteed to be an integer, right? So the former array access code can probably be type-specialized better....
Reporter | ||
Comment 4•14 years ago
|
||
True, but this is still puzzling to me. It seems like
(x + 4) >> 2
needs to add (fast, since I assume TI knows these are int32s), check for overflowing into a double, convert into an int32 if it overflowed, then shift it. Whereas in the shorter code
x$ + 1
we just needs to add then check for overflowing into a double. So far it looks like the shorter form is faster. Then, regarding the typed array access, if you know the value is already an int32, you can just use it, whereas in the shorter case you in theory need to handle the case of possible double values.
However, I am not sure they need to be handled much: typed arrays can't be big enough that indexes are larger than 32 bits, and reading or writing of larger indexes than the size have no effect (reads return undefined). (I don't see these in the spec, but I see this behavior in the implementation.) So it seems like all that is needed is an overflow check is needed, and if so then do nothing/return undefined, which still seems to imply that the shorter code should be faster I think.
Comment 5•14 years ago
|
||
Do you have a version of the benchmark that works in the browser? I get this error when trying to run it:
Timestamp: 12/31/11 5:01:11 PM
Error: uncaught exception: [Exception... "Illegal operation on WrappedNative prototype object" nsresult: "0x8057000c (NS_ERROR_XPC_BAD_OP_ON_WN_PROTO)" location: "JS frame :: file:///Users/brianhackett/bench/test3.js :: <TOP_LEVEL> :: line 899" data: no]
Reporter | ||
Comment 6•14 years ago
|
||
Sure, here's an HTML file that will work for both versions, just change "a.js" inside it.
![]() |
||
Comment 7•14 years ago
|
||
> needs to add (fast, since I assume TI knows these are int32s), check for overflowing
> into a double, convert into an int32 if it overflowed, then shift it.
In theory, just adding as int32s (with int32 overflow rules) would work, as long as you know that the result it going to have ToInt32 called on it, no?
I could be wrong, of course; comment 3 was just my guess as to what we end up doing here.
Assignee | ||
Updated•11 years ago
|
Assignee: general → nobody
Comment 8•6 years ago
|
||
Version B (~700ms) is now faster than Version A (~755ms) for me, therefore closing as WFM.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•