Closed
Bug 673314
Opened 13 years ago
Closed 13 years ago
Multiply instructions are not generated on Arm
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla8
People
(Reporter: mjrosenb, Assigned: mjrosenb)
References
(Blocks 1 open bug)
Details
(Keywords: mobile, perf, Whiteboard: [inbound])
Attachments
(2 files, 1 obsolete file)
544 bytes,
application/javascript
|
Details | |
2.79 KB,
patch
|
adrake
:
review+
|
Details | Diff | Splinter Review |
While looking into why this test case is 60 times slower on arm than on x86, I discovered that the jitted code does not contain any floating point multiply instructions (I suspect it would not contain integer multiply either).
If you want to run this test case, you probably want to cut down on the number of iterations that it performs, since currently, depending on the flags that are passed to the shell, it takes between 6 and 30 seconds on x86.
There is a comment in the relevant function (mjit::Compiler::jsop_binary), which states:
/* ARM cannot detect integer overflow with multiplication. */
We should a) implement this for fp only multiplies, and b) attempt to handle this for integers as well.
my proposal is (approximately) for
c <- a * b
we do
SMULL tmp,c, a, b
EORS tmp, tmp, c ASR 31
For those people that don't know ARM assembly, SMULL is a 32*32 -> 64 bit multiply. In this case, it stores the lower 32 bits into the register "c", and the upper 32 bits into the register "tmp". For multiplies that don't overflow, tmp should just be a sign extension of c. The second instruction shifts c right by 31, so it will be the sign bit of c repeated 32 times, exactly what will be in tmp if the multiply did not overflow. The exclusive or will then leave tmp 0 if there was no overflow, and non-zero in all other cases.
EORS can also be replaced with TEQ, which sets the condition codes as if an exclusive or had been performed, but not actually storing the result anywhere.
Comment 1•13 years ago
|
||
Could be a duplicate of bug 586267, though that one doesn't cover floating-point.
Assignee | ||
Comment 2•13 years ago
|
||
It looks like all of the code to generate the SMULL was already there, but was just not turned on. We pass the basic tests with this patch applied, and the original test drops from 60x slower to 45x slower. Not ideal, but it is a start.
Assignee | ||
Comment 3•13 years ago
|
||
Since I don't have commit access yet, if you could push this to the try-server, and then to a repo, that would be incredibly useful
Attachment #548313 -
Attachment is obsolete: true
Attachment #549003 -
Flags: review?(adrake)
Comment 4•13 years ago
|
||
Comment on attachment 549003 [details] [diff] [review]
Slightly prettier patch (does not have #if foo || 1)
Review of attachment 549003 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good! One tiny style nit:
::: js/src/methodjit/FastArithmetic.cpp
@@ +246,5 @@
> * This is temporary while ops are still being implemented.
> */
> if ((op == JSOP_MOD) ||
> (lhs->isTypeKnown() && (lhs->getKnownType() > JSVAL_UPPER_INCL_TYPE_OF_NUMBER_SET)) ||
> + (rhs->isTypeKnown() && (rhs->getKnownType() > JSVAL_UPPER_INCL_TYPE_OF_NUMBER_SET))) {
Nit: Curly braces on multi-line ifs should get their own line.
Attachment #549003 -
Flags: review?(adrake) → review+
Comment 5•13 years ago
|
||
To save time, I corrected that nit and pushed to try. I'll push to inbound as soon as the tryserver looks good.
Comment 6•13 years ago
|
||
Try run for 3b2166d7015e is complete.
Detailed breakdown of the results available here:
http://tbpl.mozilla.org/?tree=Try&rev=3b2166d7015e
Results:
success: 12
Total buildrequests: 12
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/drakedevel@gmail.com-3b2166d7015e
Comment 7•13 years ago
|
||
Whiteboard: [inbound]
Updated•13 years ago
|
Comment 8•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
Comment 9•13 years ago
|
||
Did this have any effect on benchmarks?
Comment 10•13 years ago
|
||
(In reply to comment #9)
> Did this have any effect on benchmarks?
Yes! We saw a 3.36% improvement in Talos Sunspider NoChrome on N900 when this landed.
Updated•13 years ago
|
Assignee: general → mrosenberg
You need to log in
before you can comment on or make changes to this bug.
Description
•