Closed
Bug 584223
Opened 14 years ago
Closed 14 years ago
Performance optimizations for sqrts via Math.pow
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
DUPLICATE
of bug 564548
People
(Reporter: billm, Assigned: billm)
References
Details
Attachments
(1 file, 1 obsolete file)
1.53 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
This makes a small change to the Math.pow runtime code to check if the exponent is 0.5 or -0.5. In such cases, it calls sqrt (or 1.0/sqrt) instead of pow. This speeds up Sunspider's partial-sums on my laptop by about 10%.
Updated•14 years ago
|
Blocks: JaegerSpeed
Assignee | ||
Comment 1•14 years ago
|
||
Here's the actual patch.
Comment 2•14 years ago
|
||
Could you please also add this to the traceable native? (same file, search for pow)
Assignee | ||
Comment 3•14 years ago
|
||
Did as Andreas suggested. Speedup with the tracer running is ~20% on partial-sums.
Attachment #462552 -
Attachment is obsolete: true
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Comment 6•14 years ago
|
||
SQRTPD xmm1,xmm2/m128
we should use that for sqrt and also for this pow special case.
Comment 7•14 years ago
|
||
Rockin!
/be
Comment 8•14 years ago
|
||
SQRTPD is 2x faster than FSQRT, which libc seems to use. Also, we can special case on trace for the -0.5 there.
Assignee | ||
Comment 9•14 years ago
|
||
(In reply to comment #8)
> SQRTPD is 2x faster than FSQRT, which libc seems to use. Also, we can special
> case on trace for the -0.5 there.
I did a few benchmarks and I don't think it's worth the trouble.
1. I compared the performance of SQRTSD (the scalar version of SQRTPD) to FSQRT in the following loop: for (i=1..100000) { x += 1.0/sqrt(i); }. The SQRTSD version was 20% faster.
2. I translated the partial-sums benchmark to C and compiled it with gcc, comparing an SSE2 version to an x87 version. The x87 version was actually a little faster, although I don't know why.
Take this all with a grain of salt since my laptop has a pretty lame FPU. But then, lots of people have laptops.
Comment 10•14 years ago
|
||
Comment on attachment 462559 [details] [diff] [review]
Patch for traceable native as well
This is a small but clear win. Let's not get bogged down in the asm; I suggest filing a follow-up bug for that.
Any objections to my r+?
Attachment #462559 -
Flags: review+
Comment 11•14 years ago
|
||
r=me
Updated•14 years ago
|
Assignee: general → wmccloskey
Comment 12•14 years ago
|
||
Does this just need to get landed at this point? Did it fall through the cracks?
Assignee | ||
Comment 13•14 years ago
|
||
Sorry, this change got folded in with bug 564548, which optimized pow in a different way.
Status: REOPENED → RESOLVED
Closed: 14 years ago → 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Resolution: FIXED → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•