984018 - sincos optimization

Reporter

Description

•

11 years ago

IonMonkey should recognize when both sin(x) and cos(x) are computed for the same x and form a merged sincos(x) operation which returns both results, which can be significantly faster. sincos may be implemented with the support in bug 967709 when/if it lands, or alternatively it may be implemented in terms of the sincos function provided by the standard libraries on some platforms. Opportunities for this optimization include numerous Box2d workloads, and it looks like it would also trigger in pdf.js, as well as math-partial-sums.js and 3d-cube.js in SunSpider. An alternative approach would be to use the math function cache; computing a sin(x) could insert a value into the cache for both sin(x) and cos(x) so that a subsequent call to cos(x) would be fast. However, it is hoped that the cache will be eliminated some day, possibly through bug 967709, so it would be nice to perform this optimization without using it.

Dan Gohman [:sunfish]

Reporter

Updated

•

11 years ago

Depends on: 967709

Dan Gohman [:sunfish]

Reporter

Comment 1

•

11 years ago

Note that asm.js does not currently use the math function cache at all, so the cache approach wouldn't help it unless we also enable the cache for it.

Nicolas B. Pierron [:nbp]

Comment 2

•

11 years ago

ZongShen, would you be interested to work on using sincos(tau, &s, &c) for computing both sin(tau) and cos(tau) , and caching the result in the MathCache? This would be a first part, and then after we can look at making a second part to make sure IonMonkey is only making one call to get both sin & cos results instead of doing 2 calls. Basically, the reason behind is that it is fast to compute sin and cos at the same time as opposed to compute sin and cos separately. Dan, would you be interested in mentoring?

Flags: needinfo?(sunfish)

Flags: needinfo?(andy.zsshen)

Dan Gohman [:sunfish]

Reporter

Comment 3

•

11 years ago

Yes, I can be a mentor here. I may be absent some days over the next few weeks, so I may not be able to respond quickly, but I will respond when I can. Box2d would be a good benchmark to guide this work. For the cache-based strategy, the other interesting thing to watch for is what the speed of sincos is for programs which only want one result, compared to calling just sin or just cos.

Mentor: sunfish

No longer depends on: 967709

Flags: needinfo?(sunfish)

andy.zsshen

Updated

•

11 years ago

Flags: needinfo?(andy.zsshen)

andy.zsshen

Comment 4

•

11 years ago

Yes, I can try to implement this hybrid function.

Dan Gohman [:sunfish]

Reporter

Comment 5

•

11 years ago

I would prefer the initial version of this feature to use the sincos functionality provided by the standard libraries on some platforms, rather than implementing it manually. That way, we can focus on how this feature fits into the JIT without worrying about floating-point math implementation at the same time. sincos is not a standard C/C++ feature, but it is available on some platforms, for example: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/__sincos.3.html http://www.gnu.org/software/libc/manual/html_node/Trig-Functions.html I had previously thought it was available on Windows, but further investigation suggests that it's only available to C++ AMP and other specialized environments. Of course, I'd be happy to be corrected here.

andy.zsshen

Comment 6

•

11 years ago

OK, so instead of mathematics implementation, the initial version would be a cross-platform wrapper which leverages the exported libraries from the supported platforms. Just a pseudo code, it might be: #ifdef _WIN32 #include <amp.math.h> #endif void math_sincos(double in, double *p_sin, double *p_cos) { #if defined(__linux__) sincos(in, p_sin, p_cos); #elif defined(__APPLE__) __sincos(in, p_sin, p_cos); #elif defined(_WIN32) sincos(in, p_sin, p_cos); #endif } Is that correct?

Emanuel Hoogeveen [:ehoogeveen]

Comment 7

•

11 years ago

With a fallback that does |*p_sin = sin(in); *p_cos = cos(in);|, I'm guessing, so we can play around with using it whenever we know we need both. I'm mostly replying to point out the bugs where we're investigating our own implementation: bug 967709 and bug 996375. I don't know if those will ever land, though I've been working on generating optimal sets of coefficients so we can make an informed decision. The work in this bug could help land our own implementation, since it would let us call the function only once when we know we need both.

Dan Gohman [:sunfish]

Reporter

Comment 8

•

11 years ago

(In reply to andy.zsshen from comment #6) > OK, so instead of mathematics implementation, the initial version would be a > cross-platform wrapper which > leverages the exported libraries from the supported platforms. Just a pseudo > code, it might be: > > #ifdef _WIN32 > #include <amp.math.h> > #endif I'm not familiar with AMP; are there any negative consequences for using it in regular C++ code? > void > math_sincos(double in, double *p_sin, double *p_cos) > { > #if defined(__linux__) This should be __GLIBC__ rather than __linux__. > sincos(in, p_sin, p_cos); > #elif defined(__APPLE__) > __sincos(in, p_sin, p_cos); > #elif defined(_WIN32) > sincos(in, p_sin, p_cos); And add a fallback as mentioned in comment 7. > #endif > } > > Is that correct? Yes, with comments addressed. We'll likely revisit how we actually want to compute sin and cos later. For now this is good, and will let us proceed implementing the rest of the JIT side. Once we finish the JIT work, it'll be easier to evaluate options for sin and cos computation under real-world conditions.

Jeff Walden [:Waldo]

Comment 9

•

11 years ago

I mentioned this over IRL a couple weeks ago, but it would be really really sweet, once SIMD is working and all, to use SIMD ops to implement a self-hosted MathSinCos method (maybe as asm.js to start, if necessary, then have math_sin and math_cos both be self-hosted methods that use that. Fast, no need to worry about platform differences, as quick as computing sin or cos alone, etc. That's probably not something to do now, tho, given the current state of our SIMD implementation. But sometime eventually, maybe. Just noting here to have in the record *somewhere*.

andy.zsshen

Comment 10

•

11 years ago

I just ignored the AMP library and drafted an uncached prototype. void js::math_sincos_uncached(double x, double *p_sin, double *p_cos) { #if defined(__GLIBC__) sincos(x, p_sin, p_cos); #elif defined(__APPLE__) __sincos(x, p_sin, p_cos); #else *p_sin = js::math_sin_uncached(x); *p_cos = js::math_cos_uncached(x); #endif } And I am thinking how to cache the result of such hybrid functions in MathCache.

WIP - Sincos 10 years ago Victor Carlquist 9.95 KB, patch	sunfish : feedback+	Details \| Diff \| Splinter Review
WIP - Sincos analysis. 10 years ago Victor Carlquist 13.63 KB, patch		Details \| Diff \| Splinter Review
WIP - Sincos analysis. 10 years ago Victor Carlquist 13.63 KB, patch		Details \| Diff \| Splinter Review
WIP - Sincos 10 years ago Victor Carlquist 13.89 KB, patch		Details \| Diff \| Splinter Review
WIP - Sincos 10 years ago Victor Carlquist 14.63 KB, patch		Details \| Diff \| Splinter Review
WIP - Sincos 10 years ago Victor Carlquist 12.92 KB, patch		Details \| Diff \| Splinter Review
Sunspider 10 years ago Victor Carlquist 7.53 KB, text/plain		Details
Patch 10 years ago Victor Carlquist 20.71 KB, patch		Details \| Diff \| Splinter Review
Patch 10 years ago Victor Carlquist 21.63 KB, patch		Details \| Diff \| Splinter Review
Patch (WIP) 10 years ago Victor Carlquist 22.81 KB, patch		Details \| Diff \| Splinter Review
Patch - MIRTYPE_SinCosDouble 10 years ago Victor Carlquist 33.62 KB, patch	nbp : feedback+	Details \| Diff \| Splinter Review
patch - WIP 10 years ago Victor Carlquist 36.95 KB, patch	nbp : feedback+	Details \| Diff \| Splinter Review
func06-pass00-Allocate Registers [Backtracking]-lir.gv.png 10 years ago Victor Carlquist 180.05 KB, image/png		Details
WIP - Patch 10 years ago Victor Carlquist 38.43 KB, patch		Details \| Diff \| Splinter Review
func06-pass00-Generate LIR-lir.gv.png 10 years ago Victor Carlquist 177.58 KB, image/png		Details
regalloc.txt 10 years ago Victor Carlquist 9.89 KB, text/plain		Details
Patch 10 years ago Victor Carlquist 42.96 KB, patch	nbp : feedback+	Details \| Diff \| Splinter Review
Bench - Mac 10 years ago Victor Carlquist 8.69 KB, text/plain		Details
Patch 10 years ago Victor Carlquist 40.81 KB, patch	nbp : feedback+	Details \| Diff \| Splinter Review
Path Sincos 10 years ago Victor Carlquist 45.85 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Path sincos 10 years ago Victor Carlquist 44.91 KB, patch	victorcarlquist : review+	Details \| Diff \| Splinter Review
Part 0 - ABI Signatures. 10 years ago Victor Carlquist 10.54 KB, patch	nbp : feedback+	Details \| Diff \| Splinter Review
Part 0 - ABI Signatures. 10 years ago Victor Carlquist 10.81 KB, patch	nbp : feedback+	Details \| Diff \| Splinter Review
Part 0 - ABI Signatures. 10 years ago Victor Carlquist 10.18 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 0 - ABI Signatures. 10 years ago Victor Carlquist 10.92 KB, patch	victorcarlquist : review+	Details \| Diff \| Splinter Review
Part 1 - Sincos. 10 years ago Victor Carlquist 45.65 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64 10 years ago Victor Carlquist 1.57 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64 10 years ago Victor Carlquist 2.20 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64. 10 years ago Victor Carlquist 2.20 KB, patch	victorcarlquist : review+	Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64. 10 years ago Victor Carlquist 2.26 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64 10 years ago Victor Carlquist 2.34 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64 10 years ago Victor Carlquist 2.37 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64 10 years ago Victor Carlquist 2.36 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64 10 years ago Victor Carlquist 2.26 KB, patch		Details \| Diff \| Splinter Review
Part 2 - Fixed build on Arm64. 10 years ago Victor Carlquist 2.52 KB, patch	nbp : review+ lizzard : approval-mozilla-aurora+	Details \| Diff \| Splinter Review