Closed
Bug 997459
Opened 6 years ago
Closed 6 years ago
Try out different sets of coefficients for new sincos implementation to optimize accuracy and performance
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Not set
Tracking
()
RESOLVED
FIXED
mozilla31
People
(Reporter: ehoogeveen, Assigned: ehoogeveen)
References
(Depends on 1 open bug)
Details
Attachments
(2 files, 1 obsolete file)
1.59 KB,
patch

sunfish
:
review+

Details  Diff  Splinter Review 
1.53 KB,
patch

sunfish
:
review+

Details  Diff  Splinter Review 
As requested, here's a fairly rough set of coefficients for approximating cosine (I included the 1.0 at the end for completeness): 2.06467337476762997948e9 2.75555495413759160741e7 2.48015808595638122085e5 1.38888888779622760722e3 4.16666666665987187046e2 4.99999999999999888978e1 1.0 These give a worst error of 7.87e17 in arbitrary precision, less than 1.11e16, the epsilon of the output range for the approximation, so with any luck the maximum error will still be 1 bit. It's still big enough that the errors may add up to 2 bits though (I haven't got the program from bug 967709 comment #6 set up yet, so I haven't been able to check). The optimization goal for these coefficients was to minimize the maximum error (in arbitrary precision), so if these give a maximum error of 1 bit in practice, optimizing for the mean error might give a somewhat better result  but if these give a maximum error of 2 bits, this is probably the best we're going to get with this number of coefficients. I also have a slightly better set of coefficients for sine, but I'll post those later.
Assignee  
Comment 1•6 years ago


Actually, here's the new set of coefficients for sine that I was working on: 1.59046813973877163292e10 // 6152825598094877 / exp2(85) 2.50509001624159785668e08 // 7571170002733246 / exp2(78) 2.75573146431678644161e06 // 6506786951439440 / exp2(71) 1.98412698327005105692e04 // 7320136534024805 / exp2(65) 8.33333333332626768897e03 // 4803839602524456 / exp2(59) 1.66666666666666490881e01 // 6004799503160655 / exp2(55) In arbitrary precision, these have a 10.57% smaller mean error than the current set (the difference will be smaller in double precision), at the cost of a somewhat larger maximum error (the maximum error is still so low that the maximum difference in double precision should be 1 bit).
Assignee  
Comment 2•6 years ago


(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #1) > (the difference will be smaller in double precision) These coefficients appear to have about an 8.21% smaller mean error in double precision  better than I expected! That's using Mathematica to do my own testing, so probably comparing against a different version of sin() as a baseline (is standard library sin() guaranteed to be accurate to the last bit?).
Assignee  
Comment 3•6 years ago


These coefficients should simply be better, but it would be good to confirm that the maximum error is still 1 epsilon with these. It might be worth noting that the errors peak closer to Pi/4 with this set, but the peaks are narrower (hence the reduction in mean error). It would also be possible to generate sets that 1) optimize solely to minimize the maximum error or 2) optimize for the geometric mean of the maximum error and mean error. These would give lower, but broader peaks in arbitrary precision. But given that simply squaring the input in double precision introduces errors of the same magnitude, which are then shifted around in applying the approximation, I don't think it would help make the errors significantly more predictably spaced.
Assignee: nobody → emanuel.hoogeveen
Status: NEW → ASSIGNED
Attachment #8408199 
Flags: review?(sunfish)
Assignee  
Comment 4•6 years ago


This applies the same coefficients as in comment #0 and applies on top of part 1. I don't know if these coefficients are good enough to use, so asking for feedback instead of review. If we decide that these are good enough to land, I'll work on tweaking them a bit more later.
Attachment #8408201 
Flags: feedback?(sunfish)
Comment 5•6 years ago


Comment on attachment 8408199 [details] [diff] [review] Part 1: Coefficients for polevl_sin with lower mean error. Review of attachment 8408199 [details] [diff] [review]:  My tests also show a maximum error still at one bit in the [0,pi/4] range, and the average error is reduced.
Attachment #8408199 
Flags: review?(sunfish) → review+
Comment 6•6 years ago


Comment on attachment 8408201 [details] [diff] [review] Part 2: Coefficients for polevl_cos with fewer terms; potentially less precise, but more performant. Review of attachment 8408201 [details] [diff] [review]:  This looks good. I see a maximum error of 2 bits on most ranges. When I translate this into the SIMD version, it shortens the cos lane to be the same length as the sin lane. ::: js/src/jsmath.cpp @@ +384,5 @@ > > static double polevl_cos(double zz) > { > + // Constants generated using Mathematica's GeneralMiniMaxApproximation > + double ans = 2.06467337476762997948e9; It'd be good to add a comment here explaining that this cos polynomial is one term shorter than is common, with a tradeoff in precision for performance.
Attachment #8408201 
Flags: feedback?(sunfish) → feedback+
Assignee  
Comment 7•6 years ago


(In reply to Dan Gohman [:sunfish] from comment #6) > This looks good. I see a maximum error of 2 bits on most ranges. When I > translate this into the SIMD version, it shortens the cos lane to be the > same length as the sin lane. Yes, I see the same maximum error in my own testing. By the way, I could easily generate a series for sine using one *more* coefficient if we want that final bit of precision for both of them (of course you could just set the final coefficient to 0.0 for testing). How much worse is performance using another coefficient?
Comment 8•6 years ago


(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #7) > (In reply to Dan Gohman [:sunfish] from comment #6) > > This looks good. I see a maximum error of 2 bits on most ranges. When I > > translate this into the SIMD version, it shortens the cos lane to be the > > same length as the sin lane. > > Yes, I see the same maximum error in my own testing. By the way, I could > easily generate a series for sine using one *more* coefficient if we want > that final bit of precision for both of them (of course you could just set > the final coefficient to 0.0 for testing). How much worse is performance > using another coefficient? Doing one fewer multiply and add is about a 10% difference in performance. With my current projections, this speedup is important if we're going to be performancecompetitive using the current algorithm.
Assignee  
Comment 9•6 years ago


Okay, that's a fair bit. I guess if there comes a point when browser vendors agree that high precision implementations of sin and cos are important, it's easy to add back another coefficient. Further refining the current coefficients to reduce the maximum error (unlikely) or mean error (likely) will take time, so I'll post an updated patch for review.
Assignee  
Comment 10•6 years ago


Comment on attachment 8408199 [details] [diff] [review] Part 1: Coefficients for polevl_sin with lower mean error. ># HG changeset patch ># Parent c55dfb01a02757b15d5c5d1f2bfec4310d0232fc ># User Emanuel Hoogeveen <emanuel.hoogeveen@gmail.com> >Bug 997459  Part 1: Coefficients for polevl_sin with lower mean error. > >diff git a/js/src/jsmath.cpp b/js/src/jsmath.cpp > a/js/src/jsmath.cpp >+++ b/js/src/jsmath.cpp >@@ 360,28 +360,28 @@ js::math_clz32(JSContext *cx, unsigned a > * [3] http://netlib.org/cephes/ > * [4] https://svnweb.cern.ch/trac/vdt > * [5] http://www.ecmainternational.org/ecma262/5.1/#sec15.8.2 > * [6] http://netlib.org/fdlibm > */ > > static double polevl_sin(double z, double zz) > { > // Constants taken from fdlibm k_sin.c > double ans = 1.58969099521155010221e10; >+ // Constants generated using Mathematica's GeneralMiniMaxApproximation. >+ double ans = 1.59046813973877163292e10; // 6152825598094877 / exp2(85) > ans *= zz; > ans += 2.50507602534068634195e08; >+ ans += 2.50509001624159785668e08; // 7571170002733246 / exp2(78) > ans *= zz; > ans += 2.75573137070700676789e06; >+ ans += 2.75573146431678644161e06; // 6506786951439440 / exp2(71) > ans *= zz; > ans += 1.98412698298579493134e04; >+ ans += 1.98412698327005105692e04; // 7320136534024805 / exp2(65) > ans *= zz; > ans += 8.33333333332248946124e03; >+ ans += 8.33333333332626768897e03; // 4803839602524456 / exp2(59) > ans *= zz; > ans += 1.66666666666666324348e01; >+ ans += 1.66666666666666490881e01; // 6004799503160655 / exp2(55) > ans *= zz * z; > ans += z; > return ans; > } > > static double polevl_cos(double zz) > { > // Constants taken from fdlibm k_cos.c
Assignee  
Comment 11•6 years ago


Attachment #8408201 
Attachment is obsolete: true
Attachment #8409085 
Flags: review?(sunfish)
Assignee  
Comment 12•6 years ago


Comment on attachment 8409085 [details] [diff] [review] Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. ># HG changeset patch ># Parent 8f23ed95cca6805045b49ff949438a3ff34eddad ># User Emanuel Hoogeveen <emanuel.hoogeveen@gmail.com> >Bug 997459  Part 2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. > >diff git a/js/src/jsmath.cpp b/js/src/jsmath.cpp > a/js/src/jsmath.cpp >+++ b/js/src/jsmath.cpp >@@ 379,30 +379,30 @@ static double polevl_sin(double z, doubl > ans += 1.66666666666666490881e01; // 6004799503160655 / exp2(55) > ans *= zz * z; > ans += z; > return ans; > } > > static double polevl_cos(double zz) > { > // Constants taken from fdlibm k_cos.c > double ans = 1.13596475577881948265e11; >+ // Constants generated using Mathematica's GeneralMiniMaxApproximation. >+ // This set uses one less coefficient than usual implementations to >+ // increase performance, raising the maximum approximation error to 2 bits. >+ double ans = 2.06467337476762997948e9; > ans *= zz; > ans += 2.08757232129817482790e09; >+ ans += 2.75555495413759160741e7; > ans *= zz; > ans += 2.75573143513906633035e07; >+ ans += 2.48015808595638122085e5; > ans *= zz; > ans += 2.48015872894767294178e05; >+ ans += 1.38888888779622760722e3; > ans *= zz; > ans += 1.38888888888741095749e03; >+ ans += 4.16666666665987187046e2; > ans *= zz; > ans += 4.16666666666666019037e02; > ans *= zz; > ans += 0.5; >+ ans += 4.99999999999999888978e1; > ans *= zz; > ans += 1.0; > return ans; > } > > namespace { > struct sincos_result { double s, c; }; > }
Attachment #8409085 
Attachment description: Part 2 v2: Coefficients for polevl_cos with fewer terms; potentially less precise, but more performant. → Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant.
Comment 13•6 years ago


Comment on attachment 8409085 [details] [diff] [review] Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. Review of attachment 8409085 [details] [diff] [review]:  Looks good.
Attachment #8409085 
Flags: review?(sunfish) → review+
Assignee  
Comment 14•6 years ago


Dan verified on IRC that the reduced precision isn't a problem for Sunspider, so setting checkinneeded :)
Keywords: checkinneeded
Comment 15•6 years ago


https://hg.mozilla.org/integration/mozillainbound/rev/e6bb48383502 https://hg.mozilla.org/integration/mozillainbound/rev/25925f2cb271
Keywords: checkinneeded
Comment 16•6 years ago


https://hg.mozilla.org/mozillacentral/rev/e6bb48383502 https://hg.mozilla.org/mozillacentral/rev/25925f2cb271
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution:  → FIXED
Target Milestone:  → mozilla31
Assignee  
Comment 17•6 years ago


Eyeballing the AWFY results, it looks like this change improved ssmorph by around 3% (although something seems to have regressed it back to previous levels at the moment). So it seems like this did help performance slightly in the scalar code, albeit not much :)
You need to log in
before you can comment on or make changes to this bug.
Description
•