Closed Bug 997459 Opened 11 years ago Closed 11 years ago

Try out different sets of coefficients for new sincos implementation to optimize accuracy and performance

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla31

People

(Reporter: ehoogeveen, Assigned: ehoogeveen)

References

(Depends on 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

Part 1: Coefficients for polevl_sin with lower mean error. 11 years ago Emanuel Hoogeveen [:ehoogeveen] 1.59 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review
Part 2: Coefficients for polevl_cos with fewer terms; potentially less precise, but more performant. 11 years ago Emanuel Hoogeveen [:ehoogeveen] 1.38 KB, patch	sunfish : feedback+	Details \| Diff \| Splinter Review
Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. 11 years ago Emanuel Hoogeveen [:ehoogeveen] 1.53 KB, patch	sunfish : review+	Details \| Diff \| Splinter Review

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Description

•

11 years ago

As requested, here's a fairly rough set of coefficients for approximating cosine (I included the 1.0 at the end for completeness): 2.06467337476762997948e-9 -2.75555495413759160741e-7 2.48015808595638122085e-5 -1.38888888779622760722e-3 4.16666666665987187046e-2 -4.99999999999999888978e-1 1.0 These give a worst error of 7.87e-17 in arbitrary precision, less than 1.11e-16, the epsilon of the output range for the approximation, so with any luck the maximum error will still be 1 bit. It's still big enough that the errors may add up to 2 bits though (I haven't got the program from bug 967709 comment #6 set up yet, so I haven't been able to check). The optimization goal for these coefficients was to minimize the maximum error (in arbitrary precision), so if these give a maximum error of 1 bit in practice, optimizing for the mean error might give a somewhat better result - but if these give a maximum error of 2 bits, this is probably the best we're going to get with this number of coefficients. I also have a slightly better set of coefficients for sine, but I'll post those later.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 1

•

11 years ago

Actually, here's the new set of coefficients for sine that I was working on: 1.59046813973877163292e-10 // 6152825598094877 / exp2(85) -2.50509001624159785668e-08 // -7571170002733246 / exp2(78) 2.75573146431678644161e-06 // 6506786951439440 / exp2(71) -1.98412698327005105692e-04 // -7320136534024805 / exp2(65) 8.33333333332626768897e-03 // 4803839602524456 / exp2(59) -1.66666666666666490881e-01 // -6004799503160655 / exp2(55) In arbitrary precision, these have a 10.57% smaller mean error than the current set (the difference will be smaller in double precision), at the cost of a somewhat larger maximum error (the maximum error is still so low that the maximum difference in double precision should be 1 bit).

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 2

•

11 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #1) > (the difference will be smaller in double precision) These coefficients appear to have about an 8.21% smaller mean error in double precision - better than I expected! That's using Mathematica to do my own testing, so probably comparing against a different version of sin() as a baseline (is standard library sin() guaranteed to be accurate to the last bit?).

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 3

•

11 years ago

Attached patch Part 1: Coefficients for polevl_sin with lower mean error. — Details — Splinter Review

These coefficients should simply be better, but it would be good to confirm that the maximum error is still 1 epsilon with these. It might be worth noting that the errors peak closer to Pi/4 with this set, but the peaks are narrower (hence the reduction in mean error). It would also be possible to generate sets that 1) optimize solely to minimize the maximum error or 2) optimize for the geometric mean of the maximum error and mean error. These would give lower, but broader peaks in arbitrary precision. But given that simply squaring the input in double precision introduces errors of the same magnitude, which are then shifted around in applying the approximation, I don't think it would help make the errors significantly more predictably spaced.

Assignee: nobody → emanuel.hoogeveen

Status: NEW → ASSIGNED

Attachment #8408199 - Flags: review?(sunfish)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 4

•

11 years ago

Attached patch Part 2: Coefficients for polevl_cos with fewer terms; potentially less precise, but more performant. (obsolete) — Details — Splinter Review

This applies the same coefficients as in comment #0 and applies on top of part 1. I don't know if these coefficients are good enough to use, so asking for feedback instead of review. If we decide that these are good enough to land, I'll work on tweaking them a bit more later.

Attachment #8408201 - Flags: feedback?(sunfish)

Dan Gohman [:sunfish]

Comment 5

•

11 years ago

Comment on attachment 8408199 [details] [diff] [review] Part 1: Coefficients for polevl_sin with lower mean error. Review of attachment 8408199 [details] [diff] [review]: ----------------------------------------------------------------- My tests also show a maximum error still at one bit in the [0,pi/4] range, and the average error is reduced.

Attachment #8408199 - Flags: review?(sunfish) → review+

Dan Gohman [:sunfish]

Comment 6

•

11 years ago

Comment on attachment 8408201 [details] [diff] [review] Part 2: Coefficients for polevl_cos with fewer terms; potentially less precise, but more performant. Review of attachment 8408201 [details] [diff] [review]: ----------------------------------------------------------------- This looks good. I see a maximum error of 2 bits on most ranges. When I translate this into the SIMD version, it shortens the cos lane to be the same length as the sin lane. ::: js/src/jsmath.cpp @@ +384,5 @@ > > static double polevl_cos(double zz) > { > + // Constants generated using Mathematica's GeneralMiniMaxApproximation > + double ans = 2.06467337476762997948e-9; It'd be good to add a comment here explaining that this cos polynomial is one term shorter than is common, with a tradeoff in precision for performance.

Attachment #8408201 - Flags: feedback?(sunfish) → feedback+

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 7

•

11 years ago

(In reply to Dan Gohman [:sunfish] from comment #6) > This looks good. I see a maximum error of 2 bits on most ranges. When I > translate this into the SIMD version, it shortens the cos lane to be the > same length as the sin lane. Yes, I see the same maximum error in my own testing. By the way, I could easily generate a series for sine using one *more* coefficient if we want that final bit of precision for both of them (of course you could just set the final coefficient to 0.0 for testing). How much worse is performance using another coefficient?

Dan Gohman [:sunfish]

Comment 8

•

11 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #7) > (In reply to Dan Gohman [:sunfish] from comment #6) > > This looks good. I see a maximum error of 2 bits on most ranges. When I > > translate this into the SIMD version, it shortens the cos lane to be the > > same length as the sin lane. > > Yes, I see the same maximum error in my own testing. By the way, I could > easily generate a series for sine using one *more* coefficient if we want > that final bit of precision for both of them (of course you could just set > the final coefficient to 0.0 for testing). How much worse is performance > using another coefficient? Doing one fewer multiply and add is about a 10% difference in performance. With my current projections, this speedup is important if we're going to be performance-competitive using the current algorithm.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 9

•

11 years ago

Okay, that's a fair bit. I guess if there comes a point when browser vendors agree that high precision implementations of sin and cos are important, it's easy to add back another coefficient. Further refining the current coefficients to reduce the maximum error (unlikely) or mean error (likely) will take time, so I'll post an updated patch for review.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 10

•

11 years ago

Comment on attachment 8408199 [details] [diff] [review] Part 1: Coefficients for polevl_sin with lower mean error. ># HG changeset patch ># Parent c55dfb01a02757b15d5c5d1f2bfec4310d0232fc ># User Emanuel Hoogeveen <emanuel.hoogeveen@gmail.com> >Bug 997459 - Part 1: Coefficients for polevl_sin with lower mean error. > >diff --git a/js/src/jsmath.cpp b/js/src/jsmath.cpp >--- a/js/src/jsmath.cpp >+++ b/js/src/jsmath.cpp >@@ -360,28 +360,28 @@ js::math_clz32(JSContext *cx, unsigned a > * [3] http://netlib.org/cephes/ > * [4] https://svnweb.cern.ch/trac/vdt > * [5] http://www.ecma-international.org/ecma-262/5.1/#sec-15.8.2 > * [6] http://netlib.org/fdlibm > */ > > static double polevl_sin(double z, double zz) > { >- // Constants taken from fdlibm k_sin.c >- double ans = 1.58969099521155010221e-10; >+ // Constants generated using Mathematica's GeneralMiniMaxApproximation. >+ double ans = 1.59046813973877163292e-10; // 6152825598094877 / exp2(85) > ans *= zz; >- ans += -2.50507602534068634195e-08; >+ ans += -2.50509001624159785668e-08; // -7571170002733246 / exp2(78) > ans *= zz; >- ans += 2.75573137070700676789e-06; >+ ans += 2.75573146431678644161e-06; // 6506786951439440 / exp2(71) > ans *= zz; >- ans += -1.98412698298579493134e-04; >+ ans += -1.98412698327005105692e-04; // -7320136534024805 / exp2(65) > ans *= zz; >- ans += 8.33333333332248946124e-03; >+ ans += 8.33333333332626768897e-03; // 4803839602524456 / exp2(59) > ans *= zz; >- ans += -1.66666666666666324348e-01; >+ ans += -1.66666666666666490881e-01; // -6004799503160655 / exp2(55) > ans *= zz * z; > ans += z; > return ans; > } > > static double polevl_cos(double zz) > { > // Constants taken from fdlibm k_cos.c

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 11

•

11 years ago

Attached patch Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. — Details — Splinter Review

Attachment #8408201 - Attachment is obsolete: true

Attachment #8409085 - Flags: review?(sunfish)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 12

•

11 years ago

Comment on attachment 8409085 [details] [diff] [review] Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. ># HG changeset patch ># Parent 8f23ed95cca6805045b49ff949438a3ff34eddad ># User Emanuel Hoogeveen <emanuel.hoogeveen@gmail.com> >Bug 997459 - Part 2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. > >diff --git a/js/src/jsmath.cpp b/js/src/jsmath.cpp >--- a/js/src/jsmath.cpp >+++ b/js/src/jsmath.cpp >@@ -379,30 +379,30 @@ static double polevl_sin(double z, doubl > ans += -1.66666666666666490881e-01; // -6004799503160655 / exp2(55) > ans *= zz * z; > ans += z; > return ans; > } > > static double polevl_cos(double zz) > { >- // Constants taken from fdlibm k_cos.c >- double ans = -1.13596475577881948265e-11; >+ // Constants generated using Mathematica's GeneralMiniMaxApproximation. >+ // This set uses one less coefficient than usual implementations to >+ // increase performance, raising the maximum approximation error to 2 bits. >+ double ans = 2.06467337476762997948e-9; > ans *= zz; >- ans += 2.08757232129817482790e-09; >+ ans += -2.75555495413759160741e-7; > ans *= zz; >- ans += -2.75573143513906633035e-07; >+ ans += 2.48015808595638122085e-5; > ans *= zz; >- ans += 2.48015872894767294178e-05; >+ ans += -1.38888888779622760722e-3; > ans *= zz; >- ans += -1.38888888888741095749e-03; >+ ans += 4.16666666665987187046e-2; > ans *= zz; >- ans += 4.16666666666666019037e-02; >- ans *= zz; >- ans += -0.5; >+ ans += -4.99999999999999888978e-1; > ans *= zz; > ans += 1.0; > return ans; > } > > namespace { > struct sincos_result { double s, c; }; > }

Attachment #8409085 - Attachment description: Part 2 v2: Coefficients for polevl_cos with fewer terms; potentially less precise, but more performant. → Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant.

Dan Gohman [:sunfish]

Comment 13

•

11 years ago

Comment on attachment 8409085 [details] [diff] [review] Part 2 v2: Coefficients for polevl_cos with fewer terms; less precise, but more performant. Review of attachment 8409085 [details] [diff] [review]: ----------------------------------------------------------------- Looks good.

Attachment #8409085 - Flags: review?(sunfish) → review+

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 14

•

11 years ago

Dan verified on IRC that the reduced precision isn't a problem for Sunspider, so setting checkin-needed :)

Keywords: checkin-needed

(PTO June 19-July 7) Ryan VanderMeulen

Comment 15

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/e6bb48383502 https://hg.mozilla.org/integration/mozilla-inbound/rev/25925f2cb271

Keywords: checkin-needed

(PTO June 19-July 7) Ryan VanderMeulen

Comment 16

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/e6bb48383502 https://hg.mozilla.org/mozilla-central/rev/25925f2cb271

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla31

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 17

•

11 years ago

Eyeballing the AWFY results, it looks like this change improved ss-morph by around 3% (although something seems to have regressed it back to previous levels at the moment). So it seems like this did help performance slightly in the scalar code, albeit not much :)

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Try out different sets of coefficients for new sincos implementation to optimize accuracy and performance

Categories

(Core :: JavaScript Engine: JIT, defect)

Tracking

()

People

(Reporter: ehoogeveen, Assigned: ehoogeveen)

References

(Depends on 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files, 1 obsolete file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Attachment

General

Description

File Name

Content Type