Closed Bug 933257 Opened 11 years ago Closed 9 years ago

Address precision in new Math functions

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: jorendorff, Assigned: arai)

References

(Blocks 1 open bug)

Details

Attachments

(34 files, 36 obsolete files)

183.28 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
14.43 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.01 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
971 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
22.43 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
793 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
1.82 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.79 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.62 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
11.71 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.54 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
26.26 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.15 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.01 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.74 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
3.69 KB, patch
gerv
: review+
Details | Diff | Splinter Review
2.58 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.23 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
807 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
711 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
1.03 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.11 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
2.50 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
3.47 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
647 bytes, patch
Waldo
: review+
Details | Diff | Splinter Review
17.61 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
15.69 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
8.12 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
3.60 KB, patch
luke
: review+
Details | Diff | Splinter Review
3.36 KB, application/x-javascript
Details
5.95 KB, text/plain
Details
2.08 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
5.04 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.70 KB, patch
luke
: review+
Details | Diff | Splinter Review
They are not very precise. See bug 892671. Needs numerical analysis. Cc-ing Graydon and Boris because I don't know where else to turn.
Hmm. I mostly have theoretical knowledge of this sort of stuff, not actual experience. :( A standardization question: does the spec not specify the precise sequence of floating-point operations for these functions?
(In reply to Boris Zbarsky [:bz] from comment #1) > A standardization question: does the spec not specify the precise sequence > of floating-point operations for these functions? The specification only describes the edge cases (Infinity, NaN, etc), for the actual implementation it simply refers to fdlibm. From https://people.mozilla.org/~jorendorff/es6-draft.html#sec-function-properties-of-the-math-object > Although the choice of algorithms is left to the implementation, it is > recommended (but not specified by this standard) that implementations use > the approximation algorithms for IEEE 754 arithmetic contained in fdlibm, > the freely distributable mathematical library from Sun Microsystems (http://www.netlib.org/fdlibm).
> A standardization question: does the spec not specify the precise sequence of floating-point > operations for these functions? No. I think that would be excessive. If Math.sin() were specified as a sequence of floating-point arithmetic operations, then we couldn't use the C standard library sin() to implement it. fdlibm is abandonware, but the license is right and math doesn't change much. Let's just use a C stdlib function, where it exists, and otherwise use fdlibm.
(In reply to Jason Orendorff [:jorendorff] from comment #0) > They are not very precise. See bug 892671. > > Needs numerical analysis. What sort of numerical analysis are you envisioning here? Are you trying to figure out when the libm functions are not precise enough and their fdlibm counterparts need to be used instead?
(In reply to Nathan Froyd (:froydnj) from comment #4) > (In reply to Jason Orendorff [:jorendorff] from comment #0) > > Needs numerical analysis. > > What sort of numerical analysis are you envisioning here? Are you trying to > figure out when the libm functions are not precise enough and their fdlibm > counterparts need to be used instead? No, if we go with fdlibm we don't need analysis—and I don't think any is forthcoming, so it's a good thing...
cc-ing Vladislav in case he's interested. I had given up hope of getting any help with this.
I just remembered: David posted a patch for this which we never landed. Bug 717379, attachment 779974 [details] [diff] [review].
I don't quite understand this bug, but I've looked at tests of the functions mentioned in bug 892671 under Windows: 1. Math.acosh seems to be ok. The only relatively large errors are checks x = Math.acosh(Math.cosh(x)) near 0. cosh(0)=1, so it's the expected behavior that there will be loss of precision for Math.acosh(Math.cosh(x)). 2. Math.asinh seems to have genuine not quite good precision around 0. It seems that the problem lies in Math.log1p -- asinh near zero is calculated as Math.log1p(Math.abs(x) + x * x / (1.0 + Math.sqrt(1.0 + x * x))) (see fdlibm). It works well under Linux, but not on Windows due to poor precision of Math.log1p. 3. Math.cbrt seems to have genuinely poor precision near zero under Windows.
Assignee: general → nobody
I found interesting `log1p` at https://bugs.ecmascript.org/show_bug.cgi?id=309#c16: Math.log1p = function (x) { x = Number(x); if (x === 1 / 0) { return x; } return (1 + x) - 1 === 0 ? x : x * (Math.log(1 + x) / ((1 + x) - 1)); };
I'm going to import fdlibm with prefix (fdlibm_cbrt, etc), so we can switch to fdlibm functions only where we want to use it, without having troubles (mainly conflict with math.h and cmath) Try is running (and patches there): https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6b922e02dd5
Assignee: nobody → arai.unmht
See Also: → 897634, 1225024
for now, attaching patches without setting r? any comments are welcome. overview: * import fdlibm in modules/fdlibm/src * add |fd_| prefix for each function * in addition to prefix, apply some more patches to fix bug, make it compatible with C++, etc patches for fdlibm are stored in tree, to make it easy to follow upstream change * replace sloppy math function in jsmath.cpp with fd_* variant * build used fdlibm functions [Part 1] a script fetch.py that downloads fdlibm source codes and applies patches. Most patches are in Part 4. fetch.py also replaces `#include "fdlibm.h"` with `#include "fdlibm-internal.h"`, because fdlibm.h has several macros that conflicts with math.h and cmath. Most macros in fdlibm.h are moved to fdlibm-internal.h, and fdlibm-internal.h includes encode.h (Part 2), fdlibm.h, and prefix.h (Part 2). Green on try runs: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c9bd75ab953b (linux, osx, windows, need some more tolerance on linux32) https://treeherder.mozilla.org/#/jobs?repo=try&revision=3db8ee9241e4 (fixed tolerance) https://treeherder.mozilla.org/#/jobs?repo=try&revision=28715acc1e49 (android)
[Part 2] 2 header files used by fdlibm code, endian.h and prefix.h, they're not exported. endian.h is just a partial copy of mfbt/Endian.h, to include from C file. MOZ_LITTLE_ENDIAN is needed by __HI and __LO macros used in fdlibm source. actually I want to share the endian.h code with mfbt/Endia.h, where should I place it? prefix.h contains macro that adds |fd_| prefix to all math functions in fdlibm, it's included by fdlibm-internal.h, so we don't have to modify each function definition and reference.
[Part 3] build scripts for each directory.
Attached patch Part 4: Add patches for fdlibm. (obsolete) — Splinter Review
[Part 4] patch series for fdlibm source. * 01-split-internal.patch as mentioned in Part 1, split fdlibm.h into internal macro things and public things all fdlibm source include fdlibm-internal.h, and SpiderMonkey includes fdlibm.h * 02-rename-p.patch replaced __P macro in fdlibm.h with FDLIBM_P, because it conflicts with system header * 03-add-prefix.patch add |fd_| prefix to all functions in fdlibm.h definitions and references in fdlibm sources are replaced by prefix.h added in Part 2 * 04-add-extern-c.patch adds |extern "C" { ... }| to include fdlibm.h from C++ code * 05-fix-else-branch.patch strange indentation in e_asin.c seems to be a bug according to the comment, the `/* |x|<0.5 */` block should always return, as after the |else if| block is `/* 1> |x|>= 0.5 */` case. so |else| there should be removed (previously I thought that indended lines needs braces, but it doesn't match to the comment) * 06-add-missing-braces-and-parens.patch added braces and parens just to avoid warning (&& and ||, and dangling else) * 07-add-missing-initialize.patch added missing variable initialization in e_exp.c, e_j0.c, and e_j1.c not sure whether they're correct or not... but at least they're not used in this patch series might be better to put assertion there? * 08-remove-pragma.patch commented out |#pragma ident ...| in k_tan.c, that causes error * 09-twoint-union.patch ported from js/src/fdlibm 8 years ago added fd_twoints union, and use it in __HI and __LO macros, that was the way used before, to avoid aliasing warning * 10-fix-alias.patch use __HI and __LO where aliasing warning happens
Attached patch Part 5: Import fdlibm. (obsolete) — Splinter Review
[Part 5] fdlibm source codes, with patches applied the result of fetch.py
[Part 6] adds fdlibm to USE_LIBS in js/src/moz.build.
[Part 7] replace log10 with fd_log10 in jsmath.cpp. fd_log10 passes test without sloppy_tolerance.
[Part 8] replace cosh/sinh/tanh with fd_cosh/fd_sinh/fd_tanh in jsmath.cpp. fd_cosh/fd_sinh reudces sloppy_tolerance from 100 to 8. fd_tanh reduces sloppy_tolerance from 4 to 2.
[Part 9] replace acosh/asinh/atanh with fd_acosh/fd_asinh/fd_atanh in jsmath.cpp. removed acosh/asinh/atanh implementation from jsmath.cpp. fd_acosh reduces sloppy_tolerance from 1000 to 8. fd_asinh passes test wihtout sloppy_tolerance. fd_atanh reduces sloppy_tolerance from 10 to 2.
[Part 10] replace cbrt with fd_cbrt in jsmath.cpp. removed cbrt implementation from jsmath.cpp. fd_cbrt passes test without sloppy_tolerance.
Is upstream maintained? It would be nice to submit the patches we're applying back to them.
according to readme and comment, last update (version 5.3) was made at 04/04/22 (means 2004-04-22 ?) http://www.netlib.org/fdlibm/readme
hm, now I feel it would be better to create a changeset for each fdlibm patch for clarity So, Part 5.0: import fdlibm without patch Part 5.1: split fdlibm.h into fdlibm-interanal.h Part 5.2: replaced __P macro in fdlibm.h with FDLIBM_P ... and optionally store those patches in tree for future update, as some patches are not appropriate for putting in upstream (fd_ prefix, etc)
Attachment #8689347 - Attachment is obsolete: true
Attachment #8689348 - Attachment is obsolete: true
Attachment #8689349 - Attachment is obsolete: true
Attached patch Part 4.0: Import fdlibm. (obsolete) — Splinter Review
imported original fdlibm, followed by changesets for each change
Attachment #8689350 - Attachment is obsolete: true
Attachment #8689351 - Attachment is obsolete: true
Waldo, can I get some feedback for whole plan?
Flags: needinfo?(jwalden+bmo)
(In reply to Jason Orendorff [:jorendorff] from comment #3) > fdlibm is abandonware, but the license is right and math doesn't change > much. Let's just use a C stdlib function, where it exists, and otherwise use > fdlibm. Are the C stdlib functions still preferred over fdlibm? MSVC reports the following warnings about jsmath.cpp's polyfill function definitions conflicting with the function prototypes declared in math.h. AFAICT, none of these polyfill functions are necessary for VS2013 and log2 is only needed for Android's bionic libc. I was about to submit a patch to remove jsmath.cpp's polyfill functions when I saw Tooru's fdlibm patches here. js/src/jsmath.cpp(1121) : warning C4273: 'log2' : inconsistent dll linkage js/src/jsmath.cpp(1146) : warning C4273: 'log1p' : inconsistent dll linkage js/src/jsmath.cpp(1192) : warning C4273: 'expm1' : inconsistent dll linkage js/src/jsmath.cpp(1296) : warning C4273: 'acosh' : inconsistent dll linkage js/src/jsmath.cpp(1406) : warning C4273: 'atanh' : inconsistent dll linkage js/src/jsmath.cpp(1606) : warning C4273: 'cbrt' : inconsistent dll linkage
so, the precision issue is not so strong reason for now, as the spec doesn't require it, right? other cases (like bug 1070517 or bug 1229666 maybe?) would be more important. If fdlibm can solve them, it would worth re-importing it, and maybe solving the precision issue at the same time. will look into the log2 case.
(In reply to Chris Peterson [:cpeterson] from comment #40) > (In reply to Jason Orendorff [:jorendorff] from comment #3) > > fdlibm is abandonware, but the license is right and math doesn't change > > much. Let's just use a C stdlib function, where it exists, and otherwise use > > fdlibm. > > Are the C stdlib functions still preferred over fdlibm? No. Now I would prefer switching to fdlibm, for all functions, on all platforms.
Flags: needinfo?(jwalden+bmo)
log2 (bug 1070517) is problematic :/ there is no direct implementation of log2 in fdlibm, and following polyfill returns 124.00000000000001 for Math.log2(Math.pow(2, 124)). > double _log2(double x) > { > return fd_log(x) / M_LN2; > } so, if we replace log2 with fdlibm's log + above code, bug 1070517 becomes platform independent issue.
there seems to be e_log2.c file in some repositories. will search the latest version and check the license.
NetBSD contains fdlibm, with some more functions (including log2) and float variants: http://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/lib/libm/src/ maybe we can import from there?
@Tooru Fujisawa, What is wrong with log2 implementation at https://bugzilla.mozilla.org/show_bug.cgi?id=1070517#c1 ? instead of "Math.getExponent" in C++/C you could use "ilogb" or "frexp" and instead of "Math.pow(2, x)" - "scalbn" / "ldexp"
(In reply to 4esn0k from comment #46) > @Tooru Fujisawa, > What is wrong with log2 implementation at > https://bugzilla.mozilla.org/show_bug.cgi?id=1070517#c1 ? > instead of "Math.getExponent" in C++/C you could use "ilogb" or "frexp" and > instead of "Math.pow(2, x)" - > "scalbn" / "ldexp" I was just about to switch to fdlibm (or its variant) for all Math functions. If you have a solution, feel free to work on bug 1070517 ;)
(In reply to Tooru Fujisawa [:arai] from comment #47) Do not know if I can do this - I have never written a patch for Mozilla And you will still need to fix expm1 and log1p
(In reply to 4esn0k from comment #48) > (In reply to Tooru Fujisawa [:arai] from comment #47) > Do not know if I can do this - I have never written a patch for Mozilla This page will help: https://developer.mozilla.org/en-US/docs/Introduction but I'm not sure which log2 impl is better. > And you will still need to fix expm1 and log1p fdlibm has both of them. or perhaps do you know another issue related to them?
(In reply to Tooru Fujisawa [:arai] from comment #49) > fdlibm has both of them. > or perhaps do you know another issue related to them? No, I wanted to say that you will need fdlibm for them. P.S. Your implementations of "acosh", "asinh", "atanh" are inaccurate because of log1p and expm1. And so if you fix log1p and expm1 there is a chance you fix "acosh", "asinh", "atanh".
(In reply to Tooru Fujisawa [:arai] from comment #45) > NetBSD contains fdlibm, with some more functions (including log2) and float > variants: > http://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/lib/libm/src/ Is that based on an older version? For instance their e_pow.c has "e_pow.c 5.1 93/09/24" and upstream has "e_pow.c 1.5 04/04/22". According to the README, 5.3 fixed a bug in e_pow.c. Maybe they did backport this to the NetBSD version? We could diff them to see what the differences are. If you have questions about the license you can ask :gerv
(In reply to Jason Orendorff [:jorendorff] from comment #42) > Now I would prefer switching to fdlibm, for all functions, on all platforms. This is also great to have more predictable performance across platforms. Obviously compilers will still emit different code, but at least not wildly different algorithms everywhere.
Seems, Firefox uses stdlib version "cbrt" on Ubuntu, but this version is not monotonic: Math.cbrt(8.000000000000012) > Math.cbrt(8.000000000000014)
5.3 fix for e_pow.c is ported as Revision 1.13. http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libm/src/e_pow.c k_tan.c is also fixed as Revision 1.12. http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libm/src/k_tan.c And I don't see any other notable difference between upstream and NetBSD port.
Also, the strange "else" branch in e_asin.c is fixed in different way than my WIP patch http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/lib/libm/src/e_asin.c?rev=1.12&content-type=text/plain http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libm/src/e_asin.c.diff?r1=1.9&r2=1.10 > t = 0; >... > if(ix>= 0x3ff00000) { /* |x|>= 1 */ >... > } else if (ix<0x3fe00000) { /* |x|<0.5 */ > if(ix<0x3e400000) { /* if |x| < 2**-27 */ > if(huge+x>one) return x;/* return x with inexact if x!=0*/ > } else > t = x*x; > p = t*(pS0+t*(pS1+t*(pS2+t*(pS3+t*(pS4+t*pS5))))); > q = one+t*(qS1+t*(qS2+t*(qS3+t*qS4))); > w = p/q; > return x+x*w; > }
just noticed that e_log2.c was ported from FreeBSD http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libm/src/e_log2.c that is stored here: https://github.com/freebsd/freebsd/tree/master/lib/msun/src and they solve the e_asin.c with the same way as my previous WIP patch: https://github.com/freebsd/freebsd/commit/d95c118b83b9227e11b22b271e9a8fb9f29a0ed7
It passes the try, by replacing most of Math.* functions with fdlibm functions imported from FreeBSD. https://treeherder.mozilla.org/#/jobs?repo=try&revision=8ea8a797093f So, if there is no problem in the license, it should be better to use FreeBSD's fdlibm, as their code needs less change than upstream, and seems to be maintained more (, and of course, we can use log2 there). gerv, is it okay to import source code from FreeBSD? The subset of files in following directory: https://github.com/freebsd/freebsd/tree/master/lib/msun/src It's a variant of fdlibm. fdlibm itself was stored in mozilla-central/js/src/fdlibm until 8 years ago, that was removed by bug 424399. So, the question is the license of the additional changes made in FreeBSD. Copyright and permission comment in each file is still same as before: > * ==================================================== > * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. > * > * Developed at SunPro, a Sun Microsystems, Inc. business. > * Permission to use, copy, modify, and distribute this > * software is freely granted, provided that this notice > * is preserved. > * ====================================================
Flags: needinfo?(gerv)
arai: that header gives all the four freedoms, so it's fine. Gerv
Flags: needinfo?(gerv)
If the FreeBSD team have put a difference license on their version, e.g. the one here: https://github.com/freebsd/freebsd/blob/master/lib/msun/src/catrig.c , then we need to add that license to about:license if we ship this code in Firefox. Gerv
Thank you :) I overlooked one file in my WIP patch has different license, that seems to be BSD license: https://github.com/freebsd/freebsd/blob/master/lib/msun/src/k_exp.c it's a newly added file in FreeBSD's port, and the function there is called from cosh/sinh: https://github.com/freebsd/freebsd/commits/master/lib/msun/src/e_cosh.c https://github.com/freebsd/freebsd/commits/master/lib/msun/src/e_sinh.c the history says "Use __ldexp_exp() to simplify things and improve accuracy for x near the overflow threshold". So, it means, the algorithm used in FreeBSD's port is somehow different from upstream. of course it would be better to improve accuracy tho. I'm going to add the license there to about:license.
btw, where should I store fdlibm source in mozilla-central? in my WIP patch it's stored in modules/fdlibm/src, as some other libraries are already there. is there any better place? (maybe js/src/fdlibm again?)
at first, I'd like to ask review about the change to mfbt/Endian.h. I need MOZ_LITTLE_ENDIAN/MOZ_BIG_ENDIAN macro in fdlibm source, to get low/high 32bit part of double. looks like other headers already have "__cplusplus" conditions, so I guess it's okay to apply same thing here.
Attachment #8689352 - Attachment is obsolete: true
Attachment #8689353 - Attachment is obsolete: true
Attachment #8689354 - Attachment is obsolete: true
Attachment #8689355 - Attachment is obsolete: true
Attachment #8689356 - Attachment is obsolete: true
Attachment #8690476 - Attachment is obsolete: true
Attachment #8690477 - Attachment is obsolete: true
Attachment #8690478 - Attachment is obsolete: true
Attachment #8690479 - Attachment is obsolete: true
Attachment #8690480 - Attachment is obsolete: true
Attachment #8690481 - Attachment is obsolete: true
Attachment #8690482 - Attachment is obsolete: true
Attachment #8690483 - Attachment is obsolete: true
Attachment #8690484 - Attachment is obsolete: true
Attachment #8690485 - Attachment is obsolete: true
Attachment #8690486 - Attachment is obsolete: true
Attachment #8690487 - Attachment is obsolete: true
Attachment #8690488 - Attachment is obsolete: true
Attachment #8690489 - Attachment is obsolete: true
Attachment #8690490 - Attachment is obsolete: true
Attachment #8706737 - Flags: review?(jwalden+bmo)
Is it possible to import the functions as C++? Making Endian.h in particular, and mfbt/ headers to the utmost extent possible, C++, was a very deliberate choice. C is a poor choice for writing code in, these days, when C++ is available and all but fully compatible, so we've made a *very* strong push to make it very hard to use any new C code.
Flags: needinfo?(arai.unmht)
I'll check what happens when I change extensions of imported files.
Comment on attachment 8706737 [details] [diff] [review] Part 0: Make mfbt/Endian.h work in C code. looks like it works in C++ :) we can now use namespace instead of prefix to the fdlibm functions.
Flags: needinfo?(arai.unmht)
Attachment #8706737 - Flags: review?(jwalden+bmo)
Attached patch Part 5: Use fdlibm in jsmath.cpp (obsolete) — Splinter Review
Attachment #8706737 - Attachment is obsolete: true
Comment on attachment 8706772 [details] [diff] [review] Part 1: Add a script to import fdlibm from FreeBSD can I ask review for all of these patches?
Attachment #8706772 - Flags: review?(jwalden+bmo)
Attachment #8706794 - Flags: review+
Comment on attachment 8706772 [details] [diff] [review] Part 1: Add a script to import fdlibm from FreeBSD Review of attachment 8706772 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/fdlibm/import.sh @@ +2,5 @@ > + > +BASE_URL=https://raw.githubusercontent.com/freebsd/freebsd/master/lib/msun/src > + > +function download_source() { > + curl -o src/$2 ${BASE_URL}/$1 || exit 1 "src/$2" and "${BASE_URL}/$1", I think, is slightly cleaner than depending on $1/$2 to not contain spaces. That said -- assign $1 and $2 to script-global variables with descriptive names and explanations of what they should be, please? Some sort of handling of --help with explanations of what the script commandline arguments should be, too, is desirable here. @@ +8,5 @@ > + > +mkdir -p src > + > +# headers > +download_source math.h fdlibm.h || exit 1 Rather than all these "|| exit 1", or at least in addition to it, put |set -e| at the top of the file so errors get complained about fast?
Attachment #8706772 - Flags: review?(jwalden+bmo)
Thank you for reviewing :) (In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #89) > That said -- assign $1 and $2 to script-global variables with descriptive > names and explanations of what they should be, please? Some sort of > handling of --help with explanations of what the script commandline > arguments should be, too, is desirable here. the script does not accept any kind of arguments. I used $1 and $2 there to get function parameter, maybe I'm using it wrongly?
Er. No, I think you're right. (My shell-fu is weak.) Make those local variables, then.
added "set -e", removed "|| exit 1", as "set -e" does what I want. and assigned function parameter to REMOTE_FILENAME and LOCAL_FILENAME.
Attachment #8706772 - Attachment is obsolete: true
Attachment #8709806 - Flags: review?(jwalden+bmo)
Attachment #8709806 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8706773 [details] [diff] [review] Part 2: Import fdlibm from FreeBSD totally forgot to set r? flag here there is no change in upstream for now.
Attachment #8706773 - Flags: review?(jwalden+bmo)
Attachment #8706774 - Flags: review?(jwalden+bmo)
Attachment #8706775 - Flags: review?(jwalden+bmo)
Attachment #8706776 - Flags: review?(jwalden+bmo)
Attachment #8706774 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8706775 [details] [diff] [review] Part 2.2: Change include guard in fdlibm.h Review of attachment 8706775 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/fdlibm/src/fdlibm.h @@ +14,5 @@ > * $FreeBSD$ > */ > > +#ifndef fdlibm_h > +#define fdlibm_h Maybe mozilla_imported_fdlibm_h or something more elaborate than this? This seems plausibly, at the margins, subject to collision.
Attachment #8706775 - Flags: review?(jwalden+bmo) → review+
Attachment #8706776 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8706773 [details] [diff] [review] Part 2: Import fdlibm from FreeBSD Review of attachment 8706773 [details] [diff] [review]: ----------------------------------------------------------------- rs=me, as I don't see much other way to approach this. We did have an import script with apply-local-changes diffs applied post-import, correct? I vaguely recall so, but it occurs to me the patches I just reviewed just made changes to the code, without updating any import script. Let's make sure we have such a script, please, if we don't.
Attachment #8706773 - Flags: review?(jwalden+bmo) → review+
Thank you for reviewing :) sorry I was doing patch things wrongly. I just found that similar thing is done in modules/woff2/update.sh, so I'll modify patches to follow that way (including update feature and README etc).
Added README and some more script to update with single command, including applying patches. I'll fold Part 1 and Part 1.1. using similar format as woff2 scripts https://dxr.mozilla.org/mozilla-central/source/modules/woff2/README.mozilla https://dxr.mozilla.org/mozilla-central/source/modules/woff2/update.sh differences are following: 1. fdlibm files are downloaded separately, instead of cloning repository, as cloning the repository takes so much time, network resource and disk space, even if I use shallow clone 2. commit hash is retrieved with Github API (as we don't have local repository), before and after downloading files, and checked that they're same, otherwise the script fails patches will be located at modules/fdlibm/patches/*.patch I'd like to ask reviewing for each patch in current style, as it's more convenient to fix and rebase, and I hope posting each one as a patch file will be convenient for review process :) I will store them into modules/fdlibm/patches/ instead of creating changeset for each, after review, like patches/0001_remove_unused_declarations_from_fdlibm_h.patch, etc, removing patch header. So, Part 2 and 2.1-2.14 will be folded into single changeset, and will add one more patch that adds patch files.
Attachment #8722576 - Flags: review?(jwalden+bmo)
Comment on attachment 8722576 [details] [diff] [review] Part 1.1: Add README and script to update fdlibm. Review of attachment 8722576 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/fdlibm/update.sh @@ +19,5 @@ > +if [ ${BEFORE_COMMIT} != ${COMMIT} ]; then > + echo "Latest commit is changed during import. Please run again." > + exit 1 > +fi > +sh ./patch.sh Why the separate file for this? Seems better just integrated directly into this one script. (If this is just some convention everyone else follows, okay then. A little odd all the same.)
Attachment #8722576 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8706777 [details] [diff] [review] Part 2.4: Include fdlibm.h from math_private.h Thank you for reviewing :D I'll merge patch.sh into update.sh. Part 2.4 removes `#include "math.h"` from each file, and adds `#include "fdlibm.h"` to math_private.h.
Attachment #8706777 - Flags: review?(jwalden+bmo)
Attachment #8706778 - Flags: review?(jwalden+bmo)
Attachment #8706780 - Flags: review?(jwalden+bmo)
Attachment #8706781 - Flags: review?(jwalden+bmo)
Comment on attachment 8706777 [details] [diff] [review] Part 2.4: Include fdlibm.h from math_private.h Review of attachment 8706777 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/fdlibm/src/k_sin.cpp @@ +1,1 @@ > + Undo?
Attachment #8706777 - Flags: review?(jwalden+bmo) → review+
Attachment #8706778 - Flags: review?(jwalden+bmo) → review+
Attachment #8706780 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8706781 [details] [diff] [review] Part 2.7: Add fdlibm namespace to functions defined and used in fdlibm Review of attachment 8706781 [details] [diff] [review]: ----------------------------------------------------------------- <Waldo> arai: so #define acos fdlibm::acos and such <Waldo> arai: could that alteration be done equally well by manually changing every one of those function definitions to the actual fdlibm::* stuff? 'cause I gotta say, redefining such central names as acos and such is a bit frowny If the answer is that there would be other places to change in addition to those function definitions (say, like cross-function uses within fdlibm itself), then okay, let's run with this. But this is a fairly bad patch to take in terms of macro-unhygienicity, and in light of unified builds particularly, if we have any other not-horribly-invasive alternative. So r+ conditional on this textual substitution ourselves being unpalatable to just do ourselves.
Attachment #8706781 - Flags: review?(jwalden+bmo) → review+
Thank you for reviewing :) There are 3 reasons that I thought using #define is simpler and robust for further update from upstream. 1. following functions are called internally (31 places) * cos * exp * expm1 * fabs * floor * log1p * scalbn * sin * sqrt 2. we have to resolve pre-existing |#define __ieee754_exp exp| etc at the same time for following function definitions (14 places) * __ieee754_acos * __ieee754_acosh * __ieee754_asin * __ieee754_atan2 * __ieee754_atanh * __ieee754_cosh * __ieee754_hypot * __ieee754_log * __ieee754_log10 * __ieee754_log2 * __ieee754_pow * __ieee754_sinh * __ieee754_sqrt 3. following functions are called internally with __ieee754_ prefix, (7 places) * __ieee754_exp * __ieee754_log * __ieee754_sqrt anyway, here's a patch that adds fdlibm namespae to each definition and use. How do you think about this? btw, fdlibm source is not compatible with unified build. there are so much duplicated global static variables.
Attachment #8724374 - Flags: feedback?(jwalden+bmo)
Okay, that's about what I figured. Carry on with the #define-based patch. (Scumbag fdlibm. :-) :-( )
Comment on attachment 8724374 [details] [diff] [review] Part 2.15: Add fdlibm namespace to functions defined and used in fdlibm. I see :)
Attachment #8724374 - Attachment is obsolete: true
Attachment #8724374 - Flags: feedback?(jwalden+bmo)
Attachment #8706782 - Flags: review?(jwalden+bmo)
Comment on attachment 8706783 [details] [diff] [review] Part 2.9: Remove __weak_reference macro as we don't need long double variants, removed __weak_reference macro.
Attachment #8706783 - Flags: review?(jwalden+bmo)
Attachment #8706784 - Flags: review?(jwalden+bmo)
Comment on attachment 8706785 [details] [diff] [review] Part 2.11: Comment out rcsid variable __FBSDID is extracted to unused static global variable, so commented them out (not sure if I can just remove them or not)
Attachment #8706785 - Flags: review?(jwalden+bmo)
Attachment #8706782 - Flags: review?(jwalden+bmo) → review+
Attachment #8706783 - Flags: review?(jwalden+bmo) → review+
Attachment #8706784 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8706785 [details] [diff] [review] Part 2.11: Comment out rcsid variable Review of attachment 8706785 [details] [diff] [review]: ----------------------------------------------------------------- Semi-belatedly, it occurs to me to ask: are we doing these random cleanup things, *as* random cleanup? Or are we doing them to functionally improve the quality of this code? I assume upstream fdlibm is mostly abandoned, so in theory we can make whatever changes to it we want. But is there any real value to doing these things, versus improving something else, more noticeably? If there's no point to these changes, other than to make the code negligibly more readable, I would probably say just leave it alone. The less we touch fdlibm, the less likely we are to incidentally break something, the less delta there is against upstream (should upstream ever start receiving updates again, or should someone else fork upstream and we decide their fork is worth following), etc. We generally keep changes to imported code to a minimum. Is there a good reason not to adhere to that here?
Attachment #8706785 - Flags: review?(jwalden+bmo) → review+
I forgot to mention the reason. The reason of these patches are mostly either following 1. to use fdlibm functions while using system libm in other places so we should use different namespace for functions and macros, or just remove unused parts 2. to make it buildable in our build system, that seems to be using more strict error/warning handling about Part 2.11, __FREEBSD macro may be expanded as `static const char rcsid` variable, that is warned as unused variable, and it's treated as an error in our build system.
Comment on attachment 8706786 [details] [diff] [review] Part 2.12: Fix dangling else in s_scalbn.cpp dangling else is warned and it's treated as an error.
Attachment #8706786 - Flags: review?(jwalden+bmo)
Comment on attachment 8706787 [details] [diff] [review] Part 2.13: Add missing parentheses in e_atan2.cpp this is also warned and treated as an error.
Attachment #8706787 - Flags: review?(jwalden+bmo)
Comment on attachment 8706788 [details] [diff] [review] Part 2.14: Remove unused function from k_exp.cpp we don't need complex variant, so just removed.
Attachment #8706788 - Flags: review?(jwalden+bmo)
Attachment #8706786 - Flags: review?(jwalden+bmo) → review+
Attachment #8706787 - Flags: review?(jwalden+bmo) → review+
Attachment #8706788 - Flags: review?(jwalden+bmo) → review+
While testing patches on other archs/OSes, noticed that hexadecimal float constant is used in some places. e_exp.cpp > if (k==1024) return y*2.0*0x1p1023; e_rem_pio2.cpp > STRICT_ASSIGN(double,fn,x*invpio2+0x1.8p52); > ... > fn = fn-0x1.8p52; s_expm1.cpp > if (k == 1024) y = y*2.0*0x1p1023; it's not available on C++ and VisualStudio compiler throws an error. https://treeherder.mozilla.org/#/jobs?repo=try&revision=faa3fd1371d5&exclusion_profile=false do you know any other way that is compatible with C++? or perhaps we should change it back to C code?
Flags: needinfo?(jwalden+bmo)
looks like, setting up local double variable with INSERT_WORDS works well, as the generated assembly with -O2 is same as 0x1p1023. if there is no better syntax, I'm going to use it. > double const_0x1p1023; > INSERT_WORDS(const_0x1p1023,0x7fe00000, 0); > return y*2.0*const_0x1p1023;
asm.js directly calls libm functions https://dxr.mozilla.org/mozilla-central/rev/5a2e0878d6c258b36b0ee8712a2afcde6ad94c78/js/src/asmjs/WasmTypes.cpp#228 and there is ceilf that was not imported in previous patch. Part 2.x patches are updated to apply same simple fixes to ceilf.cpp as well. Will fold this to Part 1 later.
Attachment #8727057 - Flags: review?(jwalden+bmo)
FLT_EVAL_METHOD used in math_private.h is defined in cfloat (or float.h), but it was not included. linux build hits this issue. https://treeherder.mozilla.org/#/jobs?repo=try&revision=3996eca79b20&selectedJob=17595216
Attachment #8727059 - Flags: review?(jwalden+bmo)
FLT_EVAL_METHOD seems not to be defined on Android build. https://treeherder.mozilla.org/#/jobs?repo=try&revision=cb462959bf80&selectedJob=17599576 and looks like there could be some more environment that doesn't have FLT_EVAL_METHOD. so defined safer implementation for that case.
Attachment #8727061 - Flags: review?(jwalden+bmo)
VisualStudio compiler throws error for non-C++ hexadecimal floating point constant. https://treeherder.mozilla.org/#/jobs?repo=try&revision=faa3fd1371d5&selectedJob=17603546 anyway that's non-C++, and we shouldn't use if we're going to compile the code as C++. changed those 4 constants to double variable initialized with raw data.
Attachment #8727062 - Flags: review?(jwalden+bmo)
VisualStudio compiler throws some more warning. and, on second thought it would be better handling those warning in compiler option, not fixing the source. (might be better reverting Part 2.10, 2.12, and 2.13 too?)
Attachment #8706789 - Attachment is obsolete: true
Attached patch Part 9: Use fdlibm in asm.js. (obsolete) — Splinter Review
Changed libm functions reference in asm.js to fdlibm functions.
Attachment #8727057 - Flags: review?(jwalden+bmo) → review+
Attachment #8727059 - Flags: review?(jwalden+bmo) → review+
Attachment #8727060 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8727062 [details] [diff] [review] Part 2.18: Do not use hexadecimal floating point number. Review of attachment 8727062 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/fdlibm/src/e_exp.cpp @@ +153,5 @@ > if(k >= -1021) { > + if (k==1024) { > + double const_0x1p1023; > + INSERT_WORDS(const_0x1p1023,0x7fe00000, 0); > + return y*2.0*const_0x1p1023; Seems to me, rather than this mess of manually mucking with the bits, we should mathematically compute the relevant value. Clearer, more understandable, doesn't rely on sketch low-level details. double const_0x1p1023 = pow(2, 1023); Sure, it's an extra function call, but I can't imagine there's a compiler out there, that we care about, that won't do a computation of constants at compile time. ::: modules/fdlibm/src/e_rem_pio2.cpp @@ +128,5 @@ > if(ix<0x413921fb) { /* |x| ~< 2^20*(pi/2), medium size */ > medium: > /* Use a specialized rint() to get fn. Assume round-to-nearest. */ > + double const_0x1_8p52; > + INSERT_WORDS(const_0x1_8p52,0x43380000, 0); double const_0x1_8p52 = pow(2, 52) + pow(2, 51); ::: modules/fdlibm/src/s_expm1.cpp @@ +197,5 @@ > } > if (k <= -2 || k>56) { /* suffice to return exp(x)-1 */ > y = one-(e-x); > + if (k == 1024) { > + double const_0x1p1023; Again, double const_0x1p1023 = pow(2, 1023);
Attachment #8727062 - Flags: review?(jwalden+bmo) → review-
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #121) > Comment on attachment 8727062 [details] [diff] [review] > Part 2.18: Do not use hexadecimal floating point number. > > Review of attachment 8727062 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: modules/fdlibm/src/e_exp.cpp > @@ +153,5 @@ > > if(k >= -1021) { > > + if (k==1024) { > > + double const_0x1p1023; > > + INSERT_WORDS(const_0x1p1023,0x7fe00000, 0); > > + return y*2.0*const_0x1p1023; > > Seems to me, rather than this mess of manually mucking with the bits, we > should mathematically compute the relevant value. Clearer, more > understandable, doesn't rely on sketch low-level details. > > double const_0x1p1023 = pow(2, 1023); > > Sure, it's an extra function call, but I can't imagine there's a compiler > out there, that we care about, that won't do a computation of constants at > compile time. MSVC 2013 does not appear to do this, fyi.
Comment on attachment 8727061 [details] [diff] [review] Part 2.17: Define STRICT_ASSIGN even if FLT_EVAL_METHOD is not defined. Review of attachment 8727061 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/fdlibm/src/math_private.h @@ +290,5 @@ > } \ > } while (0) > #endif > +#else > +#define STRICT_ASSIGN(type, lval, rval) do { \ I don't have much idea whether this is right or wrong, except that it seems to be what some other sources online do for this. I guess r+ for this. If someone complains ever, we can figure out what to do then, and actually attempt to grok any of the calling code.
Attachment #8727061 - Flags: review?(jwalden+bmo) → review+
(In reply to Tooru Fujisawa [:arai] from comment #112) > While testing patches on other archs/OSes, noticed that hexadecimal float > constant is used in some places. > > do you know any other way that is compatible with C++? > or perhaps we should change it back to C code? I still think pow(), even despite the MSVC nonsense. We don't have any evidence of this as perf bottleneck, and pow() is much more readable than a hexfloat or the thing your patch did. Until we have some proof that pow() affects the real world, we should prioritize readability over absolute speed.
Flags: needinfo?(jwalden+bmo)
Thanks :) fixed it to call pow.
Attachment #8727062 - Attachment is obsolete: true
Attachment #8727941 - Flags: review?(jwalden+bmo)
Added some more options to disable warning, and now Part 2.12 and Part 2.13 are not required.
Attachment #8727063 - Attachment is obsolete: true
Attachment #8728096 - Flags: review?(jwalden+bmo)
Attachment #8727941 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8728096 [details] [diff] [review] Part 3: Add build scripts for fdlibm. Review of attachment 8728096 [details] [diff] [review]: ----------------------------------------------------------------- ::: config/external/fdlibm/moz.build @@ +4,5 @@ > +# License, v. 2.0. If a copy of the MPL was not distributed with this > +# file, You can obtain one at http://mozilla.org/MPL/2.0/. > + > +Library('fdlibm') > + Might as well add with Files('**'): BUG_COMPONENT = ('Core', 'JavaScript Engine') or somesuch while we're here, right? ::: modules/fdlibm/src/moz.build @@ +25,5 @@ > + CXXFLAGS += [ > + '-wd4018', # signed/unsigned mismatch > + '-wd4146', # unary minus operator applied to unsigned type > + '-wd4305', # truncation from 'double' to 'const float' > + '-wd4723', # potential divide by 0 Did someone write out division by a literal zero somewhere? 'cause MSVC will literally crash on that, even if the operand types are floating point. ::: moz.build @@ +51,5 @@ > ] > > DIRS += [ > 'config/external/zlib', > + 'config/external/fdlibm', Alphabetize. (Wouldn't surprise me if mach required you to alphabetize these, honestly.)
Attachment #8728096 - Flags: review?(jwalden+bmo) → review+
Thank you for reviewing :) (In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #127) > ::: config/external/fdlibm/moz.build > @@ +4,5 @@ > > +# License, v. 2.0. If a copy of the MPL was not distributed with this > > +# file, You can obtain one at http://mozilla.org/MPL/2.0/. > > + > > +Library('fdlibm') > > + > > Might as well add > > with Files('**'): > BUG_COMPONENT = ('Core', 'JavaScript Engine') > > or somesuch while we're here, right? Okay, added there. > ::: modules/fdlibm/src/moz.build > @@ +25,5 @@ > > + CXXFLAGS += [ > > + '-wd4018', # signed/unsigned mismatch > > + '-wd4146', # unary minus operator applied to unsigned type > > + '-wd4305', # truncation from 'double' to 'const float' > > + '-wd4723', # potential divide by 0 > > Did someone write out division by a literal zero somewhere? 'cause MSVC > will literally crash on that, even if the operand types are floating point. It's not a literal zero, but const double initialized with 0.0. It's in e_atanh.cpp, originally e_atanh.c ( https://github.com/freebsd/freebsd/blob/master/lib/msun/src/e_atanh.c#L55 ). > static const double zero = 0.0; > > double > __ieee754_atanh(double x) > { > double t; > int32_t hx,ix; > u_int32_t lx; > EXTRACT_WORDS(hx,lx,x); > ix = hx&0x7fffffff; > if ((ix|((lx|(-lx))>>31))>0x3ff00000) /* |x|>1 */ > return (x-x)/(x-x); > if(ix==0x3ff00000) > return x/zero; I tried with following code, but it doesn't crash. #include <stdio.h> static const double zero = 0.0; int main(void) { double d = 1.0 / zero; fprintf(stderr, "%lf\n", d); return 0; } and the output is following 1.#INF00
Attachment #8706790 - Attachment is obsolete: true
Attachment #8728523 - Flags: review?(jwalden+bmo)
mostly just added fdlibm:: namespace for libm functions. also, removed MSVC specific bug workaround for sin(-0). some libm function usages are not changed because it's used in polyfills, they'll be removed in Part 6.
Attachment #8706791 - Attachment is obsolete: true
Attachment #8728525 - Flags: review?(jwalden+bmo)
also removed workaround for following: * solaris * acos(x < -1 || 1 < x) * asin(x < -1 || 1 < x) * atan(0, 0) * log(x < 0) * Windows * atan2(Infinity, Infinity) * hypot(Infinity) etc * exp(Infinity), exp(-Infinity) * OSX * ceil(-1.0 < x < 0.0) * log1p(-0) and removed polyfills: * log2 * log1p * expm1 * sqrt1pm1 * acosh * asinh * atanh * cbrt maybe I should ask review from build peer too, as I'm touching old-configure.in
Attachment #8706792 - Attachment is obsolete: true
Attachment #8728530 - Flags: review?(jwalden+bmo)
sloppy_tolerance is removed from following functions: * acosh * asinh * cbrt * cosh * log10 * sinh and reduced to 2 in following functions: * atanh * tanh
Attachment #8706793 - Attachment is obsolete: true
Attachment #8728532 - Flags: review?(jorendorff)
Replaced libm functions reference in WasmTypes.cpp with fdlibm functions
Attachment #8727064 - Attachment is obsolete: true
Attachment #8728535 - Flags: review?(luke)
Have there been any measurements of performance over stdlib on our tier 1 platforms?
not yet. will check with on local machines. btw, what could be the best way to measure the performance on many platforms?
Box2D (both Octane/Box2D and asmjs-apps-box2d-throughput) really pounds on sin/cos, so that's a good start. Also I think some of the sunspider tests do as well.
for now, here's result with 64bit js shell on OSX. SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison&categories=math-spectral-norm,bitops-3bit-bits-in-byte,date-format-xparb,3d-morph,controlflow-recursive,string-base64,3d-raytrace,access-nsieve,crypto-md5,bitops-nsieve-bits,access-binary-trees,bitops-bitwise-and,string-validate-input,access-fannkuch,string-fasta,regexp-dna,math-partial-sums,access-nbody,string-tagcloud,date-format-tofte,string-unpack-code,crypto-aes,crypto-sha1,bitops-bits-in-byte,math-cordic,3d-cube&values=before%201.8%201%2013.6%204.9%202.2%205.8%2012.3%202.7%204.4%203.5%202.7%202.1%208.9%205.6%206.9%2013.3%206.7%202.9%2019.2%2012.4%2023.2%2010.1%203.5%201.8%202.4%2012.9%0Aafter%201.8%201.1%2012.8%205%202.2%205.7%2014%202.8%204.2%203.5%202.6%202%208.9%205.5%206.6%2012.9%208.3%202.8%2018.4%2011.2%2022.9%209.8%203.5%201.7%202.3%2012.8 Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison&categories=audio-fft,stanford-crypto-pbkdf2,audio-beat-detection,stanford-crypto-ccm,imaging-darkroom,json-parse-financial,audio-oscillator,ai-astar,audio-dft,stanford-crypto-sha256-iterative,json-stringify-tinderbox,imaging-gaussian-blur,stanford-crypto-aes,imaging-desaturate&values=before%2051.5%2080.7%2095.1%2081.6%2087.8%2040.9%2068.3%2070.5%20123.7%2042.5%2044.6%2069.4%2055.7%2059.6%0Aafter%2050%2080.1%2095.4%2080%20144.2%2039.5%2070.1%2069.6%20121.9%2042.2%2044.6%2068.9%2055%2059.6 Octane: http://bnjbvr.github.io/simplegraph/#title=Octane%20result%20comparison&categories=Richards,DeltaBlue,Crypto,RayTrace,EarleyBoyer,RegExp,Splay,SplayLatency,NavierStokes,PdfJS,Mandreel,MandreelLatency,Gameboy,CodeLoad,Box2D,zlib,Typescript,Score%20(version%209)&values=before%2029869%2059701%2028911%20108260%2029759%203629%2018253%2019923%2035490%2013741%2029780%2028121%2049466%2016069%2050842%2076732%2028904%2029144%0Aafter%2030895%2061343%2028326%20107243%2030635%203543%2018188%2016561%2035525%2012784%2029032%2035486%2045909%2016324%2051549%2076276%2029154%2029024 there's apparently huge regression in Kraken imaging-darkroom, it calls Math.log and Math.pow, and it seems that Math.pow is slow (takes about 3x longer). microbench for Math.pow: code: let t = elapsed(); let n = 0; for (let i = 1; i < 10000000; i++) { n += Math.pow(i / 100000, 1.1); } print(elapsed() - t); result: before: 330611 [us] after: 925693 [us] There seems to be no regression on Box2D.
Comment on attachment 8728532 [details] [diff] [review] Part 7: Remove or reduce sloppy_tolerance in Math function tests. Review of attachment 8728532 [details] [diff] [review]: ----------------------------------------------------------------- r=me. Thank you for tackling this!
Attachment #8728532 - Flags: review?(jorendorff) → review+
Attached file micro benchmark
With microbenchmark (1000000 iterations with different parameters), there are several regressions, compared to OSX 64bit libm: * acos x5.4 (means, takes 5.4 times longer) * acosh x8.6 * asin x5.6 * asinh x6.2 * cbrt x1.2 * ceil x1.2 * cos depends on range * round x1.2 * hypot x19.5 * log10 x1.2 * log2 x1.2 * pow x1.9 * sin depends on range * tan depends on range * trunc x1.3 Then, cos/sin/tan show almost same performance when I calculate [-2*PI, 2*PI] range, but get slower with larger range. http://bnjbvr.github.io/simplegraph/#title=Math%20functions&categories=acos,acosh,asin,asinh,atan,atanh,atan2,cbrt,ceil,cos%20[-PI..PI],cos%20[-2*PI..2*PI],cos%20[-3*PI..3*PI],cos%20[-4*PI..4*PI],cos%20[-8*PI..8*PI],cosh,exp,expm1,floor,round,fround,hypot,log,log1p,log10,log2,pow,sin%20[-PI..PI],sin%20[-2*PI..2*PI],sin%20[-3*PI..3*PI],sin%20[-4*PI..4*PI],sin%20[-8*PI..8*PI],sinh,sqrt,tan%20[-PI..PI],tan%20[-2*PI..2*PI],tan%20[-3*PI..3*PI],tan%20[-4*PI..4*PI],tan%20[-8*PI..8*PI],tanh,trunc&values=before%2021504%2025729%2020965%2036404%2029291%2033195%2026438%2023106%209619%2024004%2024622%2024885%2025041%2025118%2014533%2018727%2021024%208981%2012031%207474%2011904%2023691%2026567%2023731%2023397%2016455%2023690%2024631%2025199%2025783%2025191%2014484%208060%2027648%2028633%2028683%2028942%2029179%2016314%2021983%0Aafter%20117048%20222331%20117166%20227412%2025817%2035425%2026106%2027823%2011587%2021646%2022000%2060248%2088397%20133203%2015055%2019129%2019152%208861%2014549%207432%20231816%2025663%2028120%2028947%2028302%2031614%2022396%2021881%2060353%2092987%20134090%2015693%208357%2030275%2030362%2072350%20101535%20147367%2014717%2028161
with 64bit js shell on debian, it shows different characteristics. SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison%20(debian%2064bit)&categories=math-spectral-norm,bitops-3bit-bits-in-byte,date-format-xparb,3d-morph,controlflow-recursive,string-base64,3d-raytrace,access-nsieve,crypto-md5,bitops-nsieve-bits,access-binary-trees,bitops-bitwise-and,string-validate-input,access-fannkuch,string-fasta,regexp-dna,math-partial-sums,access-nbody,string-tagcloud,date-format-tofte,string-unpack-code,crypto-aes,crypto-sha1,bitops-bits-in-byte,math-cordic,3d-cube&values=before%202.5%201.3%2015.8%205.8%202.8%206.9%2019.1%203%205.6%204.4%203.4%201.9%2011.4%206.9%208.5%2014.3%2012.7%204%2025.8%2018.7%2029%2015.2%204.5%203.7%203.3%2017.5%0Aafter%202.3%201.3%2016.5%206.3%202.6%206.9%2019.9%202.9%205.6%204.3%203.4%202%2011.4%207.1%208.4%2014.2%2024.2%203.8%2025.2%2018.9%2028.7%2016.1%204.4%203.4%203.3%2017.7 Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison%20(debian%2064bit)&categories=audio-fft,stanford-crypto-pbkdf2,audio-beat-detection,stanford-crypto-ccm,imaging-darkroom,json-parse-financial,audio-oscillator,ai-astar,audio-dft,stanford-crypto-sha256-iterative,json-stringify-tinderbox,imaging-gaussian-blur,stanford-crypto-aes,imaging-desaturate&values=before%2069.6%20115.4%20129.7%20115%20146.5%2058.6%2099.2%2093%20194.8%2058.5%2053.9%20107.9%2077.1%2095.3%0Aafter%2069.4%20116%20130.5%20115.6%20183.9%2056.7%2098.7%2092.8%20386%2058.7%2053.5%20108.2%2077.3%2095.6 Octane: http://bnjbvr.github.io/simplegraph/#title=Octane%20result%20comparison%20(debian%2064bit)&categories=Richards,DeltaBlue,Crypto,RayTrace,EarleyBoyer,RegExp,Splay,SplayLatency,NavierStokes,PdfJS,Mandreel,MandreelLatency,Gameboy,CodeLoad,Box2D,zlib,Typescript,Score%20(version%209)&values=before%2023060%2039445%2024191%2080604%2021620%202647%2013497%2018385%2033352%209747%2017280%2019936%2038457%2013533%2029687%2064068%2019693%2021737%0Aafter%2023255%2039339%2024403%2081418%2021423%202672%2013748%2018372%2033386%209811%2017603%2021988%2039630%2013340%2032324%2064007%2019324%2022076 There are big regression in SunSpider math-partial-sums and Kraken audio-dft. math-partial-sums calls Math.sin, Math.cos, and Math.pow. audio-dft calls Math.abs, Math.cos, Math.exp, Math.floor, Math.log, Math.max, Math.min, Math.pow, Math.round, Math.sin, and Math.sqrt.
Comment on attachment 8728535 [details] [diff] [review] Part 9: Use fdlibm in asm.js. Review of attachment 8728535 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for measuring! I don't know how to evaluate how important the 'pow' regression is but I don't ever recall seeing it high on any asm.js profiles :) ::: js/src/asmjs/WasmTypes.cpp @@ +23,1 @@ > #include "jslibmath.h" I think the #include style guidelines don't require a \n here.
Attachment #8728535 - Flags: review?(luke) → review+
Attachment #8728523 - Flags: review?(jwalden+bmo) → review+
Attachment #8728525 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8728530 [details] [diff] [review] Part 6: Remove unused math polyfill. Review of attachment 8728530 [details] [diff] [review]: ----------------------------------------------------------------- Okay, this almost makes it all worthwhile. :-)
Attachment #8728530 - Flags: review?(jwalden+bmo) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/ec80e2d4cccea6eaaabbd081ca7ba1116632039b Bug 933257 - Part 1: Add a script to import and update fdlibm from FreeBSD. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/3025ae7a344558e29c6348d8ea2325e7858c032a Bug 933257 - Part 2: Add patches for fdlibm. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/2f7c7d7ddacc51bcd6ad6a7d78fc99b80ef75724 Bug 933257 - Part 2.1: Import fdlibm from FreeBSD (revision bcea9d50b15e4f0027a5dd526e0e2a612238471e). r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/e4a133efcb4d0eb9d8140c88c5c9e9967f6364bf Bug 933257 - Part 3: Add build scripts for fdlibm. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/c2df769e60acdd47b600f7d58e3348f63c89ea5d Bug 933257 - Part 4: Link fdlibm in SpiderMonkey. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/2e20366dcaf75522d2bfa32dcb16f9cdc0255d36 Bug 933257 - Part 5: Use fdlibm in jsmath.cpp. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/0bc00f6c24fd621b573e7c7da044be8edf5013e4 Bug 933257 - Part 6: Remove unused math polyfill. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/96b3d993fc537220aedfe827b61cb71aef0f1571 Bug 933257 - Part 7: Remove or reduce sloppy_tolerance in Math function tests. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/69061c0ec32b220c879d9c2612c12ce6cd726d85 Bug 933257 - Part 8: Add license for k_exp.cpp to about:license. r=gerv https://hg.mozilla.org/integration/mozilla-inbound/rev/b6ca6b5832d8d353ce8e97562b94c2bf1c346dcb Bug 933257 - Part 9: Use fdlibm in asm.js. r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/e70f926a61d43b87af3eaa5a8514550d3f6904d1 Bug 933257 - Part 1: Add a script to import and update fdlibm from FreeBSD. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/dd5d65d5a57d23c658bd8dfc2f1ef6a092cbee6d Bug 933257 - Part 2: Add patches for fdlibm. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/98670690ac0ce1805341efeb93117ef60bc7fcf0 Bug 933257 - Part 2.1: Import fdlibm from FreeBSD (revision bcea9d50b15e4f0027a5dd526e0e2a612238471e). r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/c91821911d5566d7fd01cde2fef6bf74d4325bd4 Bug 933257 - Part 3: Add build scripts for fdlibm. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/083a4c3175a2b8d183159673faf39e03dc54f9f8 Bug 933257 - Part 4: Link fdlibm in SpiderMonkey. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/f3b11c2b7dbff4219c77adbba9b881481d2f7044 Bug 933257 - Part 5: Use fdlibm in jsmath.cpp. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/41f59bd4b8019b908899664c3dd7ee4c88bb15c6 Bug 933257 - Part 6: Remove unused math polyfill. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/3e5a91d06168b7a670393754c4263904ae75e08b Bug 933257 - Part 7: Remove or reduce sloppy_tolerance in Math function tests. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/6c1130d58993ea31d2f59e98769e2c55f5937c4d Bug 933257 - Part 8: Add license for k_exp.cpp to about:license. r=gerv https://hg.mozilla.org/integration/mozilla-inbound/rev/4c1a64f8996abfcc7d32d32202c6def4d13ec5bd Bug 933257 - Part 9: Use fdlibm in asm.js. r=luke
On AWFY these patches regressed Kraken-darkroom by 80% and Sunspider-partial-sums by 26%. On massive-box2d-throughput and massive-box2d-throughput-f32 there were strange results: a big win on Linux, a small regression on Mac and a big regression on Windows 10.
Just like Guilherme Lima said. There are a few regressions. Not sure yet if it is this patch series or bug 891107, but running intermediate datapoints. https://arewefastyet.com/regressions/#/regression/1802491: mac osx shell 32bit Regression(s)/Improvement(s): - kraken: 8.6% (regression) - kraken: darkroom: 82.42% (regression) - ss: 2.37% (regression) - ss: partial-sums: 26.67% (regression) - ss: tagcloud: 3.38% (regression) https://arewefastyet.com/regressions/#/regression/1802493: mac osx browser 64bit Regression(s)/Improvement(s): - kraken: 18.79% (regression) - kraken: imaging-darkroom: 63.88% (regression) - kraken: audio-dft: 92.5% (regression) - ss: math-partial-sums: 134.57% (regression) https://arewefastyet.com/regressions/#/regression/1802494: ubuntu browser 64bit Regression(s)/Improvement(s): - kraken: 18.93% (regression) - kraken: imaging-darkroom: 38.58% (regression) - kraken: audio-dft: 104% (regression) - ss: math-partial-sums: 144.44% (regression) - massive: box2d-throughput: 7.85% (improvement) - massive: box2d-throughput-f32: 7.85% (improvement) https://arewefastyet.com/regressions/#/regression/1802495: windows browser 64bit Regression(s)/Improvement(s): - kraken: 30.29% (regression) - kraken: imaging-darkroom: 90.79% (regression) - kraken: audio-dft: 213.48% (regression) - ss: math-partial-sums: 222.99% (regression) - massive: box2d-throughput: -6.68% (regression) - massive: box2d-throughput-f32: -6.39% (regression) - unity-webgl: Particles: -34.06% (regression) - unity-webgl: 2D Physics Spheres: -25.12% (regression) - unity-webgl: 2D Physics Boxes: -18.21% (regression)
Most of those regressions were expected, as noted in comment #137-142. I didn't checked massive and uniti-webgl tho, I guess those are also same reason.
Since improving the performance of fdlibm is presumably not trivial, we probably need a policy decision on whether to take the regressions in exchange for cross-platform consistency and precision guarantees. Let's ask the module owners for their input.
Flags: needinfo?(jorendorff)
Flags: needinfo?(jdemooij)
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #150) > Since improving the performance of fdlibm is presumably not trivial, we > probably need a policy decision on whether to take the regressions in > exchange for cross-platform consistency and precision guarantees. Let's ask > the module owners for their input. I really like having a single implementation on all platforms, but I don't think that justifies taking those regressions. According to comment 137, the Kraken darkroom slowdown is caused by Math.pow being slower. Does pow also have precision issues? Is there another, more performant version of that function that we can use? Maybe we can use fdlibm for the functions that aren't (much) slower or not used by the benchmarks?
Flags: needinfo?(jdemooij)
Given the explanation above I assume it is already known. But the auto bisection for kraken points to: http://hg.mozilla.org/integration/mozilla-inbound/log?rev=083a4c3175a2%3A%3A3e5a91d06168%20and%20!083a4c3175a2
I'll check the precision later today. would it be better backing out patches now? Then, if the system-built-in pow doesn't have precision issue, can we use it instead?
Here's the list of the difference between Math.pow(x, y) and the answer from Wolfram Alpha, in binary representation. Tested on Windows xp 32bit, OSX 10.11.3 64bit, and debian 64bit, without fdlibm. All of them show same result, so they're shows as "before" | before | fdlibm ---------------------------------------------+--------+-------- pow(0.4809212066001739, 0.558322023258141) | 0 | 0 pow(0.8821743608824347, 6.331257720452679) | 2 | 2 pow(6.411191442515885, 3.4981125660493495) | 0 | 0 pow(11.288337579613142, 3.358362478481501) | 2 | 2 pow(90.54096829785713, 0.688248246464136) | -1 | -1 pow(124.15974973472316, 8.540974861464846) | 13 | 13 pow(382.76897058501373, 5.065185749311704) | -5 | -5 pow(535.5013386889178, 2.9977459464885703) | 1 | 1 pow(763.8183712379288, 3.015882719917038) | -4 | -4 pow(19636.374466201112, 6.990627405584469) | 4 | 4 pow(25887.14044480693, 5.785090858595169) | -33 | -33 pow(136095.794384161, 0.9842642589325423) | 3 | 3 pow(498335.51693771064, 7.930294165776955) | -13 | -13 pow(0.8626494660054627, -1.1049068962096698) | 1 | 1 pow(2.6192846685052986, -2.9827761205325833) | -1 | -1 pow(8.227806546275112, -3.774726764739824) | 0 | 0 pow(21.141977698000364, -8.995204674600982) | -7 | -7 pow(70.90217710125286, -3.576919300732743) | 2 | 2 pow(170.42685477676963, -4.271779740948823) | 8 | 8 pow(589.5858703011936, -4.618620386354256) | -4 | -4 pow(1240.082761353482, -3.646799330080066) | -4 | -5 (the only difference) pow(6460.281167826614, -4.361302077072015) | 13 | 13 pow(11057.39668234477, -9.137715861624557) | 19 | 19 pow(64665.2509740512, -1.6563097994252662) | -4 | -4 pow(105388.27522927878, -6.6094646271375765) | -2 | -2 pow(360558.0399328597, -9.767023871018427) | -18 | -18 There is no improvement on precision, at least with these example inputs (chosen randomly). Unless I'm using Wolfram Alpha wrongly, both they have a little precision issue. so, maybe we should use builtin libm pow?
(In reply to Tooru Fujisawa [:arai] from comment #154) > I'll check the precision later today. > would it be better backing out patches now? > > Then, if the system-built-in pow doesn't have precision issue, can we use it > instead? That depends on how close we are to a solution. If it might take some time to discuss/fix the issue at hand, backing out would indeed be the sensible approach. Another approach would be to land a way to flip between the old and new implementation (not sure how feasible that is) and default on the old behavior for now. But in any case, having a (not approved) regression in tree makes it quite hard to follow up on other regressions which could be masked by it. And due to improvements to the engine might make it hard to know if the regression was fully fixed...
I'd prefer to revert pow() and keep the rest, if that makes the benchmarks happy.
Flags: needinfo?(jorendorff)
Here's benchmark result on osx 64bit, with: 1. before fdlibm patch (2b880c522f95) 2. after fdlibm patch (4c1a64f8996a) 3. revert pow (on 4c1a64f8996a) 4. revert pow/sqrt (on 4c1a64f8996a) 5. revert pow/sqrt/cos/sin/tan (on 4c1a64f8996a) Math.pow uses sqrt internally, and that path is used in math-partial-sums.js (Math.pow(x, -0.5)) Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison&categories=audio-fft,stanford-crypto-pbkdf2,audio-beat-detection,stanford-crypto-ccm,imaging-darkroom,json-parse-financial,audio-oscillator,ai-astar,audio-dft,stanford-crypto-sha256-iterative,json-stringify-tinderbox,imaging-gaussian-blur,stanford-crypto-aes,imaging-desaturate&values=no_fdlibm%2047.5%2075.3%2088.8%2075.3%2081.7%2038.3%2064.2%2066.1%20109.6%2040%2041.5%2064.6%2051.4%2053.6%0Aall_fdlibm%2046.4%2076.6%2088.5%2076%20134.6%2038.5%2064.2%2065.9%20110%2039.4%2039.4%2064.4%2052.4%2053.2%0Arevert_pow%2046.3%2076.2%2087.6%2075.5%2079.9%2038.3%2063.9%2065.6%20109.3%2039.2%2039.2%2064.2%2051.8%2052.8%0Arevert_pow/sqrt%2047.3%2075.5%2088.1%2075.9%2080.9%2037%2064.3%2065.5%20109.7%2039.4%2041.1%2064.2%2052%2053.3%0Arevert_pow/sqrt/cos/sin/tan%2047.4%2074.3%2087.9%2075%2080.7%2036.8%2064.4%2065.2%20109.1%2039.4%2040.9%2064%2051.1%2052.8 SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison&categories=math-spectral-norm,bitops-3bit-bits-in-byte,date-format-xparb,3d-morph,controlflow-recursive,string-base64,3d-raytrace,access-nsieve,crypto-md5,bitops-nsieve-bits,access-binary-trees,bitops-bitwise-and,string-validate-input,access-fannkuch,string-fasta,regexp-dna,math-partial-sums,access-nbody,string-tagcloud,date-format-tofte,string-unpack-code,crypto-aes,crypto-sha1,bitops-bits-in-byte,math-cordic,3d-cube&values=no_fdlibm%201.6%201%2012.2%204.7%202.1%205.3%2012.1%202.5%204.2%203.4%202.5%201.9%208.6%205.3%206.3%2012.4%206.2%202.7%2018.7%2010.3%2022%209.1%203.3%201.7%202.2%2012.5%0Aall_fdlibm%201.8%200.9%2011.9%204.8%202%205.5%2012.6%202.6%204.2%203.3%202.3%201.9%208.6%205.2%206.3%2012.2%208%202.7%2017.6%209.7%2021.5%209.3%203.4%201.7%202.3%2012.4%0Arevert_pow%201.8%201%2011.8%204.8%202.1%205.5%2012.3%202.4%204.3%203.3%202.4%201.8%208.6%205.2%206.2%2012.2%208%202.8%2017.3%209.7%2021.6%209.1%203.4%201.6%202.1%2012.3%0Arevert_pow/sqrt%201.7%201%2011.9%204.6%202.1%205.4%2012%202.6%204.3%203.4%202.5%201.9%208.4%205.2%206.2%2012.1%207.4%202.7%2017.7%209.6%2021.5%209.1%203.4%201.6%202.2%2012.6%0Arevert_pow/sqrt/cos/sin/tan%201.6%201%2012%204.6%202.1%205.5%2011.7%202.5%204.2%203.2%202.3%201.8%208.6%205.1%206.3%2012.2%206.2%202.5%2017.6%209.8%2022.2%209.1%203.2%201.6%202.2%2012.3 So, the regression in imaging-darkroom could be fixed by just reverting pow. also, the regression in math-partial-sums could be fixed by reverting sqrt and cos/sin/tan (cos and sin are used there) will check on linux64 shortly
Here's benchmark result on linux 64bit, with: 1. before fdlibm patch (2b880c522f95) 2. after fdlibm patch (4c1a64f8996a) 3. revert pow (on 4c1a64f8996a) 4. revert pow/sqrt (on 4c1a64f8996a) 5. revert pow/sqrt/cos/sin/tan (on 4c1a64f8996a) (same as comment #158) Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison&categories=audio-fft,stanford-crypto-pbkdf2,audio-beat-detection,stanford-crypto-ccm,imaging-darkroom,json-parse-financial,audio-oscillator,ai-astar,audio-dft,stanford-crypto-sha256-iterative,json-stringify-tinderbox,imaging-gaussian-blur,stanford-crypto-aes,imaging-desaturate&values=no_fdlibm%2068.6%20117.6%20130.1%20113.9%20146.4%2056.8%2098.3%2094.8%20193.6%2059.7%2053.1%20108%2079.6%2092.7%0Aall_fdlibm%2068.9%20117.5%20129.5%20118.1%20183.3%2056.2%2098%2093.2%20384.2%2060.9%2053.3%20107.9%2079.2%2092.4%0Arevert_pow%2068.4%20117.8%20130.9%20118.9%20146.4%2056.2%2097.7%2094%20384.6%2060.8%2053.3%20107.8%2079.2%2092.9%0Arevert_pow/sqrt%2068.6%20118.6%20127.8%20119.3%20146.6%2056.1%2098%2093.6%20384.6%2060.9%2053.3%20108.2%2079.1%2093.3%0Arevert_pow/sqrt/cos/sin/tan%2068.9%20118.5%20128.6%20119.5%20146.6%2056%2098.2%2093.7%20191.8%2061.4%2053.2%20107.9%2079.6%2094.5 SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison&categories=math-spectral-norm,bitops-3bit-bits-in-byte,date-format-xparb,3d-morph,controlflow-recursive,string-base64,3d-raytrace,access-nsieve,crypto-md5,bitops-nsieve-bits,access-binary-trees,bitops-bitwise-and,string-validate-input,access-fannkuch,string-fasta,regexp-dna,math-partial-sums,access-nbody,string-tagcloud,date-format-tofte,string-unpack-code,crypto-aes,crypto-sha1,bitops-bits-in-byte,math-cordic,3d-cube&values=no_fdlibm%202.4%201.4%2015.8%205.8%202.7%207%2018.4%203%205.6%204.5%203.4%202%2011.3%207.1%208.4%2014.4%2012.9%203.9%2025.8%2018.8%2029.9%2016.2%204.4%203.6%203.2%2017.8%0Aall_fdlibm%202.4%201.2%2016%206.3%202.6%206.9%2020.6%202.9%205.6%204.6%203.5%202%2011.5%206.9%208.5%2014.3%2024.1%203.8%2025.5%2018.8%2029.2%2015.6%204.5%203.7%203.1%2017.2%0Arevert_pow%202.5%201.3%2015.4%206.2%202.8%207%2019.7%203%205.7%204.4%203.5%202.1%2011.4%206.9%208.5%2014.4%2024.2%203.9%2025.4%2018.8%2029%2015.2%204.3%203.6%203.2%2017.8%0Arevert_pow/sqrt%202.3%201.3%2015.8%206.2%202.7%207%2019.1%202.9%205.8%204.5%203.7%201.9%2011.6%207%208.6%2014.3%2023.8%203.7%2025.4%2018.9%2029.3%2016%204.4%203.6%203.2%2017.5%0Arevert_pow/sqrt/cos/sin/tan%202.4%201.4%2015.5%205.8%202.7%206.9%2019.9%202.9%205.7%204.4%203.5%201.9%2011.3%207.1%208.4%2014.3%2012.4%203.7%2025.3%2018.6%2029.1%2015.4%204.6%203.6%203.3%2018.1 So, the regression in imaging-darkroom surely comes from pow, and the regression in audio-dft and math-partial-sums comes from cos/sin/tan. anyway, I'll backout all fdlibm patches for now, and re-land later, to see the actual regression numbers.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Confirmed all benchmarks scores are back to before landing.
so far, I don't see any precision improvement for cos/sin/tan/sqrt, and pow. I think we can revert all of those 5 functions.
We could revisit bug 967709 (and bug 996375) for sin/cos, especially if we could find a way to make that implementation precise (we lose precision by using the square of the input). tan might be harder, since using sin divided by cos leads to precision loss, and I don't know about sqrt.
the problem with pow and sqrt is that, they're used from other fdlibm functions internally. pow is used only for an alternative for hexadecimal floating point. sqrt is used in several functions. Now we have 2 options: (a) use fdlibm pow/sqrt for fdlibm internal use (b) use stdlib pow/sqrt for fdlibm internal use (b) might make other functions faster, and hopefully reduce the code size, but I'm not sure it's right way...
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #163) > (we lose precision by using the square of the input) I retract this statement: see bug 967709 comment #95 if you're interested in the details.
The purpose of switchin to fdlibm are following, right? (a) to improve the precision (b) to have single implementation across platforms and the reason we should revert pow/sqrt/sin/cos/tan is their performance, especially on benchmarks. So we cannot satisfy (b) for them, and they don't show any precision difference. so (a) is not a problem for them. pow/sqrt are used in following functions internally: pow * exp(x) -- if k == 1024, where x = k*ln2 + r, |r| <= 0.5*ln2 * expm1(x) -- if k == 1024, where x = k*ln2 + r, |r| <= 0.5*ln2 sqrt * acos(x) -- if 0.5 < |x| < 1.0 * acosh(x) -- if 1 < x < 2**28 * asin(x) -- if 0.5 <= |x| < 1.0 * asinh(x) -- if 2**-28 < |x| 2**28 * hypot(x, y) -- in most case * pow(x, y) -- if y == 0.5 exp and expm1 case could be ignored, as pow is called with constant there, like: pow(2, 1023) acos and asin are used in following files in benchmarks, they didn't show any much regression: * acos * octane/mandreel.js * webglsamples/book/chrome_book_sim.js * webglsamples/book/extension/chrome_book_sim.js * webglsamples/newgame/2011/gl-matrix.js * asin * octane/box2d.js acosh, asinh, hypot are not used, and pow will be reverted anyway. So, to keep (b), I'd like to use fdlibm implementation of pow/sqrt in above functions. Here's performance comparison on OSX 64bit, with updated patches (reverting pow/sqrt/cos/sin/tan, keep using fdlibm pow/sqrt internally) Kraken: http://bnjbvr.github.io/simplegraph/#title=Kraken%20result%20comparison&categories=audio-fft,stanford-crypto-pbkdf2,audio-beat-detection,stanford-crypto-ccm,imaging-darkroom,json-parse-financial,audio-oscillator,ai-astar,audio-dft,stanford-crypto-sha256-iterative,json-stringify-tinderbox,imaging-gaussian-blur,stanford-crypto-aes,imaging-desaturate&values=before%2054%2085.9%2098.7%2084.3%2091.1%2042.5%2070.6%2073.6%20126%2044.2%2044.5%2071.4%2058.3%2061.3%0Aafter%2054.4%2087.1%20101.6%2085.9%2091.7%2042.4%2071.5%2073.8%20128.4%2044.8%2047.1%2072.6%2062.6%2063.5 SunSpider: http://bnjbvr.github.io/simplegraph/#title=SunSpider%20result%20comparison&categories=math-spectral-norm,bitops-3bit-bits-in-byte,date-format-xparb,3d-morph,controlflow-recursive,string-base64,3d-raytrace,access-nsieve,crypto-md5,bitops-nsieve-bits,access-binary-trees,bitops-bitwise-and,string-validate-input,access-fannkuch,string-fasta,regexp-dna,math-partial-sums,access-nbody,string-tagcloud,date-format-tofte,string-unpack-code,crypto-aes,crypto-sha1,bitops-bits-in-byte,math-cordic,3d-cube&values=before%201.8%201%2012.3%204.8%202.2%205.5%2012.6%202.6%204.2%203.4%202.6%201.9%208.8%205.5%206.6%2012.7%206.5%202.8%2018.7%2010.6%2022.6%209.5%203.4%201.7%202.4%2012.9%0Aafter%201.7%201%2012.3%204.6%202.2%205.6%2012%202.6%204.2%203.5%202.6%202%208.6%205.3%206.3%2012.4%206.5%202.6%2017.9%2010.6%2022.1%209.6%203.4%201.7%202.3%2012.7 Octane: http://bnjbvr.github.io/simplegraph/#title=Octane%20result%20comparison&categories=Richards,DeltaBlue,Crypto,RayTrace,EarleyBoyer,RegExp,Splay,SplayLatency,NavierStokes,PdfJS,Mandreel,MandreelLatency,Gameboy,CodeLoad,Box2D,zlib,Typescript,Score%20(version%209)&values=before%2032111%2061979%2029607%20115142%2031703%203803%2019126%2013161%2037508%2014716%2030810%2038055%2052263%2017435%2056108%2081026%2030301%2030460%0Aafter%2032104%2063989%2030140%20115956%2031660%203731%2018718%2021210%2037656%2014846%2030227%2038055%2048682%2017176%2056276%2081069%2031145%2031241
pow and sqrt are kept for internal use, so removed cos, sin, tan and rem_pio2. this implies Part 2.8 and 2.10 will be removed, as s_cos.cpp, s_sin.cpp, s_tan.cpp, and e_rem_pio2.cpp are no more in tree.
Attachment #8732379 - Flags: review?(jwalden+bmo)
Attachment #8732381 - Flags: review?(jwalden+bmo)
Attachment #8732382 - Flags: review?(luke)
Attachment #8732382 - Flags: review?(luke) → review+
Attachment #8732381 - Flags: review?(jwalden+bmo) → review+
Attachment #8732379 - Flags: review?(jwalden+bmo) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/566412e70f27f9e77030c028cdb3d47c99bfe931 Bug 933257 - Part 1: Add a script to import and update fdlibm from FreeBSD. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/c60dcd46c9566722a4b11f7a9115efb9b9fb6a99 Bug 933257 - Part 2: Add patches for fdlibm. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/ece72de3141e39c87ba90fc390e1438e37e91412 Bug 933257 - Part 2.1: Import fdlibm from FreeBSD (revision bcea9d50b15e4f0027a5dd526e0e2a612238471e). r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/e0c63d8a8940d472db1c44e1fb05c306a421bbe7 Bug 933257 - Part 3: Add build scripts for fdlibm. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/a4163f930b2193f2a329a95ce9eaa7d39c8b90ea Bug 933257 - Part 4: Link fdlibm in SpiderMonkey. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/f1f3e847cdb3e0a565dde92e448ce27e6f12f89e Bug 933257 - Part 5: Use fdlibm in jsmath.cpp. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/b053ad6763d638384a7ae0f4d5c1238277afcea5 Bug 933257 - Part 6: Remove unused math polyfill. r=jwalden https://hg.mozilla.org/integration/mozilla-inbound/rev/18ec8a268573f55f9883c4e01398d65eb8c136d1 Bug 933257 - Part 7: Remove or reduce sloppy_tolerance in Math function tests. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/cf14555864566e4ee197ac7074ec61a5e03e4e90 Bug 933257 - Part 8: Add license for k_exp.cpp to about:license. r=gerv https://hg.mozilla.org/integration/mozilla-inbound/rev/336759fb7df024ec7fe456df8337316e8361c4d5 Bug 933257 - Part 9: Use fdlibm in asm.js. r=luke
See Also: → 1070517
Depends on: 1380031
Depends on: 1520647
No longer depends on: 1380031
Regressions: 1380031
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: