Closed
Bug 24892
Opened 25 years ago
Closed 25 years ago
gcc opt lossage confuses fdlibm
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
RESOLVED
FIXED
People
(Reporter: rginda, Assigned: rginda)
References
Details
(Keywords: js1.5, Whiteboard: [nsbeta3+])
Attachments
(8 files)
1.73 KB,
text/plain
|
Details | |
256 bytes,
text/plain
|
Details | |
2.74 KB,
patch
|
Details | Diff | Splinter Review | |
1.88 KB,
text/plain
|
Details | |
25.71 KB,
patch
|
Details | Diff | Splinter Review | |
37.92 KB,
patch
|
Details | Diff | Splinter Review | |
54.24 KB,
patch
|
Details | Diff | Splinter Review | |
54.00 KB,
patch
|
Details | Diff | Splinter Review |
JSRef Linux opt build:
js> Math.pow (2,2)
4
js> Math.pow (2,3)
0
xpcshell:
js> Math.pow (2,2)
0
js> Math.pow (2,3)
Infinity
js>
Using egcs-2.91.66
debug builds, and opt on other unix flavors work as expected.
Assignee | ||
Comment 1•25 years ago
|
||
Line 295 in the following snip from e_pow.c seems to be the root of the trouble:
290 s = one; /* s (sign of result -ve**odd) = -1 else = 1 */
291 if((((hx>>31)+1)|(yisint-1))==0) s = -one;/* (-ve)**(odd int) */
292
293 /* split up y into y1+y2 and compute (y1+y2)*(t1+t2) */
294 y1 = y;
295 __LO(y1) = 0;
296 p_l = (y-y1)*t1+y*t2;
Stepping up to this point in debug builds yields:
(gdb) p y
$2 = 3
(gdb) p y1
$3 = 3
While opt builds:
(gdb) p y
$7 = 3
(gdb) p y1
$8 = -1
The code list above looks like this in debug builds:
0x80b9365 <__ieee754_pow+2709>: fldl 0x80ca7c8
0x80b936b <__ieee754_pow+2715>: fstpl 0xffffffa8(%ebp)
0x80b936e <__ieee754_pow+2718>: movl 0xffffff70(%ebp),%ecx
0x80b9374 <__ieee754_pow+2724>: sarl $0x1f,%ecx
0x80b9377 <__ieee754_pow+2727>: movl %ecx,0xffffff30(%ebp)
0x80b937d <__ieee754_pow+2733>: movl 0xffffff30(%ebp),%edx
0x80b9383 <__ieee754_pow+2739>: incl %edx
0x80b9384 <__ieee754_pow+2740>: movl 0xffffff78(%ebp),%eax
0x80b938a <__ieee754_pow+2746>: decl %eax
0x80b938b <__ieee754_pow+2747>: movl %eax,0xffffff30(%ebp)
0x80b9391 <__ieee754_pow+2753>: orl 0xffffff30(%ebp),%edx
0x80b9397 <__ieee754_pow+2759>: testl %edx,%edx
0x80b9399 <__ieee754_pow+2761>: jne 0x80b93a6 <__ieee754_pow+2774>
0x80b939b <__ieee754_pow+2763>: fldl 0x80ca7c8
0x80b93a1 <__ieee754_pow+2769>: fchs
0x80b93a3 <__ieee754_pow+2771>: fstpl 0xffffffa8(%ebp)
0x80b93a6 <__ieee754_pow+2774>: fldl 0x10(%ebp)
0x80b93a9 <__ieee754_pow+2777>: fstpl 0xffffffc8(%ebp)
0x80b93ac <__ieee754_pow+2780>: movl $0x0,0xffffffc8(%ebp)
and this in opt:
0x80848b0 <__ieee754_pow+1232>: movl $0x0,0xffffffd0(%ebp)
0x80848b7 <__ieee754_pow+1239>: movl $0x3ff00000,0xffffffd4(%ebp)
0x80848be <__ieee754_pow+1246>: movl 0xffffffc8(%ebp),%eax
0x80848c1 <__ieee754_pow+1249>: sarl $0x1f,%eax
0x80848c4 <__ieee754_pow+1252>: incl %eax
0x80848c5 <__ieee754_pow+1253>: movl 0xffffffcc(%ebp),%edx
0x80848c8 <__ieee754_pow+1256>: decl %edx
0x80848c9 <__ieee754_pow+1257>: orl %edx,%eax
0x80848cb <__ieee754_pow+1259>: jne 0x80848db <__ieee754_pow+1275>
0x80848cd <__ieee754_pow+1261>: movl $0x0,0xffffffd0(%ebp)
0x80848d4 <__ieee754_pow+1268>: movl $0xbff00000,0xffffffd4(%ebp)
0x80848db <__ieee754_pow+1275>: fldl 0x10(%ebp)
0x80848de <__ieee754_pow+1278>: fstp %st(0)
0x80848e0 <__ieee754_pow+1280>: xorl %eax,%eax
0x80848e2 <__ieee754_pow+1282>: movl %eax,0xffffffac(%ebp)
0x80848e5 <__ieee754_pow+1285>: movl %edx,0xffffffb0(%ebp)
0x80848e8 <__ieee754_pow+1288>: fldl 0xffffffac(%ebp)
0x80848eb <__ieee754_pow+1291>: fldl 0x10(%ebp)
0x80848ee <__ieee754_pow+1294>: fsub %st(1),%st
0x80848f0 <__ieee754_pow+1296>: fmul %st(3),%st
0x80848f2 <__ieee754_pow+1298>: fxch %st(2)
0x80848f4 <__ieee754_pow+1300>: fmull 0x10(%ebp)
0x80848f7 <__ieee754_pow+1303>: faddp %st,%st(2)
0x80848f9 <__ieee754_pow+1305>: fxch %st(1)
0x80848fb <__ieee754_pow+1307>: fstl 0xffffffd8(%ebp)
0x80848fe <__ieee754_pow+1310>: fxch %st(1)
Assignee | ||
Comment 2•25 years ago
|
||
Upgrading to gcc 2.95.2 does not fix the problem.
Assignee | ||
Comment 3•25 years ago
|
||
Assignee | ||
Comment 4•25 years ago
|
||
The attached testcase reduces the failure to just a few lines of c. Running it,
I get:
[rginda@swamp gcc_opt] gcc --version
2.95.2
[rginda@swamp gcc_opt] gcc -g -Wall -Wno-format main.c
[rginda@swamp gcc_opt] ./a.out
y1 is 3.000000e+00
[rginda@swamp gcc_opt] gcc -g -O -Wall -Wno-format main.c
[rginda@swamp gcc_opt] ./a.out
y1 is 4.002151e+00
bam@dimensional.com built this on a rh5.2 system, running gcc 2.7.2.3, and
reports getting 3 for both builds.
rogerl can workaround this issue by rearranging some code, but other uses of
__LO and __HI in the fdlibm code may show up as problems in the future.
Assignee | ||
Comment 5•25 years ago
|
||
Checked in a workaround which includes a test (see attached) for affected gcc
version in the linux config makefile, and a conditional 'volatile' specifier on
affected variables.
Assignee | ||
Comment 6•25 years ago
|
||
cls, can you add a check for this to the mozilla build setup?
Defining GCC_OPT_BUG triggers the workaround.
Summary: Math.pow returns incorrect result in linux/opt builds → gcc opt lossage confuses fdlibm
Assignee | ||
Comment 7•25 years ago
|
||
I'm working on the configure.in check now. But I'm curious as to exactly what
the problem is and why it appears to be fixed if we volatile is used. Is the gcc
devel team aware of the problem? We should change the ifdef from GCC_OPT_BUG to
something a bit more descriptive. I'm scouring the gcc info pages and the best
I can see is that the automatic variables can possibly be clobbered if a longjmp
is performed. Also, the optimized build appears to do the right thing if we use
-fvolatile instead of declaring each variable volatile.
`-fvolatile'
Consider all memory references through pointers to be volatile.
Assignee | ||
Comment 9•25 years ago
|
||
The potential for the error only shows in a finite number of places in fdlibm,
and of those, only 4 affect JavaScript (
http://lxr.mozilla.org/seamonkey/search?string=GCC_OPT_BUG .) I'd argue against
-fvolatile because it would bite more than just the affected areas. I've posted
a note to n.p.m.unix and n.p.m.jseng, requesting that someone close to the gcc
team speak up. If no one answers, I'll go directly to the gcc bug mailing list
myself.
The problem is in the line
(*(int *)&foo) = 0;
Where foo is a double. The assembly this code generates shows that 0 is being
assigned to the low half of an uninitialized register. Maybe this code fools
gcc into thinking foo is an int, and that foo=0 will (completely) initialize it,
so gcc doesn't have to (but that's just a guess.) Marking the variable as
volatile tells the compiler that 'this variable may be changed by outside
forces', which in our case isn't the truth, but the extra caution used around a
volatile cures the symptom.
Also, if you're using the testcase attached to the bug as a basis for the config
check, you may want to worry about the rare chance that a compiler which
exhibits this behavior will give the correct result. Because the register isn't
initialized before the assignment, anything could happen.
Comment 10•25 years ago
|
||
Assignee | ||
Comment 11•25 years ago
|
||
According to the responses to the news posting, this is a documented change in
the gcc optimizer, and has affected more code than fdlibm (surprise, surprise :)
So the Right Thing seems to be to revert the volatile hack, and go with the
-fno-strict-aliasing option. I'll fix up the jsref build and attach a patch
asap.
> From: olaf@bigred.inka.de (Olaf Titz)
>
> For gcc since 2.9something this needs -fno-strict-aliasing
> (and yes, it is not valid ANSI C, but since these constructs are so
> common the gcc people were forced to add a bug-compatiblity switch :-)
> See also <URL:http://gcc.gnu.org/fom_serv/cache/24.html>
> The correct way for this kind of access would be a union.
>
> Olaf
Comment 12•25 years ago
|
||
Umm.. -fno-strict-aliasing doesn't fix this particular problem though.
cls@driftwood:src> gcc -g -W -fno-strict-aliasing main.c -o main
cls@driftwood:src> ./main
y1 is 3.000000e+00
cls@driftwood:src> gcc -O3 -W -fno-strict-aliasing main.c -o main
cls@driftwood:src> ./main
y1 is 3.993263e+00
Assignee | ||
Comment 13•25 years ago
|
||
argh.
Assignee | ||
Comment 14•25 years ago
|
||
The next response on .unix was more creative, but equally innefective:
#define noalias(type, ptr) \
(((union { type __x__; __typeof__(*(ptr)) __y__;} *)(ptr))->__x__)
#define __LO(x) noalias(int, &x)
As neat as the noalias macro looks, it fails just the same with or without it.
So, the right fix is to define any variable used in __LO or __HI as:
typedef union
{
struct { int32 lo, hi; } ints;
double d;
} twoints;
and adjust the macros to something like:
#define __LO(x) x.ints.lo
#define __HI(x) x.ints.hi
The prevoius patch only covers the routines we use in js on *linux*, because my
first assumption was a compiler bug. Because any compiler might flake out on
this code in the future, I'd say that any fdlibm function used by js should be
fixed. Potentially the entire library.
Assignee | ||
Comment 15•25 years ago
|
||
mozilla/js/src/jslibmath.h hooks up fdlibm functions to js. Any "extern double
fd_*" line indicates that the function is used by js on that os.
Assignee | ||
Comment 16•25 years ago
|
||
Comment 17•25 years ago
|
||
The testcase looks good except that I don't think removing the existing API (the
__LO & __HI macros) is going to go over well. Here's what I'm currently working
on.
typedef union {
#ifdef __LITTLE_ENDIAN
struct { int32 lo, hi; } ints;
#else
struct { int32 hi, lo; } ints;
#endif
double d;
} fd_twoints;
#define __HI(x) x.ints.hi
#define __LO(x) x.ints.lo
#define __HIp(x) &(x.ints.hi)
#define __LOp(x) &(x.ints.lo)
Using this method, there are only a couple of places in each fdlibm src file
that need to be changed and they are mainly assignments to & from the union
variable.
Comment 18•25 years ago
|
||
Ok, here's a preliminary patch that fixes xpcshell for me.
Things to note: X86_LINUX isn't defined in the Mozilla build. I think this was
what was really causing xpcshell to give the wrong answers in a debug build.
The macros, __LOp & __HIp, aren't defined anymore as I couldn't come up with a
quick fix for them. The Mozilla build doesn't use those macros but I suspect
the code that does will have to be rewritten.
I only changed the files needed to compile Mozilla not jsref (I couldn't get it
to build).
Comment 19•25 years ago
|
||
Comment 20•25 years ago
|
||
Just for the record, changing the macros to
#define __LO(x) ((volatile int*)&x)[0]
#define __HI(x) ((volatile int*)&x)[1]
also works.
Comment 21•25 years ago
|
||
The volatile workaround works but it's just reinforcing the use of invalid ANSI
C code. Has anyone had a chance to verify that the union fix works on non-gcc
compilers?
Assignee | ||
Comment 22•25 years ago
|
||
I'm running the test suite on the patch now, will have results soon.
Comment 23•25 years ago
|
||
If it is invalid. So far there's just GCC's opinion on the matter.
Assignee | ||
Comment 24•25 years ago
|
||
cls: your patches are working, but there are more files to be patched. I've
updated for solaris and AIX, working on windows. updated comming diff soon.
Assignee | ||
Comment 25•25 years ago
|
||
We build the entire fdlibm on windows, from what I can tell. leaving no choice
but to fix the whole thing. I've fixed a few more files in this updated patch,
but theres more to be done. cls, if you feel like fixing more files, be my
guest :)
Assignee | ||
Comment 26•25 years ago
|
||
Comment 27•25 years ago
|
||
What's left beyond the entire fdlibm? Now that I have the build instructions
for JSRef, I've been building it, patching files as I go. I hit the only uses
of __LOp & __HIp in s_modf.c and I'm wondering how you got around them.
Assignee | ||
Comment 28•25 years ago
|
||
On unixish systems, it looks like we only build the files we need, which
apparently don't include any uses of __(LO|HI)p macros. I havn't gotten a
windows build to finish yet.
grepping for __LO and __HI might be a faster way than trial and error,for
finding trouble spots.
Comment 29•25 years ago
|
||
Comment 30•25 years ago
|
||
Actually, after a grep check, it turns out that the last patch handles all
occurances of __LO, __HI, __LOp & __HIp. Rginda, any success on windows & Mac?
Comment 31•25 years ago
|
||
Assignee | ||
Comment 32•25 years ago
|
||
#ifdef XP_MAC
# include <types.h>
#else
# ifdef _WIN32
# define in32_t long
# else
# include <sys/types.h>
# endif
#endif
After adding the #define for _WIN32, the patch passed all tests on win32. The
mac results aren't in yet. Because of it's size, this patch will have to wait
until after the JS1.5 release (Tuesday) to be checked in.
Comment 33•25 years ago
|
||
What's the status on this? Rginda mentioned that there were possible problems
under win32 but then said they were not due to the patch.
Comment 34•25 years ago
|
||
Yet another request for a status check.
Assignee | ||
Comment 35•25 years ago
|
||
The problems are not due to the patch, and have been resolved. You can check
this in if you'd like.
Comment 36•25 years ago
|
||
This is a huge patch, has it been thoroughly reviewed by all interested parties?
If so, I say check the sucker in for beta3.
Whiteboard: [nsbeta3+]
Comment 37•25 years ago
|
||
Over to Rob for checkin shepherding
Assignee: rogerl → rginda
Status: ASSIGNED → NEW
Assignee | ||
Comment 38•25 years ago
|
||
So, the "volatile" fix worked for the standalone js shell in both opt and debug
builds, but apparently is f[l]ailing in xpcshell and mozilla. I've just rerun
the js test suite (with cls' union patch) for standalone js and xpcshell and all
seems to be fine. I still want to try xpcshell opt, but should have this
checked in tomorrow or so.
Status: NEW → ASSIGNED
Comment 39•25 years ago
|
||
Has someone at Sun possibly come across these same fdlibm bugs and fixed them?
Cc'ing clayton.
/be
Assignee | ||
Comment 40•25 years ago
|
||
Just checked int the union based patch.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Comment 41•25 years ago
|
||
*** Bug 53424 has been marked as a duplicate of this bug. ***
Comment 42•25 years ago
|
||
*** Bug 59963 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•