Closed
Bug 24892
Opened 25 years ago
Closed 24 years ago
gcc opt lossage confuses fdlibm
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
RESOLVED
FIXED
People
(Reporter: rginda, Assigned: rginda)
References
Details
(Keywords: js1.5, Whiteboard: [nsbeta3+])
Attachments
(8 files)
1.73 KB,
text/plain
|
Details | |
256 bytes,
text/plain
|
Details | |
2.74 KB,
patch
|
Details | Diff | Splinter Review | |
1.88 KB,
text/plain
|
Details | |
25.71 KB,
patch
|
Details | Diff | Splinter Review | |
37.92 KB,
patch
|
Details | Diff | Splinter Review | |
54.24 KB,
patch
|
Details | Diff | Splinter Review | |
54.00 KB,
patch
|
Details | Diff | Splinter Review |
JSRef Linux opt build: js> Math.pow (2,2) 4 js> Math.pow (2,3) 0 xpcshell: js> Math.pow (2,2) 0 js> Math.pow (2,3) Infinity js> Using egcs-2.91.66 debug builds, and opt on other unix flavors work as expected.
Assignee | ||
Comment 1•25 years ago
|
||
Line 295 in the following snip from e_pow.c seems to be the root of the trouble: 290 s = one; /* s (sign of result -ve**odd) = -1 else = 1 */ 291 if((((hx>>31)+1)|(yisint-1))==0) s = -one;/* (-ve)**(odd int) */ 292 293 /* split up y into y1+y2 and compute (y1+y2)*(t1+t2) */ 294 y1 = y; 295 __LO(y1) = 0; 296 p_l = (y-y1)*t1+y*t2; Stepping up to this point in debug builds yields: (gdb) p y $2 = 3 (gdb) p y1 $3 = 3 While opt builds: (gdb) p y $7 = 3 (gdb) p y1 $8 = -1 The code list above looks like this in debug builds: 0x80b9365 <__ieee754_pow+2709>: fldl 0x80ca7c8 0x80b936b <__ieee754_pow+2715>: fstpl 0xffffffa8(%ebp) 0x80b936e <__ieee754_pow+2718>: movl 0xffffff70(%ebp),%ecx 0x80b9374 <__ieee754_pow+2724>: sarl $0x1f,%ecx 0x80b9377 <__ieee754_pow+2727>: movl %ecx,0xffffff30(%ebp) 0x80b937d <__ieee754_pow+2733>: movl 0xffffff30(%ebp),%edx 0x80b9383 <__ieee754_pow+2739>: incl %edx 0x80b9384 <__ieee754_pow+2740>: movl 0xffffff78(%ebp),%eax 0x80b938a <__ieee754_pow+2746>: decl %eax 0x80b938b <__ieee754_pow+2747>: movl %eax,0xffffff30(%ebp) 0x80b9391 <__ieee754_pow+2753>: orl 0xffffff30(%ebp),%edx 0x80b9397 <__ieee754_pow+2759>: testl %edx,%edx 0x80b9399 <__ieee754_pow+2761>: jne 0x80b93a6 <__ieee754_pow+2774> 0x80b939b <__ieee754_pow+2763>: fldl 0x80ca7c8 0x80b93a1 <__ieee754_pow+2769>: fchs 0x80b93a3 <__ieee754_pow+2771>: fstpl 0xffffffa8(%ebp) 0x80b93a6 <__ieee754_pow+2774>: fldl 0x10(%ebp) 0x80b93a9 <__ieee754_pow+2777>: fstpl 0xffffffc8(%ebp) 0x80b93ac <__ieee754_pow+2780>: movl $0x0,0xffffffc8(%ebp) and this in opt: 0x80848b0 <__ieee754_pow+1232>: movl $0x0,0xffffffd0(%ebp) 0x80848b7 <__ieee754_pow+1239>: movl $0x3ff00000,0xffffffd4(%ebp) 0x80848be <__ieee754_pow+1246>: movl 0xffffffc8(%ebp),%eax 0x80848c1 <__ieee754_pow+1249>: sarl $0x1f,%eax 0x80848c4 <__ieee754_pow+1252>: incl %eax 0x80848c5 <__ieee754_pow+1253>: movl 0xffffffcc(%ebp),%edx 0x80848c8 <__ieee754_pow+1256>: decl %edx 0x80848c9 <__ieee754_pow+1257>: orl %edx,%eax 0x80848cb <__ieee754_pow+1259>: jne 0x80848db <__ieee754_pow+1275> 0x80848cd <__ieee754_pow+1261>: movl $0x0,0xffffffd0(%ebp) 0x80848d4 <__ieee754_pow+1268>: movl $0xbff00000,0xffffffd4(%ebp) 0x80848db <__ieee754_pow+1275>: fldl 0x10(%ebp) 0x80848de <__ieee754_pow+1278>: fstp %st(0) 0x80848e0 <__ieee754_pow+1280>: xorl %eax,%eax 0x80848e2 <__ieee754_pow+1282>: movl %eax,0xffffffac(%ebp) 0x80848e5 <__ieee754_pow+1285>: movl %edx,0xffffffb0(%ebp) 0x80848e8 <__ieee754_pow+1288>: fldl 0xffffffac(%ebp) 0x80848eb <__ieee754_pow+1291>: fldl 0x10(%ebp) 0x80848ee <__ieee754_pow+1294>: fsub %st(1),%st 0x80848f0 <__ieee754_pow+1296>: fmul %st(3),%st 0x80848f2 <__ieee754_pow+1298>: fxch %st(2) 0x80848f4 <__ieee754_pow+1300>: fmull 0x10(%ebp) 0x80848f7 <__ieee754_pow+1303>: faddp %st,%st(2) 0x80848f9 <__ieee754_pow+1305>: fxch %st(1) 0x80848fb <__ieee754_pow+1307>: fstl 0xffffffd8(%ebp) 0x80848fe <__ieee754_pow+1310>: fxch %st(1)
Assignee | ||
Comment 2•25 years ago
|
||
Upgrading to gcc 2.95.2 does not fix the problem.
Assignee | ||
Comment 3•25 years ago
|
||
Assignee | ||
Comment 4•25 years ago
|
||
The attached testcase reduces the failure to just a few lines of c. Running it, I get: [rginda@swamp gcc_opt] gcc --version 2.95.2 [rginda@swamp gcc_opt] gcc -g -Wall -Wno-format main.c [rginda@swamp gcc_opt] ./a.out y1 is 3.000000e+00 [rginda@swamp gcc_opt] gcc -g -O -Wall -Wno-format main.c [rginda@swamp gcc_opt] ./a.out y1 is 4.002151e+00 bam@dimensional.com built this on a rh5.2 system, running gcc 2.7.2.3, and reports getting 3 for both builds. rogerl can workaround this issue by rearranging some code, but other uses of __LO and __HI in the fdlibm code may show up as problems in the future.
Assignee | ||
Comment 5•25 years ago
|
||
Checked in a workaround which includes a test (see attached) for affected gcc version in the linux config makefile, and a conditional 'volatile' specifier on affected variables.
Assignee | ||
Comment 6•25 years ago
|
||
cls, can you add a check for this to the mozilla build setup? Defining GCC_OPT_BUG triggers the workaround.
Summary: Math.pow returns incorrect result in linux/opt builds → gcc opt lossage confuses fdlibm
Assignee | ||
Comment 7•25 years ago
|
||
I'm working on the configure.in check now. But I'm curious as to exactly what the problem is and why it appears to be fixed if we volatile is used. Is the gcc devel team aware of the problem? We should change the ifdef from GCC_OPT_BUG to something a bit more descriptive. I'm scouring the gcc info pages and the best I can see is that the automatic variables can possibly be clobbered if a longjmp is performed. Also, the optimized build appears to do the right thing if we use -fvolatile instead of declaring each variable volatile. `-fvolatile' Consider all memory references through pointers to be volatile.
Assignee | ||
Comment 9•25 years ago
|
||
The potential for the error only shows in a finite number of places in fdlibm, and of those, only 4 affect JavaScript ( http://lxr.mozilla.org/seamonkey/search?string=GCC_OPT_BUG .) I'd argue against -fvolatile because it would bite more than just the affected areas. I've posted a note to n.p.m.unix and n.p.m.jseng, requesting that someone close to the gcc team speak up. If no one answers, I'll go directly to the gcc bug mailing list myself. The problem is in the line (*(int *)&foo) = 0; Where foo is a double. The assembly this code generates shows that 0 is being assigned to the low half of an uninitialized register. Maybe this code fools gcc into thinking foo is an int, and that foo=0 will (completely) initialize it, so gcc doesn't have to (but that's just a guess.) Marking the variable as volatile tells the compiler that 'this variable may be changed by outside forces', which in our case isn't the truth, but the extra caution used around a volatile cures the symptom. Also, if you're using the testcase attached to the bug as a basis for the config check, you may want to worry about the rare chance that a compiler which exhibits this behavior will give the correct result. Because the register isn't initialized before the assignment, anything could happen.
Comment 10•25 years ago
|
||
Assignee | ||
Comment 11•25 years ago
|
||
According to the responses to the news posting, this is a documented change in the gcc optimizer, and has affected more code than fdlibm (surprise, surprise :) So the Right Thing seems to be to revert the volatile hack, and go with the -fno-strict-aliasing option. I'll fix up the jsref build and attach a patch asap. > From: olaf@bigred.inka.de (Olaf Titz) > > For gcc since 2.9something this needs -fno-strict-aliasing > (and yes, it is not valid ANSI C, but since these constructs are so > common the gcc people were forced to add a bug-compatiblity switch :-) > See also <URL:http://gcc.gnu.org/fom_serv/cache/24.html> > The correct way for this kind of access would be a union. > > Olaf
Comment 12•25 years ago
|
||
Umm.. -fno-strict-aliasing doesn't fix this particular problem though. cls@driftwood:src> gcc -g -W -fno-strict-aliasing main.c -o main cls@driftwood:src> ./main y1 is 3.000000e+00 cls@driftwood:src> gcc -O3 -W -fno-strict-aliasing main.c -o main cls@driftwood:src> ./main y1 is 3.993263e+00
Assignee | ||
Comment 13•25 years ago
|
||
argh.
Assignee | ||
Comment 14•25 years ago
|
||
The next response on .unix was more creative, but equally innefective: #define noalias(type, ptr) \ (((union { type __x__; __typeof__(*(ptr)) __y__;} *)(ptr))->__x__) #define __LO(x) noalias(int, &x) As neat as the noalias macro looks, it fails just the same with or without it. So, the right fix is to define any variable used in __LO or __HI as: typedef union { struct { int32 lo, hi; } ints; double d; } twoints; and adjust the macros to something like: #define __LO(x) x.ints.lo #define __HI(x) x.ints.hi The prevoius patch only covers the routines we use in js on *linux*, because my first assumption was a compiler bug. Because any compiler might flake out on this code in the future, I'd say that any fdlibm function used by js should be fixed. Potentially the entire library.
Assignee | ||
Comment 15•25 years ago
|
||
mozilla/js/src/jslibmath.h hooks up fdlibm functions to js. Any "extern double fd_*" line indicates that the function is used by js on that os.
Assignee | ||
Comment 16•25 years ago
|
||
Comment 17•25 years ago
|
||
The testcase looks good except that I don't think removing the existing API (the __LO & __HI macros) is going to go over well. Here's what I'm currently working on. typedef union { #ifdef __LITTLE_ENDIAN struct { int32 lo, hi; } ints; #else struct { int32 hi, lo; } ints; #endif double d; } fd_twoints; #define __HI(x) x.ints.hi #define __LO(x) x.ints.lo #define __HIp(x) &(x.ints.hi) #define __LOp(x) &(x.ints.lo) Using this method, there are only a couple of places in each fdlibm src file that need to be changed and they are mainly assignments to & from the union variable.
Comment 18•25 years ago
|
||
Ok, here's a preliminary patch that fixes xpcshell for me. Things to note: X86_LINUX isn't defined in the Mozilla build. I think this was what was really causing xpcshell to give the wrong answers in a debug build. The macros, __LOp & __HIp, aren't defined anymore as I couldn't come up with a quick fix for them. The Mozilla build doesn't use those macros but I suspect the code that does will have to be rewritten. I only changed the files needed to compile Mozilla not jsref (I couldn't get it to build).
Comment 19•25 years ago
|
||
Comment 20•25 years ago
|
||
Just for the record, changing the macros to #define __LO(x) ((volatile int*)&x)[0] #define __HI(x) ((volatile int*)&x)[1] also works.
Comment 21•25 years ago
|
||
The volatile workaround works but it's just reinforcing the use of invalid ANSI C code. Has anyone had a chance to verify that the union fix works on non-gcc compilers?
Assignee | ||
Comment 22•25 years ago
|
||
I'm running the test suite on the patch now, will have results soon.
Comment 23•25 years ago
|
||
If it is invalid. So far there's just GCC's opinion on the matter.
Assignee | ||
Comment 24•25 years ago
|
||
cls: your patches are working, but there are more files to be patched. I've updated for solaris and AIX, working on windows. updated comming diff soon.
Assignee | ||
Comment 25•25 years ago
|
||
We build the entire fdlibm on windows, from what I can tell. leaving no choice but to fix the whole thing. I've fixed a few more files in this updated patch, but theres more to be done. cls, if you feel like fixing more files, be my guest :)
Assignee | ||
Comment 26•25 years ago
|
||
Comment 27•25 years ago
|
||
What's left beyond the entire fdlibm? Now that I have the build instructions for JSRef, I've been building it, patching files as I go. I hit the only uses of __LOp & __HIp in s_modf.c and I'm wondering how you got around them.
Assignee | ||
Comment 28•25 years ago
|
||
On unixish systems, it looks like we only build the files we need, which apparently don't include any uses of __(LO|HI)p macros. I havn't gotten a windows build to finish yet. grepping for __LO and __HI might be a faster way than trial and error,for finding trouble spots.
Comment 29•25 years ago
|
||
Comment 30•25 years ago
|
||
Actually, after a grep check, it turns out that the last patch handles all occurances of __LO, __HI, __LOp & __HIp. Rginda, any success on windows & Mac?
Comment 31•25 years ago
|
||
Assignee | ||
Comment 32•25 years ago
|
||
#ifdef XP_MAC # include <types.h> #else # ifdef _WIN32 # define in32_t long # else # include <sys/types.h> # endif #endif After adding the #define for _WIN32, the patch passed all tests on win32. The mac results aren't in yet. Because of it's size, this patch will have to wait until after the JS1.5 release (Tuesday) to be checked in.
Comment 33•24 years ago
|
||
What's the status on this? Rginda mentioned that there were possible problems under win32 but then said they were not due to the patch.
Comment 34•24 years ago
|
||
Yet another request for a status check.
Assignee | ||
Comment 35•24 years ago
|
||
The problems are not due to the patch, and have been resolved. You can check this in if you'd like.
Comment 36•24 years ago
|
||
This is a huge patch, has it been thoroughly reviewed by all interested parties? If so, I say check the sucker in for beta3.
Whiteboard: [nsbeta3+]
Comment 37•24 years ago
|
||
Over to Rob for checkin shepherding
Assignee: rogerl → rginda
Status: ASSIGNED → NEW
Assignee | ||
Comment 38•24 years ago
|
||
So, the "volatile" fix worked for the standalone js shell in both opt and debug builds, but apparently is f[l]ailing in xpcshell and mozilla. I've just rerun the js test suite (with cls' union patch) for standalone js and xpcshell and all seems to be fine. I still want to try xpcshell opt, but should have this checked in tomorrow or so.
Status: NEW → ASSIGNED
Comment 39•24 years ago
|
||
Has someone at Sun possibly come across these same fdlibm bugs and fixed them? Cc'ing clayton. /be
Assignee | ||
Comment 40•24 years ago
|
||
Just checked int the union based patch.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 41•24 years ago
|
||
*** Bug 53424 has been marked as a duplicate of this bug. ***
Comment 42•24 years ago
|
||
*** Bug 59963 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•