Closed Bug 24892 Opened 25 years ago Closed 24 years ago

gcc opt lossage confuses fdlibm

Categories

(Core :: JavaScript Engine, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: rginda, Assigned: rginda)

References

Details

(Keywords: js1.5, Whiteboard: [nsbeta3+])

Attachments

(8 files)

JSRef Linux opt build:
js> Math.pow (2,2)
4
js> Math.pow (2,3)
0

xpcshell:
js> Math.pow (2,2)
0
js> Math.pow (2,3)
Infinity
js> 

Using egcs-2.91.66
debug builds, and opt on other unix flavors work as expected.
Line 295 in the following snip from e_pow.c seems to be the root of the trouble:
290             s = one; /* s (sign of result -ve**odd) = -1 else = 1 */
291             if((((hx>>31)+1)|(yisint-1))==0) s = -one;/* (-ve)**(odd int) */
292
293         /* split up y into y1+y2 and compute (y1+y2)*(t1+t2) */
294             y1  = y;
295             __LO(y1) = 0;
296             p_l = (y-y1)*t1+y*t2;

Stepping up to this point in debug builds yields:
(gdb) p y
$2 = 3
(gdb) p y1
$3 = 3
While opt builds:
(gdb) p y
$7 = 3
(gdb) p y1
$8 = -1

The code list above looks like this in debug builds:
0x80b9365 <__ieee754_pow+2709>: fldl   0x80ca7c8
0x80b936b <__ieee754_pow+2715>: fstpl  0xffffffa8(%ebp)
0x80b936e <__ieee754_pow+2718>: movl   0xffffff70(%ebp),%ecx
0x80b9374 <__ieee754_pow+2724>: sarl   $0x1f,%ecx
0x80b9377 <__ieee754_pow+2727>: movl   %ecx,0xffffff30(%ebp)
0x80b937d <__ieee754_pow+2733>: movl   0xffffff30(%ebp),%edx
0x80b9383 <__ieee754_pow+2739>: incl   %edx
0x80b9384 <__ieee754_pow+2740>: movl   0xffffff78(%ebp),%eax
0x80b938a <__ieee754_pow+2746>: decl   %eax
0x80b938b <__ieee754_pow+2747>: movl   %eax,0xffffff30(%ebp)
0x80b9391 <__ieee754_pow+2753>: orl    0xffffff30(%ebp),%edx
0x80b9397 <__ieee754_pow+2759>: testl  %edx,%edx
0x80b9399 <__ieee754_pow+2761>: jne    0x80b93a6 <__ieee754_pow+2774>
0x80b939b <__ieee754_pow+2763>: fldl   0x80ca7c8
0x80b93a1 <__ieee754_pow+2769>: fchs   
0x80b93a3 <__ieee754_pow+2771>: fstpl  0xffffffa8(%ebp)
0x80b93a6 <__ieee754_pow+2774>: fldl   0x10(%ebp)
0x80b93a9 <__ieee754_pow+2777>: fstpl  0xffffffc8(%ebp)
0x80b93ac <__ieee754_pow+2780>: movl   $0x0,0xffffffc8(%ebp)

and this in opt:
0x80848b0 <__ieee754_pow+1232>: movl   $0x0,0xffffffd0(%ebp)
0x80848b7 <__ieee754_pow+1239>: movl   $0x3ff00000,0xffffffd4(%ebp)
0x80848be <__ieee754_pow+1246>: movl   0xffffffc8(%ebp),%eax
0x80848c1 <__ieee754_pow+1249>: sarl   $0x1f,%eax
0x80848c4 <__ieee754_pow+1252>: incl   %eax
0x80848c5 <__ieee754_pow+1253>: movl   0xffffffcc(%ebp),%edx
0x80848c8 <__ieee754_pow+1256>: decl   %edx
0x80848c9 <__ieee754_pow+1257>: orl    %edx,%eax
0x80848cb <__ieee754_pow+1259>: jne    0x80848db <__ieee754_pow+1275>
0x80848cd <__ieee754_pow+1261>: movl   $0x0,0xffffffd0(%ebp)
0x80848d4 <__ieee754_pow+1268>: movl   $0xbff00000,0xffffffd4(%ebp)
0x80848db <__ieee754_pow+1275>: fldl   0x10(%ebp)
0x80848de <__ieee754_pow+1278>: fstp   %st(0)
0x80848e0 <__ieee754_pow+1280>: xorl   %eax,%eax
0x80848e2 <__ieee754_pow+1282>: movl   %eax,0xffffffac(%ebp)
0x80848e5 <__ieee754_pow+1285>: movl   %edx,0xffffffb0(%ebp)
0x80848e8 <__ieee754_pow+1288>: fldl   0xffffffac(%ebp)
0x80848eb <__ieee754_pow+1291>: fldl   0x10(%ebp)
0x80848ee <__ieee754_pow+1294>: fsub   %st(1),%st
0x80848f0 <__ieee754_pow+1296>: fmul   %st(3),%st
0x80848f2 <__ieee754_pow+1298>: fxch   %st(2)
0x80848f4 <__ieee754_pow+1300>: fmull  0x10(%ebp)
0x80848f7 <__ieee754_pow+1303>: faddp  %st,%st(2)
0x80848f9 <__ieee754_pow+1305>: fxch   %st(1)
0x80848fb <__ieee754_pow+1307>: fstl   0xffffffd8(%ebp)
0x80848fe <__ieee754_pow+1310>: fxch   %st(1)
Keywords: js1.5
Upgrading to gcc 2.95.2 does not fix the problem.
The attached testcase reduces the failure to just a few lines of c.  Running it,
I get:
[rginda@swamp gcc_opt] gcc --version
2.95.2
[rginda@swamp gcc_opt] gcc -g -Wall -Wno-format main.c
[rginda@swamp gcc_opt] ./a.out 
y1 is 3.000000e+00
[rginda@swamp gcc_opt] gcc -g -O -Wall -Wno-format main.c
[rginda@swamp gcc_opt] ./a.out 
y1 is 4.002151e+00

bam@dimensional.com built this on a rh5.2 system, running gcc 2.7.2.3, and
reports getting 3 for both builds.

rogerl can workaround this issue by rearranging some code, but other uses of
__LO and __HI in the fdlibm code may show up as problems in the future.
Checked in a workaround which includes a test (see attached) for affected gcc
version in the linux config makefile, and a conditional 'volatile' specifier on
affected variables.
cls, can you add a check for this to the mozilla build setup?
Defining GCC_OPT_BUG triggers the workaround.
Summary: Math.pow returns incorrect result in linux/opt builds → gcc opt lossage confuses fdlibm
I'm working on the configure.in check now.  But I'm curious as to exactly what
the problem is and why it appears to be fixed if we volatile is used. Is the gcc
devel team aware of the problem?  We should change the ifdef from GCC_OPT_BUG to
something a bit more descriptive.  I'm scouring the gcc info pages and the best
I can see is that the automatic variables can possibly be clobbered if a longjmp
is performed.  Also, the optimized build appears to do the right thing if we use
-fvolatile instead of declaring each variable volatile.

`-fvolatile'
     Consider all memory references through pointers to be volatile.
 
The potential for the error only shows in a finite number of places in fdlibm,
and of those, only 4 affect JavaScript (
http://lxr.mozilla.org/seamonkey/search?string=GCC_OPT_BUG .)  I'd argue against
-fvolatile because it would bite more than just the affected areas.  I've posted
a note to n.p.m.unix and n.p.m.jseng, requesting that someone close to the gcc
team speak up.  If no one answers, I'll go directly to the gcc bug mailing list
myself.

The problem is in the line
  (*(int *)&foo) = 0;

Where foo is a double.  The assembly this code generates shows that 0 is being
assigned to the low half of an uninitialized register.  Maybe this code fools
gcc into thinking foo is an int, and that foo=0 will (completely) initialize it,
so gcc doesn't have to (but that's just a guess.)  Marking the variable as
volatile tells the compiler that 'this variable may be changed by outside
forces', which in our case isn't the truth, but the extra caution used around a
volatile cures the symptom.

Also, if you're using the testcase attached to the bug as a basis for the config
check, you may want to worry about the rare chance that a compiler which
exhibits this behavior will give the correct result.  Because the register isn't
initialized before the assignment, anything could happen.
According to the responses to the news posting, this is a documented change in
the gcc optimizer, and has affected more code than fdlibm (surprise, surprise :)
So the Right Thing seems to be to revert the volatile hack, and go with the
-fno-strict-aliasing option.  I'll fix up the jsref build and attach a patch
asap.

> From: olaf@bigred.inka.de (Olaf Titz)
>
> For gcc since 2.9something this needs -fno-strict-aliasing
> (and yes, it is not valid ANSI C, but since these constructs are so
> common the gcc people were forced to add a bug-compatiblity switch :-)
> See also <URL:http://gcc.gnu.org/fom_serv/cache/24.html>
> The correct way for this kind of access would be a union.
>
> Olaf
Umm.. -fno-strict-aliasing doesn't fix this particular problem though.

cls@driftwood:src> gcc -g -W -fno-strict-aliasing main.c -o main
cls@driftwood:src> ./main
y1 is 3.000000e+00
cls@driftwood:src> gcc -O3 -W -fno-strict-aliasing main.c -o main
cls@driftwood:src> ./main
y1 is 3.993263e+00
argh.
The next response on .unix was more creative, but equally innefective:
#define noalias(type, ptr) \
 (((union { type __x__; __typeof__(*(ptr)) __y__;} *)(ptr))->__x__)
#define __LO(x) noalias(int, &x)

As neat as the noalias macro looks, it fails just the same with or without it.

So, the right fix is to define any variable used in __LO or __HI as:
typedef union 
{
    struct { int32 lo, hi; } ints;
    double d;
} twoints;

and adjust the macros to something like:
#define __LO(x) x.ints.lo
#define __HI(x) x.ints.hi

The prevoius patch only covers the routines we use in js on *linux*, because my
first assumption was a compiler bug.  Because any compiler might flake out on
this code in the future, I'd say that any fdlibm function used by js should be
fixed.  Potentially the entire library.
mozilla/js/src/jslibmath.h hooks up fdlibm functions to js.  Any "extern double
fd_*" line indicates that the function is used by js on that os.
The testcase looks good except that I don't think removing the existing API (the
__LO & __HI macros) is going to go over well.  Here's what I'm currently working
on.
typedef union {
#ifdef __LITTLE_ENDIAN
    struct { int32 lo, hi; } ints;
#else
    struct { int32 hi, lo; } ints;
#endif
    double d;
} fd_twoints;

#define __HI(x) x.ints.hi
#define __LO(x) x.ints.lo
#define __HIp(x) &(x.ints.hi)
#define __LOp(x) &(x.ints.lo)

Using this method, there are only a couple of places in each fdlibm src file
that need to be changed and they are mainly assignments to & from the union
variable.


Ok, here's a preliminary patch that fixes xpcshell for me.  
Things to note: X86_LINUX isn't defined in the Mozilla build.  I think this was
what was really causing xpcshell to give the wrong answers in a debug build. 
The macros, __LOp & __HIp, aren't defined anymore as I couldn't come up with a
quick fix for them.  The Mozilla build doesn't use those macros but I suspect
the code that does will have to be rewritten.  
I only changed the files needed to compile Mozilla not jsref (I couldn't get it
to build). 
Just for the record, changing the macros to

#define __LO(x) ((volatile int*)&x)[0]
#define __HI(x) ((volatile int*)&x)[1]

also works.
The volatile workaround works but it's just reinforcing the use of invalid ANSI
C code.  Has anyone had a chance to verify that the union fix works on non-gcc
compilers?
I'm running the test suite on the patch now, will have results soon.
If it is invalid. So far there's just GCC's opinion on the matter.
cls: your patches are working, but there are more files to be patched.  I've
updated for solaris and AIX, working on windows.  updated comming diff soon.
We build the entire fdlibm on windows, from what I can tell.  leaving no choice
but to fix the whole thing.  I've fixed a few more files in this updated patch,
but theres more to be done.  cls, if you feel like fixing more files, be my
guest :)
Attached patch more fixesSplinter Review
What's left beyond the entire fdlibm?  Now that I have the build instructions
for JSRef, I've been building it, patching files as I go.  I hit the only uses
of __LOp & __HIp in s_modf.c and I'm wondering how you got around them.
On unixish systems, it looks like we only build the files we need, which
apparently don't include any uses of __(LO|HI)p macros.  I havn't gotten a
windows build to finish yet.
grepping for __LO and __HI might be a faster way than trial and error,for
finding trouble spots.
Actually, after a grep check, it turns out that the last patch handles all
occurances of __LO, __HI, __LOp & __HIp.  Rginda, any success on windows & Mac?
Attached patch s/int32_t/int/gSplinter Review
#ifdef XP_MAC
# include <types.h>
#else
# ifdef _WIN32
#  define in32_t long
# else
#  include <sys/types.h>
# endif
#endif

After adding the #define for _WIN32, the patch passed all tests on win32.  The
mac results aren't in yet.  Because of it's size, this patch will have to wait
until after the JS1.5 release (Tuesday) to be checked in.
What's the status on this?  Rginda mentioned that there were possible problems
under win32 but then said they were not due to the patch.
Yet another request for a status check.
The problems are not due to the patch, and have been resolved.  You can check
this in if you'd like.
Status: NEW → ASSIGNED
Keywords: nsbeta3
This is a huge patch, has it been thoroughly reviewed by all interested parties? 
If so, I say check the sucker in for beta3.
Whiteboard: [nsbeta3+]
Over to Rob for checkin shepherding
Assignee: rogerl → rginda
Status: ASSIGNED → NEW
So, the "volatile" fix worked for the standalone js shell in both opt and debug
builds, but apparently is f[l]ailing in xpcshell and mozilla.  I've just rerun
the js test suite (with cls' union patch) for standalone js and xpcshell and all
seems to be fine.  I still want to try xpcshell opt, but should have this
checked in tomorrow or so.
Status: NEW → ASSIGNED
Has someone at Sun possibly come across these same fdlibm bugs and fixed them? 
Cc'ing clayton.

/be
Just checked int the union based patch.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
*** Bug 53424 has been marked as a duplicate of this bug. ***
*** Bug 59963 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: