Closed Bug 24892 Opened 25 years ago Closed 25 years ago

gcc opt lossage confuses fdlibm

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: rginda, Assigned: rginda)

References

Details

(Keywords: js1.5, Whiteboard: [nsbeta3+])

Attachments

(8 files)

C testcase showing the failure 25 years ago Robert Ginda 1.73 KB, text/plain		Details
make test for affected gcc version (thanks cls) 25 years ago Robert Ginda 256 bytes, text/plain		Details
Patch adds check to configure using C runtime testcase and s/GCC_OPT_BUG/USE_VOLATILE_KEYWORD/g 25 years ago cls 2.74 KB, patch		Details \| Diff \| Splinter Review
Testcase showing what I believe is the Right Way to do this 25 years ago Robert Ginda 1.88 KB, text/plain		Details
Preliminary fix using union 25 years ago cls 25.71 KB, patch		Details \| Diff \| Splinter Review
more fixes 25 years ago Robert Ginda 37.92 KB, patch		Details \| Diff \| Splinter Review
Applies the union fix to enough of fdlibm to build JSRef & Mozilla 25 years ago cls 54.24 KB, patch		Details \| Diff \| Splinter Review
s/int32_t/int/g 25 years ago cls 54.00 KB, patch		Details \| Diff \| Splinter Review

Robert Ginda

Assignee

Description

•

25 years ago

JSRef Linux opt build: js> Math.pow (2,2) 4 js> Math.pow (2,3) 0 xpcshell: js> Math.pow (2,2) 0 js> Math.pow (2,3) Infinity js> Using egcs-2.91.66 debug builds, and opt on other unix flavors work as expected.

Robert Ginda

Assignee

Comment 1

•

25 years ago

Line 295 in the following snip from e_pow.c seems to be the root of the trouble: 290 s = one; /* s (sign of result -ve**odd) = -1 else = 1 */ 291 if((((hx>>31)+1)|(yisint-1))==0) s = -one;/* (-ve)**(odd int) */ 292 293 /* split up y into y1+y2 and compute (y1+y2)*(t1+t2) */ 294 y1 = y; 295 __LO(y1) = 0; 296 p_l = (y-y1)*t1+y*t2; Stepping up to this point in debug builds yields: (gdb) p y $2 = 3 (gdb) p y1 $3 = 3 While opt builds: (gdb) p y $7 = 3 (gdb) p y1 $8 = -1 The code list above looks like this in debug builds: 0x80b9365 <__ieee754_pow+2709>: fldl 0x80ca7c8 0x80b936b <__ieee754_pow+2715>: fstpl 0xffffffa8(%ebp) 0x80b936e <__ieee754_pow+2718>: movl 0xffffff70(%ebp),%ecx 0x80b9374 <__ieee754_pow+2724>: sarl $0x1f,%ecx 0x80b9377 <__ieee754_pow+2727>: movl %ecx,0xffffff30(%ebp) 0x80b937d <__ieee754_pow+2733>: movl 0xffffff30(%ebp),%edx 0x80b9383 <__ieee754_pow+2739>: incl %edx 0x80b9384 <__ieee754_pow+2740>: movl 0xffffff78(%ebp),%eax 0x80b938a <__ieee754_pow+2746>: decl %eax 0x80b938b <__ieee754_pow+2747>: movl %eax,0xffffff30(%ebp) 0x80b9391 <__ieee754_pow+2753>: orl 0xffffff30(%ebp),%edx 0x80b9397 <__ieee754_pow+2759>: testl %edx,%edx 0x80b9399 <__ieee754_pow+2761>: jne 0x80b93a6 <__ieee754_pow+2774> 0x80b939b <__ieee754_pow+2763>: fldl 0x80ca7c8 0x80b93a1 <__ieee754_pow+2769>: fchs 0x80b93a3 <__ieee754_pow+2771>: fstpl 0xffffffa8(%ebp) 0x80b93a6 <__ieee754_pow+2774>: fldl 0x10(%ebp) 0x80b93a9 <__ieee754_pow+2777>: fstpl 0xffffffc8(%ebp) 0x80b93ac <__ieee754_pow+2780>: movl $0x0,0xffffffc8(%ebp) and this in opt: 0x80848b0 <__ieee754_pow+1232>: movl $0x0,0xffffffd0(%ebp) 0x80848b7 <__ieee754_pow+1239>: movl $0x3ff00000,0xffffffd4(%ebp) 0x80848be <__ieee754_pow+1246>: movl 0xffffffc8(%ebp),%eax 0x80848c1 <__ieee754_pow+1249>: sarl $0x1f,%eax 0x80848c4 <__ieee754_pow+1252>: incl %eax 0x80848c5 <__ieee754_pow+1253>: movl 0xffffffcc(%ebp),%edx 0x80848c8 <__ieee754_pow+1256>: decl %edx 0x80848c9 <__ieee754_pow+1257>: orl %edx,%eax 0x80848cb <__ieee754_pow+1259>: jne 0x80848db <__ieee754_pow+1275> 0x80848cd <__ieee754_pow+1261>: movl $0x0,0xffffffd0(%ebp) 0x80848d4 <__ieee754_pow+1268>: movl $0xbff00000,0xffffffd4(%ebp) 0x80848db <__ieee754_pow+1275>: fldl 0x10(%ebp) 0x80848de <__ieee754_pow+1278>: fstp %st(0) 0x80848e0 <__ieee754_pow+1280>: xorl %eax,%eax 0x80848e2 <__ieee754_pow+1282>: movl %eax,0xffffffac(%ebp) 0x80848e5 <__ieee754_pow+1285>: movl %edx,0xffffffb0(%ebp) 0x80848e8 <__ieee754_pow+1288>: fldl 0xffffffac(%ebp) 0x80848eb <__ieee754_pow+1291>: fldl 0x10(%ebp) 0x80848ee <__ieee754_pow+1294>: fsub %st(1),%st 0x80848f0 <__ieee754_pow+1296>: fmul %st(3),%st 0x80848f2 <__ieee754_pow+1298>: fxch %st(2) 0x80848f4 <__ieee754_pow+1300>: fmull 0x10(%ebp) 0x80848f7 <__ieee754_pow+1303>: faddp %st,%st(2) 0x80848f9 <__ieee754_pow+1305>: fxch %st(1) 0x80848fb <__ieee754_pow+1307>: fstl 0xffffffd8(%ebp) 0x80848fe <__ieee754_pow+1310>: fxch %st(1)

Robert Ginda

Assignee

Updated

•

25 years ago

Keywords: js1.5

Robert Ginda

Assignee

Comment 2

•

25 years ago

Upgrading to gcc 2.95.2 does not fix the problem.

Robert Ginda

Assignee

Comment 3

•

25 years ago

Attached file C testcase showing the failure — Details

Robert Ginda

Assignee

Comment 4

•

25 years ago

The attached testcase reduces the failure to just a few lines of c. Running it, I get: [rginda@swamp gcc_opt] gcc --version 2.95.2 [rginda@swamp gcc_opt] gcc -g -Wall -Wno-format main.c [rginda@swamp gcc_opt] ./a.out y1 is 3.000000e+00 [rginda@swamp gcc_opt] gcc -g -O -Wall -Wno-format main.c [rginda@swamp gcc_opt] ./a.out y1 is 4.002151e+00 bam@dimensional.com built this on a rh5.2 system, running gcc 2.7.2.3, and reports getting 3 for both builds. rogerl can workaround this issue by rearranging some code, but other uses of __LO and __HI in the fdlibm code may show up as problems in the future.

Robert Ginda

Assignee

Comment 5

•

25 years ago

Checked in a workaround which includes a test (see attached) for affected gcc version in the linux config makefile, and a conditional 'volatile' specifier on affected variables.

Robert Ginda

Assignee

Comment 6

•

25 years ago

cls, can you add a check for this to the mozilla build setup? Defining GCC_OPT_BUG triggers the workaround.

Summary: Math.pow returns incorrect result in linux/opt builds → gcc opt lossage confuses fdlibm

Robert Ginda

Assignee

Comment 7

•

25 years ago

Attached file make test for affected gcc version (thanks cls) — Details

cls

Comment 8

•

25 years ago

I'm working on the configure.in check now. But I'm curious as to exactly what the problem is and why it appears to be fixed if we volatile is used. Is the gcc devel team aware of the problem? We should change the ifdef from GCC_OPT_BUG to something a bit more descriptive. I'm scouring the gcc info pages and the best I can see is that the automatic variables can possibly be clobbered if a longjmp is performed. Also, the optimized build appears to do the right thing if we use -fvolatile instead of declaring each variable volatile. `-fvolatile' Consider all memory references through pointers to be volatile.

Robert Ginda

Assignee

Comment 9

•

25 years ago

The potential for the error only shows in a finite number of places in fdlibm, and of those, only 4 affect JavaScript ( http://lxr.mozilla.org/seamonkey/search?string=GCC_OPT_BUG .) I'd argue against -fvolatile because it would bite more than just the affected areas. I've posted a note to n.p.m.unix and n.p.m.jseng, requesting that someone close to the gcc team speak up. If no one answers, I'll go directly to the gcc bug mailing list myself. The problem is in the line (*(int *)&foo) = 0; Where foo is a double. The assembly this code generates shows that 0 is being assigned to the low half of an uninitialized register. Maybe this code fools gcc into thinking foo is an int, and that foo=0 will (completely) initialize it, so gcc doesn't have to (but that's just a guess.) Marking the variable as volatile tells the compiler that 'this variable may be changed by outside forces', which in our case isn't the truth, but the extra caution used around a volatile cures the symptom. Also, if you're using the testcase attached to the bug as a basis for the config check, you may want to worry about the rare chance that a compiler which exhibits this behavior will give the correct result. Because the register isn't initialized before the assignment, anything could happen.

cls

Comment 10

•

25 years ago

Attached patch Patch adds check to configure using C runtime testcase and s/GCC_OPT_BUG/USE_VOLATILE_KEYWORD/g — Details — Splinter Review

Robert Ginda

Assignee

Comment 11

•

25 years ago

According to the responses to the news posting, this is a documented change in the gcc optimizer, and has affected more code than fdlibm (surprise, surprise :) So the Right Thing seems to be to revert the volatile hack, and go with the -fno-strict-aliasing option. I'll fix up the jsref build and attach a patch asap. > From: olaf@bigred.inka.de (Olaf Titz) > > For gcc since 2.9something this needs -fno-strict-aliasing > (and yes, it is not valid ANSI C, but since these constructs are so > common the gcc people were forced to add a bug-compatiblity switch :-) > See also <URL:http://gcc.gnu.org/fom_serv/cache/24.html> > The correct way for this kind of access would be a union. > > Olaf

cls

Comment 12

•

25 years ago

Umm.. -fno-strict-aliasing doesn't fix this particular problem though. cls@driftwood:src> gcc -g -W -fno-strict-aliasing main.c -o main cls@driftwood:src> ./main y1 is 3.000000e+00 cls@driftwood:src> gcc -O3 -W -fno-strict-aliasing main.c -o main cls@driftwood:src> ./main y1 is 3.993263e+00

Robert Ginda

Assignee

Comment 13

•

25 years ago

argh.

Robert Ginda

Assignee

Comment 14

•

25 years ago

The next response on .unix was more creative, but equally innefective: #define noalias(type, ptr) \ (((union { type __x__; __typeof__(*(ptr)) __y__;} *)(ptr))->__x__) #define __LO(x) noalias(int, &x) As neat as the noalias macro looks, it fails just the same with or without it. So, the right fix is to define any variable used in __LO or __HI as: typedef union { struct { int32 lo, hi; } ints; double d; } twoints; and adjust the macros to something like: #define __LO(x) x.ints.lo #define __HI(x) x.ints.hi The prevoius patch only covers the routines we use in js on *linux*, because my first assumption was a compiler bug. Because any compiler might flake out on this code in the future, I'd say that any fdlibm function used by js should be fixed. Potentially the entire library.

Robert Ginda

Assignee

Comment 15

•

25 years ago

mozilla/js/src/jslibmath.h hooks up fdlibm functions to js. Any "extern double fd_*" line indicates that the function is used by js on that os.

Robert Ginda

Assignee

Comment 16

•

25 years ago

Attached file Testcase showing what I believe is the Right Way to do this — Details

cls

Comment 17

•

25 years ago

The testcase looks good except that I don't think removing the existing API (the __LO & __HI macros) is going to go over well. Here's what I'm currently working on. typedef union { #ifdef __LITTLE_ENDIAN struct { int32 lo, hi; } ints; #else struct { int32 hi, lo; } ints; #endif double d; } fd_twoints; #define __HI(x) x.ints.hi #define __LO(x) x.ints.lo #define __HIp(x) &(x.ints.hi) #define __LOp(x) &(x.ints.lo) Using this method, there are only a couple of places in each fdlibm src file that need to be changed and they are mainly assignments to & from the union variable.

cls

Comment 18

•

25 years ago

Ok, here's a preliminary patch that fixes xpcshell for me. Things to note: X86_LINUX isn't defined in the Mozilla build. I think this was what was really causing xpcshell to give the wrong answers in a debug build. The macros, __LOp & __HIp, aren't defined anymore as I couldn't come up with a quick fix for them. The Mozilla build doesn't use those macros but I suspect the code that does will have to be rewritten. I only changed the files needed to compile Mozilla not jsref (I couldn't get it to build).

cls

Comment 19

•

25 years ago

Attached patch Preliminary fix using union — Details — Splinter Review

tenthumbs

Comment 20

•

25 years ago

Just for the record, changing the macros to #define __LO(x) ((volatile int*)&x)[0] #define __HI(x) ((volatile int*)&x)[1] also works.

cls

Comment 21

•

25 years ago

The volatile workaround works but it's just reinforcing the use of invalid ANSI C code. Has anyone had a chance to verify that the union fix works on non-gcc compilers?

Robert Ginda

Assignee

Comment 22

•

25 years ago

I'm running the test suite on the patch now, will have results soon.

tenthumbs

Comment 23

•

25 years ago

If it is invalid. So far there's just GCC's opinion on the matter.

Robert Ginda

Assignee

Comment 24

•

25 years ago

cls: your patches are working, but there are more files to be patched. I've updated for solaris and AIX, working on windows. updated comming diff soon.

Robert Ginda

Assignee

Comment 25

•

25 years ago

We build the entire fdlibm on windows, from what I can tell. leaving no choice but to fix the whole thing. I've fixed a few more files in this updated patch, but theres more to be done. cls, if you feel like fixing more files, be my guest :)

Robert Ginda

Assignee

Comment 26

•

25 years ago

Attached patch more fixes — Details — Splinter Review

cls

Comment 27

•

25 years ago

What's left beyond the entire fdlibm? Now that I have the build instructions for JSRef, I've been building it, patching files as I go. I hit the only uses of __LOp & __HIp in s_modf.c and I'm wondering how you got around them.

Robert Ginda

Assignee

Comment 28

•

25 years ago

On unixish systems, it looks like we only build the files we need, which apparently don't include any uses of __(LO|HI)p macros. I havn't gotten a windows build to finish yet. grepping for __LO and __HI might be a faster way than trial and error,for finding trouble spots.

cls

Comment 29

•

25 years ago

Attached patch Applies the union fix to enough of fdlibm to build JSRef & Mozilla — Details — Splinter Review

cls

Comment 30

•

25 years ago

Actually, after a grep check, it turns out that the last patch handles all occurances of __LO, __HI, __LOp & __HIp. Rginda, any success on windows & Mac?

cls

Comment 31

•

25 years ago

Attached patch s/int32_t/int/g — Details — Splinter Review

Robert Ginda

Assignee

Comment 32

•

25 years ago

#ifdef XP_MAC # include <types.h> #else # ifdef _WIN32 # define in32_t long # else # include <sys/types.h> # endif #endif After adding the #define for _WIN32, the patch passed all tests on win32. The mac results aren't in yet. Because of it's size, this patch will have to wait until after the JS1.5 release (Tuesday) to be checked in.

cls

Comment 33

•

25 years ago

What's the status on this? Rginda mentioned that there were possible problems under win32 but then said they were not due to the patch.

cls

Comment 34

•

25 years ago

Yet another request for a status check.

Robert Ginda

Assignee

Comment 35

•

25 years ago

The problems are not due to the patch, and have been resolved. You can check this in if you'd like.

rogerl (gone)

Updated

•

25 years ago

Status: NEW → ASSIGNED

Keywords: nsbeta3

Patrick C. Beard

Comment 36

•

25 years ago

This is a huge patch, has it been thoroughly reviewed by all interested parties? If so, I say check the sucker in for beta3.

Whiteboard: [nsbeta3+]

rogerl (gone)

Comment 37

•

25 years ago

Over to Rob for checkin shepherding

Assignee: rogerl → rginda

Status: ASSIGNED → NEW

Robert Ginda

Assignee

Comment 38

•

25 years ago

So, the "volatile" fix worked for the standalone js shell in both opt and debug builds, but apparently is f[l]ailing in xpcshell and mozilla. I've just rerun the js test suite (with cls' union patch) for standalone js and xpcshell and all seems to be fine. I still want to try xpcshell opt, but should have this checked in tomorrow or so.

Status: NEW → ASSIGNED

Brendan Eich [:brendan]

Comment 39

•

25 years ago

Has someone at Sun possibly come across these same fdlibm bugs and fixed them? Cc'ing clayton. /be

Robert Ginda

Assignee

Comment 40

•

25 years ago

Just checked int the union based patch.

Status: ASSIGNED → RESOLVED

Closed: 25 years ago

Resolution: --- → FIXED

rogerl (gone)

Comment 41

•

25 years ago

*** Bug 53424 has been marked as a duplicate of this bug. ***

Braden

Comment 42

•

25 years ago

*** Bug 59963 has been marked as a duplicate of this bug. ***

You need to log in before you can comment on or make changes to this bug.