Closed Bug 48976 Opened 24 years ago Closed 24 years ago

AIX Optimized build coredumps in libmozjs.so on startup

Categories

(Core :: JavaScript Engine, defect, P3)

Other
AIX
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: jdunn, Assigned: jdunn)

References

Details

Attachments

(1 file)

The optimized build coredumps on startup and I have traced it to
something in libmozjs.so (I did this by compiling libmozjs.so
without -O and I no longer coredump).  There is already one
file, jsdtoa.c, in js/src that causes us problems when compiled
optimized.  I just need to track down the other file(s).

This occurs on the xlC 3.6.4 compiler.
Blocks: 18688
Status: NEW → ASSIGNED
QA Contact: pschwartau → jdunn
Target Milestone: --- → M18
Adding rginda, since he did some JS porting work on AIX and
I am wondering if he ran into an issue with an opt build?
I just built js standalone/optimized on hotaix and it starts up fine, but the
crash you're seeing is probably running some script.  Could you post a stack
trace?

Adding pschwartau, the new js QA contact.
There is no strack trace.  all data is lost.
I am slowly removing -O's from the compile lines for
each file to determine which file or sets of files affect this.
Will let you know.
This patch will surely slow you down, but may help see what js was doing before
the crash...

Index: jsinterp.c
===================================================================
RCS file: /cvsroot/mozilla/js/src/jsinterp.c,v
retrieving revision 3.49
diff -u -r3.49 jsinterp.c
--- jsinterp.c  2000/07/14 05:37:40     3.49
+++ jsinterp.c  2000/08/16 00:46:17
@@ -1121,6 +1121,7 @@
     JSTryNote *tn;
     ptrdiff_t offset;
 #endif
+    int lineno, oldlineno = 0;
 
     if (cx->interpLevel == MAX_INTERP_LEVEL) {
         JS_ReportErrorNumber(cx, js_GetErrorMessage, NULL,
JSMSG_OVER_RECURSED);
@@ -1175,6 +1176,15 @@
     SAVE_SP(fp);
 
     while (pc < endpc) {
+        lineno = js_PCToLineNumber(script, fp->pc);
+        if (lineno != oldlineno) {
+            oldlineno = lineno;
+            if (script->filename) {
+                fprintf(stderr, "script name %s, line number %d\n",
+                        script->filename, lineno);
+            }
+        }
+
         fp->pc = pc;
         op = (JSOp) *pc;
       do_op:
I have found that the following 'diff' works (removing the -O flag
for jsscan, jsinterp, jsgc, jspars, jsatom

Index: Makefile.in
===================================================================
RCS file: /cvsroot/mozilla/js/src/Makefile.in,v
retrieving revision 3.52
diff -r3.52 Makefile.in
216c216
< ifeq (,$(filter BeOS HP-UX,$(OS_ARCH)))
---
> ifeq (,$(filter BeOS HP-UX AIX,$(OS_ARCH)))
294a295,309
>       $(CC) -o $@ -c $(filter-out -O, $(COMPILE_CFLAGS)) $<
> # if scan is compiled OPT, we coredump
> jsscan.o: jsscan.c
>       $(CC) -o $@ -c $(filter-out -O, $(COMPILE_CFLAGS)) $<
> # if interp is compiled OPT, we coredump
> jsinterp.o: jsinterp.c
>       $(CC) -o $@ -c $(filter-out -O, $(COMPILE_CFLAGS)) $<
> # if parse is compiled OPT, we coredump
> jsgc.o: jsgc.c
>       $(CC) -o $@ -c $(filter-out -O, $(COMPILE_CFLAGS)) $<
> # if parse is compiled OPT, we exit with no error
> jsparse.o: jsparse.c
>       $(CC) -o $@ -c $(filter-out -O, $(COMPILE_CFLAGS)) $<
> # if attom is compiled OPT, we 'spin'
> jsatom.o: jsatom.c
Turning off compiler optimization for five files
sounds a little too much.  This sounds like either
a compiler bug or a bug in the JS code (e.g., not
using the 'volatile' qualifier where it's needed).

Last year we found an NSPR bug caused by not using
the 'volatile' qualifier on a variable that may be
changed by a different thread without the protection
of a lock (intentionally).  The IBM compiler optimizes
aggressively and only reads that variable once and
optimizes away the subsequent reads.  Therefore the
generated code won't notice the change to the variable
by other threads.  I suspect that something similar may
be going on here.
HA! I think I found it!
I have tracked it to JS_ARENA_ALLOCATE!
If compile these files with -O and put a printf after the JS_ARENA_ALLOCATE
calls I don't coredump.  Can someone PLEASE look at the JS_ARENA_ALLOCATE
macro in jsarena.h and suggest something for me to try?

Thanks!
I have tracked it to the 3.7 revision to jsarena.h that try to clean 
it up to allow it to compile using C++.  One of the changes, on 
line 116, makes the following change
from:    p = (void *) _q;
to:    *(void **)&q = (void *) _q;

And it is this line, when compiled optimized that causes
the problem.  I have been talking to WanTeh who has the
exact same code in NSPR (in NSPR it is still p = (void *) _q;)
and we are talking to Mike McCabe and Bill Gibbons who
made the change.
cc'ing Brendan because I think he would be interested in this -
I just noticed that change (which is ugly, it requires p to be homed to stack
memory) and wondering who made it, and why.  Casting to void** like that likely
runs afoul of the C standard's memory ambiguity rules: different pointer types
must not alias the same address, and aggressive optimizers can count on the lack
of aliasing and reorder instructions.  I'm just reciting gospel, I have not
figured out exactly what the AIX optimizer might have reordered here to make a
crash occur.

If we go back to p = (void *) _q;, what do we regress?  mccabe, do you remember?
Bill's checkin log message doesn't specify.

/be
Be great to get a couple of r='s -- jdunn is testing on AIX and HPUX (thanks!).

/be
r= jdunn@netscape.com
verified on hpux & aix.  
this is fixed
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
verified
Status: RESOLVED → VERIFIED
closing
Status: VERIFIED → CLOSED
Bugs are not to be marked CLOSED until Mozilla 1.0. Reopening to reresolve and 
reverify.
Status: CLOSED → REOPENED
Resolution: FIXED → ---
Resolving FIXED.
Status: REOPENED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
And verified.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: