Closed Bug 1694 Opened 26 years ago Closed 26 years ago

SunOS 5.6, Gcc 2.8.1, Latest pitches in Threads

Categories

(NSPR :: NSPR, defect, P2)

Sun
Solaris
defect

Tracking

(Not tracked)

CLOSED INVALID

People

(Reporter: igb, Assigned: wtc)

Details

Building from a cvs snapshot, I consistently get the following problem:

mungo:/u/igb/mozilla/mozilla/obj-sparc-sun-solaris2.6/dist/bin 08:10:49 (543)
$ ./xpviewer
Segmentation Fault (core dumped)

Using GDB to look at the entrails, I see:

#0  0xef118bac in _hashKeyCompare (key1=
Cannot access memory at address 0xef7fffec.
) at ../../../xpcom/src/nsHashtable.cpp:31
31      static PR_CALLBACK PRIntn _hashKeyCompare(const void *key1, const void
*key2) {
(gdb) bt
#0  0xef118bac in _hashKeyCompare (key1=
Cannot access memory at address 0xef7fffec.
) at ../../../xpcom/src/nsHashtable.cpp:31
Cannot access memory at address 0xef7fff74.

which to me, not currently doing development for a living, says the stack
has got trampled.  So I run it under GDB control, and for as long as I
leave the break-point at prulock:207 I can do `cont' and it keeps going.

Delete the break point and it pitches.

ian

Program received signal SIGSEGV, Segmentation fault.
0xee612754 in PR_Lock (lock=0xa6940) at prulock.c:208
208         PRThread *me = _PR_MD_CURRENT_THREAD();
Current language:  auto; currently c

which isn't the same thing, but
(gdb) bt
#0  0xee612754 in PR_Lock (lock=0xa6940) at prulock.c:208
Cannot access memory at address 0xef7fffa4.

The stack's still broken.  So, guessing that the problem's happening
around that area, I set a breakpoint and re-run:
(gdb) break 207
Breakpoint 1 at 0xee612728: file prulock.c, line 207.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program:
/u/igb/mozilla/mozilla/obj-sparc-sun-solaris2.6/dist/bin/./xpviewer
warning: Unable to find dynamic linker breakpoint function.
warning: GDB will be unable to debug shared library initializers
warning: and track explicitly loaded dynamic code.
Cannot insert breakpoint 1:
Temporarily disabling shared library breakpoints:
1

Breakpoint 1, PR_Lock (lock=0xc8220) at prulock.c:207
207     {
(gdb) bt
#0  PR_Lock (lock=0xc8220) at prulock.c:207
#1  0xef11c030 in nsRepository::FindFactory (aClass=@0x83f90,
    aFactory=0xeffff304) at ../../../xpcom/src/nsRepository.cpp:281
#2  0xef11cb80 in nsRepository::RegisterFactory (aClass=@0x83f90,
    aLibrary=0x83cc0 "libwidgetgtk.so", aReplace=0, aPersist=0)
    at ../../../xpcom/src/nsRepository.cpp:474
#3  0x2663c in NS_SetupRegistry ()
    at ../../../../xpfe/xpviewer/src/nsSetupRegistry.cpp:140
#4  0x3b01c in nsViewerApp::SetupRegistry (this=0xaa880)
    at ../../../../xpfe/xpviewer/src/nsViewerApp.cpp:151
#5  0x3b0dc in nsViewerApp::Initialize (this=0xaa880, argc=1, argv=0xeffff554)
    at ../../../../xpfe/xpviewer/src/nsViewerApp.cpp:169
#6  0x3aa64 in main (argc=1, argv=0xeffff554)
    at ../../../../xpfe/xpviewer/src/nsBrowserMain.cpp:104
(gdb) step
PR_Lock (lock=0xc8240) at prulock.c:208
208         PRThread *me = _PR_MD_CURRENT_THREAD();
(gdb) step
213         PR_ASSERT(me != suspendAllThread);
(gdb) step
215         PR_ASSERT(!(me->flags & _PR_IDLE_THREAD));
(gdb) step
225             _PR_INTSOFF(is);
(gdb) step
227         PR_ASSERT(_PR_IS_NATIVE_THREAD(me) || _PR_MD_GET_INTSOFF() != 0);
(gdb) step
230         if (lock->owner == 0) {
(gdb) step
232             lock->owner = me;
(gdb) step
233             lock->priority = me->priority;
(gdb) step
235             PR_APPEND_LINK(&lock->links, &me->lockList);
(gdb) step
238                     _PR_FAST_INTSON(is);
(gdb) step
239             return;
(gdb) step
307     }
(gdb) step
PR_EnterMonitor (mon=0xc8220) at prmon.c:80
80                      mon->entryCount = 1;
(gdb) step
82      }
(gdb) step
nsRepository::FindFactory (aClass=@0x83f90, aFactory=0xeffff304)
    at ../../../xpcom/src/nsRepository.cpp:283
283       IDKey key(aClass);
Current language:  auto; currently c++
(gdb) step
284       FactoryEntry *entry = (FactoryEntry*) factories->Get(&key);
(gdb) step
286       nsresult res = NS_ERROR_FACTORY_NOT_REGISTERED;
(gdb) step
298       PR_ExitMonitor(monitor);
(gdb) cont
Status: NEW → ASSIGNED
People have reported infinite recursion problems on Linux/x86.
I myself got stack overflow on Digital Unix V4.0D, which could
be caused by infinite recursion.  Now that you also saw a
corrupted stack, I think it's likely to be caused by the same
infinite recursion.

In short, problems elsewhere caused an infinite recursion,
which resulted in a crash in NSPR functions.  This is my
theory.

You can try the following.  If it works, then we know it
is due to the same infinite recursion problem (from Mike Shaver):
  if you want to test your port, update widget/ with the datestamp of
  "1998-11-24 02:00", and all should work again.  The nsBaseWidget
  changes since then have tripped some resize bugs in the GTK code,
  which case [sic] this stack death.

By "widget/", he's referring to the directory mozilla/widget.
I think there's more to it than the fix you suggest.  Simply checking out
mozilla/widget at the date quoted breaks stuff in layout/events/src.  An example
follows.  I think NS_KEY_PRESS and
the associated mKeyListener->KeyPress(*aDOMEvent) has been added to
for example layout/events/src/nsEventListenerManager.cpp since the widget
changes were made.

ian

make[3]: Entering directory
`/u/igb/mozilla/mozilla/obj-sparc-sun-solaris2.6/layout/events/src'
/usr/local/gcc-2.8.1/bin/g++ -o nsEventListenerManager.o -c -DXP_UNIX  -g
-fPIC  -DUSE_AUTOCONF=1 -DMOZILLA_CLIENT=1 -DBROKEN_QSORT=1 -DSTDC_HEADERS=1
-DHAVE_ST_BLKSIZE=1 -DHAVE_ST_RDEV=1 -DHAVE_TZNAME=1 -DHAVE_DIRENT_H=1
-DSTDC_HEADERS=1 -DHAVE_SYS_WAIT_H=1 -DTIME_WITH_SYS_TIME=1 -DHAVE_FCNTL_H=1
-DHAVE_LIMITS_H=1 -DHAVE_MALLOC_H=1 -DHAVE_STRINGS_H=1 -DHAVE_UNISTD_H=1
-DHAVE_SYS_FILE_H=1 -DHAVE_SYS_IOCTL_H=1 -DHAVE_SYS_TIME_H=1
-DHAVE_SYS_CDEFS_H=1 -DHAVE_LIBC=1 -DHAVE_LIBM=1 -DHAVE_LIBDL=1
-DHAVE_LIBRESOLV=1 -DHAVE_LIBSOCKET=1 -DHAVE_LIBNSL=1 -DHAVE_LIBELF=1
-DHAVE_LIBINTL=1 -DHAVE_LIBPOSIX4=1 -DHAVE_LIBW=1 -DHAVE_LIBL=1
-DHAVE_ALLOCA_H=1 -DHAVE_ALLOCA=1 -DHAVE_UNISTD_H=1 -DHAVE_GETPAGESIZE=1
-DHAVE_MMAP=1 -DRETSIGTYPE=void -DHAVE_STRCOLL=1 -DHAVE_STRFTIME=1
-DHAVE_UTIME_NULL=1 -DHAVE_VPRINTF=1 -DHAVE_FTIME=1 -DHAVE_GETCWD=1
-DHAVE_GETHOSTNAME=1 -DHAVE_GETWD=1 -DHAVE_MKDIR=1 -DHAVE_MKTIME=1
-DHAVE_PUTENV=1 -DHAVE_RMDIR=1 -DHAVE_SELECT=1 -DHAVE_SOCKET=1 -DHAVE_STRCSPN=1
-DHAVE_STRDUP=1 -DHAVE_STRERROR=1 -DHAVE_STRSPN=1 -DHAVE_STRSTR=1
-DHAVE_STRTOL=1 -DHAVE_STRTOUL=1 -DHAVE_UNAME=1 -DHAVE_QSORT=1 -DHAVE_SNPRINTF=1
-DHAVE_WAITID=1 -DHAVE_FORK1=1 -DHAVE_REMAINDER=1 -DHAVE_LCHOWN=1
-DHAVE_GETTIMEOFDAY=1 -DGETTIMEOFDAY_TWO_ARGS=1 -DHAVE_IOS_BINARY=1
-DHAVE_IOS_BIN=1  -D_IMPL_NS_HTML -UDEBUG -DNDEBUG -DTRIMMED -DNETSCAPE
-DOSTYPE=\"SunOS5\" -DMOZILLA_CLIENT -DLAYERS -DUNIX_EMBED -DX_PLUGINS
-DJS_THREADSAFE -DUNIX_ASYNC_DNS -DSTANDALONE_IMAGE_LIB -DMODULAR_NETLIB
-DMOZ_USER_DIR=\".mozilla\"  -I../../../dist/./include -I../../../dist/include
-I../../../../include -I/u/igb/mozilla/build/include
-I../../../dist/./public/jpeg -I../../../dist/./public/png
-I../../../dist/./public/zlib  -I../../../dist/public/dom
-I../../../../layout/events/src/../../html/base/src  -I/usr/openwin/include
../../../../layout/events/src/nsEventListenerManager.cpp
../../../../layout/events/src/nsEventListenerManager.cpp: In method `unsigned
int nsEventListenerManager::HandleEvent(class nsIPresContext &, struct nsEvent
*, class nsIDOMEvent **, enum nsEventStatus &)':
../../../../layout/events/src/nsEventListenerManager.cpp:361: `NS_KEY_PRESS'
undeclared (first use this function)
../../../../layout/events/src/nsEventListenerManager.cpp:361: (Each undeclared
identifier is reported only once
../../../../layout/events/src/nsEventListenerManager.cpp:361: for each function
it appears in.)
make[3]: *** [nsEventListenerManager.o] Error 1
I haven't tried the suggest workaround myself.
Sorry about that.

I think I will wait until people fix the infinite
recursion problem.  You can monitor the newsgroup
netscape.public.mozilla.unix for the current status,
especially the thread titled "crashing on startup"
started by Mike Shaver.
Hi, are you still getting this bug?

My Netscape colleague Chris McAfee
reported that he ran into three compiler
bugs when building SeaMonkey with gcc 2.8.1
on Solaris 2.6.  So this could also be
a possible cause.  Can you revert to
gcc 2.7.* or switch to egcs?
As wtc says, 2.8.1 isn't ready for primetime and people
think that 2.7.* or egcs is the way to go right now.
Status: ASSIGNED → RESOLVED
Closed: 26 years ago
Resolution: --- → INVALID
Status: RESOLVED → CLOSED
I am going to mark this bug as INVALID because
gcc 2.8.1 is not ready for prime time.
Closed the bug.
NSPR now has its own Bugzilla product.  Moving this bug to the NSPR product.
You need to log in before you can comment on or make changes to this bug.