Open Bug 172470 Opened 23 years ago Updated 3 years ago

The nblayer test hangs on 4-CPU Red Hat Linux 6.2 machine

Categories

(NSPR :: NSPR, defect)

4.2.1
x86
Linux
defect

Tracking

(Not tracked)

People

(Reporter: wtc, Unassigned)

Details

Attachments

(1 file)

NSPR version: 4.2.2 The nblayer test sometimes hangs on Red Hat Linux 6.2. The test machine is washer, which has four CPUs. The thread stacks at the time of the hang are: Beginning layered test Ending layered test Thu Oct 3 15:13:51 PDT 2002 Beginning non-layered test Ending non-layered test Beginning layered test Ending layered test Thu Oct 3 15:13:51 PDT 2002 Beginning non-layered test Ending non-layered test Beginning layered test Ending layered test Thu Oct 3 15:13:51 PDT 2002 Beginning non-layered test Ending non-layered test Beginning layered test Ending layered test Thu Oct 3 15:13:51 PDT 2002 Beginning non-layered test Ending non-layered test Beginning layered test ... hang washer[svbld]:/u/svbld> ps -fu svbld UID PID PPID C STIME TTY TIME CMD svbld 3967 3966 0 15:08 pts/1 00:00:00 -tcsh svbld 4064 3967 0 15:12 pts/1 00:00:02 bash svbld 19964 4064 0 15:13 pts/1 00:00:00 nblayer svbld 19966 19964 0 15:13 pts/1 00:00:00 nblayer svbld 19969 19966 0 15:13 pts/1 00:00:00 nblayer svbld 19983 19982 1 15:27 pts/0 00:00:00 -tcsh svbld 20034 19983 0 15:27 pts/0 00:00:00 ps -fu svbld washer[svbld]:/u/svbld> gdb washer[svbld]:/u/svbld> cd /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/ washer[svbld]:/share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests> setenv LD_LIBRARY_PATH /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt//dist/lib washer[svbld]:/share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests> gdb nblayer 19964 GNU gdb 19991004 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux"... /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/19964: No such file or directory. Attaching to program: /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/nblayer, Pid 19964 Reading symbols from /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/../../dist/lib/libplc4.so...done. Reading symbols from /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/../../dist/lib/libnspr4.so...done. Reading symbols from /lib/libpthread.so.0...done. Reading symbols from /lib/libc.so.6...done. Reading symbols from /lib/libdl.so.2...done. Reading symbols from /lib/ld-linux.so.2...done. 0x40080deb in __sigsuspend (set=0xbffff328) at ../sysdeps/unix/sysv/linux/sigsuspend.c:48 48 ../sysdeps/unix/sysv/linux/sigsuspend.c: No such file or directory. (gdb) (gdb) bt #0 0x40080deb in __sigsuspend (set=0xbffff328) at ../sysdeps/unix/sysv/linux/sigsuspend.c:48 #1 0x40055c82 in __pthread_wait_for_restart_signal (self=0x4005d940) at pthread.c:785 #2 0x400535ce in pthread_join (thread_id=3074, thread_return=0xbffff484) at restart.h:26 #3 0x4003ba1c in PR_JoinThread () from /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/../../dist/lib/libnspr4.so #4 0x8049e5b in main () #5 0x4007a9cb in __libc_start_main (main=0x80499c4 <main>, argc=1, argv=0xbffff734, init=0x8048a44 <_init>, fini=0x8049eec <_fini>, rtld_fini=0x4000ae60 <_dl_fini>, stack_end=0xbffff72c) at ../sysdeps/generic/libc-start.c:92 (gdb) (gdb) thread 3 [Switching to thread 3 (Thread 19969)] #0 0x4010df50 in __poll (fds=0xbf7ffa48, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:45 45 ../sysdeps/unix/sysv/linux/poll.c: No such file or directory. (gdb) info threads 3 Thread 19969 0x4010df50 in __poll (fds=0xbf7ffa48, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:45 * 2 Thread 19964 (initial thread) 0x40080deb in __sigsuspend ( set=0xbffff328) at ../sysdeps/unix/sysv/linux/sigsuspend.c:48 1 Thread 19966 (manager thread) 0x4010df50 in __poll (fds=0x804e770, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:45 (gdb) bt #0 0x4010df50 in __poll (fds=0xbf7ffa48, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:45 #1 0x4003a31a in _pr_poll_with_poll () from /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/../../dist/lib/libnspr4.so #2 0x4003a4de in PR_Poll () from /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/../../dist/lib/libnspr4.so #3 0x8049260 in Server () #4 0x4003b49e in _pt_root () from /share/builds/mccrel3/nspr/nspr42/builds/20021002.1/hornet_Linux2.2/linux22_opt/pr/tests/../../dist/lib/libnspr4.so #5 0x40053b85 in pthread_start_thread (arg=0xbf7ffe40) at manager.c:241 (gdb) (gdb) thread 1 [Switching to thread 1 (Thread 19966 (manager thread))] #0 0x4010df50 in __poll (fds=0x804e770, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:45 45 in ../sysdeps/unix/sysv/linux/poll.c (gdb) bt #0 0x4010df50 in __poll (fds=0x804e770, nfds=1, timeout=2000) at ../sysdeps/unix/sysv/linux/poll.c:45 #1 0x40053915 in __pthread_manager (arg=0x5) at manager.c:128
The closest configurations I can find at Netscape are: 1. A uniprocessor Red Hat Linux 6.2 box 2. A 4-CPU Red Hat Linux 7.1 box 3. A dual-processor Red Hat Linux Advanced Server 2.1AS box. I will run the nblayer test repeatedly on these machines and see if I can reproduce the hang.
The stack trace is different. This should be filed as a new bug.
I haven't been able to reproduce the same hang of nblayer here. If you can reproduce the hang with a debug build and get the stack traces of the threads, that'll give me more debug information (file name and line numbers). There is no rush; do this when you have time. Thanks.
QA Contact: wtchang → nspr

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: wtc → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: