Open Bug 301987 Opened 19 years ago Updated 2 years ago

some tests fail

Categories

(NSPR :: NSPR, defect)

Other
FreeBSD
defect

Tracking

(Not tracked)

People

(Reporter: mi+mozilla, Unassigned)

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (compatible; Konqueror/3.4; FreeBSD; X11; amd64) KHTML/3.4.1 (like Gecko)
Build Identifier: 

The test-suite appears to be largely abandoned and bit-rotten. But the 
pr/tests/addrstr seems to show a genuine bug: 
 
./addrstr 
converted bad addr 1:2:3:4:5:6:7::8 
converted bad addr 1:2:3:4:5:6::7:8 
FAIL 
 
Don't know, how best to fix it :-( 

Reproducible: Always
Assuming you built NSPR on FreeBSD, which has inet_pton,
this means FreeBSD's inet_pton converts two bad IPv6
address strings "1:2:3:4:5:6:7::8" and "1:2:3:4:5:6::7:8"
successfully.  inet_pton should reject them because they
are too long; the "::" expands to at least one 16-bit
group of zeros.

You can try the following:
1. Report this bug to the FreeBSD maintainers; and
2. Remove the line:
       #define _PR_HAVE_INET_NTOP
   from mozilla/nsprpub/pr/include/md/_freebsd.h,
   and rebuild NSPR (cd into nsprpub and say
   "make clean; make").

We should add "addrstr" to runtests.sh so that this
test gets run regularly.
Please apply this patch to your source tree, rebuild NSPR,
run addrstr, and paste the output in this bug report.
Yes, indeed, this seems to be a FreeBSD problem :-( However, FreeBSD inet_pton    
implementation is straight from KAME:    
    
http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/net/inet_pton.c    
    
Some (most?) other OSes must be having the same problem...    
    
I reported the bug to FreeBSD:    
   http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/84106 
    
In addition to this, the following tests also fail on FreeBSD:   
         dlltest Segmentation fault (core dumped)   
        dlltest:        PR_LoadLibrary failed (-5977, 43, Cannot open   
"dll/libmy.so")   
   
        instrumt Segmentation fault (core dumped)   
   
     libfilename                        FAILED   
        libfilename:    PR_LoadLibrary failed   
   
   
Everything else (including forktest, nbconn, and poll_er) works and my 
FreeBSD/amd64 (single opteron-246). 
As far as dlltest, I can tell you, that build/pr/tests/dll/libmy.so.1 gets   
created, but there is no usual libmy.so symlink. If I create the link manually,   
both dltest and libfilename succeed.   
   
The instrumt test crashes with the following:   
(gdb) run   
Starting   
program: /var/ports/devel/nspr/work/nspr-4.6/mozilla/nsprpub/build/pr/tests/instrumt    
   
Program received signal SIGSEGV, Segmentation fault.   
0x0000000800671cc4 in pthread_testcancel () from /usr/lib/libpthread.so.1   
(gdb) where   
#0  0x0000000800671cc4 in pthread_testcancel () from /usr/lib/libpthread.so.1   
#1  0x0000000800661760 in sigaction () from /usr/lib/libpthread.so.1   
#2  0x000000080066a265 in pthread_mutexattr_init ()   
from /usr/lib/libpthread.so.1   
#3  0x0000000000000000 in ?? ()   
Error accessing memory address 0x7fffffbff000: Bad address.   
   
This seems to happen on the line 390 of instrumt.c:   
	t1 = PR_CreateThread(PR_USER_THREAD,   
			RecordTrace, NULL,    
			PR_PRIORITY_NORMAL,   
			PR_GLOBAL_THREAD,   
    		PR_UNJOINABLE_THREAD,   
			0);   
   
The pthread_mutexattr_init on the gdb's stack above is, actually, the   
_PT_PTHREAD_CREATE() call on line pr/src/pthreads/ptthread.c:455. This crash  
happens on both FreeBSD/i386 and FreeBSD/amd64. 
 
On my dual PentiumII machine (FreeBSD/i386) the following other tests fail: 
        forktest                        FAILED 
        forktest:       Accepting connection at port 52656 
        forktest:       Wait one second before connect 
        forktest:       Connecting to port 52656 
        forktest:       Writing message "Hello world!" 
        forktest:       Received "Hello world!" from the client 
        forktest:       The message is received correctly 
        forktest:       Fork succeeded.  Parent process continues. 
        forktest:       Accepting connection at port 60779 
        forktest:       Wait one second before connect 
        forktest:       Fork succeeded.  Child process continues. 
        forktest:       Accepting connection at port 62560 
        forktest:       Connecting to port 60779 
        forktest:       Writing message "Hello world!" 
        forktest:       Received "Hello world!" from the client 
        forktest:       The message is received correctly 
        forktest:       PR_Accept failed: error code -5990 
        forktest:       Child process exits. 
        forktest:       Parent process exits. 
        forktest:       FAILED 
 
      io_timeout                        FAILED 
        io_timeout:     test with global bound thread 
        io_timeout:     thread id 0, scope GLOBAL_BOUND scope 
        io_timeout:     test with local thread 
        io_timeout:     thread id 0, scope GLOBAL scope 
        io_timeout:     test with global thread 
        io_timeout:     thread id 0, scope GLOBAL scope 
 
Do these, perhaps, have race conditions or some other assumptions, that don't 
hold on multi-CPU machines? 
Summary: addrstr test fails → some tests fail
Yes, there is a race in forktest. Adding a one second delay before PR_Accept 
solves the problem: 
 
@@ -185,2 +185,5 @@ 
     } 
+    printf("Wait one second before accept\n"); 
+    fflush(stdout); 
+    PR_Sleep(PR_SecondsToInterval(1)); 
     printf("Accepting connection at port %hu\n", PR_ntohs(addr.inet.port)); 
 
io_timeout was failing because I had a web-server running on the machine
listening on port 8000...

The attached patch changes the BASE_PORT from the collision-prone 8000 to 38011
(for want of a better algorithm for free-port searching), improves diagnostics
of bind() failure, and makes EADDRINUSE a non-failure, so that the test
sequence does not fail for reasons outside of NSPR.
QA Contact: wtchang → nspr
First part of this bug fixed long ago on FreeBSD side, does second still matters?
Mikhail T, can you offer guidance on ....

(In reply to Phoenix from comment #6)
> First part of this bug fixed long ago on FreeBSD side, does second still
> matters?
Flags: needinfo?(mi+mozilla)
Unfortunately, I never understood, what Phoenix meant by "first part" and "second part".

FWIW, I just ran the tests here and they all succeeded except:

          fdcach                        FAILED
        fdcach: fd stack malfunctioned
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(mi+mozilla) → needinfo?(pppx)
(In reply to Mikhail T. from comment #8)
> Unfortunately, I never understood, what Phoenix meant by "first part" and
> "second part".
You provided two patches, at the time of writing first patch was present in FreeBSD ports "files" folder
Flags: needinfo?(pppx)
(In reply to Phoenix from comment #9)
> (In reply to Mikhail T. from comment #8)
> > Unfortunately, I never understood, what Phoenix meant by "first part" and
> > "second part".
> You provided two patches, at the time of writing first patch was present in
> FreeBSD ports "files" folder

It was present in the files/ folder, because I put it there :-) Well, marcus@ did on my prodding.

My two hunks (single patch) -- along with a whole bunch of others -- remain in our devel/nspr port:

      https://svnweb.freebsd.org/ports/head/devel/nspr/files/patch-tests?view=markup

But we are not discussing the FreeBSD port here, but an NSPR problem. That FreeBSD (partially) solves it, is irrelevant -- Mozilla/NSPR developers ought to improve their code so OS-specific packagers do not have to maintain patches for decades... NSPR's self-tests remain unreliable, which keeps people from routinely using them, thus making them less useful than they deserve.
Severity: normal → S3

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: wtc → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: