Provider test failed on HPUX with 544MB log full of error messages.

RESOLVED FIXED in 4.6.9

Status

NSPR
NSPR
P2
critical
RESOLVED FIXED
11 years ago
10 years ago

People

(Reporter: Slavomir Katuscak, Assigned: Slavomir Katuscak)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: test)

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

11 years ago
From nightly tests:
securityjes5/20070627.1/charm.4-HP-UXB.11.11_64_OPT.OBJ

provider./runtests.sh[261]: 7531 Terminated
			FAILED

In log there is 12376114x line:
PR_CONNECT_RESET_ERROR(-5961), oserror = 32

Log has 544MB, it's too much.
If NSPR had a "test" component, this bug would get that component.

This bug, and the other NSPR bugs like it that were filed in the last few days,
seems to be a defect in the test program and/or test script itself,
or in the system configuration, rather than (or in addition to) a defect 
in the NSPR libraries.  

If there is a defect in the NSPR libraries, these test results do not 
clearly indicate that.  Many of these failures are not occurring in NSPR
test programs at all.  

Have the systems run out of some resource?  
Are they not able to start more processes because the Process table is full?  
Are the failures to start new threads due to the process table being full? 
(Does each thread have its own PID?) 
Assignee: wtc → slavomir.katuscak
Whiteboard: test
Slavo, any test that produces 12376114 lines of the same error message
is a seriously broken test.  It needs to be recoded, so as to stop LONG
before that, like (say) after the second error.  Maybe it needs to pause
for a little time after the error?  

Please fix the test so that this does not occur again.
(Assignee)

Updated

11 years ago
Priority: -- → P2
(Assignee)

Comment 3

10 years ago
Finally (after many experiments) reproduced:

$ nspr_build/pr/tests/provider -e 1 -v -p -d > out.log

On error output there was only many lines:
PR_CONNECT_RESET_ERROR(-5961), oserror = 32

In out.log there is:
Client/Server started at(0x8000000100007690): 05:30:45
main(0x8000000100007690): starting server
main(0x8000000100007690): creating server thread
main(0x8000000100007690): waiting for server init
Server started at(0x8000000100023fe8): 05:30:45
main(0x8000000100007690): server init complete (port #13000)
main(0x8000000100007690): creating 1 client threads
main(0x8000000100007690): creating client threads
        CreateWorker(0x8000000100023fe8): create new worker (0x0000000000000000)
        Server(0x8000000100023fe8): waiting for state change
Client started at(0x8000000100047fd8): 05:30:45
                Worker(0x8000000100060018): started [1]
                Worker(0x8000000100060018): calling accept
main(0x8000000100007690): waiting for execution interval (1 seconds)
        Client(0x8000000100047fd8): connecting to server at 127.0.0.1
        Client(0x8000000100047fd8): sending descriptor for 838 bytes
                Worker(0x8000000100060018): calling accept
        Client(0x8000000100047fd8): sending 838 bytes
        Client(0x8000000100047fd8): sending 839 bytes
        Client(0x8000000100047fd8): sending 840 bytes

        ... (increasing numbers)

        Client(0x8000000100047fd8): sending 1020 bytes
        Client(0x8000000100047fd8): sending 1021 bytes
        Client(0x8000000100047fd8): sending 1022 bytes
        Client(0x8000000100047fd8): sending 1023 bytes
        Client(0x8000000100047fd8): sending 1024 bytes
        Client(0x8000000100047fd8): sending 1024 bytes
        Client(0x8000000100047fd8): sending 1024 bytes
        Client(0x8000000100047fd8): sending 1024 bytes

        ... (many many lines)

        Client(0x8000000100047fd8): sending 1024 bytes
        Client(0x8000000100047fd8): sending 1024 bytes
Shutting down(0x8000000100007690): 05:30:46
main(0x8000000100007690): notifying client(0x8000000100047fd8) to stop
        Client(0x8000000100047fd8): sending 1024 bytes
        Client(0x8000000100047fd8): stopped after 0 operations and 0 bytes
main(0x8000000100007690): joining client(0x8000000100047fd8)
main(0x8000000100007690): notifying server(0x8000000100023fe8) to stop
        Server(0x8000000100023fe8): shutting down workers
        Server(0x8000000100023fe8): interrupting worker(0x8000000100025148)
        Server(0x8000000100023fe8): waiting for 1 workers to exit
                Worker(0x8000000100060018): exiting [1]
        Server(0x8000000100023fe8): stopped after 0 operations and 0 bytes
main(0x8000000100007690): joining server(0x8000000100023fe8)
main(0x8000000100007690): test complete
Test exiting at(0x8000000100007690): 05:30:50
(Assignee)

Comment 4

10 years ago
Output from successfull run:

Client/Server started at(0x8000000100007690): 05:19:26
main(0x8000000100007690): starting server
main(0x8000000100007690): creating server thread
main(0x8000000100007690): waiting for server init
Server started at(0x8000000100023fe8): 05:19:26
main(0x8000000100007690): server init complete (port #13000)
main(0x8000000100007690): creating 1 client threads
main(0x8000000100007690): creating client threads
        CreateWorker(0x8000000100023fe8): create new worker (0x0000000000000000)
        Server(0x8000000100023fe8): waiting for state change
Client started at(0x8000000100043fe8): 05:19:26
                Worker(0x8000000100063fe8): started [1]
                Worker(0x8000000100063fe8): calling accept
        Client(0x8000000100043fe8): connecting to server at 127.0.0.1
main(0x8000000100007690): waiting for execution interval (1 seconds)
        Client(0x8000000100043fe8): sending descriptor for 838 bytes
        ProcessRequest(0x8000000100063fe8): receiving desciptor
        Client(0x8000000100043fe8): sending 838 bytes
        Client(0x8000000100043fe8): receiving 838 bytes
                ProcessRequest(0x8000000100063fe8): read descriptor {838, CS63daea408000000100043fe8-0.dat}
        ProcessRequest(0x8000000100063fe8): receive 838 bytes
        ProcessRequest(0x8000000100063fe8): write 838 bytes to file
                ProcessRequest(0x8000000100063fe8): opening CS63daea408000000100043fe8-0.dat
        ProcessRequest(0x8000000100063fe8): read 838 bytes from file
                ProcessRequest(0x8000000100063fe8): sending 838 bytes
        Client(0x8000000100043fe8): disconnected from server
                ProcessRequest(0x8000000100063fe8): Finished
                Worker(0x8000000100063fe8): calling accept
        Client(0x8000000100043fe8): connecting to server at 127.0.0.1
        Client(0x8000000100043fe8): sending descriptor for 113 bytes
        Client(0x8000000100043fe8): sending 113 bytes
        Client(0x8000000100043fe8): receiving 113 bytes
        ProcessRequest(0x8000000100063fe8): receiving desciptor
                ProcessRequest(0x8000000100063fe8): read descriptor {113, CS63daea408000000100043fe8-1.dat}
        ProcessRequest(0x8000000100063fe8): receive 113 bytes
        ProcessRequest(0x8000000100063fe8): write 113 bytes to file
                ProcessRequest(0x8000000100063fe8): opening CS63daea408000000100043fe8-1.dat
        ProcessRequest(0x8000000100063fe8): read 113 bytes from file
                ProcessRequest(0x8000000100063fe8): sending 113 bytes
        Client(0x8000000100043fe8): disconnected from server
                ProcessRequest(0x8000000100063fe8): Finished
                Worker(0x8000000100063fe8): calling accept
Shutting down(0x8000000100007690): 05:19:27
main(0x8000000100007690): notifying client(0x8000000100043fe8) to stop
main(0x8000000100007690): joining client(0x8000000100043fe8)
        Client(0x8000000100043fe8): stopped after 2 operations and 1902 bytes
main(0x8000000100007690): notifying server(0x8000000100023fe8) to stop
        Server(0x8000000100023fe8): shutting down workers
        Server(0x8000000100023fe8): interrupting worker(0x8000000100025148)
        Server(0x8000000100023fe8): waiting for 1 workers to exit
                Worker(0x8000000100063fe8): exiting [1]
        Server(0x8000000100023fe8): stopped after 2 operations and 1902 bytes
main(0x8000000100007690): joining server(0x8000000100023fe8)
main(0x8000000100007690): test complete
Test exiting at(0x8000000100007690): 05:19:32
(Assignee)

Comment 5

10 years ago
static void _MY_Assert(const char *s, const char *file, PRIntn ln)
{
    PL_PrintError(NULL);
#if DEBUG
    PR_Assert(s, file, ln);
#endif
}  /* _MW_Assert */

That's the problem. Assert doesn't terminate in optimized code.
(Assignee)

Comment 6

10 years ago
Created attachment 296358 [details] [diff] [review]
Patch.

Enabling PR_Assert also in optimized code, without this code doesn't terminate when expected and that can cause infinite loops.
Attachment #296358 - Flags: review?(wtc)
(Assignee)

Updated

10 years ago
Target Milestone: --- → 4.6.9

Comment 7

10 years ago
Comment on attachment 296358 [details] [diff] [review]
Patch.

r=wtc.  Thanks for the patch.
Attachment #296358 - Flags: review?(wtc) → review+

Updated

10 years ago
Duplicate of this bug: 396212

Comment 9

10 years ago
Created attachment 296775 [details] [diff] [review]
Patch v2 (checked in)

This patch also fixes the same bug in cltsrv.c.  Note
that I kept the comment at the end of _MY_Assert but I
fixed the typo.

I've checked in this patch on the NSPR trunk for NSPR 4.7.
If you still need to fix this bug in NSPR 4.6.9, please
check in this patch on the NSPR_4_6_BRANCH.

Checking in cltsrv.c;
/cvsroot/mozilla/nsprpub/pr/tests/cltsrv.c,v  <--  cltsrv.c
new revision: 3.15; previous revision: 3.14
done
Checking in provider.c;
/cvsroot/mozilla/nsprpub/pr/tests/provider.c,v  <--  provider.c
new revision: 3.15; previous revision: 3.14
done
Attachment #296358 - Attachment is obsolete: true
(Assignee)

Comment 10

10 years ago
Checked into branch:

Checking in cltsrv.c;
/cvsroot/mozilla/nsprpub/pr/tests/cltsrv.c,v  <--  cltsrv.c
new revision: 3.12.2.1; previous revision: 3.12
done
Checking in provider.c;
/cvsroot/mozilla/nsprpub/pr/tests/provider.c,v  <--  provider.c
new revision: 3.12.2.1; previous revision: 3.12
done
Status: NEW → RESOLVED
Last Resolved: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.