Closed Bug 99493 Opened 23 years ago Closed 23 years ago

stress test failure - file descriptor not connected

Categories

(NSS :: Tools, defect, P1)

HP
HP-UX
defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 83593

People

(Reporter: sonja.mirtitsch, Assigned: wtc)

Details

Attachments

(2 files)

The stresstest passes on all other platforms, but fails on sjsu. It has not yet
been evaluated if kernel patches or reconfiguration is needed (info at the end
of this report), but I doubt this very much, since the backward compatibility
stress tests (old tools, new dlls) are passing fine on the same machine.

I put in more debugging information in the script - please note thet the
selfserv is still present even at the end of the stresstest.

We noted a lot of stress test failures on HP lately, they look not very related,
 but we started seing more of them after we connected to the selfserv with a
testclnt -q before the stressclient. I will look into this, and for now replace
the tstclnt -q with a sleep 5 on sjsu only.

I paste in the relevant part of the output.log. 
ssl.sh: SSL Stress Test ===============================
ssl.sh: Stress SSL2 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n sjsu.red.iplanet.com \
         -w nss   -i ../tests_pid.9655  &
selfserv started at Thu Sep 13 06:04:45 PDT 2001
tstclnt -p 8443 -h sjsu -q -d . <
/share/builds/mccrel/nss/nsstip/builds/20010913.1/booboo_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

debugging disapering selfserv... ps -ef | grep selfserv
   svbld 10173  9778  4 06:04:45 ?         0:00 selfserv -D -p 8443 -d ../server
-n sjsu.red.iplanet.com -w 
   svbld 10181  9778  1 06:04:46 ?         0:00 grep selfserv
strsclnt -q -p 8443 -d . -w nss -c 1000 -C A  \
         sjsu.red.iplanet.com
strsclnt started at Thu Sep 13 06:04:46 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: 1 server certificates tested.
strsclnt completed at Thu Sep 13 06:05:08 PDT 2001
debugging disapering selfserv... ps -ef | grep selfserv
   svbld 10173  9778 69 06:04:45 ?         0:05 selfserv -D -p 8443 -d ../server
-n sjsu.red.iplanet.com -w 
   svbld 10189  9778  1 06:05:08 ?         0:00 grep selfserv
/share/builds/mccrel/nss/nsstip/builds/20010913.1/booboo_Solaris8/mozilla/security/nss/tests/all.sh[97]:
10173 Terminated
=============

we had to change following kernelparameters to make our tests pass on hp64

1. maxfiles.  old value = 60.  new value = 100.
2. nkthread.  old value = 499.  new value = 1328.
3. max_thread_proc.  old value = 64.  new value = 512.
4. maxusers.  old value = 32.  new value = 64.
5. maxuprc.  old value = 75.  new value = 512.
6. nproc.  old formula = 20+8*MAXUSERS, which evaluated to 276.
   new value (note: not a formula) = 750.

A few other kernel parameters were also changed automatically
as a result of the above changes.
Attached file result.html
Attached file output.log
this bug causes QA failures for 3.3.1 - maybe someone should have a look
hpgamma, which has the latest patches shows the same failures as seen on sjsu.

orville has the latest patches and shows somewhat different stresstest failures,
see bug #83593

If this could mean that the bugs show up on machines that do have the right
patches installed, and NSS only works on machines that are not up to date, this
is an issue that we need to fix or at least investigate before we release NSS
3.3.1 for HP-UX

following is the stresstest log for hpgamma, paste in the email correspondence
with Jim at the end of this report
 
==================================

ssl.sh: SSL Stress Test ===============================
ssl.sh: Stress SSL2 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n hpgamma.red.iplanet.com \
         -w nss   -i ../tests_pid.10204  &
selfserv started at Thu Sep 20 14:02:26 PDT 2001
tstclnt -p 8443 -h hpgamma -q -d . <
/share/builds/mccrel/nss/nss331/builds/20010920.1/booboo_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

strsclnt -q -p 8443 -d . -w nss -c 1000 -C A  \
         hpgamma.red.iplanet.com
strsclnt started at Thu Sep 20 14:02:27 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: 1 server certificates tested.
strsclnt completed at Thu Sep 20 14:02:30 PDT 2001
ssl.sh: Stress SSL3 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n hpgamma.red.iplanet.com \
         -w nss   -i ../tests_pid.10204  &
selfserv started at Thu Sep 20 14:02:30 PDT 2001
tstclnt -p 8443 -h hpgamma -q -d . <
/share/builds/mccrel/nss/nss331/builds/20010920.1/booboo_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

strsclnt -q -p 8443 -d . -w nss -c 1000 -C c  \
         hpgamma.red.iplanet.com
strsclnt started at Thu Sep 20 14:02:30 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: PR_Write returned error -5978:
Network file descriptor is not connected.
strsclnt: 992 cache hits; 1 cache misses, 0 cache not reusable
strsclnt completed at Thu Sep 20 14:02:37 PDT 2001

================================

You can use hpgamma for the test.

Sonja Mirtitsch wrote:

Thanks, but I'd rather wait until the NSS team responds with their thoughts, if
the patches actually could trigger the failure. Possibly NSS is using an
HP-UX-bug as a feature, and as soon as the HP-bug is gone, NSS fails.
It is possible that NSS depends on a certain errorcode returned, and HP-UX with
and without the patches return different errorcodes.
May I use bar to run some of my tests?
It will take less than 1/2 an hour for the tests, I will not need root access,
it seems to have all necessary mounts,except for  /tools/ns.

Thanks

Sonja

Jim Fei wrote:

Sonja,
The June 2001 patch is on machine bar(/work/depot/XSWGR1100_11.00.depot). You
can come over to my office(1517) and we install together or you can do it youself.
Jim

Sonja Mirtitsch wrote:

Hi Jim,

I have root access to sjsu, could we have a look at this together?

orville patch timing is very interesting, because the first time the problem was
seen there is 6/15/2001.
Maybe Nelson or Wan-Teh have a thought there?
http://bugzilla.mozilla.org/show_bug.cgi?id=83593
http://bugzilla.mozilla.org/show_bug.cgi?id=99493

Sonja

Jim Fei wrote:

Sonja,
I don't have root access on sjsu.  orville already has the latest patch bundle
release(June 2001).

Regards,
Jim

Sonja Mirtitsch wrote:

Hi Jim,

I have troubles with 2 iplanet HP-UX machines, probably not all necessary
patches are installed. The names of the machines are orville and sjsu, I was
hoping you could have a look at them. Both are in the red.iplanet.com domain
Thanks

Sonja


Severity: normal → critical
Priority: -- → P1
Target Milestone: --- → 3.3.1
This bug is a duplicate of the strsclnt failures on orville.
The underlying problem is a HP-UX bug.  The difference is
that we are more likely to run into that HP-UX bug on orville
than on sjsu and hpgamma.

*** This bug has been marked as a duplicate of 83593 ***
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: