stress tests fail intermittently on slow machines

RESOLVED WORKSFORME

Status

NSS
Tools
P1
normal
RESOLVED WORKSFORME
17 years ago
17 years ago

People

(Reporter: Sonja Mirtitsch, Assigned: larryh (gone))

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

(Reporter)

Description

17 years ago
Platform: Linux2.2_x86_glibc_PTH_DBG.OBJ
Test Run: washer.1 running RH_Linux_6.2_(Zoot)
last clean: 06/06
6/7 killed
first time 06/08
Stress SSL2 RC4 128 with MD5



Platform: Linux2.2_x86_glibc_PTH_DBG.OBJ
Test Run: phaedrus.1 running RH_Linux_6.1_(Cartman)
last clean: 06/05
first time 06/06
Stress SSL2 RC4 128 with MD5

Yugo dbldog (6.2) dryer (6.1) and huey (7.1) did not show the problem

ssl.sh: SSL Stress Test ===============================
ssl.sh: Stress SSL2 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n washer.red.iplanet.com \
         -w nss   -i ../tests_pid.31504  &
selfserv started at Fri Jun  8 06:26:41 PDT 2001
tstclnt -p 8443 -h washer -q -d . <
/h/blds-sca15a/export/builds/mccrel/nss/nsstip/builds/20010608.1/y2sun2_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

strsclnt -q -p 8443 -d . -w nss -c 1000 -C A  \
         washer.red.iplanet.com
strsclnt started at Fri Jun  8 06:27:09 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 4 server certificates tested.
strsclnt completed at Fri Jun  8 06:29:02 PDT 2001
ssl.sh: Stress SSL3 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n washer.red.iplanet.com \
         -w nss   -i ../tests_pid.31504  &
selfserv started at Fri Jun  8 06:29:34 PDT 2001
tstclnt -p 8443 -h washer -q -d . <
/h/blds-sca15a/export/builds/mccrel/nss/nsstip/builds/20010608.1/y2sun2_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

strsclnt -q -p 8443 -d . -w nss -c 1000 -C c  \
         washer.red.iplanet.com
strsclnt started at Fri Jun  8 06:29:59 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
strsclnt completed at Fri Jun  8 06:32:02 PDT 2001



ssl.sh: SSL Stress Test ===============================
ssl.sh: Stress SSL2 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n phaedrus.red.iplanet.com \
         -w nss   -i ../tests_pid.28408  &
selfserv started at Fri Jun  8 05:16:16 PDT 2001
tstclnt -p 8443 -h phaedrus -q -d . <
/h/blds-sca15a/export/builds/mccrel/nss/nsstip/builds/20010608.1/y2sun2_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

strsclnt -q -p 8443 -d . -w nss -c 1000 -C A  \
         phaedrus.red.iplanet.com
strsclnt started at Fri Jun  8 05:16:23 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 6 server certificates tested.
strsclnt completed at Fri Jun  8 05:18:35 PDT 2001
ssl.sh: Stress SSL3 RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../server -n phaedrus.red.iplanet.com \
         -w nss   -i ../tests_pid.28408  &
selfserv started at Fri Jun  8 05:19:06 PDT 2001
tstclnt -p 8443 -h phaedrus -q -d . <
/h/blds-sca15a/export/builds/mccrel/nss/nsstip/builds/20010608.1/y2sun2_Solaris8/mozilla/security/nss/tests/ssl/sslreq.txt

strsclnt -q -p 8443 -d . -w nss -c 1000 -C c  \
         phaedrus.red.iplanet.com
strsclnt started at Fri Jun  8 05:19:13 PDT 2001
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
strsclnt completed at Fri Jun  8 05:23:15 PDT 2001
(Reporter)

Comment 1

17 years ago
Pasting in Nelson's email:

...There was also a failure on phaedrus.  This failure had been reported
previously as intermittent.  This problem is not really a failure.  The SSL2
session lifetime is exactly 100 seconds, so when an SSL session reuse test last
longer than 100 seconds, it reports that the session was not reused in all the
connections. The test script should change to attempt fewer connections on slow
machines that cannot do 1000 connections
in 100 seconds.
... 
(Reporter)

Comment 2

17 years ago
I can change the testscripts, but where can I find a benchmark to measure
"slow"? Axilla, which showed this not-failure today as well used to be one of
our faster machines, 4 CPU Ultra Sparc 1664 MB of RAM.
Also, these failures have only shown up for less than 2 weeks or so, never
before, so I'd like to doublecheck, have we gotten that much slower?
OS: Linux → All
Hardware: PC → All
Summary: stress tests fail intermittantly on Linux → stress tests fail intermittantly on "slow" machines
(Reporter)

Comment 3

17 years ago
Created attachment 37950 [details]
results of axila
(Reporter)

Comment 4

17 years ago
Created attachment 37952 [details]
output.log
(Reporter)

Comment 5

17 years ago
Created attachment 37953 [details]
system information axilla
Both of the log outputs that were given above as part of this bug report
show an SSL 2 test that ran longer than 100 seconds.  As I have already
explained, 100 seconds is the limit of the lifetime of an SSL 2 connection.
Any SSL2 session reuse test on any platform that takes longer than 100 
seconds will fail.  

If a platform takes longer than 100 seconds to complete the test, change
the test for that platform to do fewer connections.

If a platform fails the SSL 2 stress test and does NOT take longer than 
100 seconds, that is another problem, and is NOT the problem reported in 
this bug report. 
Assignee: wtc → sonmi
(Reporter)

Updated

17 years ago
Summary: stress tests fail intermittantly on "slow" machines → ssl tests now slower on 4-CPU Ultrasparc than on P I 133 before

Comment 7

17 years ago
Sonja has a good point.  Axilla can hardly be called slow.  There
are these possible explanations.
1. NSS performance has degraded on the trunk recently.  We should run
   the same test on NSS 3.2.1 and see if it also exceeds the timeout
   for SSL2 (100 seconds).  I am also wondering why we are not getting
   the same slowdown on all machines.
2. The owner of axilla is running heavy jobs on axilla.

Note that this phenomenon started to occur before Nelson checked in his
new SSL server session ID cache code last Friday night (6/8).

Kirk, can you look into this?  We are suspecting performance degradation
on the trunk on Solaris and Linux.  The machines to test on are phaedrus,
washer, and axilla.

Sonja, can you examine the performance numbers in the daily QA reports
and see if there is any big change between now and NSS 3.2.1?
Assignee: sonmi → kirke
Priority: -- → P2
Target Milestone: --- → 3.3
(Reporter)

Comment 8

17 years ago
rerunning 3.2.1 QA on axilla, washer, phaedrus, louie and huey. This will
generate a QA performance table that is easier to read, also this way we can see
if there are already stresstest failures on 3.2.1
(Reporter)

Comment 9

17 years ago
reran 3.2.1 QA, all tested platforms at regular speed, failure did not show up
(washer, axilla, phaedrus, huey and louie)

Since the failure was intermittant I guess this does not tell us much - I am
rerunning once more.

Unfortunately I can not compare the time, since our original 3.2.1rtm build has
been lost a long time ago.
(Reporter)

Comment 10

17 years ago
check out bug #85456... maybe they are related.

I reran 3.2.1 QA multiple times on all machines that I had seen this failure so
far, but no failures on 3.2.1.
 I noticed that at times the network really slowed down (washer). I attach the
resultfile, should have all the hyperlinks in it, so you can see.
I will run the selfserv -v in tomorrows QA

(Reporter)

Comment 11

17 years ago
Created attachment 38165 [details]
3.2.1 result.html
(Reporter)

Comment 12

17 years ago
Created attachment 38166 [details]
3.2.1 result.html

Comment 13

17 years ago
I ran all.sh on the SunOS5.6_DBG.OBJ build of the tip
today on axilla for 23 times.  All passed.

There is no conclusive evidence that the performance
on the tip has degraded.  Moreover, if we assume that
the SSL2 session timeout that occurred on axilla was
due to performance degradation, then SSL2 session
timeout should have been observed on all machines slower
than axilla.  But we have only observed that on two
other machines (washer and phaedrus).  Therefore the
assumption is false.

This bug should be marked invalid.
This bug was originally filed about two slow machines that chronically 
take longer than 100 seconds to complete the SSL2 tests.  There is much
repeated evidence of this problem. 

Axilla is a fast machine that ONE DAY took very very long to complete.
I strongly surmise that the machine was also in use by other programs
run by other people that day.  A bug complaining about SSL performance
on a platform should not be submitted unless is is repeatable.

The subject of this bug should be set back to the original subject,
namely about slow machines that fail the tests.   The solution is a 
change to the test script for those boxes.
(Reporter)

Comment 15

17 years ago
I'll take out the tests on axilla, washer, phaedrus, and louie. huey needs to
stay until box is up.
---------------------------------

> 

> > 4. 84764: SSL tests took longer than the SSL2 timeout (100
> > seconds) on axilla, a powerful 4-CPU Solaris box.  We
> > suspect performance degradation on the trunk but there
> > is not enough evidence to convince me.

> 
> I read the comment on the bug. I assume I am supposed now to fix the
> stress tests to make them pass?


That would be nice but is not necessary.  There are many
other things more worthy of your time.

My comment on the bug is that I am unable to conclude
from our testing and analysis that we have introduced
performance degradation on the trunk since NSS 3.2.1.

I would just avoid running QA tests on slow machines and
mark the bug WONTFIX.  That's how low I think the priority
of fixing this is.
----------------


Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → WONTFIX

Comment 16

17 years ago
[kirke@dbldog client]$ !!
go box tip-nodelayserver
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:42:52 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
7.79user 1.02system 0:14.14elapsed 62%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1729minor)pagefaults 0swaps
Fri Jun 15 11:43:06 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:43:06 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
7.78user 1.04system 0:13.76elapsed 64%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1539minor)pagefaults 0swaps
Fri Jun 15 11:43:20 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:43:20 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
8.18user 0.91system 0:13.29elapsed 68%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1447minor)pagefaults 0swaps
Fri Jun 15 11:43:33 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:43:33 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
7.83user 1.10system 0:13.83elapsed 64%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1766minor)pagefaults 0swaps
Fri Jun 15 11:43:47 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:43:47 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
8.01user 0.93system 0:14.46elapsed 61%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1496minor)pagefaults 0swaps
Fri Jun 15 11:44:01 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:44:01 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
7.66user 1.15system 0:13.94elapsed 63%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1726minor)pagefaults 0swaps
Fri Jun 15 11:44:15 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:44:15 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
7.62user 1.01system 0:13.58elapsed 63%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1434minor)pagefaults 0swaps
Fri Jun 15 11:44:29 PDT 2001
+strsclnt -N  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:44:29 PDT 2001
strsclnt: 0 cache hits; 1000 cache misses, 0 cache not reusable
8.10user 1.06system 0:14.85elapsed 61%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1671minor)pagefaults 0swaps
Fri Jun 15 11:44:44 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:44:44 PDT 2001
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
2.63user 0.86system 0:07.65elapsed 45%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1416minor)pagefaults 0swaps
Fri Jun 15 11:44:51 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:44:52 PDT 2001
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
2.66user 0.68system 0:07.68elapsed 43%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1471minor)pagefaults 0swaps
Fri Jun 15 11:44:59 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:44:59 PDT 2001
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
2.78user 0.88system 0:07.33elapsed 49%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1410minor)pagefaults 0swaps
Fri Jun 15 11:45:07 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:45:07 PDT 2001
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
2.78user 0.88system 0:06.99elapsed 52%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1404minor)pagefaults 0swaps
Fri Jun 15 11:45:14 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:45:14 PDT 2001
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
2.81user 0.74system 0:07.92elapsed 44%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1474minor)pagefaults 0swaps
Fri Jun 15 11:45:22 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

Fri Jun 15 11:45:22 PDT 2001
strsclnt: 999 cache hits; 1 cache misses, 0 cache not reusable
2.71user 0.56system 0:07.18elapsed 45%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (624major+1404minor)pagefaults 0swaps
Fri Jun 15 11:45:29 PDT 2001
+strsclnt  -t 8  -p 12344 -c 1000 -d ../certs box

<HANG>
Nearly done, we did the fulls, and saw nearly the last of 8 restart runs
hang.  Hitting box from dbldog.

Here's the selfserv command line on box:
selfserv --optimize --nodelay
+ LD_LIBRARY_PATH=/d3/workarea-nss/mozilla/dist/Linux2.4_x86_glibc_PTH_OPT.OBJ/l
ib
+ /d3/workarea-nss/mozilla/dist/Linux2.4_x86_glibc_PTH_OPT.OBJ/bin/selfserv -D -
t 8 -d /d3/certs -n 'box.red.iplanet.com'\''s Cert-O-Matic II ID' -p 12344 -w ip
lanet
ps -ael | grep self
000 S  9469 22128 21898  0  69   0    -   534 wait4  pts/1    00:00:00 selfserv
000 S  9469 22147 22128  0  69   0    -  9181 do_pol pts/1    00:00:05 selfserv
040 S  9469 22150 22147  0  69   0    -  9181 do_pol pts/1    00:00:00 selfserv
040 S  9469 22151 22150  1  69   0    -  9181 rt_sig pts/1    00:00:18 selfserv
040 S  9469 22152 22150  1  69   0    -  9181 rt_sig pts/1    00:00:17 selfserv
040 S  9469 22153 22150  2  69   0    -  9181 rt_sig pts/1    00:00:19 selfserv
040 S  9469 22154 22150  1  69   0    -  9181 rt_sig pts/1    00:00:18 selfserv
040 S  9469 22155 22150  1  69   0    -  9181 rt_sig pts/1    00:00:16 selfserv
040 S  9469 22156 22150  2  69   0    -  9181 rt_sig pts/1    00:00:19 selfserv
040 S  9469 22157 22150  2  69   0    -  9181 rt_sig pts/1    00:00:19 selfserv
040 S  9469 22158 22150  1  69   0    -  9181 rt_sig pts/1    00:00:18 selfserv
[kirke@box scripts]$ 
[kirke@dbldog kirke]$ ps -ael | grep strs
000 S  9469 15286 15285  0  60   0    -  1360 rt_sig pts/0    00:00:00 strsclnt
040 S  9469 15289 15286  0  60   0    -  1360 do_pol pts/0    00:00:00 strsclnt
040 S  9469 15450 15289  0  60   0    -  1360 do_pol pts/0    00:00:00 strsclnt
[kirke@dbldog kirke]$ 



Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: ssl tests now slower on 4-CPU Ultrasparc than on P I 133 before → selfserv -D, strsclnt stress test hangs

Comment 17

17 years ago
Kirk,

Please open a new bug report.  Changing the summary of a bug
report is confusing.  In this case, the selfserv -D, strsclnt
stress test hanging is a different bug so a new bug report
should be opened.
Status: REOPENED → RESOLVED
Last Resolved: 17 years ago17 years ago
Resolution: --- → WONTFIX
Summary: selfserv -D, strsclnt stress test hangs → ssl tests now slower on 4-CPU Ultrasparc than on P I 133 before
(Reporter)

Comment 18

17 years ago
This is what the bug has been about all the time. this is the same bug in the
same tests. There has been a lot of confusion about it, mainly because of
Nelson's statement that it only occured on slow machines, and my changing of the
title to get it looked at once more.
I think it should stay in this report, since it happens exactly in the same
test, stresstest QA, or Kirks performance. look at the output log I attached on
6/11. 
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: ssl tests now slower on 4-CPU Ultrasparc than on P I 133 before → stresstests fail, selfserv -D, stressclien restart
(Reporter)

Comment 19

17 years ago
One more observation - since the QA stresstests run with -q  on the client side
the quit after a while - and return a failure; this might point to the pronblem
being on the server side. 
Kirk has been able to reproduce the problem with or without -v on the server,
with or without -D on the server

Updated

17 years ago
Priority: P2 → P1

Comment 20

17 years ago
> Kirk has been able to reproduce the problem with or without -v on the server,
> with or without -D on the server

The failure only occurs WITH -D (running selfserv with nodelay).
(Reporter)

Comment 21

17 years ago
Sorry, I missunderstood that at first. I have not seen the bug occur without -D
in my tests either. What were the combinations that you tested and that failed?
I remember it failed on 2 different tests - was it -N and no -N on the client?

Comment 22

17 years ago
> I remember it failed on 2 different tests - was it -N and no -N on the client?

Exactly.  Strsclnt with and without -N (restart and full handshakes).
(Reporter)

Comment 23

17 years ago
On the NT tinderbox we see hangs on the stress tests too. Not sure if they are
related. Anthony had to kill a stressclient todeay, tinderbox QA had been
hanging since more than 12 hours.

Updated

17 years ago
Whiteboard: NSS 3.3 Early Release

Updated

17 years ago
Whiteboard: NSS 3.3 Early Release

Comment 24

17 years ago
After examining the test out logs closely, I agree with Nelson's
analysis that the intermittent failures on phaedrus and washer and
the one-time failure on axilla were due to the SSL2 session reuse
test lasting longer than the SSL2 session lifetime (100 seconds).

In these failures, strsclnt did complete (with a failure status).
It did not hang.  So these failures are different from the strsclnt
restart, selfserv -D hanging problem Kirk reported (which was filed
as bug #70643 on 2001-03-01) or the stress tests hangs on the NT
tinderbox on 2001-06-18.  Therefore, I have to change the summary
of this bug back to the original summary, "stress tests fail
intermittently on slow machines."

We can modify the test script to rerun the test if it fails and
lasts longer than 100 seconds, halving the number of connections
each time, until the test passes or lasts shorter than 100 seconds.

Or we can just avoid running QA on slow machines (i.e., resolve
this bug WONTFIX).  This is my preference.  I'll let QA decide
what we should do.
Summary: stresstests fail, selfserv -D, stressclien restart → stress tests fail intermittently on slow machines
(Reporter)

Comment 25

17 years ago
The main difference why one test hangs and the other times out, is that if the
tests are run with the -q option after a certain time of hanging it times out
and the program exits.
I reran 3.2.1 QA on the failing machines 20 times or so, and did not get a
single failure of this kind, 3.3 QA fails in almost 100% of the tests

Comment 26

17 years ago
If strsclnt exits because of the -q option, it will print the
following message to stderr:
  strsclnt: Client timed out waiting for connection to server.

I did not see that message in the output logs.  Instead, I see
messages like:
  strsclnt: 9 server certificates tested.
which are printed by the main() function right before it returns.

These failures suggest that NSS may have become slower (at
least on Linux) since NSS 3.2.1.  I was not denying that.  I
just wanted to point out that these failures are different from
the hang that Kirk reported.

Comment 27

17 years ago
Last night I ran the ssl stress tests on washer as follows.

I pulled and built NSS 3.2.1 (with the cvs tag NSS_3_2_1_RTM)
on shabadoo.  I did a debug build.  Then I replaced the
libnspr4.so (which is NSPR 4.0) with the libnspr4.so from
/s/b/c/nspr20/v4.1.2.

I edited mozilla/security/nss/tests/ssl/ssl.sh and commented
out ssl_cov and ssl_auth so that I only ran the ssl stress
tests.

I ran ssl.sh alternating between the NSS 3.2.1 and NSS 3.3
Beta (copied from /s/b/c/nss/NSS_3_3_BETA) shared libraries.
(Note: all the NSPR and NSS.so's I used are debug builds.)
I verified the version identification strings in the .so's
before I ran ssl.sh every time.

I did not see any difference between NSS 3.2.1 and NSS 3.3.
SSL2 stress test took 4 or 5 seconds and SSL3 stress test
took 8 or 9 seconds consistently with either 3.2.1 or 3.3.

I just asked Larry to repeat this test on phaedrus.

This is as far as we can do to investigate these failures
as an indication of possible performance degradation in
NSS 3.3.  I will ask Kirk to run his performance tests against
NSS 3.2.1 and 3.3 Beta to ensure that we don't have any
regression in our performance.

Assignee: kirke → larryh
Status: REOPENED → NEW
(Reporter)

Comment 28

17 years ago
I have not seen this behavior lately - we are running QA every day on all the 
machines that used to fail almost 100% of the tests before I went on sick leave, 
and yesterday and today there seem to be no QA failures on them. As soon as I 
have set up my environment here I will verify when they stopped (or if the 
stopped). 
(Assignee)

Comment 29

17 years ago
I ran the tests, as Wah-Teh suggested, oh phaedrus. The particulars:
I built nss 3.2.1 on shabadoo.
I substituted nspr 4.1.2 for default (4.0, I think).
I ran 10 iterations using the 3.2.1 tests.
I ran 10 iterations using the 3.3-beta parts from /s/b/c
There was no indication of test failure. ... No trouble found.
At this writing the residue from my tests can be found at 
~larryh/nss321/mozilla/security/nss/tests/ssl/xxx321 and .../xxx33

There are several possibilities for what is causing the tests to fail, as 
observed by others:

Network bandwidth. I have noticed that network bandwidth into the SCA17 3rd 
floor lab often is terrible. Depending on where the NFS mounted disks are, the 
net bandwidth could slow things down enough to see these failures.

Multiple users on the machines. CPU utilizations, memory pressure causing paging 
could cause the observed symptoms.

Wanna know for sure? I leave it as an exercise for the currious. WORKSFORME
(Reporter)

Comment 30

17 years ago
This problem has not shown up in QA testing since before 6/19, I checked all log 
files 6/21 and newer. I agree with Larry's conclusion
Status: NEW → RESOLVED
Last Resolved: 17 years ago17 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.