Closed
Bug 58624
Opened 24 years ago
Closed 24 years ago
SSL Stress Test fails on FreeBSD 3.5
Categories
(NSS :: Tools, defect, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
3.2
People
(Reporter: lennox, Assigned: sonja.mirtitsch)
Details
The NSS 3.1 SSL Stress Tests fail for me on FreeBSD 3.5. The end of the output
of './ssl.sh stress' looks like this:
********************* Stress Test ****************************
********************* Stress SSL2 RC4 128 with MD5 ****************************
selfserv -p 8443 -d
/local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n
conrail.cs.columbia.edu -w nss -i /tmp/tests_pid.5505 &
strsclnt -p 8443 -d . -w nss -c 1000 -C A conrail.cs.columbia.edu
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_NewTCPSocket returned error -5974:
Insufficient system resources.
Terminated
********************* Stress SSL3 RC4 128 with MD5 ****************************
selfserv -p 8443 -d
/local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n
conrail.cs.columbia.edu -w nss -i /tmp/tests_pid.5505 &
strsclnt -p 8443 -d . -w nss -c 1000 -C c conrail.cs.columbia.edu
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_NewTCPSocket returned error -5974:
Insufficient system resources.
Terminated
Running ktrace on the process (ktrace is a system-call tracer, the equivalent of
Linux's strace) reveals that socket() failed with ENOBUFS after it was called
for the 953rd time for the first test, and it failed after the 27th time it was
called for the second test.
The failure is consistent, both for debug and optimized builds; I haven't tested
to see whether the count of socket() failures is consistent.
All the other NSS tests pass successfully.
Comment 1•24 years ago
|
||
Nelson, please take a look at this bug and reassign to
the appropriate person. Thanks.
Assignee: wtc → nelsonb
Comment 2•24 years ago
|
||
I see no indication of any error on NSS's part from this description.
It sounds like an OS kernel configuration problem on the
submittor's system. The stress test is just that. It stresses
the server by pounding it with SSL connections. Apparently this
test exhausts some kernel resource on the submittor's system.
The only change to NSS that might be beneficial to this test
would be to respond to this error by waiting and trying again
for some limited number of times, rather than immediately
treating it as a fatal error.
However, while such a change might make the test appear to pass,
it would merely be hiding a very serious problem, namely,
chronic system resource exhaustion.
So, I suggest that, in this case, the failure serves the useful
purpose of revealing the system problem, which needs to be
cured apart from any changes to NSS.
I'll leave this bug open for a few more days, to give others
a chance to persuade me that some NSS change would and should
solve this problem.
Reporter | ||
Comment 3•24 years ago
|
||
Okay, some more investigation leads me to agree with you. What's happening is
that the TCP connections from the stress test stick around in TIME_WAIT for two
minutes; my kernel is only configured to support 1064 simultaneous open sockets,
which isn't enough for the 2K sockets opened by the stress test plus the 100 or
so normally in use on my system.
So I'd just suggest adding a note to the NSS test webpage to the effect of "The
SSL stress test opens 2,048 TCP connections in quick succession. Kernel data
structures may remain allocated for these connections for up to two minutes.
Some systems may not be configured to allow this many simulatenous connections
by default; if the stress tests fail, try increasing the number of simultaneous
sockets supported."
On FreeBSD, you can display the number of simultaneous sockets with the command
sysctl kern.ipc.maxsockets
which on my system returns 1064.
It looks like this can be fixed with the kernel config option
options NMBCLUSTERS=[something-large]
or by increasing the 'maxusers' parameter.
It looks like more recent FreeBSD implementations still have this limitation,
and the same solutions apply, plus you can alternatively specify the maxsockets
parameter in the boot loader.
Comment 4•24 years ago
|
||
Thanks for your very useful explanation of how to fix this on freeBSD.
I'm reassigning this to Sonja. She can update the test page.
Assignee: nelsonb → sonmi
Target Milestone: --- → 3.2
Assignee | ||
Comment 5•24 years ago
|
||
Sent email to Scott Carver to update the Webpage.
I personally am not so sure that this is a great idea, because he might end up
having to describe a lot of kernel configurations and parameters.
This would be useful information in a readme file, or as a comment in the
source, also we could add the kernelparameters we had to change on HP and
(still have to) on AIX
Reporter | ||
Comment 6•24 years ago
|
||
I think the webpage should just mention what the problem is, the reason for it,
and the general solution -- increase the size of your kernel's datastructures.
(Roughly, the quoted text in the second paragraph of my 11/02 comment.) I could
slap together some appropriate text if this would be useful.
Specific instructions for each OS as to how to accomplish this should then be in
a README file in nss/tests/ssl or somewhere. (Probably just having it be a
comment in the source code would be too obscure.) The webpage can have a link
or a reference to this file.
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 7•24 years ago
|
||
checked file platform_specific_problems into mozilla/security/nss/tests/doc,
containing this bug report and the changed hp kernel parameters - anyone willing
to bring it into a more readable format is welcome to do so.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•