Open
Bug 334961
Opened 18 years ago
Updated 1 year ago
strsclnt: PR_Connect: Local Network address is in use
Categories
(NSS :: Test, defect, P3)
Tracking
(Not tracked)
NEW
People
(Reporter: Sandeep.Konchady, Unassigned)
Details
Attachments
(2 files, 1 obsolete file)
4.62 KB,
patch
|
KaiE
:
review-
|
Details | Diff | Splinter Review |
4.34 KB,
patch
|
KaiE
:
review-
|
Details | Diff | Splinter Review |
When running all.sh on WinXP with OS_TARGET=WIN95 the following SSL test fails. This does not seem to be caught by the script and is not reported as a failure. ssl.sh: Stress SSL3 ECDHE-RSA AES 128 CBC with SHA ---- selfserv -D -p 8443 -d ../server -n WINXP.red.iplanet.com \ -e WINXP.red.iplanet.com-ec -w nss -c :C013 -i ../tests_pid.168 & selfserv started at Thu Apr 20 14:48:06 PST 2006 tstclnt -p 8443 -h WINXP.red.iplanet.com -B -s -q \ -d ../client < C:/export/tinderbox/WINNT_5.1_Depend/mozilla/security/nss/tests/ssl/sslreq.dat strsclnt -q -p 8443 -d ../client -B -s -w nss -c 1000 -C :C013 -T \ WINXP.red.iplanet.com strsclnt started at Thu Apr 20 14:48:06 PST 2006 strsclnt: 958 cache hits; 1 cache misses, 0 cache not reusable strsclnt: -- SSL: Server Certificate Validated. strsclnt: 0 cache hits; 1 cache misses, 0 cache not reusable NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. NEXT ERROR strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt completed at Thu Apr 20 14:48:12 PST 2006 ssl.sh: Stress SSL3 ECDHE-RSA AES 128 CBC with SHA produced a returncode of 0, expected is 0. PASSED
Reporter | ||
Comment 1•18 years ago
|
||
This not only happens for OS_TARGET=WIN95, but for normal builds too. System details are: OS : Microsoft Windows XP Professional 2002 SP2 CPU : Intel Pentium III 728 MHz RAM : 512 MB Compiler: Microsoft Visual Studio .NET 2003 ssl.sh: Stress SSL3 ECDHE-RSA AES 128 CBC with SHA ---- selfserv -D -p 8443 -d ../server -n WINXP.red.iplanet.com \ -e WINXP.red.iplanet.com-ec -w nss -c :C013 -i ../tests_pid.2580 & selfserv started at Fri Apr 21 10:43:06 PST 2006 tstclnt -p 8443 -h WINXP.red.iplanet.com -B -s -q \ -d ../client < C:/export/tinderbox/WINNT_5.1_Depend/mozilla/security/nss/tests/ssl/sslreq.dat strsclnt -q -p 8443 -d ../client -B -s -w nss -c 1000 -C :C013 -T \ WINXP.red.iplanet.com strsclnt started at Fri Apr 21 10:43:07 PST 2006 strsclnt: 958 cache hits; 1 cache misses, 0 cache not reusable strsclnt: -- SSL: Server Certificate Validated. strsclnt: 0 cache hits; 1 cache misses, 0 cache not reusable strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt: PR_Connect returned error -5982: Local Network address is in use. strsclnt completed at Fri Apr 21 10:43:12 PST 2006 ssl.sh: Stress SSL3 ECDHE-RSA AES 128 CBC with SHA produced a returncode of 0, expected is 0. PASSED
Summary: strsclnt on WinXP with OS_TARGET=WIN95 throw error → strsclnt on WinXP Professional throw error running ssl tests
Comment 2•18 years ago
|
||
We've seen this error in selfserv, and understand why that occurs. But this error is coming from strsclnt. The single most important issue here is that strsclnt reports this as a succesful run, when it clearly is not.
Summary: strsclnt on WinXP Professional throw error running ssl tests → strsclnt: PR_Connect: Local Network address is in use
Target Milestone: --- → 3.11.1
Version: 3.12 → 3.11
Updated•18 years ago
|
Priority: -- → P3
Updated•17 years ago
|
Target Milestone: 3.11.1 → ---
Comment 3•16 years ago
|
||
I noticed this error when I tested TLS session tickets on Windows. At first I thought this error only occurs when session tickets are enabled. Then I discovered that it also occurs when session tickets are disabled, it's just not reported as a failure (see Nelson's comment 2). It is the Winsock connect() function that fails with the WSAEADDRINUSE error, so TLS/SSL is not involved. I did a search for this error and it seems that other people have seen it, too: http://www.cygwin.com/ml/cygwin-developers/2001-11/msg00080.html http://www.fastcgi.com/archives/fastcgi-developers/2001-February/000933.html http://tech.groups.yahoo.com/group/new-httpd/message/47305 http://ureader.com/msg/165626.aspx
Assignee: nobody → wtc
Target Milestone: --- → 3.12
Comment 4•16 years ago
|
||
Attachment #301414 -
Flags: review?(nelson)
Comment 5•16 years ago
|
||
Wan-Teh, Shouldn't we fix this problem in NSPR instead ?
Comment 6•16 years ago
|
||
Wan-Teh, how does ignoring this error fix it? Doesn't this merely mask the problem? Is the problem transient? Or does it persist once it occurs? Is an immediately retry likely to experience the same error? or not?? If so, isn't this patch likely to create a tight loop, retrying failed connections as fast as possible?
Comment 7•16 years ago
|
||
There is probably no good solution to this problem. http://support.microsoft.com/kb/319502 http://msdn2.microsoft.com/en-us/library/aa560610.aspx http://www.velocityreviews.com/forums/t368277-clientside-tcp-socket-receiving-quotaddress-already-in-usequot-upon-connect.html Let me call this patch a proposed approach because we can change the details but the basic idea is the same: retry PR_Connect after the PR_ADDRESS_IN_USE_ERROR. Note that I also changed strsclnt to require the specified number of SSL connections to me made. The second approach is to keep track of how many times we get the PR_ADDRESS_IN_USE_ERROR, and subtract that from the number of SSL connections required for strsclnt to return successfully. This approach still need to retry PR_Connect in tstclnt though. (Yes, tstclnt is also affected by this error.)
Attachment #301414 -
Attachment is obsolete: true
Attachment #301414 -
Flags: review?(nelson)
Comment 8•15 years ago
|
||
Unsetting target milestone in unresolved bugs whose targets have passed.
Target Milestone: 3.12 → ---
Comment 9•10 years ago
|
||
Brian: I did a search for "PR_Connect PR_ADDRESS_IN_USE_ERROR" in Bugzilla and found this old bug. So the strsclnt failures you saw are not caused by recent changes in NSS 3.15.4. Apparently I researched this issue thoroughly and even wrote a patch, but I didn't remember a thing about this yesterday. It's amazing how my brain can page out unneeded info :-) Please feel free to review my patch. Note Julien and Nelson's comments in comment 5 and comment 6 if you review the patch. Fixing this bug in NSPR seems hard if we can't simply retrying connect on the OS socket.
Flags: needinfo?
Updated•10 years ago
|
Flags: needinfo?
Updated•10 years ago
|
Attachment #301692 -
Flags: review?(brian)
Comment 10•10 years ago
|
||
Comment on attachment 301692 [details] [diff] [review] Proposed approach 1: ignore PR_ADDR_IN_USE_ERROR from PR_Connect r=kaie but it needs merging
Attachment #301692 -
Flags: review?(brian) → review+
Comment 11•10 years ago
|
||
merged to trunk.
Comment 12•10 years ago
|
||
The merged patch worked for me locally (linux). Checked in: https://hg.mozilla.org/projects/nss/rev/014577d91a1a Let's see if it works.
Comment 13•10 years ago
|
||
Status with the patch checked in: - win 2008 (7) is good - win 2003 (previously unaffected) fails a few times with: strsclnt: PR_Connect returned error -5990, OS error 10060: I/O operation timed out - win xp no longer shows the original failure (-5982), instead, it fails a few times with the same failure that was introduced on win 2000
Comment 14•10 years ago
|
||
-5990 means PR_IO_TIMEOUT_ERROR
Comment 15•10 years ago
|
||
Kai: thank you for the update. I also saw these errors. The WSAETIMEDOUT (10060) is worse than WSAEADDRINUSE because WSAETIMEDOUT takes longer to fail. Since we can't eliminate PR_Connect failures completely, I think we should undo the retry on WSAEADDRINUSE, and then implement the second approach in comment 7. Do you have time to work on that? strsclnt.c doesn't seem to have a counter for successful connections. So we'll need to add one. ('numConnected' is a counter for current connections, so it is decremented when connections are closed.)
Comment 16•10 years ago
|
||
(In reply to Kai Engert (:kaie) from comment #12) > Checked in: > https://hg.mozilla.org/projects/nss/rev/014577d91a1a Backed out: https://hg.mozilla.org/projects/nss/rev/f13d752507cc
Comment 17•10 years ago
|
||
Comment on attachment 301692 [details] [diff] [review] Proposed approach 1: ignore PR_ADDR_IN_USE_ERROR from PR_Connect marking as r- because it doesn't work
Attachment #301692 -
Flags: review+ → review-
Updated•10 years ago
|
Attachment #8403955 -
Flags: review-
Comment 18•8 years ago
|
||
Is it possible that we could just reduce the number of iterations on Windows and maybe only for CI? The tests won't be as good, but we won't use as many ports. I see that we could increase the number of ephemeral ports and the length of TIME_WAIT on the build machines: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters Value Name: MaxUserPort Data Type: REG_DWORD Value: 65534 (for example) Valid Range: 5000-65534 (decimal) Default: 0x1388 (5000 decimal) HKEY_LOCAL_MACHINE\System\CurrectControlSet\services\Tcpip\Parameters Value Name: TcpTimedWaitDelay Value Type: REG_DWORD-time in seconds Valid Range: 30-300 (decimal) Default: 0xF0 (240 decimal) That might help a little, but even then 65k over 30s seems like it might be possible. I also think that we should try to re-land Wan-Teh's retry patch. I suspect that the timeouts are only due to the increase in the number of sockets that it creates.
Updated•2 years ago
|
Severity: normal → S3
Comment 19•1 year ago
|
||
The bug assignee is inactive on Bugzilla, so the assignee is being reset.
Assignee: wtc → nobody
You need to log in
before you can comment on or make changes to this bug.
Description
•