Open
Bug 282496
Opened 21 years ago
Updated 3 years ago
Inconsistent cross platform behavior of PR_Shutdown
Categories
(NSPR :: NSPR, defect)
Tracking
(Not tracked)
NEW
People
(Reporter: nelson, Unassigned)
Details
A colleage has reported that when one thread is waiting on a read
(e.g. PR_Read or PR_Recv) on an NSPR socket, and another thread
calls PR_Shutdown on that socket, shutting it down for read,
that the effect of the shutdown on the reading thread is not
consistent on NSPR platforms. It is reported that on some
platforms, this causes the PR_Recv/PR_Read to terminate, as if
EOF had been received at that moment. On others, (including Solaris)
the PR_Read/Recv continues to stay blocked.
This is an issue for JSS, which needs a consistent cross platform
way of causing blocked receiving threads to be unblocked and
receive EOF.
I believe that NSPR socket behavior should be consistent accross
platforms with respect to this issue, even if the underlying OSes
are not. I consider this reported lack of consistency to be a bug.
Comment 1•21 years ago
|
||
It may be hard to make the behavior consistent
across platforms. Most applications don't need
it, and it may be expensive to implement.
Comment 2•21 years ago
|
||
In the current NSPR Unix implementation,
functions like PR_Read and PR_Recv block
in poll() for at most 5 seconds. We can
set a flag in the PRFileDesc structure
in PR_Shutdown, and have PR_Read and PR_Recv
check that flag when they time out from
poll. This solution is inexpensive.
But I will need one of you to implement it.
Comment 3•21 years ago
|
||
Thanks Wan-Teh, Assigning but to myself, to implement.
Assignee: wtchang → glen.beasley
Comment 4•20 years ago
|
||
On Windows when a SSLSocket is blocked on Read, the execution is blocked
in ntio.c in the function _PD_MD_WAIT
on rv = WaitForSingleObject(thread->md.blocked_sema, msecs);
in this case msecs is set to INFINITE:
calling _PR_MD_SHUTDOWN with how == PR_SHUTDOWN_RCV the
wait on the semaphore is not released
but
calling _PR_MD_SHUTDOWN with how == PR_SHUTDOWN_BOTH the
wait on the semaphore is released.
The goal in for bug 282732 is for Socket.close to interrupt threads blocked in
I/O which can be accomplished by:
if (ioRead || ioWrite) {
shutdownNativeLow(SocketBase.PR_SHUTDOWN_BOTH);
}
So this bug no longers blocks 282732.
JSS does have methods for shutdownInput() shutdownOutput() and I will
open a bug to discuss the proper behavior for these two methods. Note
in all version of the JDK shutdownInput and shutdownOutput is not supported.
Test output:
java.lang.UnsupportedOperationException: The method shutdownInput() is not
supported in SSLSocket
Comment 5•20 years ago
|
||
Glen,
Can you fix JSS bug 282732 without any NSPR changes?
What's this shutdownNativeLow function you referred to?
I can't find it in JSS. Did you really mean shutdownNative?
Does JSS use real NSPR sockets, or Java sockets wrapped
in PRFileDesc? It seems that JSS can use both:
http://lxr.mozilla.org/security/source/security/jss/org/mozilla/jss/ssl/common.c#142
If Java sockets are used, they should already have
the desired "close" semantics, right?
Comment 6•20 years ago
|
||
I can fix bug 282732 without an NSPR fix. If we want to implement the methods
shutdownInput and shutdownOutput then an NSPR fix would be needed but this is
low priority since all version of the JDK do not support shutdownInput and
shutdownOutput. The JDK only supports shutdown on both read/write IO on a call
to close which is the behaviour 282732 will implement when I complete the fix.
shutdownNativeLow is a new function that I will be adding to the fix for 282732.
I am just cleaning up 282732, and will attach the fix soon.
>Does JSS use real NSPR sockets, or Java sockets wrapped
>in PRFileDesc? It seems that JSS can use both:
http://lxr.mozilla.org/security/source/security/jss/org/mozilla/jss/ssl/common.c#142
I believe 99% of the time Users of JSS use NSPR sockets. I have yet
to see any code using Java sockets.
SSLServerSocket.accept() creates a NSPR JSS SSLSocket. So all
SSLServerSocket usage is NSPR sockets.
Only clients using SSLSocket can create a Java socket and only
one constructor allows such creation.
public SSLSocket(java.net.Socket s, String host,
SSLCertificateApprovalCallback certApprovalCallback,
SSLClientCertificateSelectionCallback clientCertSelectionCallback)
throws IOException
javasock.c handles Java Sockets and has its own PRIOMethods
We have no sample programs or QA programs using this constructor,
I almost think we should add a comments to this constructor warning that
usage of SSLSocket with Java Sockets is not well tested or used.
| Reporter | ||
Comment 7•20 years ago
|
||
Glen, in reply to comment 4,
> but calling _PR_MD_SHUTDOWN with how == PR_SHUTDOWN_BOTH the
> wait on the semaphore is released.
IIRC, we determined that what is actually happening is that the
shutdown both causes a TCP FIN to be sent to the peer, and then
(in the test setup) the peer closes the connection, which in turn
causes the local socket to see an EOF, and it is this EOF that
causes the threads to become unblocked. IIRC, we determined that
if the remote process blocks for (say) 30 seconds before closing
the socket when it receives the FIN, then the local socket also
does not become unblocked for 30 seconds.
Am I recalling correctly?
If so, then IMO we need a solution that does not depends on the
remote system closing the socket first.
Comment 8•20 years ago
|
||
reply to comment 7
It is the case that if PR_SHUTDOWN_BOTH or PR_SHUTDOWN_SEND an EOF is sent
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/socket_2.asp
"If the how parameter is SD_SEND, subsequent calls to the send function are
disallowed. For TCP sockets, a FIN will be sent"
On unix platforms PR_Shutdown(fd, PR_SHUTDOWN_RCV); will unblock
the reader, but on Windows the reader is blocked.
On windows we need to use PR_Interrupt and PR_Interrupt unblocks the reader but
the NSPR threading model is corrupted and the program will crash 90 percent of
the time. The 50 percent of the time when the program does a PR_Close of the
socket, 40 percent of the time at exit of my test program, and 10 percent it
does crash.
I created nspr bug 288232 for the PR_Interrupt issue.
I changed the OS to Windows NT for this bug.
OS: Solaris → Windows NT
Comment 9•20 years ago
|
||
(In reply to comment #8)
> On windows we need to use PR_Interrupt. PR_Interrupt does unblocks the reader but
> the NSPR threading model is corrupted and the program will crash 90 percent of
> the time. 50 percent of the time when the program does a PR_Close of the
> socket, 40 percent of the time at exit of my test program, and 10 percent it
> does crash.
I meant 10 percent does not crash.
>
> I created nspr bug 288232 for the PR_Interrupt issue.
> I changed the OS to Windows NT for this bug.
>
>
>
Updated•19 years ago
|
QA Contact: wtchang → nspr
Comment 10•19 years ago
|
||
This bug was created will working on bug 282732 which is solved.
Closing this bug as WONTFIX.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → WONTFIX
| Reporter | ||
Comment 11•19 years ago
|
||
This bug was originally reported against Solaris, then was shown to also be
inconsistent with NSPR on Windows NT.
While JSS no longer requires that this bug be fixed, this bug remains a
true platform inconsistency across supported NSPR platforms.
A fix for this bug has been outlined (see comment 2), and I see no reason
for us to refuse to fix it.
Re: comment 8, this bug is about the effect of PR_Shutdown on the receiving
side of the socket, not on the sending side of the socket. It concerns the
effect of shutdown on outstanding reads, not writes.
Re: comment 9, the "corrupted threading model" allegation was disproven in
bug 288232.
If we were to choose to refuse to fix it, then we need to document it in a
public web page of known deficiencies of NSPR's cross-platform consistency.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
| Reporter | ||
Comment 12•19 years ago
|
||
I think the next step for this bug is to produce a simple c test program
that can reproduce the problem. the progrqm should become part of our
cross platform NSPR test suite
Assignee: glen.beasley → nobody
Status: REOPENED → NEW
Updated•18 years ago
|
Assignee: nobody → julien.pierre.boogz
Priority: -- → P2
Hardware: Sun → PC
| Reporter | ||
Comment 13•18 years ago
|
||
There is some confusion about which OSes are affected by this bug.
It includes BOTH Solaris and Windows, and possibly others.
There is now evidence that bug 282732 was not actually fixed,
and I continue to believe that this bug is part of the cause.
OS: Windows NT → All
Comment 15•3 years ago
|
||
The bug assignee didn't login in Bugzilla in the last 7 months and this bug has priority 'P2'.
:KaiE, could you have a look please?
For more information, please visit auto_nag documentation.
Assignee: nelson → nobody
Flags: needinfo?(kaie)
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•