282496 - Inconsistent cross platform behavior of PR_Shutdown

Reporter

Description

•

21 years ago

A colleage has reported that when one thread is waiting on a read (e.g. PR_Read or PR_Recv) on an NSPR socket, and another thread calls PR_Shutdown on that socket, shutting it down for read, that the effect of the shutdown on the reading thread is not consistent on NSPR platforms. It is reported that on some platforms, this causes the PR_Recv/PR_Read to terminate, as if EOF had been received at that moment. On others, (including Solaris) the PR_Read/Recv continues to stay blocked. This is an issue for JSS, which needs a consistent cross platform way of causing blocked receiving threads to be unblocked and receive EOF. I believe that NSPR socket behavior should be consistent accross platforms with respect to this issue, even if the underlying OSes are not. I consider this reported lack of consistency to be a bug.

glen beasley

Updated

•

21 years ago

Blocks: 282732

Wan-Teh Chang

Comment 1

•

21 years ago

It may be hard to make the behavior consistent across platforms. Most applications don't need it, and it may be expensive to implement.

Wan-Teh Chang

Comment 2

•

21 years ago

In the current NSPR Unix implementation, functions like PR_Read and PR_Recv block in poll() for at most 5 seconds. We can set a flag in the PRFileDesc structure in PR_Shutdown, and have PR_Read and PR_Recv check that flag when they time out from poll. This solution is inexpensive. But I will need one of you to implement it.

glen beasley

Comment 3

•

21 years ago

Thanks Wan-Teh, Assigning but to myself, to implement.

Assignee: wtchang → glen.beasley

glen beasley

Comment 4

•

20 years ago

On Windows when a SSLSocket is blocked on Read, the execution is blocked in ntio.c in the function _PD_MD_WAIT on rv = WaitForSingleObject(thread->md.blocked_sema, msecs); in this case msecs is set to INFINITE: calling _PR_MD_SHUTDOWN with how == PR_SHUTDOWN_RCV the wait on the semaphore is not released but calling _PR_MD_SHUTDOWN with how == PR_SHUTDOWN_BOTH the wait on the semaphore is released. The goal in for bug 282732 is for Socket.close to interrupt threads blocked in I/O which can be accomplished by: if (ioRead || ioWrite) { shutdownNativeLow(SocketBase.PR_SHUTDOWN_BOTH); } So this bug no longers blocks 282732. JSS does have methods for shutdownInput() shutdownOutput() and I will open a bug to discuss the proper behavior for these two methods. Note in all version of the JDK shutdownInput and shutdownOutput is not supported. Test output: java.lang.UnsupportedOperationException: The method shutdownInput() is not supported in SSLSocket

Wan-Teh Chang

Comment 5

•

20 years ago

Glen, Can you fix JSS bug 282732 without any NSPR changes? What's this shutdownNativeLow function you referred to? I can't find it in JSS. Did you really mean shutdownNative? Does JSS use real NSPR sockets, or Java sockets wrapped in PRFileDesc? It seems that JSS can use both: http://lxr.mozilla.org/security/source/security/jss/org/mozilla/jss/ssl/common.c#142 If Java sockets are used, they should already have the desired "close" semantics, right?

glen beasley

Comment 6

•

20 years ago

I can fix bug 282732 without an NSPR fix. If we want to implement the methods shutdownInput and shutdownOutput then an NSPR fix would be needed but this is low priority since all version of the JDK do not support shutdownInput and shutdownOutput. The JDK only supports shutdown on both read/write IO on a call to close which is the behaviour 282732 will implement when I complete the fix. shutdownNativeLow is a new function that I will be adding to the fix for 282732. I am just cleaning up 282732, and will attach the fix soon. >Does JSS use real NSPR sockets, or Java sockets wrapped >in PRFileDesc? It seems that JSS can use both: http://lxr.mozilla.org/security/source/security/jss/org/mozilla/jss/ssl/common.c#142 I believe 99% of the time Users of JSS use NSPR sockets. I have yet to see any code using Java sockets. SSLServerSocket.accept() creates a NSPR JSS SSLSocket. So all SSLServerSocket usage is NSPR sockets. Only clients using SSLSocket can create a Java socket and only one constructor allows such creation. public SSLSocket(java.net.Socket s, String host, SSLCertificateApprovalCallback certApprovalCallback, SSLClientCertificateSelectionCallback clientCertSelectionCallback) throws IOException javasock.c handles Java Sockets and has its own PRIOMethods We have no sample programs or QA programs using this constructor, I almost think we should add a comments to this constructor warning that usage of SSLSocket with Java Sockets is not well tested or used.

glen beasley

Updated

•

20 years ago

No longer blocks: 282732

Nelson Bolyard (seldom reads bugmail)

Reporter

Comment 7

•

20 years ago

Glen, in reply to comment 4, > but calling _PR_MD_SHUTDOWN with how == PR_SHUTDOWN_BOTH the > wait on the semaphore is released. IIRC, we determined that what is actually happening is that the shutdown both causes a TCP FIN to be sent to the peer, and then (in the test setup) the peer closes the connection, which in turn causes the local socket to see an EOF, and it is this EOF that causes the threads to become unblocked. IIRC, we determined that if the remote process blocks for (say) 30 seconds before closing the socket when it receives the FIN, then the local socket also does not become unblocked for 30 seconds. Am I recalling correctly? If so, then IMO we need a solution that does not depends on the remote system closing the socket first.

glen beasley

Comment 8

•

20 years ago

reply to comment 7 It is the case that if PR_SHUTDOWN_BOTH or PR_SHUTDOWN_SEND an EOF is sent http://msdn.microsoft.com/library/default.asp?url=/library/en-us/winsock/winsock/socket_2.asp "If the how parameter is SD_SEND, subsequent calls to the send function are disallowed. For TCP sockets, a FIN will be sent" On unix platforms PR_Shutdown(fd, PR_SHUTDOWN_RCV); will unblock the reader, but on Windows the reader is blocked. On windows we need to use PR_Interrupt and PR_Interrupt unblocks the reader but the NSPR threading model is corrupted and the program will crash 90 percent of the time. The 50 percent of the time when the program does a PR_Close of the socket, 40 percent of the time at exit of my test program, and 10 percent it does crash. I created nspr bug 288232 for the PR_Interrupt issue. I changed the OS to Windows NT for this bug.

OS: Solaris → Windows NT

glen beasley

Comment 9

•

20 years ago

(In reply to comment #8) > On windows we need to use PR_Interrupt. PR_Interrupt does unblocks the reader but > the NSPR threading model is corrupted and the program will crash 90 percent of > the time. 50 percent of the time when the program does a PR_Close of the > socket, 40 percent of the time at exit of my test program, and 10 percent it > does crash. I meant 10 percent does not crash. > > I created nspr bug 288232 for the PR_Interrupt issue. > I changed the OS to Windows NT for this bug. > > >

Reed Loden [:reed]

Updated

•

19 years ago

QA Contact: wtchang → nspr

glen beasley

Comment 10

•

19 years ago

This bug was created will working on bug 282732 which is solved. Closing this bug as WONTFIX.

Status: NEW → RESOLVED

Closed: 19 years ago

Resolution: --- → WONTFIX

Nelson Bolyard (seldom reads bugmail)

Reporter

Comment 11

•

19 years ago

This bug was originally reported against Solaris, then was shown to also be inconsistent with NSPR on Windows NT. While JSS no longer requires that this bug be fixed, this bug remains a true platform inconsistency across supported NSPR platforms. A fix for this bug has been outlined (see comment 2), and I see no reason for us to refuse to fix it. Re: comment 8, this bug is about the effect of PR_Shutdown on the receiving side of the socket, not on the sending side of the socket. It concerns the effect of shutdown on outstanding reads, not writes. Re: comment 9, the "corrupted threading model" allegation was disproven in bug 288232. If we were to choose to refuse to fix it, then we need to document it in a public web page of known deficiencies of NSPR's cross-platform consistency.

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Nelson Bolyard (seldom reads bugmail)

Reporter

Comment 12

•

19 years ago

I think the next step for this bug is to produce a simple c test program that can reproduce the problem. the progrqm should become part of our cross platform NSPR test suite

Assignee: glen.beasley → nobody

Status: REOPENED → NEW

Julien Pierre

Updated

•

18 years ago

Assignee: nobody → julien.pierre.boogz

Priority: -- → P2

Hardware: Sun → PC

Nelson Bolyard (seldom reads bugmail)

Reporter

Comment 13

•

18 years ago

There is some confusion about which OSes are affected by this bug. It includes BOTH Solaris and Windows, and possibly others. There is now evidence that bug 282732 was not actually fixed, and I continue to believe that this bug is part of the cause.

OS: Windows NT → All

Nelson Bolyard (seldom reads bugmail)

Reporter

Comment 14

•

16 years ago

Taking.

Assignee: julien.pierre.boogz → nelson

BugBot [:suhaib / :marco/ :calixte]

Comment 15

•

3 years ago

The bug assignee didn't login in Bugzilla in the last 7 months and this bug has priority 'P2'.
:KaiE, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: nelson → nobody

Flags: needinfo?(kaie)

Kai Engert [:KaiE:]

Comment 16

•

3 years ago

JSS issues aren't a priority

Flags: needinfo?(kaie)

Priority: P2 → --

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Bugzilla

Inconsistent cross platform behavior of PR_Shutdown

Categories

(NSPR :: NSPR, defect)

Tracking

(Not tracked)

People

(Reporter: nelson, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Updated

Comment 13

Comment 14

Comment 15

Comment 16

Updated