Closed Bug 121952 Opened 23 years ago Closed 23 years ago

Verify that PR_ConnectContinue works on Mac

Categories

(NSPR :: NSPR, defect, P2)

4.1.2
PowerPC
Mac System 9.x
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wtc, Assigned: sfraser_bugs)

References

Details

Attachments

(2 files)

PR_ConnectContinue has never been tested on the Mac. With the proposed patch in bug 106188, Mozilla will be calling PR_ConnectContinue. So we need to verify the Mac implementation of PR_ConnectContinue is working correctly.
Blocks: 106188
With the patch in bug 121951, and the non-blocking connect code from bug 106188 in my tree, connecting to secure sites works fine. What more can we do to test this? Connecting to a slow server? Connecting to an unresponsive server?
Status: NEW → ASSIGNED
Sorry I did not explain how you can test PR_ConnectContinue. With the browser, you can test the connect success and connect failure cases. First, make sure that the SocketConnectContinue function in mozilla/nsprpub/pr/src/io/prsocket.c does not have my temporary workaround (#if 0 ...). 1. Connect success: go to a working HTTP site. Verify that _MD_mac_get_nonblocking_connect_error() returns 0. 2. Connect failure: go to a host that is not running a web server. Verify that _MD_mac_get_nonblocking_connect_error() return ECONNREFUSED or its Mac equivalent. I noticed that the current code returns -1 on connect failure. This is incorrect. It should return the error code on failure. There are many reasons why connect fails. Common errors are ECONNREFUSED, ETIMEDOUT, and EHOSTUNREACH. Note that it may be fine for _MD_mac_get_nonblocking_connect_error() to return -1 on connect failure. It just means that you don't know *why* it failed, but you do know it failed. That may be sufficient for the browser. To correctly implement this function, I believe you need to store the 'discon.reason' code returned by OTRcvDisconnect() in the NSPR fd.
Priority: -- → P2
Target Milestone: --- → 4.2
I noticed that we actually have a bug lurking because of this. With non-blocking connect active (temporary workaround removed), when going to an address that has no webserver (e.g. http://localhost), we spin for ever. Here's fixed version of _MD_mac_get_nonblocking_connect_error (pseudo diff): int _MD_mac_get_nonblocking_connect_error(PRInt32 osfd) { PRThread *thread = _PR_MD_CURRENT_THREAD(); OTResult resultOT; EndpointRef endpoint = (EndpointRef) osfd; + OSErr threadErr = thread->md.osErrCode; resultOT = OTGetEndpointState(endpoint); switch (resultOT) { case T_OUTCON: macsock_map_error(EINPROGRESS); return -1; case T_DATAXFER: return 0; case T_IDLE: + macsock_map_error(threadErr); return -1; case T_INREL: macsock_map_error(ENOTCONN); return -1; default: PR_ASSERT(0); return -1; } }
Simon, With your pseudo diff applied, if you go to an address with no web server, what is the value of 'threadErr'? Is it the Mac equivalent of ECONNREFUSED? You should change _MD_mac_get_nonblocking_connect_error() to return the error code as opposed to -1, to be consistent with the other platforms.
This patch doesn't work on Mac classic. The macsock_map_error() call at the end of _MD_Connect is clobbering the error code set in the thread by the OT notifier.
Simon, That's right. I didn't catch that earlier. The error code of a failed non-blocking connect should be stored in PRFileDesc, not in PRThread. The OT notification of a failed non-blocking connect (T_DISCONNECT) arrives asynchronously; no thread is waiting for it. So the OT notifier should store the error code (discon.reason) in the PRFileDesc. Later, a thread calls PR_ConnectContinue to retrieve the non-blocking connect error code saved in the PRFileDesc. This means _MD_mac_get_nonblocking_connect_error() needs to take PRFileDesc *fd, as opposed to PRInt32 osfd, as the argument.
Should _MD_mac_get_nonblocking_connect_error both return the error code, and set it on the thread with PR_SetError()? When can we clear this error value on the PRFileDesc?
I also wonder how this should interact with MD_Poll (see bug 102791).
_MD_mac_get_nonblocking_connect_error just needs to return the error code. It should not set the error code on the thread with PR_SetError(). We can clear this error value on the PRFileDesc after we retrieve it, which I think is what Berkeley sockets do, but I would not worry about it.
The patch adds a OTReason disconnectError member on _MDFileDesc, and sets that when the OT notifer gets a T_DISCONNECT. That error is then read in _MD_mac_get_nonblocking_connect_error(). _MD_mac_get_nonblocking_connect_error() calls macsock_map_error() (which does a PR_SetError()) with errors, and returns -1 on failure, since this is what the code in SocketConnectContinue() expects. Other platforms do the same thing. With these changes, entering a hostname in the browser which is not running a server correctly shows a 'connection refused' error message. Before this patch, it would just spin.
Comment on attachment 67058 [details] [diff] [review] First cut at a patch in nspr In nsprpub/pr/src/md/mac/macsockotpt.c, we have: > secret->md.exceptReady = PR_TRUE; // XXX Check this > >+ md->disconnectError = discon.reason; // save for _MD_mac_get_nonblocking_connect_error >+ > // wake up waiting threads, if any > result = -3199 - discon.reason; // obtain the negative error code What's the difference between discon.reason and -3199 - discon.reason? Which one is the error code?
Simon wrote: > With these changes, entering a hostname in the browser > which is not running a server correctly shows a > 'connection refused' error message. Could you verify in the debugger that we hit the ECONNREFUSED case in macsock_map_error()? The reason I asked is that nsSocketTransport::doConnection() always sets rv to NS_ERROR_CONNECTION_REFUSED on connect failure, without checking the NSPR or OS error code. So even though the browser shows a 'connection refused' error message, we can't tell if the NSPR error code is PR_CONNECT_REFUSED_ERROR.
> >+ md->disconnectError = discon.reason; // save for _MD_mac_get_nonblocking_connect_error > >+ > > // wake up waiting threads, if any > > result = -3199 - discon.reason; // obtain the negative error code > > What's the difference between discon.reason and > -3199 - discon.reason? Which one is the error > code? discon.reason is the Unix-style error code (e.g. 77). result is the mac-style error code (e.g. -3162). It doesn't really matter which one we store, as long as we do the correct mapping in _ MD_mac_get_nonblocking_connect_error(). I did verify that we get the correct error from MD_mac_get_nonblocking_connect_error() in the debugger on the connection failure.
This is what should happen after trying to connect to a host which is not running a server. 1. PR_ConnectContinue() should return PR_FAILURE. 2. A subsequent PR_GetError() call should return PR_CONNECT_REFUSED_ERROR.
Comment on attachment 67058 [details] [diff] [review] First cut at a patch in nspr r=wtc. Simon, you can check in this patch. Thanks.
Attachment #67058 - Flags: review+
I need to check both patches in to enable non-blocking connects on Mac.
Actually, the patch in bug 121951 also need to go in before we can enable non- blocking connects.
Comment on attachment 69364 [details] [diff] [review] Patch to re-enable non-blocking connect on Mac r=wtc.
Attachment #69364 - Flags: review+
Patches checked into the NSPRPUB_PRE_4_2_CLIENT_BRANCH, and the NSPR tip (NSPR changes), and the tip (security change).
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: