Open
Bug 301479
Opened 19 years ago
Updated 2 years ago
can't tell when connect job is finished on windows
Categories
(NSPR :: NSPR, defect)
Tracking
(Not tracked)
NEW
People
(Reporter: kamil, Unassigned)
Details
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 This bug was produced on windows. It has been some time since I excercised this code. So caveat, that I'm not sure if it is against 4.4.1 source. But a brief reading the 4.4.1 source indicates that it should still be a problem. PR_QueueJob_Connect does not return consistent error values, and its not clear how to use it. Where it should return PR_IN_PROGRESS to the implementation underneath (see nsprpub/pr/src/misc/prtpool.c) it seems to always return PR_IO_TIMEOUT_ERROR. If you ignore that return code and call PR_Connect with a NO_TIMEOUT interval, then, if the connection can be easily established (local) you get back an IS_CONNECTED (already) error. Otherwise (if your peer is not listening for example) you get a IO_TIMEOUT, if you tell it INTERVAL_NO_TIMEOUT. About the only way this seems to work is to tie up an engine thread waiting for the connect to return once you are in the handleConnectJob call. Underneath, on NT/XP, PR_queueJob_Connect operates by doing a PR_Connect with a timeout of NO_WAIT, and then posts completion on the job. Now on windows, PR_Connect is implemented by calling the WSA connect function. But there is a twist, the timeout is imlemented by putting the socket into non-blocking mode, then doing a select for the timeout. If the timeout is 0 (NO_WAIT), and the connect hasn't finished yet, then select returns the number of fds changed (0) and this results in nspr returning a PR_IO_TIMEOUT_ERROR. The socket is then taken out of non-blocking mode, and the error is poped back to the thread pool. The queuejob function gets a timeout error and the job returns immediately. At this point the completion status of the connect is indeterminate. You can call PR_Connect at this point to try to find out, and now there is a race. If you call connect before you get the socket to actually connect you must handle PR_IO_ALREADY_INITIATED, and if its connected then you get back PR_IS_CONNECTED_ERROR. This is because (and I'm quoting from the visual c++ 6.0 help): "Until the connection attempt completes on a nonblocking socket, all subsequent calls to connect on the same socket will fail with the error code WSAEALREADY, and WSAEISCONN when the connection completes successfully. Due to ambiguities in version 1.1 of the Windows Sockets specification, error codes returned from connect while a connection is already pending may vary among implementations. As a result, it is not recommended that applications use multiple calls to connect to detect connection completion. If they do, they must be prepared to handle WSAEINVAL and WSAEWOULDBLOCK error values the same way that they handle WSAEALREADY, to assure robust execution." WSAEEINVAL maps to PR_IO_ALREADY_INITIATED. So how do you figure out when your PR_connect returned? Or for that matter what the return code is? What happens if it failed and you miss the rc? Reproducible: Always Steps to Reproduce: Create code that uses PR_QueueJob_Connect, try to use it with a varriety of hosts that exist or don't. Actual Results: sometimes the socket was connected, sometimes it was not. Never can tell what the resulting error was. Expected Results: The connect job should return when a difinitive return status comes back from connect.
Updated•18 years ago
|
QA Contact: wtchang → nspr
Updated•5 years ago
|
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Updated•2 years ago
|
Severity: normal → S3
Comment 1•2 years ago
|
||
The bug assignee is inactive on Bugzilla, so the assignee is being reset.
Assignee: wtc → nobody
Status: ASSIGNED → NEW
You need to log in
before you can comment on or make changes to this bug.
Description
•