Closed
Bug 303123
Opened 19 years ago
Closed 16 years ago
assert in notify_ioq
Categories
(NSPR :: NSPR, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: kamil, Assigned: wtc)
Details
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 This is on windows XP running nspr 4.4.1 (debug) (patched for bug https://bugzilla.mozilla.org/show_bug.cgi?id=291982) My code does the following: 1. connect a socket to a remote host 2. queue a read job and a write job 3. when the read job or write job returns call PR_Recv or PR_Send. Each of these calls is made with the PR_INTERVAL_NO_WAIT. 4. If PR_Recv or PR_Send return PR_IO_TIMEOUT_ERROR then from the job handler, I call queue another read job or a write job. The result of this is that eventually an assert is encountered prtpool.c on line 989 (in notify_ioq). This apears to have something to do with io_pending already being set when SendSocket is called from the bowels of notify_ioq. Reproducible: Always Steps to Reproduce: As per above. Queue a job from the handler of another job, on the same fd. Actual Results: code asserts in nspr Expected Results: Job should have been qeueued. Though, its not clean why a job might complete, and then a call like PR_Recv or PR_Send return a timeout.
| Assignee | ||
Comment 1•19 years ago
|
||
Hi Kamil,
Whenever you get PR_IO_TIMEOUT_ERROR or PR_PENDING_INTERRUPT_ERROR
in the WINNT configuration of NSPR, you need to handle it like this:
#include "priv/pprio.h" // for PR_NT_CancelIo
if (PR_Recv(fd, ...) == -1) {
// PR_Recv failed
PRErrorCode err = PR_GetError();
#ifdef WINNT
if (err == PR_IO_TIMEOUT_ERROR || err == PR_PENDING_INTERRUPT_ERROR) {
// make the socket usable again
PR_NT_CancelIo(fd);
}
#endif
if (err == PR_IO_TIMEOUT_ERROR) {
// queue another read job
}
}
The reason for the PR_NT_CancelIo call is explained in this
(out-of-date) tech note:
http://www.mozilla.org/projects/nspr/tech-notes/ntiotimeoutinterrupt.html
I suggest that you call PR_Recv and PR_Send with a
non-zero timeout. In particular, for PR_Recv, you can
use PR_INTERVAL_NO_TIMEOUT. This is because PR_Recv
will return as soon as some data are read. You can't
use PR_INTERVAL_NO_TIMEOUT for PR_Send because PR_Send
(in blocking mode) will try to send the entire buffer
of data.
Note that if PR_Send times out, *some data* may have been
sent, but NSPR doesn't tell you how many bytes have been
sent, so you have no choice but to close the socket. It is
okay to continue to use the socket after PR_Recv times out.Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Hi Wan-Teh, Thanks for your response. It does however beget a couple of other questions: 1. Does a similar limitation exist in the other platform implementations of NSPR? If run this code on solaris, should I expect it to work with out the specified limitations? 2. Is there anyway for me to find out how much data PR_Send is guaranteed to accept without returning the timeout error? It is important to me that I not have to close the socket.
| Assignee | ||
Comment 3•19 years ago
|
||
The need to call PR_NT_CancelIo only exists in the WINNT configuration of NSPR. It does not exist on other platforms or the "WIN95" (generic WIN32) configuration. This is why the special code to recover from PR_IO_TIMEOUT_ERROR or PR_PENDING_INTERRUPT_ERROR is ifdef'ed with WINNT. The problem of not knowing how many bytes have been sent when a *blocking* PR_Send fails with PR_IO_TIMEOUT_ERROR or PR_PENDING_INTERRUPT_ERROR exists in all configurations of NSPR. Unfortunately there is no way for you to find out how many bytes have been sent. You can try this options. 1. Use a very large timeout for PR_Send. When a *blocking* socket is writable, the only reason PR_Send may block is that TCP flow control kicks in because the receiver can't consume the data as fast. This can happen but it rarely happens. If you use a large timeout, when PR_Send times out, you know something is seriously wrong with the receiver, so it is fine for you to close the socket. 2. Try setting the socket in non-blocking mode. I don't know if the prtpool code works with non-blocking sockets though.
Updated•18 years ago
|
QA Contact: wtchang → nspr
Comment 4•16 years ago
|
||
No NSPR bug was identified in this bug report.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•