Closed
Bug 193267
Opened 22 years ago
Closed 22 years ago
Code cleanup for nsSocketTransport2.cpp
Categories
(Core :: Networking, defect, P2)
Core
Networking
Tracking
()
VERIFIED
FIXED
mozilla1.3final
People
(Reporter: wtc, Assigned: darin.moz)
References
Details
Attachments
(2 files, 4 obsolete files)
7.75 KB,
patch
|
Details | Diff | Splinter Review | |
10.54 KB,
patch
|
dougt
:
review+
bzbarsky
:
superreview+
asa
:
approval1.3+
|
Details | Diff | Splinter Review |
I'm going to propose some cleanup of the code in
nsSocketTransport2.cpp around PR_Poll. This should
target Mozilla 1.4 alpha so that it gets enough
testing on various platforms before Mozilla 1.4 final.
The benefit is more accurate error codes if PR_POLL_ERR,
PR_POLL_EXCEPT, or PR_POLL_HUP is reported by PR_Poll.
It also removes a workaround for an invalid NSPR bug
(Necko was using the out_flags fields after a PR_Poll
timeout).
Reporter | ||
Comment 1•22 years ago
|
||
Reporter | ||
Comment 2•22 years ago
|
||
Note that this if statement
if ((mPollFlags & PR_POLL_WRITE) && (pollFlags & ~PR_POLL_READ)) {
means "if we polled for write and PR_Poll reported at least one event
that indicates we can write some data or the write will fail (the flags
include PR_POLL_WRITE, PR_POLL_EXCEPT, PR_POLL_ERR, PR_POLL_HUP, and
PR_POLL_NVAL but not PR_POLL_READ)".
Assignee | ||
Comment 3•22 years ago
|
||
Comment on attachment 114389 [details] [diff] [review]
Same patch, with whitespace changes ignored for easier code review
sr=darin.. very clever :)
Attachment #114389 -
Flags: superreview+
Assignee | ||
Updated•22 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.4alpha
Assignee | ||
Updated•22 years ago
|
Attachment #114389 -
Flags: review?(bzbarsky)
Reporter | ||
Comment 4•22 years ago
|
||
You may find this patch more self-documenting.
I explicitly list the PR_POLL_XXX flags that
represent errors.
Reporter | ||
Comment 5•22 years ago
|
||
You can actually handle PR_POLL_NVAL as a special case,
but I don't know what is the NS_ error code for "bad
descriptor (EBADF)" or "invalid handle (ERROR_INVALID_HANDLE)".
Assignee | ||
Comment 6•22 years ago
|
||
yeah, that's probably better. turns out there is a bug on linux (probably
XP_UNIX) where the check for PR_POLL_ERR above PR_ConnectContinue was causing
bugs. apparently, if a server is not listening on a designated port, PR_Poll
will set PR_POLL_ERR (in fact the out_flags are set to decimal 40)! as a
result, we end up mapping the connection failure to NS_ERROR_NET_RESET and give
the wrong error message to the user. see for example bug 193092. because of
this, i'd really like to try to get this patch in for mozilla 1.3 final. i've
tested it out and it works well.
Blocks: 193092
Severity: normal → major
Flags: blocking1.3?
Priority: -- → P2
Target Milestone: mozilla1.4alpha → mozilla1.3final
Assignee | ||
Updated•22 years ago
|
Attachment #114389 -
Flags: superreview+
Attachment #114389 -
Flags: review?(bzbarsky)
Assignee | ||
Updated•22 years ago
|
Attachment #114511 -
Flags: superreview+
Attachment #114511 -
Flags: review?(dougt)
Comment 7•22 years ago
|
||
Comment on attachment 114511 [details] [diff] [review]
Alternative patch, with whitespace changes ignored for easier code review
nit i would have move the #define up to the top of the file and commented on
it.
Attachment #114511 -
Flags: review?(dougt) → review+
Assignee | ||
Updated•22 years ago
|
Attachment #114511 -
Flags: approval1.3?
Assignee | ||
Comment 8•22 years ago
|
||
doug: yeah, but i think in this case the #define is very localized to this
function. if the #define is not near the function, then the casual reader might
go looking for PR_POLL_ERRORS in prio.h or one of the other NSPR headers.
keeping it local to the function helps make the code more self-documenting
(IMO), so i'm inclined to just leave it as is :-/
Reporter | ||
Comment 9•22 years ago
|
||
Comment on attachment 114511 [details] [diff] [review]
Alternative patch, with whitespace changes ignored for easier code review
We can turn the PR_POLL_ERRORS macro into a const PRInt16 local
variable. You may want to rename it to follow your variable
naming convention.
Assignee | ||
Comment 10•22 years ago
|
||
yeah, that should do nicely ;-)
Assignee | ||
Comment 11•22 years ago
|
||
actually, i suspect this patch would cause a 100% CPU utilization bug if a
server drops a socket connection while necko is not interested in reading or
writing on the socket. we'd continuously call PR_Poll with PR_POLL_EXCEPT set,
which would return immediately with some kind of error flag set. this would
continue until necko's idle timer fires (15 second intervals), at which time
necko would check its idle sockets to see if any have become disconnected (using
a call to nsSocketTransport::IsAlive). i think we should avoid polling for only
PR_POLL_EXCEPT. going to run some tests to verify my assertion...
Assignee | ||
Updated•22 years ago
|
Attachment #114511 -
Attachment is obsolete: true
Attachment #114511 -
Flags: approval1.3?
Assignee | ||
Comment 12•22 years ago
|
||
ok, i discovered a few subtle bugs while testing out this patch. not only do
we have the potential problem i described previously, but i also stumbled on a
bug in nsSocketTransportService::RemoveFromIdleList, which probably explains
the PR_Poll crash reported as bug 193756.
Attachment #114388 -
Attachment is obsolete: true
Attachment #114389 -
Attachment is obsolete: true
Attachment #114509 -
Attachment is obsolete: true
Assignee | ||
Comment 13•22 years ago
|
||
Assignee | ||
Updated•22 years ago
|
Attachment #114759 -
Flags: superreview?(bzbarsky)
Attachment #114759 -
Flags: review?(dougt)
Updated•22 years ago
|
Attachment #114759 -
Flags: review?(dougt) → review+
![]() |
||
Comment 14•22 years ago
|
||
Comment on attachment 114759 [details] [diff] [review]
revised patch (w/ whitespace changes stripped)
>Index: nsSocketTransport2.cpp
>-nsSocketTransport::OnSocketReady(PRFileDesc *fd, PRInt16 pollFlags)
>+nsSocketTransport::OnSocketReady(PRFileDesc *fd, PRInt16 outFlags)
maybe call those "socketFlags"? In any case, update the function def in the
header so the name there matches this one.
sr=bzbarsky with that addressed.
>- memcpy(&mActiveList[index], &mActiveList[mIdleCount - 1], sizeof(SocketContext));
>+ memcpy(&mIdleList[index], &mIdleList[mIdleCount - 1], sizeof(SocketContext));
You know, I spent a lot of time sorting through that code to figure out what it
was doing.... can't believe I missed that. :(
Attachment #114759 -
Flags: superreview?(bzbarsky) → superreview+
Assignee | ||
Comment 15•22 years ago
|
||
Comment on attachment 114759 [details] [diff] [review]
revised patch (w/ whitespace changes stripped)
requesting drivers approval for 1.3 final. this patch fixes a couple critical
bugs.
Attachment #114759 -
Flags: approval1.3?
Assignee | ||
Comment 16•22 years ago
|
||
bz: i changed the argument to be named |outFlags| because it corresponds to
PRPollDesc::out_flags, which is set by PR_Poll after it returns. thanks for
noticing that the header file is inconsistent btw!
Comment 17•22 years ago
|
||
Comment on attachment 114759 [details] [diff] [review]
revised patch (w/ whitespace changes stripped)
a=asa (on behalf of drivers) for checkin to 1.3.
Attachment #114759 -
Flags: approval1.3? → approval1.3+
Assignee | ||
Comment 18•22 years ago
|
||
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Updated•22 years ago
|
Flags: blocking1.3?
Comment 19•22 years ago
|
||
*** Bug 194625 has been marked as a duplicate of this bug. ***
Comment 20•22 years ago
|
||
VERIFIED:
the proxy error (bug 193092) reported this is now working.
darin: re #11, is the scenario describe:
1- Server sends a FIN (abortive release) and we try to read from a connection
that is now in CLOSE_WAIT
2- Server disappeared, leaving the connection half open?
I've begun integrating these situations into the TCP unit test suite.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•