Closed
Bug 134099
Opened 22 years ago
Closed 22 years ago
Assertion fails on startup in prsocket.c
Categories
(NSPR :: NSPR, defect, P1)
Tracking
(Not tracked)
VERIFIED
FIXED
4.2.2
People
(Reporter: beos, Assigned: wtc)
Details
Attachments
(1 file, 4 obsolete files)
2.40 KB,
patch
|
beos
:
review+
|
Details | Diff | Splinter Review |
Well, it seems we keep running into problems in nspr. Everything builds, but when trying to run, an assertion fails in prsocket.c at line 466. The addr sent into this function from PR_NewTCPSocketPair (peerAddr) never gets initialized. I think the _MD_accept in bnet.c should be setting this, but it is not, the the result it returns is rv=2. Chris, any ideas? I don't know if we can look at nightly builds, especially if they have been failing because of ldap failing. I'll try looking through cvs to see if I can notice any changes that might have caused this.
Oops. I filed this under the wrong product. Changed to NSPR
Component: Networking → NSPR
Product: Browser → NSPR
Assignee | ||
Comment 2•22 years ago
|
||
I think that's caused by my fix for bug 106496. PR_NewTCPSocketPair was not calling PR_Accept with peerAddr. I changed that so that we can verify the peer is the other socket we create in PR_NewTCPSocketPair.
Assignee | ||
Comment 3•22 years ago
|
||
The deleted code is wrong. I hope that's the problem. Let me know if this patch fixes the assertion failure.
Assignee | ||
Updated•22 years ago
|
Priority: -- → P1
Target Milestone: --- → 4.2
_PR_HAVE_SOCKADDR_LEN is only defined for the BONE version, so the code you would be removing would never be used under net_server anyway. Sorry, I guess I should have said that this was for the net_server build. I have not tried under BONE lately.
Here is a step through of what I see happenning before the failed assertion:
- PR_NewTCPSocketPair from prsocket.c gets called.
- since we are neither unix nor nt, two PRNetAddr are declared, selfAddr and
peerAddr
- selfAddr gets initialized
- peerAddr is used in the PR_Accept call without being initialized
- PR_Accept calls SocketAccept in prsocket.c
- before the failed assertion is reached, _MD_accept is called with the still, unitialized PRNetAddr
So, where is peerAddr supposed to be initialized? From what I'v read in the
BeBook:
>After you've bound a stream listener socket to an interface (through bind()),
>you then tell the socket to start listening for clients that are trying to
>connect. You then pass the socket to accept() which blocks until a client
>connects to the listener (the client does this by calling connect(), passing it
>a description of the interface to which the listener is bound).
This says to be that the client addr being passed into accept should be a vaild
sockaddr, but in our scenario, it is not. So, this leads me to ask, what is
peerAddr supposed to be?
Assignee | ||
Comment 6•22 years ago
|
||
The client addr being passed into accept() is to receive the address of the client. It is an output argument. Therefore it does not need to be initialized. Can you print the value of *addrlen before and after each call to accept()? addrlen is an input/output argument. It needs to be initialized to the size of the buffer for the client addr, and on return accept() sets it to the actual size of the client addr.
Assignee | ||
Comment 7•22 years ago
|
||
Please also print the contents of *addr when the assertion in prsocket.c at line 466 fails. Could it be that the accept() function has a bug and does not return the client address? If so, you can work around this accept() bug by modifying the _MD_accept() function in bnet.c to call getpeername(rv, addr, addrlen) (I omitted the casts) if rv != -1 and addr is not NULL.
The addr length stays consistently at 12. The peerAddr raw struct has a family value of 31120. After the accept call in bnet.c, it remains unchanged. Wtc, when you want me to print out *addr, what exactly do you want? Chris, have you tried to build/run on BeOS with the latest code lately? Do you see similar problems? I will do an update under BONE tonight, and see if everything is ok there, or not. I will also try the suggestion above. However, _MD_accept had to have been called before, why would it not work now?
Assignee | ||
Comment 9•22 years ago
|
||
> The addr length stays consistently at 12. This value is correct for BeOS. > The peerAddr raw struct has a family value of 31120. > After the accept call in bnet.c, it remains unchanged. 31120 is the uninitialized value. If it remains unchanged after the accept call in bnet.c, this means accept has a bug: it does not store the peer's address in the sockaddr structure pointed to by 'addr'. > However, _MD_accept had to have been called before, why > would it not work now? We were not calling _MD_accept with a *non-NULL* addr argument before. After my change to PR_NewTCPSocketPair, we are now calling _MD_accept with a non-NULL addr argument. accept with a non-NULL addr argument is equivalent to accept with a NULL addr argument + getpeername on the new socket. This is why I suggested the workaround of adding getpeername(rv, addr, addrlen) (I omitted the casts).
Reporter | ||
Comment 10•22 years ago
|
||
This fixes things under net_server builds. BONE was not working last night. It's getting late, so I won't be able to check BONE until tomorrow.
Attachment #76750 -
Attachment is obsolete: true
Comment 11•22 years ago
|
||
default owner and qa.
Assignee: new-network-bugs → wtc
QA Contact: benc → wtc
Reporter | ||
Comment 12•22 years ago
|
||
Ok, under BONE, it is failing with an Assertion at line 1079 in prsocket.c, on the second time that PR_GetSockName is called. I'm assuming the first time was with a non-NULL value, which worked, and the second is with a NULL value. The starting addrlen, in both calls, is 28. When the call to getsockname in the _MD_getsockname method in bnet.c returns the first time, the addrlen is set to 12, which is valid. The second call, it remains 28, but the getsockname function does not return a failure code either. Is there something like before, that I can call, if the addr is NULL, so that the addrlen is set properly?
Assignee | ||
Comment 13•22 years ago
|
||
Arougthopher:
> Is there something like before, that I can call, if
> the addr is NULL, so that the addrlen is set properly?
I don't understand your question. The second PR_GetSockName()
call in PR_NewTCPSocketPair() is:
if (PR_GetSockName(f[0], &selfAddr) == PR_FAILURE) {
goto failed;
}
The addr is &selfAddr, which is not NULL.
It seems that the second call to getsockname() fails because
addrlen remains 28 after the call. That just can't be right.
Assignee | ||
Comment 14•22 years ago
|
||
Arougthopher: please try this patch under BONE. It explicitly binds the client socket and then calls getsockname on it.
Reporter | ||
Comment 15•22 years ago
|
||
This fixes the problem for both BONE and net_server builds. wtc: Your comments above about removing the IS_LITTLE_ENDIAN code were needed for the BONE build and have been added to this patch. btw, if you don't want to type out arougthopher, you can just call me paul :) Thank you so VERY much for your help.
Attachment #77176 -
Attachment is obsolete: true
Attachment #77343 -
Attachment is obsolete: true
Reporter | ||
Comment 16•22 years ago
|
||
Well, I just tried a nightly build, and the assertion was not thrown?!? Maybe this problem was happenning because I am running with DEBUG enabled. I will try removing the patch and building again, and also building without debug.
Assignee | ||
Comment 17•22 years ago
|
||
Paul, NSPR assertions (the PR_ASSERT macro) are compiled away in optimized builds. By default the assertions only exist in debug builds. This is why you won't see the assertion failure in a nightly build (which is optimized).
Reporter | ||
Comment 18•22 years ago
|
||
oh yeah, just call me stupid :) I think I was coding a little to long that day. Anyway, the patch seems to work for me. r=arougthopher
Reporter | ||
Comment 19•22 years ago
|
||
Chris or Wan-teh, can you give this your r=, so that it can be checked in? I don't have access to the NSPR branch, IIRC. It would be nice to be able to build a DEBUG build, without having to remember to apply this patch every time, and, for new developers, which BeOS desparately needs! thanks
Status: NEW → ASSIGNED
Assignee | ||
Comment 20•22 years ago
|
||
Arougthopher, Could you try this new patch? Its change to prsocket.c is cleaner. Let me know if it works on BeOS. Thanks.
Assignee | ||
Updated•22 years ago
|
Keywords: mozilla1.0.1
Target Milestone: 4.2 → 4.2.1
Assignee | ||
Comment 21•22 years ago
|
||
Arougthopher, Just a reminder that you don't need to recompile or relink Mozilla after applying the new NSPR patch. You only need to do a clean rebuild in nsprpub. Here are the steps. 1. Remove previously patched mozilla/nsprpub/pr/src/io/prsocket.c and mozilla/nsprpub/pr/src/md/beos/bnet.c. Check out these two files from cvs. 2. cd to the top of your build tree, then do % cd nsprpub % gmake clean % gmake
Attachment #77355 -
Attachment is obsolete: true
Reporter | ||
Comment 22•22 years ago
|
||
Comment on attachment 86682 [details] [diff] [review] New patch r=arougthopher this works for me. btw, Wan-teh, I know you don't have to recompile everthing, but, if a new developer, or, if I download a release branch, then it must be, which is what I meant. Thanks for the comments, though.
Attachment #86682 -
Flags: review+
Assignee | ||
Comment 23•22 years ago
|
||
I checked in the patch on the tip and NSPRPUB_PRE_4_2_CLIENT_BRANCH of NSPR. Paul, Chris, please let me know if you want the fix on the MOZILLA_1_0_BRANCH for inclusion in the mozilla 1.0.1 release. We will need drivers' approval for that.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 24•22 years ago
|
||
I don't see a need for this in the MOZILLA_1_0_BRANCH, as 1.1.1alpha is supposed to be released soon, which, in my mind, means the 1_0 branch won't be needed as much, soon. Though, I could be completely wrong on this :) Chris?
Comment 25•22 years ago
|
||
Sadly, I haven't been able to get a straight answer from drivers wrt accepting anything for the 1.0 branch other than fixes for commercial embedding concerns. So while I think it's a good fix for our "stable" branch, I have no faith that drivers would actually accept the patch so that time & effort would be better spent elsewhere.
Assignee | ||
Updated•22 years ago
|
Keywords: mozilla1.0.1
Assignee | ||
Updated•22 years ago
|
Target Milestone: 4.2.1 → 4.2.2
You need to log in
before you can comment on or make changes to this bug.
Description
•