Closed
Bug 71391
Opened 24 years ago
Closed 24 years ago
Complete networking failure while running the page loader on win98
Categories
(Core :: Networking, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: jrgmorrison, Assigned: darin.moz)
References
()
Details
Attachments
(1 file)
997 bytes,
patch
|
Details | Diff | Splinter Review |
Overview Description: Beginning with today's builds (03/08) on win98, the page loading test, after visiting a fair number of the pages, would throw up a dialog 'Connection refused when trying to contact jrgm.mcom.com'. After that dialog is dismissed, mozilla cannot visit *any* URL (it just throws up the same alert). But, on top of that, neither Nav4.7 or IE5.5 on the same machine could do the connection. I mentioned this to gagan, and he said that there was an existing DNS related bug, and this might be the same. However, I searched for that bug, but couldn't find it, so here is a new bug (Sorry). However, I'm wondering if this might be related to the leaks that were happening earlier today. The reason why is that the error message for Nav4.7, says: "A network error occurred : unable to connect to server (TCP Error: Not enough memory) The server may be down or unreachable. Try connecting later." Steps to Reproduce: 1) http://jrgm.mcom.com/page-loader/loader.pl and hit submit 2) go have a coffee; return to PC in ~20 minutes Build Date & Platform Bug Found: win98 2001030813 build Additional Builds and Platforms Tested On: not seen on Mac or Linux Sidenote to twalker: the workaround is to copy the current URL in mozilla to the clipboard, quit mozilla, restart, paste in the URL and continue the the test from there. [Actually, that's not a great workaround, since I don't know that there aren't OS side effects that have happened, and maybe a reboot would be the cleanest thing to do. But for now, we can just go with the simple workaround].
Reporter | ||
Comment 1•24 years ago
|
||
This may be resolved with the mem leak fix, but if not, this is a pretty serious problem (user cannot connect to anything on the network).
this is now also reported in: bug 71375 "mail/news eats "buffer space", causes other apps to fail connecting to server" bug 71392 "Mozilla doesn't close TCP connections" bug 71395 "After several minutes of browsing, tcpip locks up"
I see the unclosed sockets accumulate on Linux too, all are left in CLOSE_WAIT state, and never vanish. Bug 67957, bug 70417 and bug 70605 is about the same phenomena in mailnews.
Reporter | ||
Comment 5•24 years ago
|
||
Okay, so the first set of other bugs are likely dups, and this is probably the leak, for which there is a fix in hand, bug 71317. The second set of bugs noted above, related to mailnews, are not this bug. Or, let's keep it clear that this bug and other 713** started today, and those other bugs have been around for some time.
Yes, this is a new bug, being reported once an hour. A likely "mostfreq" before the day is over. Tempting to suggest blocker-status.
Assignee | ||
Comment 8•24 years ago
|
||
Seems like a dupe of bug 71317 to me. Please reopen if the problem persists. *** This bug has been marked as a duplicate of 71317 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Reporter | ||
Comment 9•24 years ago
|
||
Fair enough. Tracy, if the win98 test runs tomorrow without conking out, then please slap a verify on this bug.
Assignee | ||
Comment 10•24 years ago
|
||
*** Bug 71332 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 11•24 years ago
|
||
*** Bug 71375 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 12•24 years ago
|
||
*** Bug 71395 has been marked as a duplicate of this bug. ***
Comment 13•24 years ago
|
||
i applied the patch from bug 71317 to a fresh CVS build, but after having browsed 3 sites i had 181 sockets hanging in CLOSE_WAIT. (linux)
Assignee | ||
Comment 14•24 years ago
|
||
sounds like this bug needs to be reopened.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee | ||
Comment 15•24 years ago
|
||
reassigning to myself
Assignee: neeti → darin
Status: REOPENED → NEW
Reporter | ||
Comment 16•24 years ago
|
||
Or is R.K. pointing out an pre-existing problem (perhaps Bug 67957, bug 70417 or bug 70605) that is independent of the leak and this (assumed) consequence of the leak. Before today's build, this had never occurred on win98 running the test.
Assignee | ||
Comment 17•24 years ago
|
||
true, this bug is reported on win98. can someone test this on win98? thx!
Comment 18•24 years ago
|
||
an example: load http://www.digi.no Strangely enough one last small picture won't load till a minute has passed. During this, there are 53 sockets hanging in CLOSE_WAIT. Then, the last image seems to be "flushed" - renders - and checking open sockets to the site there is now ONE socket less open - but the first 52 remains hanging. Seems only the last socket used gets closed normally, but for each item on the page before that, one new socket is opened and never closes.
Comment 19•24 years ago
|
||
jrgm: On Linux I have never seen the *browser* leave sockets hanging in close_wait "forever", till now on the 8th. This is in reality a Windows AND Linux bug - and it's about sockets not closing. That becomes "fatal" on Windows quicker, since MSWindows allow so few simoultanously open sockets.
Assignee | ||
Comment 20•24 years ago
|
||
I am definitely seeing the same thing as R.K.Aa on linux.
Status: NEW → ASSIGNED
Assignee | ||
Comment 21•24 years ago
|
||
It looks like HTTP is leaking socket transports. It creates sockets with keep-alive status and then losses them along the way. There has been a bug open for sometime about leaking HTTP channels and hence leaking their respective transports. R.K.Aa maybe your seeing that bug instead?? It's difficult to tell the difference between bug 31317 and 62388 from the perspective of netstat -tcpd.
Assignee | ||
Comment 22•24 years ago
|
||
Also, I noticed that the CLOSE_WAIT problem does not show up under gtkEmbed. Under mozilla, I see them pile up right away.
Assignee | ||
Comment 23•24 years ago
|
||
HTTP channels are being held open as well, which are probably holding onto the socket transports. Also, I disabled my fix to bug 66516, and this problem still persists!
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 24•24 years ago
|
||
Reopening... I did not mean to close this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 25•24 years ago
|
||
I understand the problem!! The server (www.digi.no) is sending an HTTP/1.1 response with "Connection: close" HTTP does not convey this info to the socket transport b/c it assumes that the default mode of the socket transport is to automatically close when done, unless otherwise instructed. For keep-alive connections, HTTP tells the socket transport that it might be reused. So, I think the problem is in the socket transport. My recent changes on dougt's branch probably broke the old/assumed behavior of "close when done by default unless otherwise instructed."
Comment 26•24 years ago
|
||
ahh that would explain why our mail connections are being kept open to for pop and news.
Assignee | ||
Comment 27•24 years ago
|
||
Assignee | ||
Comment 28•24 years ago
|
||
It turns out we had two problems in the socket transport: 1) We were not closing the socket transport on PR_POLL_HUP 2) We were not closing the socket transport when (mReuseCount == 0)
Comment 29•24 years ago
|
||
oh baby, check that puppy in. sr=mscott cc'ing naving so he's in the loop on this as he was investigating many of the mail problems with the sockets being left open. Nice catch Darin.
Comment 30•24 years ago
|
||
applied the patch - now things look Good again :)
Comment 31•24 years ago
|
||
*** Bug 71423 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 32•24 years ago
|
||
The current builds on win98 are not showing the problem originally reported. This is fixed for that point.
Assignee | ||
Updated•24 years ago
|
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 33•24 years ago
|
||
Fix check in
Comment 34•24 years ago
|
||
as far as i can tell, this fixed all the mailnews bugs about open sockets as well. Nice work.
Comment 35•23 years ago
|
||
verified
You need to log in
before you can comment on or make changes to this bug.
Description
•