Closed
Bug 71391
Opened 24 years ago
Closed 24 years ago
Complete networking failure while running the page loader on win98
Categories
(Core :: Networking, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: jrgmorrison, Assigned: darin.moz)
References
()
Details
Attachments
(1 file)
997 bytes,
patch
|
Details | Diff | Splinter Review |
Overview Description:
Beginning with today's builds (03/08) on win98, the page loading
test, after visiting a fair number of the pages, would throw up
a dialog 'Connection refused when trying to contact jrgm.mcom.com'.
After that dialog is dismissed, mozilla cannot visit *any* URL (it
just throws up the same alert). But, on top of that, neither Nav4.7
or IE5.5 on the same machine could do the connection.
I mentioned this to gagan, and he said that there was an existing
DNS related bug, and this might be the same. However, I searched
for that bug, but couldn't find it, so here is a new bug (Sorry).
However, I'm wondering if this might be related to the leaks that
were happening earlier today. The reason why is that the error message
for Nav4.7, says:
"A network error occurred : unable to connect to server
(TCP Error: Not enough memory)
The server may be down or unreachable. Try connecting later."
Steps to Reproduce:
1) http://jrgm.mcom.com/page-loader/loader.pl and hit submit
2) go have a coffee; return to PC in ~20 minutes
Build Date & Platform Bug Found: win98 2001030813 build
Additional Builds and Platforms Tested On: not seen on Mac or Linux
Sidenote to twalker: the workaround is to copy the current URL in mozilla
to the clipboard, quit mozilla, restart, paste in the URL and continue the
the test from there. [Actually, that's not a great workaround, since I don't
know that there aren't OS side effects that have happened, and maybe a reboot
would be the cleanest thing to do. But for now, we can just go with the simple
workaround].
Reporter | ||
Comment 1•24 years ago
|
||
This may be resolved with the mem leak fix, but if not, this is a pretty
serious problem (user cannot connect to anything on the network).
Reporter | ||
Comment 5•24 years ago
|
||
Okay, so the first set of other bugs are likely dups, and this is probably
the leak, for which there is a fix in hand, bug 71317.
The second set of bugs noted above, related to mailnews, are not this bug.
Or, let's keep it clear that this bug and other 713** started today, and
those other bugs have been around for some time.
Yes, this is a new bug, being reported once an hour. A likely "mostfreq" before
the day is over. Tempting to suggest blocker-status.
Assignee | ||
Comment 8•24 years ago
|
||
Seems like a dupe of bug 71317 to me. Please reopen if the problem persists.
*** This bug has been marked as a duplicate of 71317 ***
Status: NEW → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Reporter | ||
Comment 9•24 years ago
|
||
Fair enough. Tracy, if the win98 test runs tomorrow without conking out, then
please slap a verify on this bug.
Assignee | ||
Comment 10•24 years ago
|
||
*** Bug 71332 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 11•24 years ago
|
||
*** Bug 71375 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 12•24 years ago
|
||
*** Bug 71395 has been marked as a duplicate of this bug. ***
Comment 13•24 years ago
|
||
i applied the patch from bug 71317 to a fresh CVS build, but after having
browsed 3 sites i had 181 sockets hanging in CLOSE_WAIT. (linux)
Assignee | ||
Comment 14•24 years ago
|
||
sounds like this bug needs to be reopened.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee | ||
Comment 15•24 years ago
|
||
reassigning to myself
Assignee: neeti → darin
Status: REOPENED → NEW
Reporter | ||
Comment 16•24 years ago
|
||
Assignee | ||
Comment 17•24 years ago
|
||
true, this bug is reported on win98. can someone test this on win98? thx!
Comment 18•24 years ago
|
||
an example:
load http://www.digi.no
Strangely enough one last small picture won't load till a minute has passed.
During this, there are 53 sockets hanging in CLOSE_WAIT.
Then, the last image seems to be "flushed" - renders - and checking open sockets
to the site there is now ONE socket less open - but the first 52 remains
hanging. Seems only the last socket used gets closed normally, but for each item
on the page before that, one new socket is opened and never closes.
Comment 19•24 years ago
|
||
jrgm: On Linux I have never seen the *browser* leave sockets hanging in
close_wait "forever", till now on the 8th.
This is in reality a Windows AND Linux bug - and it's about sockets not closing.
That becomes "fatal" on Windows quicker, since MSWindows allow so few
simoultanously open sockets.
Assignee | ||
Comment 20•24 years ago
|
||
I am definitely seeing the same thing as R.K.Aa on linux.
Status: NEW → ASSIGNED
Assignee | ||
Comment 21•24 years ago
|
||
It looks like HTTP is leaking socket transports. It creates sockets with
keep-alive status and then losses them along the way. There has been a bug
open for sometime about leaking HTTP channels and hence leaking their
respective transports. R.K.Aa maybe your seeing that bug instead?? It's
difficult to tell the difference between bug 31317 and 62388 from the
perspective of netstat -tcpd.
Assignee | ||
Comment 22•24 years ago
|
||
Also, I noticed that the CLOSE_WAIT problem does not show up under gtkEmbed.
Under mozilla, I see them pile up right away.
Assignee | ||
Comment 23•24 years ago
|
||
HTTP channels are being held open as well, which are probably holding onto
the socket transports.
Also, I disabled my fix to bug 66516, and this problem still persists!
Status: ASSIGNED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 24•24 years ago
|
||
Reopening... I did not mean to close this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 25•24 years ago
|
||
I understand the problem!! The server (www.digi.no) is sending an HTTP/1.1
response with "Connection: close" HTTP does not convey this info to the
socket transport b/c it assumes that the default mode of the socket transport
is to automatically close when done, unless otherwise instructed. For
keep-alive connections, HTTP tells the socket transport that it might be
reused. So, I think the problem is in the socket transport. My recent changes
on dougt's branch probably broke the old/assumed behavior of "close when done by
default unless otherwise instructed."
Comment 26•24 years ago
|
||
ahh that would explain why our mail connections are being kept open to for pop
and news.
Assignee | ||
Comment 27•24 years ago
|
||
Assignee | ||
Comment 28•24 years ago
|
||
It turns out we had two problems in the socket transport:
1) We were not closing the socket transport on PR_POLL_HUP
2) We were not closing the socket transport when (mReuseCount == 0)
Comment 29•24 years ago
|
||
oh baby, check that puppy in. sr=mscott
cc'ing naving so he's in the loop on this as he was investigating many of the
mail problems with the sockets being left open. Nice catch Darin.
Comment 30•24 years ago
|
||
applied the patch - now things look Good again :)
Comment 31•24 years ago
|
||
*** Bug 71423 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 32•24 years ago
|
||
The current builds on win98 are not showing the problem originally reported.
This is fixed for that point.
Assignee | ||
Updated•24 years ago
|
Status: REOPENED → RESOLVED
Closed: 24 years ago → 24 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 33•24 years ago
|
||
Fix check in
Comment 34•24 years ago
|
||
as far as i can tell, this fixed all the mailnews bugs about open sockets as
well. Nice work.
Comment 35•24 years ago
|
||
verified
You need to log in
before you can comment on or make changes to this bug.
Description
•