Closed Bug 146769 Opened 23 years ago Closed 22 years ago

DNS: gets stuck resolving host

Categories

(Core :: Networking, defect)

defect
Not set
major

Tracking

()

VERIFIED WORKSFORME

People

(Reporter: swordedge, Assigned: gordon)

References

()

Details

From Bugzilla Helper: User-Agent: Mozilla/5.0 (OS/2; U; Warp 4.5; en-US; rv:1.0rc2) Gecko/20020510 BuildID: 2002051021 Basically, Mozilla is looking up the IP address of a web site and gets stuck doing that. When this happens, I can use nslookup on a command line and get a response back before I get my finger off the return key. So something is mozilla is on occasion, getting stuck with this operation. This problem is intermitant. I have seen delete the history.dat file as an solution but that doesn't work. Nor is it a good solution. The address works fine again after the original operation times out in something like four minutes, way too long. Reproducible: Sometimes Steps to Reproduce: 1.go to web page, any 2. 3. Actual Results: intermittant. can't reliably reproduce it. Expected Results: When happens, starts to retrieve page, not say resolving host for many minutes. I have seen this on windows and os/2. I consider this severe as it looks real real bad when mozilla can't find an IP address. A major Mozilla impression hit.
-> Networking
Assignee: Matti → new-network-bugs
Component: Browser-General → Networking
QA Contact: imajes-qa → benc
This could be a DNS or a more general networking bottleneck. When this happens, can you open a new window and go someplace else via ip address? Can you give some specific steps that produce this problem?
I didn't realize the email address was bogus... If I open a command line and do an nslookup, I get the IP address immediately, before I can get my pinky off the enter key. If I cut and paste that address that nslookup found into the URL line of a new browser window, the browser goes their pronto. This is definately some sort of mozilla DNS lookup problem. I have seen this behavior on both OS/2 and windows so it should be in common code. As for specific steps, good question. The proble is very intermittant. Sometimes I will see it several times a day. Then I might not see it for several days. I do not know of a specific sequence of events that makes the problem occur. If I leave the browser to time out (what, four or five minutes?), it puts up the can't find URL error window. If I then go right back to the same URL, it usually goes immediately to it. When the problem is occuring, I can open a new browser instance and plug in any URL I want. If that URL is anything other then an IP address, it gets stuck too. If I click on a link on the site that has any URL in it (absolute links), including links to itself, it gets stuck during instances when the problem is occuring. It looks like mozilla does not remember any IP address that it has previously seen during this session (they arn't likely to change while you have the browser open!) More info.... New data... It happened again. So I nslooked up the site. Opened a new browser window. Used the IP address and went to the site. Worked great (no absolute links). About five mins later, up pops a window asking me to save the file that I clicked on before the DNS problem occured. I successfully saved file. Went to the tab that was halted. It had the page on it perfectly. It just took it something like five minutes to figure out where the page was located. No error message about not finding page, they loaded instead.
This bug seems to have something to do with opening a link in a new window. When this occurs, current downloads stop, no page in any window can find a host, untill it clears in roughly five minutes.
I have noticed exactly the same problem on Linux (MZ build 2002060904), except that it does not take 5 minutes for Mozilla to resolve the host (rather 1 minute or so) and that it seems to have no link with "opening a link in new window".
Tcpdumps suggest that the resolutions that mozilla is trying aren't the ones that the status bar is complaining about. Is resolution in mozilla single-threaded such that one stuck resolution can block the entire queue? I'm starting to think the problem is bad inline images on a page, not the page itself...
I believe this problem has been seen and reported to DNS before...
Whiteboard: dupeme
I'm seeing a similar problem to the one described here. I'm not sure if there's another bug open for it; I didn't see one. There seems to be a small window of time (in all likelyhood, during the DNS resolution) in which hitting "stop" for a page load will make all Mozilla DNS lookups block for some time. For example, I just had the following happen: 1. Type http://www.cnn.com/ in the location bar and hit enter 2. Almost immediately hit "stop" 3. All subsequent attempted page loads which required new (non-cached) DNS information hung for a couple of minutes. Then everything went back to normal. Additional page loads from sites I'd already been at (thus, DNS info was probably cached) worked fine. nslookup on the hostnames for the blocked load attempts also worked fine. I see this a couple of times a week on average. Currently I'm using build 2002082608 on FreeBSD, but I've seen this (or similar) problems on builds for a long time and also on MacOS X. Is there a more appropriate bug for this issue?
Sean: please file a new bug on your issue. I have seen this myself, but not given it enough attention to file a bug.
Done: Bug 164988
I'm interested in trying to understand this better. Sean filed a bug about problems w/ stop not unlocking the DNS service. Can we steps for the "Open New Window" problem here?
Summary: gets stuck resolving host → DNS: gets stuck resolving host
I ran into this problem only after upgrading from Mozilla 1.1B to 1.1 on Windows 2000. It will NOT resolve any host address and after 30 seconds displays an alert that the connection was refused. All subsequent attempts are immediately met with that same alert. IP addresses produce the same result. IE 5.5 launches and resolves the URL almost instantly.
I also have this problem. I had it with mozilla 1.0rc3 on FreeBSD 4.6.2 and still have it with mozilla 1.1 on FreeBSD 4.7. My work around is to quit then restart mozilla.
gordon, can you look at this.
Assignee: new-network-bugs → gordon
I have also experienced this or a similar bug on Mozilla 1.0 running on Slackware Linux 8.1. Galeon 1.2.5 also similarly affected. But this problem only seems to have developed recently for me...so that could possibly be a clue that some configuration change could be affecting this...or maybe not. :)
I experience this as well .. several times a day at the moment. All is well if I restart the browser, or a new mozilla process or if I submit the query in opera. Its d*** frustrating and eats cpu. I run Mozilla 1.2B on a linux 2.2.18 machine.
I should have mentioned that when this bug occurs I can refresh existing pages just fine, but cannot access other pages not currently open.
I also see this. Debian unstable, kernel 2.4.19 I see it with Mozilla 1.1 and with 2002111218 (mozilla-snapshot package). It is *extremely* annoying, and seems quite unnecessary: the problematic host name is always (instantly) resolvable with "host" or "nslookup", but mozilla hangs for a minute. No pages can be opened at all, until the first one is found (then all I have tried to open come at once...) I also note that File->Exit leaves the Mozilla process around (but the window disappears). This means I cannot run mozilla again (the Debian scripts try to remote-control existing instances first). So I first must kill all mozilla processes, then re-run. Seems therefore that something very synchronous is happening... can't even be exited! Anyway, I see this as a severe bug, as it affects me almost daily! /Mikael
When I get this bug, which after all this time I still have and still can't force to happen, I kill mozilla. I can't start a new one till I run a process killer called watchcat. According to it, mozilla is still running, 99% CPU usage and running a process called BufferCreator. Kill that and I can start mozilla again. What ever BufferCreator is doing, it needs fixed.
I'm seeing this on 1.2.1 under MacOS 9.2.2 For me it seems to happen when I leave the browser running for a few days. I open a new window and try to go to google (for whatever reason that's the only site this happens to me with). I type a few letters into the url box in the nav bar at the top and hit enter. The little info thing at the bottom of the window just sits there with "resolving host" for minutes and minutes. I've never seen it even time out after a long time. Opening a new window won't help either. I can go to other new sites (eg hotbot worked for me all weekend while this problem was happenning). I ran Internet Explorer to see if the problem was something other than Mozilla, and it loaded google file. Jumped over to Mozilla thinking, oh maybe google's working again, but no, on Mozilla it still wouldn't load. Quitting the browser fixed the problem, though it took me three days before I asked someone and they suggested it.
I seemed to have managed to get round this by creating a completely new profile with the same properties as the old. When I use this new profile the problem has never occured ! whereas with the previous profiles the problem constantly occurs.
can you preserve the old profile?
Same goes for me, never had a Mozilla browser that actually has *ever* worked on my WIN98 SE computer, and this latest version that is supposed to be stable is yet another. Heard lots of great things about it, but worthless if it doesn't work. I've got 64MB RAM, lots of HD space and nothing on the install appears to be out of place.
mozilla 1.2.1, redhat linux 9. I see this after mozilla has been running for about two or three days. For no apparent reason it gets stuck in the 'Resolving host ...' stage when a new site is accessed. Existing pages can be refreshed OK. URLs with numeric addresses work OK. Other browsers and DNS lookups are fine when this is happening. Exiting mozilla leaves a couple of processes hanging around which have to be killed manually before mozilla can be restarted. After that, it all works fine for another two days or so. The problem seems to affect mozilla mail too. While its happening you can't send email messages - gets stuck contacting the SMTP server. When this is happening, strace of the the main (?) mozilla thread shows this... read(3, "\372", 1) = 1 gettimeofday({1060236565, 665000}, NULL) = 0 ioctl(5, FIONREAD, [0]) = 0 poll([{fd=5, events=POLLIN}, {fd=12, events=POLLIN}, {fd=8, events=POLLIN}, {fd=3, events=POLLIN}], 4, 0) = 0 gettimeofday({1060236565, 665705}, NULL) = 0 gettimeofday({1060236565, 665852}, NULL) = 0 write(5, "5\30\4\0\375Dl\2H\0\0\0\21\0\21\0;\3\5\0\302@l\2\0\0\0"..., 848) = 848 ioctl(5, FIONREAD, [0]) = 0 poll([{fd=5, events=POLLIN}, {fd=12, events=POLLIN}, {fd=8, events=POLLIN}, {fd=3, events=POLLIN, re vents=POLLIN}], 4, -1) = 1 gettimeofday({1060236565, 714604}, NULL) = 0 gettimeofday({1060236565, 714724}, NULL) = 0 gettimeofday({1060236565, 714768}, NULL) = 0 gettimeofday({1060236565, 714940}, NULL) = 0 read(3, "\372", 1) = 1 ... looping forever, with pretty much no delay in the poll, read or write as far as I can tell. I've let it run for 1/2 an hour and it remains stuck until you hit the stop button. The FDs shown by lsof are... mozilla-b 22911 djh900 3r FIFO 0,5 847808 pipe mozilla-b 22911 djh900 4w FIFO 0,5 847808 pipe mozilla-b 22911 djh900 5u unix 0xc239aac0 847809 socket There are a couple of other threads/processes too, one of which (IIRC) is stuck in a futex(). I'm pretty sure tcpdump shows a flurry of packets at the start of all this (after entering the URL), but shows no traffic during the rest of the 'stuck' state. My ~/.mozilla was initially created by mozilla 1.0x. FWIW, nscd is not running and resolv.conf is... search anu.edu.au nameserver 130.56.4.1 nameserver 150.203.1.10 nameserver 150.203.22.28 nameserver 150.203.35.2 Thanks.
Just to add one more experience to the list... Mozilla 1.5a Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5a) Gecko/20030718 on RedHat 9 I see the (what appears to be the...) same problem -- Mozilla decides (sometimes...) to take a vacation with "Resolving host www.foo.com..." at the bottom of the window. Dig and host both answer IPv6 and IPv4 queries for www.foo.com just fine while Mozilla is stuck on 'planet mozilla'... Best wishes...
I've had this problem happen to me with 1.2.1, 1.4, and now 1.5b on RH9. Exact same symptoms, and exiting normally leaves behind a mozilla-bin process that has to be SIGKILLed. Why hasn't this bug been assigned? It's a show-stopper when it happens.
FWIW this is also a problem in Firebird 0.6.1 and 0.7 on RH9. Stop, kill -9 pid, restart, sigh.
i suspect this has been fixed now that the DNS rewrite landed (bug 205726). please test against a trunk build. 0.7 is based on mozilla 1.5 which does not include the DNS rewrite. marking WORKSFORME
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
VERIFIED/WFM: DNS cleanup, since rewrite has been 1.6a->1.6f. Regressions or new problems need new bugs.
Blocks: 205726
Status: RESOLVED → VERIFIED
Whiteboard: dupeme
I still see this in 1.7.3. Bug 260832 seems to describe the same problem, so if benc wants to throw away the info accumulated in this bug, people still seeing the problem might want to head over that way.
Akkana, but a reference back to this bug if you like, but if this is a persistent problem after a re-write, why use the same bug? The first report here was for pre-1.0 on OS/2.
You need to log in before you can comment on or make changes to this bug.