Closed Bug 135724 Opened 22 years ago Closed 21 years ago

DNS: Random lookup failures / very slow operation

Categories

(Core :: Networking, defect)

x86
FreeBSD
defect
Not set
normal

Tracking

()

VERIFIED WORKSFORME

People

(Reporter: cce, Assigned: gordon)

References

Details

After using Mozilla for a period of time, DNS lookups start to hang.
I don't think this is relevant, but I'm using DJB's local DnsCashe
program (part of TinyDNS) and lookups are done at 127.0.0.1 via
the loopback address.
I perceive the same problem.  Network failure and DNS server
failure have been ruled out.  In my case, I am not using DNS caching.  

As the reporter states, DNS lookups function perfectly well for the first
little bit of time.  Within the same faulty session, DNS lookups will sometimes
succeed after multiple clicks on links, but this does not always work.  And, of
course, when Mozilla is exited and restarted, DNS lookups function fine again.

It was not stated by the original reporter, but in my case, I am testing a
native freebsd build of mozilla, not the linux version via binary compatibility.
Provide build IDs please. I assume nslookup etc. can still look up
things properly ?
Component: URL Bar → Networking
I'm experiencing this on a fairly old build of FreeBSD 4.5-STABLE with RC2.  I
thiiiink I first noticed it with .9.9, which I also installed around the time I
ran that build, so it could just as likely be FreeBSD at fault...?

Pertinent information for anyone chasing this:

Mozilla 1.0 Release Candidate 2
Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc2) Gecko/20020514

FreeBSD obie 4.5-STABLE FreeBSD 4.5-STABLE #3: Mon Feb  4 10:49:31 EST 2002    
floid@obie:/usr/obj/usr/src/sys/OBIE  i386
(Tree was probably pulled on 2/3..)

pkg_info (port revision):
mozilla-1.0.rc2,1   The open source, standards compliant web browser

I installed RC2 with portupgrade -R, so all the port dependencies should be
fresh and shiny and new as of last week or so.

The behavior seems to bear some resemblence to
http://bugzilla.mozilla.org/show_bug.cgi?id=117613 , but it's so random that I
can't say I've really seen a pattern to it.  It *is* very, very annoying when it
hits with 10-15 tabs open.

One theory that I might look into- I've noticed Mozilla throws A6 queries for
all lookups.  Maybe Something Bad happens intermittently when a domain actually
has an A6 record?  (Like most users, I've got IPv6 active, as the BSDs basically
require now, but haven't done anything to configure it.  Definitely no 6bone link.)

The problem seems to have gotten mildly more common after I *stopped* using a
local djbdns cache and switched to my ISP's.

Anyone experienced a cure via a fresher build of FreeBSD?
Just built today's 4-STABLE tree of FreeBSD, leaving me with:

FreeBSD obie 4.6-RC FreeBSD 4.6-RC #0: Thu May 30 22:35:22 EDT 2002    
floid@obie:/usr/obj/usr/src/sys/OBIE  i386

Still sucks with the same installation of RC2.  Next stop, RC3 if it's available
from ports and I can free up the disk space.

Haven't yet tried deciphering a tcpdump of a 'hanging' lookup/load attempt.
Been on RC3 since the day I posted the previous comment; it's "better," but not
perfect.  As there are many things about RC3 that are better but far from
perfect (Thus far, every session I launch hits a showstopper with the 'Find in
this page' feature long before this enter-the-URI-and-page-refuses-to-load
'feature' kicks in), I'm going to assume this really hinges on whatever code
lets multiple windows and tabbed browsing multitask properly, and probably
whatever was causing the persistent 'Transferring...' bug.

I'm assuming many of those remaining (better-defined) bugs will be quashed for
the release, so perhaps this'll let up fully in 1.0.

If it proves otherwise, then I'll definitely have to peel my lazy brain up and
see if I can tcpdump any evidence of a resolver problem.

(I'd still be really happy to know if anyone reading *doesn't* experience
this... if I'm part of a special case, then I should definitely expend effort
tracking this down even if I'm a relative noob... If everyone knows it sucks on
FreeBSD, I think I'd do a greater service reducing the s/n of everyone's inbox..)
Oops- my RC3 build ID:

Mozilla 1.0 Release Candidate 3
Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc3) Gecko/20020531
This has been happening to me for the last several releases.  I had to cancel my
eMusic.com subscription because of it.  It's currently trivially reproduceable
on both of my FreeBSD machines.  Just go to http://www.emusic.com/, something
will happen that keeps the page from loading for several minutes (I think
they're doubleclick.net banner ad IFRAMEs).  It's not just that page - trying to
load any other pages in the same or different windows/tabs just hangs for a few
minutes.  Then something times out and it's back to normal.

I use Galeon, but I have confirmed the same behavior in Mozilla

Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0.0) Gecko/20020610

and even Netscape 6

Mozilla/5.0 (X11; U; Linux i386; en-US; rv:0.9.4.1) Gecko/20020508 Netscape6/6.2.3

(running on FreeBSD's linux compatibility).
This may have something to do with weird DNS behavior for images.geo.mp3.com.
I'm getting SERVFAIL sometimes, an IP address other times, and occasionally:

chris@lion-around:~$ dig images.mp3.com a

; <<>> DiG 8.3 <<>> images.mp3.com a
;; res options: init recurs defnam dnsrch
;; old answer:
;; ns_initparse: Message too long
;; old answer:
;; ns_initparse: Message too long
;; old answer:
;; ns_initparse: Message too long
;; old answer:
;; ns_initparse: Message too long

I can confirm this too, tested with 1.0, 1.0rc3 and 1.1alpha on FreeBSD 4.4.

I've had this problem with several sites, not all sites seem to suffer from it
though. When I try to access the same site with opera, it opens instantly, so
there can't be a problem with the system DNS, also looking up the hostname with
'host' works immediately.
does this problem still exist with 1.1beta?
hi,

 i have seen this in netscape 4.x as well as in mozilla 0.9x-1.1b ---
 every time our central dns server had trouble.
 is the following discussion outdated?
 
http://groups.google.de/groups?hl=de&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=a5a54f22f6f45ba2&seekm=6dch1v%24ngj5%40secnews.netscape.com#s

 in my case, this happens using the following configuration:
 -NT workstations get information to use internal DNS via dhcp.
 -proxy autoconfig uses isInNet(host, "192.168.5.0","255.255.255.0")
 for resolving whether the host is on the net, the name is resolved
 using the internal DNS.
 while the central DNS is lagging, every time you open a link in a new
 window/tab which requires a lookup, the browser freezes.
 well, that way we were always quickly informed when something had gone
 wrong with our dns server, so maybe it's not a bug, but . .......?

platform: only freebsd?
jhacker: there are some complaints that DNS is still synchronous and blocking.
This disucssion however, should be focused on problems just in BSD.

There are several bugs about how BSD+IPv6 are causing problems w/ DNS
performance. My DNS expertise is pre-IPv6, so I do not fully grasp the situation. 

Most of these bugs are in Networking, w/ BSD as the OS. If anyone could
summarize or dupe, that would be great.
Assignee: hewitt → new-network-bugs
Keywords: qawanted
QA Contact: claudius → benc
Summary: Random DNS lookup failures / very slow operation → DNS: Random lookup failures / very slow operation
Is anyone still seeing this on FreeBSD with a moz 1.2 build?
Assignee: new-network-bugs → gordon
*** Bug 154536 has been marked as a duplicate of this bug. ***
*** Bug 157880 has been marked as a duplicate of this bug. ***
Marking ASSIGNED to get it on my radar.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
I tried Mozilla 1.2b and 1.1 today under the latest FreeBSD-STABLE and both seem
to exhibit this problem. It's quite more annoying than anyone else has claimed
though, and seems to happen more (though it might be my imagination) when I'm
trying to resolve a host without the 'www.' part... ie. 'bugzilla.mozilla.org'
or 'search.yahoo.com'... Whereas resolving www.yahoo.com and www.mozilla.org
seemed quick.
I continue to have the same problem described by #7. Recent win32 builds of Moz
and Phoenix(20021213) hang on certain elements of a page while loading and the
rest of the page won't load. Stopping the browser and then reloading the page 
usually gets me in. When this happens and I try Internet Explorer, it gets in
instantly while Mozilla continue to hang. Nslookup continues to work while
experiencing this problem. 

Since this is a general summary problem I'll also mention bugs 146769,185128,174733
http://bugzilla.mozilla.org/show_bug.cgi?id=146769
http://bugzilla.mozilla.org/show_bug.cgi?id=185128
http://bugzilla.mozilla.org/show_bug.cgi?id=174733
since they may be related.
Sorry.  The FreeBSD links went bad pretty quick.
If you do a search for the thread

Multi-threaded or async Mozilla (NSPR, really)

You'll find the posts in the FreeBSD mailing list.
removing ip6 support from the kernel is supposed to be a work around.
I'm having this same issue.  DNS lookups are quick and work fine for
www.whatever.com, but for URLs of the structure subdomain.whatever.com they hang
forever.  For the stories on my.yahoo.com, they link to story.news.yahoo.com,
and this name _never_ resolves.  I'm running 1.0.1 which came with the RedHat 8
distro  I just slapped on my laptop.
If this is caused by the IPv6 AAAA (or A6?) DNS queries made
by Mozilla/NSPR (see comment 3 and comment 20), this bug and
bug 181610 are the same problem.  But I can't tell if this
bug is really caused by the AAAA/A6 DNS queries made by
Mozilla/NSPR because this bug is very long and contains
some speculations and irrelevent info.  Just wanted to note
that I have checked in a fix for bug 181610 that will cause
Mozilla 1.4alpha to not make AAAA/A6 DNS queries on a machine
with no IPv6 connectivity.
The observation made in another bug is that AAAA records and long domain names
would cause the resolver to do a transport mode switch to TCP. We seem to behave
badly when that happens, although I also think people are using local DNS
servers that reject TCP connections.

That would also explain why many of the problem reports relate just uk, yahoo,
and   doubleclick domains.

Checking in that change that change sounds great. We can at least start polling
the non-IPv6 users and see if the problem went away.

I'll put a cleanup of these bugs on my list for next week.
I recently experienced the same problem on SuSE 8.2 (2.4.20 kernel).  I experienced 
slow dns lookups in lynx, konqueror and mozilla.  However, dig and nslookup routinely 
resolved hostnames in about 10 ms. 
 
The problem was that SuSE loads the ipv6 module by default, even with nothing 
configured to use ipv6.  For linux users, I aliased net-pf-10 to off in 
/etc/modules.conf.  I had to reboot because for some reason the 'Used by' count was 
-1, and I could not rmmod the module.  Hopefully this will help Redhat users as well. 
 
It seems to me that there are 3 distinct issues here: 
- There is a bug in (several) OS vendor configurations.  Ipv6 should not be enabled in 
the kernel if it is not used. 
- This is iffy: Can an application readily determine that it must use ipv4 if the kernel 
is providing support for ipv6?  I haven't done any development using ipv6, so I can't 
help here.  I do seem to recall that openssh had a similar problem on Slackware 8.0, 
and I had to compile with --disable-ipv6 (or something similar). 
- Is mozilla able to properly resolve hostnames using ipv6 in an ipv6 environment.  
I'm not sure this has been, or needs to be addressed in this bug.  This discussion is 
primarily centered on properly *not* using ipv6. :) 
 
IMHO, this particular bug is an OS vendor issue and should be closed.  The third item 
above, if a problem at all, is a separate bug. 
 
Sorry I couldn't be more helpful to the FreeBSD folks.  Does FreeBSD use a module 
mechanism similar to Linux?  Any way to convince the FreeBSD team that ipv6 kernel 
support should be dynamically enabled? 
*** Bug 198700 has been marked as a duplicate of this bug. ***
I see something like this on 1.5beta, running RH 9 with all the updates.
nslookup works fine, but mozilla hangs trying to resolve.  One way to trigger
it faster is to load lots of different pages while the machine is compiling
something large (ie, heavy disk & memory pressure, so mozilla processes are
slow at times).  I've noticed that I can reload pages I've recently hit
(I'm assuming it cached the DNS result?)

If I shutdown (pres the X), then even after the GUIs go away,
the process is still there...  I have to kill it with killall or
similar.

I'm on an SMP machine, and this smells of deadlock to me, but I don't
know for certain.
this should be fixed on the trunk now.  see bug 205726 for details.  please
reopen if problem is reproducible on the trunk.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Depends on: 205726
Resolution: --- → WORKSFORME
V:
Please try 1.6f for BSD. New problems to new bugs please.
Status: RESOLVED → VERIFIED
Keywords: qawanted
You need to log in before you can comment on or make changes to this bug.