Closed Bug 231007 Opened 21 years ago Closed 19 years ago

DNS: IPv6: Intermittent errors, tcpdump shows query to 127.0.0.1

Categories

(Core :: Networking, defect)

x86
Linux
defect
Not set
major

Tracking

()

VERIFIED EXPIRED

People

(Reporter: felix-mozilla, Assigned: darin.moz)

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7a) Gecko/20040114 Firebird/0.8.0+
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7a) Gecko/20040114 Firebird/0.8.0+

My /etc/resolv.conf contains exactly one DNS server: fec0:0:0:ffff::1 (that is
an IPv6 anycast address that is served on the same machine).  My Firebird
started giving me intermittent DNS errors ("Address Not Found Error",
"wireservice.wired.com could not be found. Please check the name and try
again.")  This happens with all names, not just this one.

tcpdump on loopback reveals that my dnscache works fine and answer all queries,
but firebird sends about every third to fifth query to 127.0.0.1 instead of the
correct address.  There is no dnscache on 127.0.0.1, so the result is an ICMP
port unreachable message.  I can see that Firebird then resends the query three
times, every time with the same result obviously, and then gives up and gives me
the error message.

Retrying the same page helps most of the time, but sometimes I have to reload
several times.  This is particularly unnerving since Firebird apparently does
not have it's own DNS cache, so if I go to a web page with many inline images,
most of the time none of them will show up, even if I get the web page itself.

Reproducible: Sometimes

Steps to Reproduce:
1. just click around on the net, for example to slashdot.org, and try to follow
some links (it does not matter whether the links are internal to slashdot or
external)
2. run tcpdump port 53 to watch the DNS packets
3. watch in amazement as Mozilla queries the wrong server every now and then

Actual Results:  
Mozilla sometimes asks 127.0.0.1 instead of fec0:0:0:ffff::1

Expected Results:  
ask fec0:0:0:ffff::1
can you attach a Networking log, see instructions here:
http://www.mozilla.org/projects/netlib/http/http-debugging.html (you may want to
reduce the log to show the relevant part if the issue doesn't occur in the first
pages as the log tends to grow rapidly)
Can you put the resolv.conf file and the tcpdump in the bug?
Summary: Intermittent DNS errors, tcpdump shows that firebird asks wrong server → DNS: IPv6: Intermittent errors, tcpdump shows query to 127.0.0.1
Here is partial tcpdump output ("tcpdump -n -i lo -s 0 port 53"):

01:33:43.101374 IP 127.0.0.1.32809 > 127.0.0.1.53:  10005+ AAAA? www.heise.de. (30)
01:33:43.101734 IP 127.0.0.1.32809 > 127.0.0.1.53:  10005+ AAAA? www.heise.de. (30)
01:33:43.101779 IP 127.0.0.1.32809 > 127.0.0.1.53:  10006+ AAAA? www.heise.de. (30)
01:33:43.101811 IP 127.0.0.1.32809 > 127.0.0.1.53:  10006+ AAAA? www.heise.de. (30)
01:33:43.101852 IP 127.0.0.1.32809 > 127.0.0.1.53:  10007+ A? www.heise.de. (30)
01:33:43.101884 IP 127.0.0.1.32809 > 127.0.0.1.53:  10007+ A? www.heise.de. (30)
01:33:43.101918 IP 127.0.0.1.32809 > 127.0.0.1.53:  10008+ A? www.heise.de. (30)
01:33:43.101963 IP 127.0.0.1.32809 > 127.0.0.1.53:  10008+ A? www.heise.de. (30)

I said "port 53", so we don't get the icmp port unreachable here.

And this is what my /etc/resolv.conf looks like:

  nameserver fec0:0:0:ffff::1
  #nameserver 10.0.0.1
firebird is just calling glibc functions.  in this case, firebird is likely
calling getaddrinfo.  maybe this is a glibc bug.  do any other applications
misbehave in a similar fashion?  i'll attach a little test program that can be
used to simulate getaddrinfo being called by an application.  try testing this
program to see how it behaves.
Attached file test.c
compile with this line:

cc test.c -o test
felix: if you can run the test program and report back on how it behaves that
would be very helpful.  thanks!!
No other glibc program behaves like this.  In fact, I'm currently using 
Konqueror to browse the web because of these problems, and it is of course 
linked to the same glibc.  Also, MozillaFirebird 0.7 does not have the problem, 
only the CVS version.  I'm using Gentoo, so the 0.7 version was also compiled 
on my box with the same glibc and the same gcc, it's not a binary from someone 
else. 
felix: ok, that's helpful info... can you still please run the test program? 
i'm not certain that konqueror is using getaddrinfo on your OS.
I just kept clicking around with the firebird 0.7 release, and it turns out 
that it also produces the 127.0.0.1 DNS queries, it just took a lot longer. 
 
Mhh.  It may be a glibc bug after all.  I will open a Gentoo bug on this, maybe 
they have an idea. 
 
BTW: your test program is broken, it prints IPv6 addresses incorrectly ;-) 
The test program does not exhibit the problem.  I think it only happens if the 
same application does a lot of getaddrinfo calls.  I just added a loop to the 
test program that runs the same getaddrinfo for 100 times, but it still does 
not reproduce the bug.  Maybe firebird overwrites some glibc memory here? 
 
Konqueror, by the way, also supports IPv6, so I am quite sure that it also uses 
getaddrinfo for DNS resolution. 
hmm.. firebird 0.7 did not use getaddrinfo, so that is a very interesting data
point.  during the mozilla 1.6 development cycle, the mozilla host resolver was
rewritten to use getaddrinfo for ipv6 support.  otherwise, it still calls
gethostbyname_r on linux.  konqueror could be calling gethostbyname2 instead to
deal with ipv6.  though, from past experience with that codebase, i think it is
probably calling getaddrinfo as well.  i really don't know for sure.

memory corruption is a slim possibility here.  it is very odd though that the
same problem can be seen before and after the mozilla host resolver rewrite :-/

do you know if anyone else has seen this problem?

have you tried creating a new mozilla user profile?

can you reproduce this using a mozilla nightly build?
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → EXPIRED
V. please reopen if you have more data.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: