Closed Bug 245174 Opened 20 years ago Closed 20 years ago

very slow connection on Fedora Core 2 (2.6.5-1.358 kernel)

Categories

(Core :: Networking: HTTP, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 239358

People

(Reporter: jonathanbaron7, Assigned: darin.moz)

References

()

Details

(Keywords: helpwanted)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8a2) Gecko/20040522
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8a2) Gecko/20040522

The New York Times takes over 30 sec to load.  The status bar
says "Resolving host www.nytimes.com" during this time.


Reproducible: Always
Steps to Reproduce:
1. install Fedora Core 2
2. go to NY Times web site
3.

Actual Results:  
long delay


Expected Results:  
gone to the site


After reading through bug reports (particularly 234320 and 68798)
and various posts about ipv6 in www.fedoraform.org forums and
particularly this thread:
http://www.redhat.com/archives/fedora-test-list/2004-March/msg00280.html
I decided that the problem had to do with ipv6.  So I set
network.dns.disableIPv6 to true and that did no good.  (That
may be where the bug is.  Shouldn't it work?)  I also tried
adding www.nytimes.com to network.dns.ipv4OnlyDomains, and that
didn't help either.
The problem did not occur with Opera or Lynx.  Use of tcpdump
reveald that the dns servers were simply not returning anything
useful.  All three of the ones listed in /etc/hosts were tried
and they all failed.  (I don't know how it ever worked.)  With
Opera, the connection was immediate.
I fixed the problem by turning off ipv6 by adding
alias net-pf-10 off
alias ipv6 off
to the end of /etc/modprobe.conf
Now Mozilla connects even faster than Opera.  No problems with
dns.  But I shouldn't have to do this.
Perhaps this calls for something in the release notes as well as
a fix for the preference to disable ipv6 (if that is where the
trouble is).  How would anyone know to do that?

*** This bug has been marked as a duplicate of 68796 ***
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
It's not mozilla's fault, it's the DNS server's fault. Nevertheless, there is a
pref you can use to work around it. See bug 68796.
> It's not mozilla's fault, it's the DNS server's fault. Nevertheless, there is a
> pref you can use to work around it. See bug 68796.

I'm missing something here.  Where is the pref to work around it?

I explicitly said in my report that the preferences in about:config did
nothing.  That is in fact the bug I think I reported.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to comment #3)
> I explicitly said in my report that the preferences in about:config did
> nothing.  That is in fact the bug I think I reported.

Oh dear, my mistake. So you set network.dns.disableIPv6 to true? If so, can you
check whether mozilla is making DNS requests for AAAA records? It shouldn't. Do 

tcpdump -nvv -s1500 udp and port 53

and look for AAAA
You could also try doing an strace and looking to see if getaddrinfo() is
getting called using the AF_INET or AF_UNSPEC address family (I don't know if
you see that using strace though).
(no, strace doesn't give you that. maybe ltrace does)
No AAAA from the tcpdump command you gave, either way the preference is
set.  Nor is there any in the raw tcpdump output with no problems.

And I can't figure out how to use ltrace or strace.
Do you see A queries? If you see A queries but not AAAA queries then the pref is
doing its job and the problem is not IPv6 related.
I'm sorry.  I made a mistake in trying the original tcpdump command.
I forgot that I was going through wlan0 rather than eth0.  So now I've
done it right, and I get lots of AAAA's without the preference and
none of them with it.  But it still takes forever, and the routers
still fail.

So the preference setting is doing something, but it isn't having
the effect that it is supposed to have.  Note that if I turn off ipv6
through the kernel setting, everything works and I get no routing
errors.  Note also that Opera and Lynx do not require this change.
They just work with or without the kernel change.  That is what makes
me think that Mozilla could do something that it isn't doing.



Did you restart mozilla between tests? Maybe you're seeing the DNS cache...
(In reply to comment #10)
> Did you restart mozilla between tests? Maybe you're seeing the DNS cache...

Yes. I restarted.
Hmm... if it's not DNS lookups timing out here, I don't know what's wrong.
Mozilla does a DNS lookup, and when that returns it connects to the address it
received. If it only receives a v4 address (which presumably it does since it
only queries for an A record), it will only use that. So I find it hard to
understand how the problem can be IPv6 related, assuming that the pref is working.
Jonathan: since you are the only one who sees this bug, you're the only one that
can help us fix it. Could you try posting the output of an strace run on mozilla
as an attachment to the bug?

Start mozilla with something like

strace -f /path to/mozilla 2> strace.out

and then try to go to the web site. Then quit mozilla and attach strace.out to
this bug.
Sorry for posting to the wrong bug.

But I'm not the only one who sees this bug, just the only one who has reported
it here.  I've seen discussion of this in Fedora groups.

I'm attaching the stack trace.  It ran for about half a minute and then
crashed.  The command was
strace -o strace.out -f mozilla/mozilla http://www.nytimes.com
Hmm... does it crash if you don't use strace?
Writing to a file with strace causes the crash.  I was able to run
strace to the standard output, that is, my terminal, and I saved the
terminal using "script".  It just goes on and on and does not crash.
Unfortunately, it does the same thing when I try it with a page that
actually works.  So I think what we're seeing is that strace is
preventing mozilla from completing its task of opening the page.
I will wait for further instructions before attaching the latest
file.
Keywords: helpwanted
Perhaps this will help.  I ran strace on one computer with ipv6 on
(but off in Mozilla prefs) and one with it off (in the kernel).  Then
I compared them until the trouble started.  Here is where it is:

[pid  4661] open("/home/baron/.Xauthority", O_RDONLY) = 4
[pid  4661] fstat64(4, {st_mode=S_IFREG|0600, st_size=933, ...}) = 0
[pid  4661] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0)
= 0x556c6000
[pid  4661] read(4, "\1\0\0\10newfinzi\0\00210\0\22MIT-MAGIC-COOK"..., 4096) = 933
[pid  4661] close(4)                    = 0
[pid  4661] munmap(0x556c6000, 4096)    = 0
[pid  4661] writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1",
18},
{"\0\0", 2}, {"\345\303$\352j\227fQ\34\313\346\306\307i\277\356", 16}], 4) = 48
[pid  4661] fcntl64(3, F_GETFL)         = 0x2 (flags O_RDWR)
[pid  4661] fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  4661] read(3, 0xfefff438, 8)      = -1 EAGAIN (Resource temporarily
unavailable)                                                                   
                  
[pid  4661] select(4, [3], NULL, NULL, NULL) = 1 (in [3])
[pid  4661] read(3, "\1\0\v\0\0\0c\0", 8) = 8
[pid  4661] read(3, "`5\236\3\0\0\240\2\377\377\37\0\0\1\0\0\24\0\377\377\1"...,
396)
= 396
[pid  4661] write(3,
"7\0\5\0\0\0\240\2@\0\0\0\10\0\0\0\377\377\377\0b\0\5\0"..., 64)
= 64
[pid  4661] read(3, 0xfefff450, 32)     = -1 EAGAIN (Resource temporarily
unavailable)                                                                   
                  
[pid  4661] select(4, [3], NULL, NULL, NULL) = 1 (in [3])
[pid  4661] read(3,
"\1\0\2\0\0\0\0\0\1\202\0\0\0\0\0\0\0\0\0\0\35\0\0\0\300"..., 32)
= 32

It is these read, select, and write lines that repeat over and over and don't stop.
That doesn't look related to the problem, it just seems like the application
communicating with the X server.

Is your nameserver on the same machine (i.e., do you have "nameserver 127.0.0.1"
in /etc/resolv.conf)? If not, could you attach a tcpdump of all the packets sent
and received (possibly excluding your ssh connection) while mozilla is trying to
connect?
This is actually a duplicate of bug 239358.  Loading www.yahoo.com on a default
Fedora Core 2 system results in _10_ DNS queries.  GLIBC seems to perform a
reverse lookup on each returned IP address before returning from getaddrinfo. 
I'm not sure why it does this, but it only happens if AI_CANONNAME is passed to
getaddrinfo.  Moreover, Mozilla only uses getaddrinfo if IPv6 is supported by
the kernel.  Hence, disabling IPv6 support causes the bug to go away.  With
gethostbyname, there is only 1 DNS query.

*** This bug has been marked as a duplicate of 239358 ***
Status: REOPENED → RESOLVED
Closed: 20 years ago20 years ago
Resolution: --- → DUPLICATE
Does anyone still want the result of tcpdump?  Note comment #7 and #8.
(I tried it on my office computer, but that is on a VERY active network.)
I have to reset the kernel on my home computer and try that again.  I'm
willing, but it sounds like it isn't going to do any good at this point.
(In reply to comment #21)
> Does anyone still want the result of tcpdump?

I would say that you're almost certainly seeing bug 239358. If you want to make
sure, see if the tcpdump command in comment #4 (on the right interface) shows
PTR queries being made while mozilla is waiting.
(In reply to comment #22)
> (In reply to comment #21)
> > Does anyone still want the result of tcpdump?
> 
> I would say that you're almost certainly seeing bug 239358. If you want to make
> sure, see if the tcpdump command in comment #4 (on the right interface) shows
> PTR queries being made while mozilla is waiting.

Yes.  Lots of them.  Like this:
07:26:17.745170 IP (tos 0x0, ttl  64, id 15435, offset 0, flags [DF], proto 17,
length: 73)
130.91.68.45.32934 > 128.91.254.1.domain: [bad udp cksum 7fc0!]  43777+ PTR?
203.49.203.216.in-addr.arpa. (45)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: