Closed Bug 614526 Opened 14 years ago Closed 13 years ago

Fails to open dual-stacked web pages on Mac OS X, if host system has a IPv6 default route but no global IPv6 addresses

Categories

(NSPR :: NSPR, defect, P2)

x86
macOS
defect

Tracking

(blocking2.0 .x+, status2.0 wanted, status1.9.2 .17-fixed, status1.9.1 .19-fixed)

RESOLVED FIXED
Tracking Status
blocking2.0 --- .x+
status2.0 --- wanted
status1.9.2 --- .17-fixed
status1.9.1 --- .19-fixed

People

(Reporter: tore, Assigned: mayhemer)

References

()

Details

(Whiteboard: blocking for Firefox 4: needed for IPv6 test day [http-conn])

Attachments

(4 files, 3 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686 (x86_64); nb-NO; rv:1.9.2.12) Gecko/20101027 Fedora/3.6.12-1.fc14 Firefox/3.6.12
Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0b7) Gecko/20100101 Firefox/4.0b7

On a Mac OS X machines connected to networks with IPv6 router advertisements that have no auto-configuration of IPv6 addresses (i.e. they have no prefix information), Firefox fails to open dual-stacked web sites, that is, they will load after a considerable timeout.

In these situations the operating system has a IPv6 default route, but no globally usable IPv6 address assigned (only link-local unicast addresses from fe80::/10).  Use of IPv4 must in these situations be preferred to IPv6, because IPv6 cannot possibly work.

Reproducible: Always

Steps to Reproduce:
1. Connect a Mac OS X host to a network with IPv6 RAs but no prefix information
2. Attempt to open a dual-stacked site using Firefox, e.g. www.ripe.net
3.
Actual Results:  
It sits there without loading anything for a very long time, tab header is saying «Connecting...».  The site appears to be down, until things finally start happening well over a minute later.  However you get the same kind of timeouts for every dual-stacked element included on the page, too, which makes the total page load time extremely long.  All but the most patient users would give up.

Expected Results:  
The page should have loaded in the same speed when it does when the machine has no IPv6 default route and try IPv4 right away.

This is tested on Mac OS X 10.6.5.  Safari (5.0.3), Chrome (7.0.517.44), and Opera (10.63) have no problems, they all try IPv4 directly.  The problem occurs both with Firefox 4.0b7 and 3.6.12.

I'm not familiar with the Mozilla source code, but one thing worth checking out is if it calls getaddrinfo() without the AI_ADDRCONFIG flag when resolving host names.  This might make the resolver skip the AAAA lookups if there's only link-local IPv6 addresses.  Not sure, though.

Network configuration on my test host (with IPv4 addresses anonymised):

osx:~ tore$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        inet6 fdc4:446b:4566:5ba1:21f:5bff:fec2:b845 prefixlen 128
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en1: flags=8823<UP,BROADCAST,SMART,SIMPLEX,MULTICAST> mtu 1500
        ether 00:1f:5b:c2:b8:45 
        media: autoselect (<unknown type>)
        status: inactive
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        ether 00:1f:5b:f7:71:d0
        inet6 fe80::21f:5bff:fef7:71d0%en0 prefixlen 64 scopeid 0x5
        inet 192.0.2.59 netmask 0xffffffc0 broadcast 192.0.2.63
        media: autoselect (1000baseT <full-duplex,flow-control>)
        status: active
fw0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 4078
        lladdr 00:1f:f3:ff:fe:34:c8:a8 
        media: autoselect <full-duplex>
        status: inactive
vboxnet0: flags=8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        ether 0a:00:27:00:00:00 
utun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1500
        inet6 fe80::21f:5bff:fec2:b845%utun0 prefixlen 64 scopeid 0x8
        inet6 fd00:6587:52d7:87:21f:5bff:fec2:b845 prefixlen 64 
osx:~ tore$ netstat -rn
Routing tables
        
Internet:
Destination        Gateway            Flags        Refs      Use   Netif Expire
default            192.0.2.1        UGSc           62        0     en0
192.0.2/26         link#5             UCS             8        0     en0
192.0.2.1          0:11:43:e6:f5:77   UHLWI          62        0     en0   1165
192.0.2.2          0:14:22:12:99:d9   UHLWI           0      860     en0    775
192.0.2.3          0:14:22:17:64:4    UHLWI           0       22     en0   1179
192.0.2.9          0:25:11:59:cc:93   UHLWI           0      495     en0   1104
192.0.2.11         0:14:4f:1:8a:28    UHLWI           0        0     en0   1185
192.0.2.42         0:1d:60:48:f5:9e   UHLWI           2   411272     en0    233
192.0.2.44         0:18:f3:4:79:1f    UHLWI           1     3198     en0    751
192.0.2.59         127.0.0.1          UHS             0        0     lo0   
192.0.2.63         ff:ff:ff:ff:ff:ff  UHLWbI          0        2     en0   
127                127.0.0.1          UCS             0        0     lo0    
127.0.0.1          127.0.0.1          UH              0        0     lo0    
169.254            link#5             UCS             0        0     en0

Internet6:         
Destination                             Gateway                         Flags         Netif Expire
default                                 fe80::211:43ff:fee6:f577%en0    UGc             en0
::1                                     ::1                             UH              lo0
fd00:6587:52d7::/52                     fd00:6587:52d7:87:21f:5bff:fec2:b845 UGCS          utun0
fd00:6587:52d7:87::/64                  fe80::21f:5bff:fec2:b845%utun0  Uc            utun0 
fd00:6587:52d7:87:21f:5bff:fec2:b845    link#8                          UHL             lo0
fdc4:446b:4566:5ba1:21f:5bff:fec2:b845  link#1                          UHL             lo0
fe80::%lo0/64                           fe80::1%lo0                     Uc              lo0
fe80::1%lo0                             link#1                          UHL             lo0
fe80::%en0/64                           link#5                          UC              en0
fe80::211:43ff:fee6:f577%en0            0:11:43:e6:f5:77                UHLW            en0
fe80::21f:5bff:fef7:71d0%en0            0:1f:5b:f7:71:d0                UHL             lo0
fe80::%utun0/64                         fe80::21f:5bff:fec2:b845%utun0  Uc            utun0
fe80::21f:5bff:fec2:b845%utun0          link#8                          UHL             lo0
ff01::/32                               ::1                             Um              lo0
ff02::/32                               ::1                             UmC             lo0
ff02::/32                               link#5                          UmC             en0
ff02::/32                               fe80::21f:5bff:fec2:b845%utun0  UmC           utun0
osx:~ tore$ ndp -rn
fe80::211:43ff:fee6:f577%en0 if=en0, flags=, pref=medium, expire=8h28m34s
> but one thing worth checking out is if it calls getaddrinfo() without the
> AI_ADDRCONFIG flag

The relevant code is in nsHostResolver::ThreadFunc:

884         PRIntn flags = PR_AI_ADDRCONFIG;
885         if (!(rec->flags & RES_CANON_NAME))
886             flags |= PR_AI_NOCANONNAME;
887 
888         ai = PR_GetAddrInfoByName(rec->host, rec->af, flags);

But then NSPR ignores that flag in PR_GetAddrInfoByName.  Why, exactly?
Assignee: nobody → wtc
Component: Networking → NSPR
Product: Core → NSPR
QA Contact: networking → nspr
Version: unspecified → 4.9
And confirming, too.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I looked into that issue before, and I couldn't
find the reason in the CVS history.  That code
was developed on a branch with terse checkin
comments.

I think it's better to call getaddrinfo with
AI_ADDRCONFIG.  Based on the experience with
Chromium, the only problem is that AI_ADDRCONFIG
applies the existence of an outgoing network
interface to IP addresses of the loopback interface,
due to a strict interpretation of the specification.

For example, if a computer does not have any
outgoing IPv6 network interface, but its loopback
network interface supports IPv6, getaddrinfo on
"localhost" with AI_ADDRCONFIG won't return the
IPv6 loopback address "::1", because getaddrinfo
thinks the host cannot connect to any IPv6
destination, ignoring the remote vs. local/loopback
distinction.

So after passing AI_ADDRCONFIG, you will need to
add code to handle the loopback addresses as
special cases.
Where "you" is NSPR, right?
I've confirmed for myself that using AI_ADDRCONFIG will indeed save the day here.  Output from my test app show below there's a 75 second penalty for every IPv6 address it attempts to connect to before, and without AI_ADDRCONFIG you get all the IPv6 addresses sorted on top.  So for www.arin.net (that have two IPv6 addresses) the user have to endure a 150 sec timeout before something starts happening.

The test app does getaddrinfo for www.arin.net (without AI_ADDRCONFIG) and conncects to the resulting list of addresses in order, then repeats the process (this time with AI_ADDRCONFIG enabled).  Let me know you you want the source code for the test app.  The machine has a IPv6 default route, but no globally scoped addresses.  Tcpdump shows no IPv6 connection attempts happening on the wire so it seems the entire timeout is internal to the operating system.

By the way content providers are seeing this problem in the wild and it's one of several issues that are causing us to put off deploying IPv6 - we don't want to cut of access to access to our web sites for our own users, after all - but users with this problem are effectively prohibited from accessing dual-stacked sites.  So in the interest of helping the IPv6 transition get underway I hope you'll prioritise gettnig this fixed in both the 4.0 and the 3.6 branches on the next version you're pushing out to the auto-update mechanism.

Tore



osx:~ tore$ ./toretest www.arin.net -ac www.arin.net
[         0us] begin gai_and_connect(www.arin.net)
[+     3790us] getaddinfo(www.arin.net) done
[+       20us] dest = 2001:500:4:13::81 (AF_INET6)
[+        8us] about to connect()
[+ 74593566us] connect() fails: Operation timed out
[+       54us] dest = 2001:500:4:13::80 (AF_INET6)
[+        9us] about to connect()
[+ 75011715us] connect() fails: Operation timed out
[+       49us] dest = 192.149.252.75 (AF_INET)
[+       10us] about to connect()
[+   128100us] connect() suceeds
[+       48us] dest = 192.149.252.76 (AF_INET)
[+       10us] about to connect()
[+   128853us] connect() suceeds

[         0us] -ac seen, using AI_ADDRCONFIG from now on

[         0us] begin gai_and_connect(www.arin.net)
[+      944us] getaddinfo(www.arin.net) done
[+       20us] dest = 192.149.252.75 (AF_INET)
[+       10us] about to connect()
[+   128374us] connect() suceeds
[+       47us] dest = 192.149.252.76 (AF_INET)
[+       11us] about to connect()
[+   128738us] connect() suceeds

Tore
Oh, and by the way, with regards to the problem with the loopback interface that was mentioned, it appears it's no longer the case, see how it behaves with IPv6 completely disabled (ip6 -x):

osx:~ tore$ sudo ip6 -x 
osx:~ tore$ ./toretest localhost -ac localhost
[         0us] begin gai_and_connect(localhost)
[+     1219us] getaddinfo(localhost) done
[+       23us] dest = ::1 (AF_INET6)
[+        7us] about to connect()
[+       74us] connect() fails: Connection refused
[+       24us] dest = fe80::1 (AF_INET6)
[+        6us] about to connect()
[+       55us] connect() fails: Connection refused
[+       19us] dest = 127.0.0.1 (AF_INET)
[+        7us] about to connect()
[+       65us] connect() fails: Connection refused

[         0us] -ac seen, using AI_ADDRCONFIG from now on

[         0us] begin gai_and_connect(localhost)
[+      141us] getaddinfo(localhost) done
[+       15us] dest = ::1 (AF_INET6)
[+        5us] about to connect()
[+       62us] connect() fails: Connection refused
[+       40us] dest = fe80::1 (AF_INET6)
[+        8us] about to connect()
[+       47us] connect() fails: Connection refused
[+       18us] dest = 127.0.0.1 (AF_INET)
[+        6us] about to connect()
[+       46us] connect() fails: Connection refused

osx:~ tore$ ./toretest www.arin.net -ac www.arin.net
[         0us] begin gai_and_connect(www.arin.net)
[+     2215us] getaddinfo(www.arin.net) done
[+       22us] dest = 2001:500:4:13::80 (AF_INET6)
[+        7us] about to connect()
[+       24us] connect() fails: No route to host
[+       21us] dest = 2001:500:4:13::81 (AF_INET6)
[+        6us] about to connect()
[+       11us] connect() fails: No route to host
[+       15us] dest = 192.149.252.75 (AF_INET)
[+        6us] about to connect()
[+   128101us] connect() suceeds
[+       45us] dest = 192.149.252.76 (AF_INET)
[+       11us] about to connect()
[+   128201us] connect() suceeds

[         0us] -ac seen, using AI_ADDRCONFIG from now on

[         0us] begin gai_and_connect(www.arin.net)
[+      777us] getaddinfo(www.arin.net) done
[+       20us] dest = 192.149.252.75 (AF_INET)
[+        8us] about to connect()
[+   128745us] connect() suceeds
[+       45us] dest = 192.149.252.76 (AF_INET)
[+       10us] about to connect()
[+   128349us] connect() suceeds

So for the global destination, AI_ADDRCONFIG masked the IPv6 addresses, but for localhost, it did not.  So I don't think any special casing of localhost is needed, at least not as of OS X 10.6.5.

Tore
status2.0: --- → ?
I believe that the "AI_ADDRCONFIG breaks connecting to localhost" behaviour is specific to Windows, which has a particular interpretation of RFC 3484. On OS X (and Linux), the loopback address does not appear to count for the purposes of AI_ADDRCONFIG.

Tore, you once pointed me at:

http://www.opensource.apple.com/source/Libinfo/Libinfo-330.7/lookup.subproj/si_getaddrinfo.c

I see that the code just calls getifaddrs(). I don't have a mac to test, does getifaddrs() just ignore the loopback address?
Lorenzo: Chrome users also reported the "AI_ADDRCONFIG breaks
connecting to localhost" behavior on Ubuntu Linux.
I'm happy to test whatever on Mac if someone tells me how to (ideally in the form of a C file I just compile and run).
(In reply to comment #7)

> On OS X
> (and Linux), the loopback address does not appear to count for the purposes of
> AI_ADDRCONFIG.

For what it's worth, my test results in comment #6 appear to confirm this (for OS X).

> Tore, you once pointed me at:
> 
> http://www.opensource.apple.com/source/Libinfo/Libinfo-330.7/lookup.subproj/si_getaddrinfo.c
> 
> I see that the code just calls getifaddrs(). I don't have a mac to test, does
> getifaddrs() just ignore the loopback address?

Doesn't look like it:

osx:~ tore$ ./gia-test 
name=lo0, addr=::1, UP
name=lo0, addr=fe80::1, UP
name=lo0, addr=127.0.0.1, UP
name=lo0, addr=fd14:aca2:970c:a18e:21f:5bff:fec2:b845, UP
name=en0, addr=192.0.2.59, UP

Will attach test programs shortly.

Tore
Blocks: 569993
blocking2.0: --- → ?
Whiteboard: blocking for Firefox 4: needed for IPv6 test day
What environment is actually needed to test a patch for this?  It is not clear to me from the description.
(In reply to comment #13)
> What environment is actually needed to test a patch for this?  It is not clear
> to me from the description.

1) Start with a Mac OS X host on a IPv4-only network, with for instance the latest version 10.6.5

2) Add a default IPv6 route, e.g.:
$ sudo route add -inet6 default fe80::1%en0

3) Attempt to open http://www.arin.net using Firefox (and maybe try other browsers too while you wait for the page to load)

Let me know if you need more help reproducing the issue.

Tore
Attached patch v1 (obsolete) — Splinter Review
So this patch helps.  It's untested on other platforms, after that I will request review.
Assignee: wtc → honzab.moz
Status: NEW → ASSIGNED
Maybe instead:

         hints.ai_flags = (flags & PR_AI_NOCANONNAME) ? 0: AI_CANONNAME;
+#if AI_ADDRCONFIG
+        hints.ai_flags |=(flags & PR_AI_ADDRCONFIG)  ? AI_ADDRCONFIG : 0);
+#endif
As this is not a regression from 3.6, it's not going to block, but we will take an appropriate patch.
blocking2.0: ? → .x
The IPv6 test day should also apply pressure on the OS
vendors to fix getaddrinfo bugs with the addresses of
the loopback interface when AI_ADDRCONFIG is specified.

Chromium uses AI_ADDRCONFIG and works around these bugs.
It sucks if every program that uses AI_ADDRCONFIG has to
work around these bugs.

vandebo@chromium.org worked on those Chromium bugs.  He
has a spreadsheet for those bugs at
http://code.google.com/p/chromium/issues/detail?id=32522#c50
http://code.google.com/p/chromium/issues/detail?id=49024

It references five Chromium bugs:
41408, 39830, 49024, 42058, 49025

In those Chromium bug reports you can find vandebo's changelists.
Then you can duplicate them in either NSPR or Mozilla proper.
Hopefully the IPv6 test day will cause the OS vendors to fix
the underlying getaddrinfo bugs, too.  Note: not every OS has
these bugs.  vandebo's spreadsheet seems to suggest only Linux
(or certain Linux distributions) has these bugs.

If this is too much disorganized info to digest, you can also
add AI_ADDRCONFIG first, and then react to bug reports.  (I
have to admit I can't parse vandebo's spreadsheet.)
I had problems grokking the spreadsheet as well...

Have I understood correctly that bug is that getaddrinfo() in Linux/glibc will mask the IPv4 loopback address (127.0.0.1) for "localhost" if there's no global (non-loopback) IPv4 addresses on the system, and vice verca, that it will mask the IPv6 loopback address (::1) if there's no global (non-loopback and non-link-local) IPv6 address on the system?

I didn't find any open bug about this in the glibc bug tracker at http://sourceware.org/bugzilla/ , has it been reported anywhere?

Anyway I think the Linux guys will love you for adding AI_ADDRCONFIG.  A very often-reported bug that leads to users recommending disabling IPv6 outright to each other is https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/417757 - essentially what happens here is that when you have the following:

1) An IPv4-only computer
2) An application that doesn't use AI_ADDRCONFIG, like Firefox
3) A DNS forwarder/recursor that doesn't handle AAAA queries properly

...«Internet doesn't work».

If you add AI_ADDRCONFIG, you'll fix that, instead exposing the localhost glibc bug.  That, I think, is the best way to pressure the Linux vendors into fixing that bug, which after all is definitively their responsibility.

Tore

Tore
(In reply to comment #17)
> As this is not a regression from 3.6, it's not going to block, but we will take
> an appropriate patch.

Simple:  Fix it in 3.6 too, that way it can block, right?  ;-)

Seriously though, fixing it in 3.6 as well would be very welcome.  Not all users upgrade in a timely fashion, especially not when going from one major version to the next.

Tore
(In reply to comment #19)
>
> Have I understood correctly that bug is that getaddrinfo() in Linux/glibc will
> mask the IPv4 loopback address (127.0.0.1) for "localhost" if there's no global
> (non-loopback) IPv4 addresses on the system, and vice verca, that it will mask
> the IPv6 loopback address (::1) if there's no global (non-loopback and
> non-link-local) IPv6 address on the system?

Tore: yes, that's the bug.  It may exist in only certain
versions or Linux/glibc or certain Linux distributions.
The source of the bug is a strict interpretation of the
specification of AI_ADDRCONFIG in RFC 3493:

   If the AI_ADDRCONFIG flag is specified, IPv4 addresses shall be
   returned only if an IPv4 address is configured on the local system,
   and IPv6 addresses shall be returned only if an IPv6 address is
   configured on the local system.  The loopback address is not
   considered for this case as valid as a configured address.

This specification is ambiguous when the addresses of the
host name are loopback addresses.

I don't remember if I ever reported this bug to glibc.
I forgot to say that I agree with Tore we should use
AI_ADDRCONFIG, and deal with the localhost bug.  I
remember Windows also has the localhost bug, which
is one reason Chromium doesn't use AI_ADDRCONFIG on
Windows.  Another reason is that a comment in
<winsock2.h> says AI_ADDRCONFIG is the default
(although it clears changes the behavior related to
localhost).  I quoted that comment in
http://src.chromium.org/viewvc/chrome/trunk/src/net/base/host_resolver_proc.cc?view=markup
I tested resolving "localhost" on Fedora 14 (glibc 2.12.90) using AI_ADDRCONFIG, with the following results (depending on what kind of non-loopback/linklocal addresses were configured on the system):

1) neither ipv4 nor ipv6 => ::1, 127.0.0.1
2) only ipv4             => ::1, 127.0.0.1
3) only ipv6             => ::1
4) both ipv4 and ipv6    => ::1, 127.0.0.1

So the user would have to be on an IPv6-only machine while running an IPv4-only service on the loopback interface for the bug to actually affect him.  That's a fringe case, I think...

I believe it's safe to assume that the OS X users affected by this bug and the Linux users that are affected by the Ubuntu #417757 bug are far greater in number.  Therefore, simply starting using AI_ADDRCONFIG (at least on OS X and Linux) should be a net improvement, even if you don't at the same time add any special handling of AI_ADDRCONFIG.

However, avoiding using AI_ADDRCONFIG when you're looking up localhost can't be very hard either?  Suggested patch (untested but obvious):

--- nsprpub/pr/src/misc/prnetdb.c	1 May 2009 23:08:05 -0000	3.59
+++ nsprpub/pr/src/misc/prnetdb.c	16 Dec 2010 19:25:17 -0000
@@ -2028,6 +2028,10 @@
 
         memset(&hints, 0, sizeof(hints));
         hints.ai_flags = (flags & PR_AI_NOCANONNAME) ? 0: AI_CANONNAME;
+#if defined(AI_ADDRCONFIG)
+	if(strcasecmp(hostname, "localhost"))
+		hints.ai_flags |= (flags & PR_AI_ADDRCONFIG) ? AI_ADDRCONFIG : 0;
+#endif
         hints.ai_family = (af == PR_AF_INET) ? AF_INET : AF_UNSPEC;
 
         /*

Tore
I'm curious to hear if there's been any progress on this bug lately?  Is there some problems with the suggested patches I could potentially help resolve?

BTW:  I just learned that the Portugese ISP SAPO now is shipping DSL routers to their customers that by default emits such prefix-less router advertisements.  That means that all of those customers that are also using Mac OS X are unable to use Firefox to access dual-stacked sites.

Tore
Actually, I got stuck on strcasecmp function that is not available for prnetdb.c, then this bug lost the blocking status.  Also I'm not sure that doing strcmp on the host name this way is the proper way of recognizing localhost.

I am thinking of some kind of fallback here in case we get only '::1' as a result.  In that case drop the flag and retry the query.  This will also cover any hosts file entries.  But I was not thinking very deep about that yet.
prnetdb.c includes <string.h> and it compiles fine on my Linux host, but maybe it's not available on all platforms?

In any case, I just noticed that there's a private reimplementation of strcasecmp in lib/libc/src/strcase.c (PL_strcasecmp).  I'll attach an updated patch in a bit that uses that function instead, see if that works better?

BTW that "localhost" is the canonical name for the local host is documented in RFC 2606, so I think the strcasecmp approach should be fine.

Tore
Honza: just use strcmp for now.  We can't use PL_strcasecmp in
this file because PL_strcasecmp is defined in another shared
library.

Please rewrite the original code and the new code like this:

    hints.ai_flags = 0;
    if (flags & PR_AI_NOCANONNAME)
        hints.ai_flags |= AI_CANONNAME;
#ifdef AI_ADDRCONFIG
    /* A comment that explains the special case for "localhost", etc. */
    if (strcmp(hostname, "localhost") != 0 &&
        strcmp(hostname, "localhost.localdomain") != 0 && ...) {
        if (flags & PR_AI_ADDRCONFIG))
            hints.ai_flags |= AI_ADDRCONFIG;
    }
#endif

Your fallback logic in comment 25 may be more flexible than
testing for "localhost", etc. specifically, but it won't be
able to handle the corner case of getting nothing as a result
(for example, when the computer is not connected to any
network).
OS: Mac OS X → Windows XP
OS: Windows XP → Mac OS X
"&& ...)" what else should be included?  Also, should anything ending with ".localhost" be excluded as well?
I have seen "localhost6" mentioned in bug reports.

Another option is to exclude the getaddrinfo implementations
that are known to have this AI_ADDRCONFIG/localhost problem:

#ifdef AI_ADDRCONFIG
    /* A comment that explains why these implementations are excluded */
#if !defined(__GLIBC__) && !defined(_WIN32)
    if (flags & PR_AI_ADDRCONFIG)
        hints.ai_flags |= AI_ADDRCONFIG;
#endif
#endif

This approach is worth considering only if Mac OS X's
getaddrinfo doesn't have the AI_ADDRCONFIG/localhost problem.
(In reply to comment #30)
> #if !defined(__GLIBC__) && !defined(_WIN32)

How do we recognize a GLIBC version that doesn't suffer from this bug when it gets fixed?
Attached patch v2 (obsolete) — Splinter Review
- filtering various localhost names
- Wan-Teh, I have added citation of your comment 3 in this bug as comment for the code; it perfectly explains why we do it, do you agree?

Going to push to try to just check it builds on all our platforms.

Tested only on Mac with $route add -inet6 default fe80::1%en1 and www.ripe.net page.
Attachment #497780 - Attachment is obsolete: true
Attachment #502256 - Attachment is obsolete: true
Attachment #502660 - Flags: review?(wtc)
Comment on attachment 502660 [details] [diff] [review]
v2

r=wtc.

>-        hints.ai_flags = (flags & PR_AI_NOCANONNAME) ? 0: AI_CANONNAME;
>+        hints.ai_flags = ((flags & PR_AI_NOCANONNAME) ? 0: AI_CANONNAME);

This change is not needed.  Alternatively, you can rewrite this as:

        hints.ai_flags = 0;
        if (flags & PR_AI_NOCANONNAME)
            hints.ai_flags |= AI_CANONNAME;

>+        /* 
>+         Propagate AI_ADDRCONFIG to GETADDRINFO call if set.
>+         
>+         Needs workaround for loopback host addresses:         
...
>+         destination, ignoring the remote vs. local/loopback distinction.
>+        */

Please format a multi-line comment as follows:
    /*
     * line 1
     * line 2
     * line 3
     */
Attachment #502660 - Flags: review?(wtc) → review+
Attachment #502660 - Flags: approval2.0?
Honza, I made the changes I suggested to your patch.
Please review and test it.
Attachment #502660 - Attachment is obsolete: true
Attachment #503263 - Flags: review?(honzab.moz)
Attachment #503263 - Flags: approval2.0?
Attachment #502660 - Flags: approval2.0?
Hi guys,

I just saw the Slashdot story about you gearing up for the release of Firefox 4 next month.  As you might already have noticed, in a few months major content providers (Google, Yahoo, Facebook, Limelight, Akamai) will be simultaneously publishing AAAA records for their sites, see <http://isoc.org/wp/worldipv6day/>.  In order to make this event be as smooth as possible for all users of Firefox, I would strongly urge you to get this patch committed prior to the release of Firefox 4 (and preferably back-ported to the 3.6 series as well).  If there's something I can do to help make this happen, please let me know as soon as possible.

Apologies for nagging...

Tore
Tore: you can nag the OS vendors or the authors of
RFC 3493 about the AI_ADDRCONFIG/loopback address
problem I described in comment 18 and comment 21.
Priority: -- → P2
Target Milestone: --- → 4.8.8
Version: 4.9 → other
Wan-Teh, sure thing:  http://sourceware.org/bugzilla/show_bug.cgi?id=12398

Tore
Comment on attachment 503263 [details] [diff] [review]
Patch v3, by Honza Bambas [Check in comment 46]

Thanks for an update.

Going to land this after it gets a+.
Attachment #503263 - Flags: review?(honzab.moz) → review+
Comment on attachment 503263 [details] [diff] [review]
Patch v3, by Honza Bambas [Check in comment 46]

Patch checked in on the NSPR trunk (NSPR 4.8.8).

Checking in prnetdb.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prnetdb.c,v  <--  prnetdb.c
new revision: 3.62; previous revision: 3.61
done
Wan-Teh, should this also land in the mozilla tree, or is NSPR expected to merge soon?
Please merge this patch into mozilla-central.  I am not following
the Firefox 4 schedule closely, so I don't know what the checkin
rules are right now.
I just remember that the need for AI_ADDRCONFIG was previously
discussed in bug 467497.
To drivers: please decide on approval soon, thanks.  

This should get to Firefox 4 because of the ipv6 test day.  Perfectly it should get to a beta first.
FWIW, to state it explicitly:
Usually, Mozilla is supposed to take only public releases of NSPR.

However, this is an edge sceneario.
Wan-Teh has granted permission to Mozilla to take this individual patch on top of the currently used NSPR snapshot.

This is being seen as the best route to getting this bug widely tested.

Please approve this patch, so this can be tested in the next Firefox 4 beta.
Comment on attachment 503263 [details] [diff] [review]
Patch v3, by Honza Bambas [Check in comment 46]

Please land ASAP.
Attachment #503263 - Flags: approval2.0? → approval2.0+
Patch has been pushed to mozilla-central for Firefox 4:
http://hg.mozilla.org/mozilla-central/rev/00bf6b6767d3
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
(In reply to comment #46)
> Patch has been pushed to mozilla-central for Firefox 4:
> http://hg.mozilla.org/mozilla-central/rev/00bf6b6767d3

Honza and Wan-Teh, great stuff, thanks guys! :-D

The patch applies cleanly to releases/mozilla-1.9.2/nsprpub/pr/src/misc/prnetdb.c (i.e. Firefox 3.6 - also vulnerable to the bug), could you please commit it there, too?
-        hints.ai_flags = (flags & PR_AI_NOCANONNAME) ? 0: AI_CANONNAME;
+        if (flags & PR_AI_NOCANONNAME)
+            hints.ai_flags |= AI_CANONNAME;
The meanings of PR_AI_NOCANONNAME looks to be reversed. Is this an intentional change?
Before the patch, AI_CANONNAME will be set if PR_AI_NOCANONNAME is NOT set.
Now, AI_CANONNAME will be set if PR_AI_NOCANONNAME is set.
(In reply to comment #48)

> The meanings of PR_AI_NOCANONNAME looks to be reversed. Is this an intentional
> change?
> Before the patch, AI_CANONNAME will be set if PR_AI_NOCANONNAME is NOT set.
> Now, AI_CANONNAME will be set if PR_AI_NOCANONNAME is set.

I very much doubt it, this bug is all about the AI_ADDRCONFIG part of the patch. The AI_CANONNAME part is likely just (intended to be) a cosmetic change.

Tore
No, Masatoshi Kimura is right.  The NOCANONNAME change is wrong.  The if condition is backwards.  Honza, Wan-Teh, can you please confirm + fix?

Tore, if you think the patch should land for 1.9.2, please request approval for 1.9.2 on the patch?
(In reply to comment #50)
> No, Masatoshi Kimura is right.  The NOCANONNAME change is wrong.  The if
> condition is backwards.  Honza, Wan-Teh, can you please confirm + fix?
> 

Fell in to my blind spot.  Good catch.  Will fix it today.
Attachment #507842 - Flags: review?(wtc)
Attachment #507842 - Flags: approval2.0?
Comment on attachment 507842 [details] [diff] [review]
AI_CANONNAME bustage fix v1 [Check in comment 55]

Preapproving this.
Attachment #507842 - Flags: approval2.0? → approval2.0+
Comment on attachment 507842 [details] [diff] [review]
AI_CANONNAME bustage fix v1 [Check in comment 55]

r=wtc.

emk, thank you for catching my mistake.
Attachment #507842 - Flags: review?(wtc) → review+
Comment on attachment 507842 [details] [diff] [review]
AI_CANONNAME bustage fix v1 [Check in comment 55]

http://hg.mozilla.org/mozilla-central/rev/3bba6b2caabd
Attachment #507842 - Attachment description: AI_CANONNAME bustage fix v1 → AI_CANONNAME bustage fix v1 [Check in comment 55]
Attachment #503263 - Attachment description: Patch v3, by Honza Bambas → Patch v3, by Honza Bambas [Check in comment 46]
Comment on attachment 507842 [details] [diff] [review]
AI_CANONNAME bustage fix v1 [Check in comment 55]

Patch checked in on the NSPR trunk (NSPR 4.8.8).

Checking in prnetdb.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prnetdb.c,v  <--  prnetdb.c
new revision: 3.64; previous revision: 3.63
done
(In reply to comment #50)

> Tore, if you think the patch should land for 1.9.2, please request approval for
> 1.9.2 on the patch?

Boris,

I'd love to, afterall it is users of 3.6.x I see that are the most affected by this problem today. Unfortunately, I cannot figure out how to actually go about doing it. I understand I would have to set a flag «approval1.9.2?» on the two patches, but I have no idea where I would do that. Similarly, I understand I should set the «status1.9.2» field to «wanted», but again, I don't see how. Perhaps my Bugzilla account lack the necessary privileges? Could you add the necessary flags and statuses for me, do you think?

Tore
We should not backport these patches to mozilla-1.9.2 for
Firefox 3.6.x until they have been tested in Firefox 4 for
a few weeks.  Our workaround for loopback addresses may be
insufficient.
Attachment #503263 - Flags: approval1.9.2.15?
Attachment #507842 - Flags: approval1.9.2.15?
> but I have no idea where I would do that

Click the "Edit" link on the attachment, and the flags should be there.  But yes, it may depend on your bugzilla permissions....  I'll set the flags.
(In reply to comment #7)

Lorenzo Colitti wrote:
>
> Tore, you once pointed me at:
> 
> http://www.opensource.apple.com/source/Libinfo/Libinfo-330.7/lookup.subproj/si_getaddrinfo.c

Thank you for that link to Libinfo-330.7, which is in Mac OS X 10.6.6.
I studied the Libinfo-330.7 code carefully.

> I see that the code just calls getifaddrs(). I don't have a mac to test, does
> getifaddrs() just ignore the loopback address?

The getifaddrs() call you mentioned is only used by the deprecated
getipnodebyname() function.  So I'll ignore that.

I found that getaddrinfo checks the AI_ADDRCONFIG flag only when DNS
(_mdns_addrinfo) is used.    The other two lookup methods, directory service
and file (ds_addrinfo and file_addrinfo), ignore the AI_ADDRCONFIG flag.

This implies Mac OS X 10.6.6's getaddrinfo doesn't have the AI_ADDRCONFIG
loopback address problem, because DNS cannot return loopback addresses.
But it has a different problem -- addresses returned by /etc/hosts lookup
(file_addrinfo) may be unusable.
OS: Mac OS X → Windows 7
OS: Windows 7 → Mac OS X
Wan-Teh,

No problems for me waiting until the patch has proved itself in Firefox 4 for a while. I very much hope that it'll make it into Firefox 3.6.15, though. (Provided no problems show up, that is.) Should I re-open the bug until the patch has landed on the 1.9.2 branch?

Boris,

Thank you for setting the flags for me. I'm pretty sure I don't have the necessary permissions in Bugzilla, as I still cannot see any «edit» links even now when I know where to look for them.

Tore
(In reply to comment #61)
> I very much hope that it'll make it into Firefox 3.6.15, though.
> (Provided no problems show up, that is.) Should I re-open the bug until the
> patch has landed on the 1.9.2 branch?

No, we track the status on the branches using flags.
Attachment #503263 - Flags: approval1.9.2.15? → approval1.9.2.15+
Attachment #507842 - Flags: approval1.9.2.15? → approval1.9.2.15+
Hi guys,

I just realised that Firefox 3.5 is still maintained and receive updates. The patches apply cleanly to the mozilla-1.9.1 branch as well - in fact, except for these two patches, prnetdb.c is identical on the mozilla-1.9.1, -1.9.2, and -central brances.

So, for completeness sake, could you also add the «approval1.9.1?» (or perhaps it is «approval1.9.1.18?») flag to the patches and set status1.9.1 to «wanted»?

Tore
Attachment #503263 - Flags: approval1.9.1.18?
Attachment #507842 - Flags: approval1.9.1.18?
blocking1.9.1: --- → ?
status1.9.1: --- → ?
I can confirm that the issue is now solved for Firefox 4.0 beta 11 running on Mac OS X. Fantastic work guys, thanks a lot! :-)

Now we just need the patches to appear in Firefox 3.5.18 and 3.6.14 as well. Could they please be checked in to the release branches now, so that we can be completely certain that they won't be accidentally overlooked and forgotten about until 3.5.18 and 3.6.14 are tagged in the Mercurial repository?

Tore
Tore, 3.6.14 and 3.5.18 have been frozen for a while.

The patches attached are approved to land for 3.6.15.  Honza, can you do that?  If not, let me know.

The patches attached are not yet approved to land for 3.5.anything.
(In reply to comment #65)
> Honza, can you do that? If not, let me know.

I have it prepared in the queue just qfin and push it, I just don't know if I can land now, before 3.6.14 final build.
You can.  3.6.14 is on a branch and has been since January 21.
Just make sure to land on the "default" branch in 1.9.2.  ;)
(In reply to comment #65)
> Tore, 3.6.14 and 3.5.18 have been frozen for a while.

I'm aware of that - my hopes are for 3.6.15 and 3.5.17.

> The patches attached are not yet approved to land for 3.5.anything.

Can anyone help with that? I would think that approving it for 3.5 would be quite uncontroversial, considering that it's approved for 3.6 already and the file that's being patched is identical in both branches...?

Tore
(In reply to comment #69)
> (In reply to comment #65)
> > Tore, 3.6.14 and 3.5.18 have been frozen for a while.
> 
> I'm aware of that - my hopes are for 3.6.15 and 3.5.17.

Uhm, on second thought I meant 3.5.18 here. Is 3.5.18 really frozen, already now? It's not tagged in the Mercurial repo at http://hg.mozilla.org/releases/mozilla-1.9.1/tags and https://wiki.mozilla.org/Releases show that 3.5.17 isn't even released yet...

Anyway - I just want the patches to be committed on the tip so that they become part of the next release that is not frozen/tagged right now. :-)

Tore

Tore
Ah, I might have confused 3.5.17 and 3.5.18.  3.5.18 is still open.  You mistyped in comment 64 if you meant 3.6.15.  ;)
Hi,

Can somebody with the appropriate Bugzilla permissions please set the «status1.9.1» flag to «wanted»?

Thanks in advance,
Tore
Comment on attachment 503263 [details] [diff] [review]
Patch v3, by Honza Bambas [Check in comment 46]

Approved for 1.9.1.18, a=dveditz for release-drivers

Upgrading NSS to 3.12.9 is scheduled to go into 3.6.15 and 3.5.18 already, does it require an upgraded NSPR anyway? Seems better to pick up an NSPR release than to start adding the odd patch if these patches are in an upstream.
Attachment #503263 - Flags: approval1.9.1.18? → approval1.9.1.18+
Comment on attachment 507842 [details] [diff] [review]
AI_CANONNAME bustage fix v1 [Check in comment 55]

Approved for 1.9.1.18, a=dveditz for release-drivers
Attachment #507842 - Flags: approval1.9.1.18? → approval1.9.1.18+
blocking1.9.1: ? → ---
The NSS 3.12.9 update on the branches included an update to NSPR 4.8.7 according to kaie, but comment 56 says this was checked into NSPR 4.8.8 -- I guess we need this separate patch on the branches in the interim.
Comment on attachment 503263 [details] [diff] [review]
Patch v3, by Honza Bambas [Check in comment 46]

wtc may not agree that a month of pre-release beta testing in Firefox 4 is sufficient for this fix, especially given Mac's minority status and the extremely tiny use of IPv6. Moving branch approvals back to requests.
Attachment #503263 - Flags: approval1.9.2.15?
Attachment #503263 - Flags: approval1.9.2.15+
Attachment #503263 - Flags: approval1.9.1.18?
Attachment #503263 - Flags: approval1.9.1.18+
Attachment #507842 - Flags: approval1.9.2.15?
Attachment #507842 - Flags: approval1.9.2.15+
Attachment #507842 - Flags: approval1.9.1.18?
Attachment #507842 - Flags: approval1.9.1.18+
(In reply to comment #77)

> wtc may not agree that a month of pre-release beta testing in Firefox 4 is
> sufficient for this fix, especially given Mac's minority status and the
> extremely tiny use of IPv6. Moving branch approvals back to requests.

Not speaking for wtc, but he said in comment #58 «a few weeks» of testing in Firefox 4 should do, and it's been in Firefox 4 for more than three weeks now (since Feb 8).

Also, regarding the «extremely tiny use of IPv6», you should check out «World IPv6 Day», scheduled for June 8, where many major sites will simultaneously enable IPv6 - participants include Google/YouTube, Yahoo, Akamai, Limelight, Bing,  heck, even Mozilla is in on it. See http://isoc.org/wp/worldipv6day/participants/ - and the list is constantly growing. It is crucially important that all supported releases of Firefox carries this change ahead of that day. The earlier changed versions are released, the better; as you probably know, not all users upgrade their software in a timely manner.

Also, de-prioritizing IPv6-related problems due to its lacklustre deployment status is just helping cement non-deployment; the reason why Google and others haven't deployed IPv6 so far is *precicely* due to issues such as this one.

Tore
'd like to echo what Tore said. 

The company I work for (cisco) is participating in world v6 day on June 8 as well. We'll have www.cisco.com on IPv6 and webex, linksys, and other parts of the company will likely be participating in one form or another as well. We have lots of Macs here, and Firefox is one of the two IT-supported browsers. 

Further, my ISP in Paris (free.fr) currently delivers IPv6 to 500,000 "opt-in" users. They just moved to "on by default" with their new Residential Gateway which is shipping as fast as they can build them. I could see a million just from this ISP alone by year end.
I see at <http://hg.mozilla.org/releases/mozilla-1.9.2/tags> that there's now a FIREFOX_3_6_15_RELEASE tag present. Does that mean it's too late for these patches to be included in 3.6.15?

Tore
The version number 3.6.15 was used for an emergency release.

The version originally planned as 3.6.15 has been renamed to 3.6.16.
Comment on attachment 503263 [details] [diff] [review]
Patch v3, by Honza Bambas [Check in comment 46]

Approved for 1.9.2.16 and 1.9.1.18, a=dveditz for release-drivers
Attachment #503263 - Flags: approval1.9.2.16?
Attachment #503263 - Flags: approval1.9.2.16+
Attachment #503263 - Flags: approval1.9.1.18?
Attachment #503263 - Flags: approval1.9.1.18+
Comment on attachment 507842 [details] [diff] [review]
AI_CANONNAME bustage fix v1 [Check in comment 55]

Approved for 1.9.2.16 and 1.9.1.18, a=dveditz for release-drivers
Attachment #507842 - Flags: approval1.9.2.16?
Attachment #507842 - Flags: approval1.9.2.16+
Attachment #507842 - Flags: approval1.9.1.18?
Attachment #507842 - Flags: approval1.9.1.18+
https://hg.mozilla.org/releases/mozilla-1.9.2/rev/c5d74bcd7421
https://hg.mozilla.org/releases/mozilla-1.9.2/rev/5e68ad3bb017

https://hg.mozilla.org/releases/mozilla-1.9.1/rev/7564dd07fd03
https://hg.mozilla.org/releases/mozilla-1.9.1/rev/aeeba2b6d487

It appears that, although the approval flags have been renamed for the emergency release, the status flags haven't, so I'm setting "fixed1.9.2.16" and "fixed1.9.1.18" even though we already know the release this is in is *at least* 1.9.2.17/1.9.1.19.
Depends on: 650474
Regression on AIX5, see bug# 650474
Whiteboard: blocking for Firefox 4: needed for IPv6 test day → blocking for Firefox 4: needed for IPv6 test day [http-conn]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: