Closed Bug 565862 Opened 14 years ago Closed 14 years ago

Use of DNS A records incorrect. Also, persistence is not right, these are related problems.

Categories

(Firefox :: General, defect)

x86
Windows XP
defect
Not set
major

Tracking

()

RESOLVED DUPLICATE of bug 392953

People

(Reporter: dgibelli, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3

I tested this in a lab as IE6 works correctly. Here is what I did to reproduce the problem.

I set two A records for a single web server, www.test. I browsed to the test url, http://www.test and Firefox worked correctly to one of the two A records. Good so far.

I deliberately made the IP that Firefox used unreachable and clicked refresh or a link in the page that referred to the same server. Firefox tried three times before failing over to the other IP referred to in the other A record. This process took too long. But the worst thing was that all subsequent attempts to connect were to the IP that was down.

Firefox needs to be quicker to switch server if there is a problem but more importantly once Firefox detects a dead IP it must remember the new IP for all subsequent connections to the same site. This is called persistence.  

Reproducible: Always

Steps to Reproduce:
1. Set up two A records to the same hostname but with different IP's.
I used 10.0.1.2 and 10.0.2.2
www IN A 10.0.1.2
www IN A 10.0.2.2
2. Browse to the host referred to in the two A records. http://www.test in my case
3. Break the routing to the IP that Firefox picks
4. Click refresh and monitor the test network to see retries so you can understand what is happening.
Actual Results:  
Firefox connects correctly but when the web server is unreachable Firefox waits too long before trying another IP address for the same hostname.

Worst still, Firefox does not remember the new IP that is reachable but still retries the first IP that is unreachable when the user clicks a link in the page.

Expected Results:  
When Firefox detects a web server has died and uses another IP for the same site it must remember the new IP and not retry the dead IP for every attempt to refresh the page.

I tested this in a lab. It is not an obscure scenario, I have two data centres and I have problems with Firefox if one is taken down for maintenance. IE works correctly in that it fails over and does not retry the dead IP immediately. 

It would be good to have the timers configurable, controlling the amount of time the browser waits before trying the other A records.

I will be happy to perform tests for you.
hai
http://en.wikipedia.org/wiki/Round_robin_DNS
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
I don't understand why you don't want to fix this problem?
Round Robin DNS is performed both by the DNS server, by the client operating system, and by all the DNS caches in between ; it's normally invisible by the applications, which don't know (or care) that there are multiple ipaddresses for a single FQDN. That is the *disadvantage* of round robin DNS. Most applications don't realize they should retry their request with one of the answers.

Round Robin should only used for load balancing, not for hardware availability ; if one of your servers goes down, 50% of all request will fail.
Resolution: WONTFIX → DUPLICATE
I completely disagree that DNS should only be used for load balancing.

The internet was designed so that if parts of it were wiped out the network would still function. Multiple A records were designed for this reason. IE works correctly but Firefox does NOT.

I am now in the position of having to recommend IE to corporate users who also want resilience as well as distributed load.
I'm a developer on high-speed routers for a big telecommunications company. Seriously, don't depend on Round Robin DNS to provide HA. It shifts the blame to applications, which often don't deal with it correctly. And the various DNS caches in between only make it worse.

Round Robin was not designed for the military resilience that you mention - you're confusing Layer 2 with Layer 3.

Ok, bug 392953 will help to make it better for Firefox, but it's not necessarily for the round robin part. More important is that it can remember which ipaddress seems to be down, so that it can be skipped for a new request. Otherwise your requests will be very slow if they have to be repeated.
You need to log in before you can comment on or make changes to this bug.