Closed Bug 214538 Opened 17 years ago Closed 16 years ago

DNS: changes in /etc/resolv.conf aren't picked up

Categories

(Core :: Networking, defect)

x86
Linux
defect
Not set

Tracking

()

VERIFIED WORKSFORME

People

(Reporter: dbt, Assigned: darin.moz)

References

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030703

If I move from one network to another and update my /etc/resolv.conf nameserver
settings, mozilla does not pick up the changes.  This means I have to restart
the browser, which sucks.  Much wailing and gnashing of teeth as I reopen
various windows.

Reproducible: Always

Steps to Reproduce:
1.point /etc/resolv.conf at nonexistant nameserver (1.1.1.1)
2.restart mozilla
3.fix /etc/resolv.conf and note that you still can't get anywhere.
Actual Results:  
Looking up: www.cnn.com...<hang>

Expected Results:  
worked.
Reporter:
Did you restart the name service caching deamon after chaning /etc/resolv.conf ?
Example 
- SYSV/Solaris/etc.:
  % /etc/init.d/nscd stop
  % /etc/init.d/nscd start
- Linux:
  % /etc/init.d/nscd restart

*** This bug has been marked as a duplicate of 162871 ***
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → DUPLICATE
This is not the same bug as 162871; I'm not complaining about RECORDS being
cached, rather the nameserver IPs themselves.

I don't know if this was addressed in the DNS rewrite ( 205726 ), but it's 
certainly not DNS "pinning".
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
david: what version of glibc are you running?  in newer versions of glibc
(included with redhat 7 and above), there is a function to cause glibc to reread
/etc/resolv.conf.  mozilla will try to call that function if it exists in cases
where a lookup fails.  maybe your version of glibc is too old??

also, if mozilla seems to not be responding to changes in your /etc/resolv.conf
you can try toggling the online/offline mode.
if I wait a few minutes, the changes will get picked up.  toggling offline and
online doesn't make any difference.

Perhaps the resolver routines could simply stat() /etc/resolv.conf and call that
function automatically if the st_mtime has changed?
david: there's a real problem with that approach.  /etc/resolv.conf may not even
be used to resolve hostnames on some platforms.  namely, if folks have enabled
NIS, it is possible to have NIS resolve all hostnames.  users can configure this
via /etc/nsswitch.conf.  likewise, it could happen that users modify
/etc/host.conf to disable DNS altogether.  the point is that the range of
configuration options for host lookup under GLIBC-based platforms is really
vast.  for mozilla to detect the configuration using adhoc methods like stat'ing
files is not a good idea.  maybe it would work 95% of the time since most folks
probably use the default configuration, but i still don't think it is something
we should be doing since it is a "fragile" solution at best.
I've seen this w/ a co-worker's linux system that I was using to verify Darin's
resinit fixes from before.

It was extremely weird how it worked, I was even running a packet trace at the
time. I really would like to know if there is better linux documentation than
the resinit() man page.

-> NEW
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: changes in /etc/resolv.conf aren't picked up → DNS: changes in /etc/resolv.conf aren't picked up
Again (see comment #1):
On some platforms it is _required_ to restart the NSCD deamon before changes to
/etc/resolv.conf are picked-up. Did you try that yet ?
*** Bug 215302 has been marked as a duplicate of this bug. ***
Sorry about not following up to your questions:

I'm running redhat 7.3 (fully patched etc) so a "recent enough glibc", from
comment #4.

I am not running nscd at all, so restarting it is not an issue.

If I wait for a single query to time out completely (which takes a while when I
get to work, my home DNS setup has three resolver IPs) then all the sudden it
works just fine -- this process can easily take two minutes.  I presume this is
when the mozilla resolver wonders if something is wrong and calls res_init() again.

I think that a valid approach to this problem is to have a function that can try
to detect changes in the local resolver subsystem early, before making the user
wait for a failed lookup.  Such a detection routine could, for example, check
the last modified timestamp of /etc/resolv.conf, check domainname(2) for
changes, etc, and call res_init() when appropriate.  
This  is a "me too". Same thing under RH8 using any release  of Mozilla up to
1.6a.  Sometimes  it come right after  a few  minutes  -  but most of the time
nothing short  of restarting  it  will fix it.

It can be done better. KDE/konqueror originally suffered from it too, but they
fixed it via res_init  (according to  a  developer  I spoke to). You say Mozilla
 already uses this, so maybe Konqueror  does some  other checks too?

BTW: Konqueror  IMMEDIATELY  notices  /etc/resolv.conf changes 

It is a real pain; I  VPN  into work from home, so can easily "change 
/etc/resolv.conf" 5-10 times a  night. Restarting Mozilla is such a  pain, that
I usually stick to Konqueror  in VPN  and leave  Mozilla in "Internet only mode" ;-)

BTW: I  use  WPAD at  home and  a different one at work - I don't know if
Mozilla also rechecks WPAD settings when  /etc/resolv.conf is  changed - but it
 should...

Jason: can you please test a recent nightly mozilla build (such as today's
build)?  i have made some changes to mozilla's host resolution code that may
help.  thx!
OK, I just downloaded and installed 2003-11-05-05-trunk under RH8.

I edited /etc/resolv.conf so that it pointed exclusively at one of our two DNS
servers, then started "tcpdump -n -i eth0 host me and port 53". I then started
Mozilla, and saw the DNS requests for WPAD, followed by that for the Web server
that is my homepage. I then edited /etc/resolv.conf and changed it to point to
our other DNS server, then I went back to Mozilla and went to a different Web site.

It still used the first DNS server :-( Oh yeah - and it didn't do another WPAD
lookup either - which sorta follows...

jason: well, suppose you disable the first DNS server so that a connection to it
cannot be made?  then, mozilla should recover and start using the second DNS
server.  if not, then we have a regression.  if it does work, then the only
remaining issue is that we don't discover and use the latest contents of
/etc/resolv.conf.  that is really an artifact / bug in GLIBC.  i looked at the
konqueror source code, and i don't think it is doing anything special with
/etc/resolv.conf.  in fact, it appears to use the same algorithm as mozilla.  it
calls res_init if gethostbyname (or getaddrinfo) fails.  that's exactly what
mozilla does.  the problem in your example is that gethostbyname won't fail. 
GLIBC is happy to continue talking to the same DNS server that it found in
/etc/resolv.conf when the application was started.  since gethostbyname doesn't
fail, we don't bother calling res_init, and therefore we never realize that
/etc/resolv.conf changed.

it is important to note that a unix system can be configured to not even use
/etc/resolv.conf.  /etc/nsswitch.conf allows you to configure host resolution to
use a combination of /etc/hosts, DNS, NIS, NISPLUS, and even WINS (if you have
samba winbindd installed).  so, watching file modification dates is really a
hackish solution at best.  mozilla really shouldn't have to know about
/etc/resolv.conf and /etc/nsswitch.conf.
You are right.

Whilst Mozilla was still running, I edited /etc/resolv.conf and put in one
nameserver which was invalid. I then went to a Web page and Mozilla just hung -
waiting for the DNS lookup (a host that was down - so no ICMP port unreachable
to go by). After it was hanging for about 40 secs, I edited /etc/resolv.conf and
put a working nameserver in there. About 15 sec later Mozilla noticed and away
it went again.

This is great - thanks! I'll give this CVS release a good test from home tonight :-)
toss up between marking this WONTFIX or WORKSFORME.  i think WORKSFORME is most
accurate since changes to /etc/resolv.conf are eventually picked up.  but
anything more, such as watching file last modified time stamps, is definitely
WONTFIX.

marking WORKSFORME.
Status: NEW → RESOLVED
Closed: 17 years ago16 years ago
Resolution: --- → WORKSFORME
One more thing. Even though the latests CVS release appears to make Mozilla
correctly notice /etc/resolv.conf changes, it still doesn't attempt to reload
automatic proxy settings.

If I fiddle with /etc/resolv.conf, I can sniff DNS queries going to the new DNS
server - but Mozilla insists on using the same WPAD settings.

e.g. I'm at home, using my DNS server and my WPAD file pointing at my home Squid
server. I then VPN into work. Now when I try to go to a Web page, Mozilla does a
DNS lookup of my home Squid server - which doesn't resolve. It then does a popup
saying the proxy server isn't available.

If I go into Mozilla, and change the current "http://wpad/wpad.dat" I have set
to actually match the FQDN of our internal WPAD server, then reload - then
Mozilla "relearns" and it's happy again. I can even then put
"http://wpad/wpad.dat" back in again - it seems that loading the WPAD is
stickier than it should be?

Then when I leave VPN, I have to go through the whole thing again, as obviously
work proxy servers aren't available from home either.

How about reloading WPAD whenever /etc/resolv.conf is changed?

[BTW: I don't actually mean /etc/resolv.conf - I mean res_init I suppose...]

jason: WPAD???  mozilla does not support WPAD... yet.  there is a patch in a
bug, but that is not yet part of the build.  at any rate, the problem related to
WPAD is another bug.  you'd want to either file another bug, or get in touch
with the author of the WPAD bug.
Sorry - bad wording. It isn't WPAD - it's "WPAD technology" aka the "automatic
proxy configuration URL". By using "http://wpad/wpad.dat" as the URL - you
totally emulate WPAD. 

There is nothing wrong with the "WPAD" support - it's the fact that it isn't
re-read when a resolving change occurs that I'm on about.

I'll put it in as a separate bug then,
you mean PAC, not WPAD, I think
Well, I just tried using firebird 0.7 and it still has this same **** 
problem, so I guess I'm going back to Konqueror, which works just fine  
and is smart enough to know that re-calling res_init if /etc/resolv.conf 
has a new modification date is a good idea. 
 
Yes, I do understand that the system resolver won't always use DNS.  In that 
case, /etc/resolv.conf probably won't be changing very often, so the extra  
overhead (an occasional stat()) is MINOR. 
 
I don't know how many people use their laptops to roam from network to network 
under Linux, but I imagine that number is only increasing, and having to 
restart the browser under those conditions is not a worthwhile solution,  
especially since you can run only one at a time I often have up to 20-30  
browser windows open with various links I haven't read yet. 
 
Total showstopper for me. 
 
Reopen, or tell me to lump it, I don't care, but I just wanted you to  
understand how **** this is. 
Under RH8 with Mozilla-1.6 I get the following:

On home network: startup Mozilla. I have a proxy server at home which I have set
up as a WPAD server for IE. Mozilla doesn't do "automatic network detection"
like IE/Konqueror do, but no worries - I just use "http://wpad/wpad.dat" as the
Automatic Proxy URL. Mozilla looks up "wpad.MYDOMAIN", finds an entry, downloads
the wpad.dat which tells it to use proxy.MYDOMAIN as it's proxy server.

Then I start a VPN client and log into work. My /etc/resolv.conf file is updated
accordingly. When I attempt to access a site, Mozilla *does* notice a network
change has occurred - but it attempts to connect to "proxy.MYDOMAIN" again!!!
Well, MYDOMAIN is just my fake home domain, and even if it wasn't, you wouldn't
be able to connect to it's proxy server from anywhere other than home! ;-) At
that stage Mozilla pops up an error saying the proxy server isn't available.

I think Mozilla needs a loop at that stage. If you have the "automatic proxy
URL" setting set, and any error occurs WRT settings learnt from that URL, then
before dropping into a error popup, Mozilla could go through the cycle again. In
my case it would have then looked up "wpad.WORKDOMAIN", found a DIFFERENT entry,
downloaded a DIFFERENT URL, got a DIFFERENT proxy server and away it would have
gone again.

As it is, I don't have to shutdown Mozilla each time I change networks, but I
have to go Edit-Preferences-Adv-Proxy, "Reload automatic proxy URL". Once I do
that it's all sweet again.

Jason
Firebird0.7 is old
Jason, let's talk about that in a separate bug.

V/WFM.
Status: RESOLVED → VERIFIED
(In reply to comment #24)
> Jason, let's talk about that in a separate bug.
> 
> V/WFM.

Hi there again

I just want to let you know this is still a problem under Mozilla 1.7x. 

i.e. I have "Automatic proxy URL" set to "http://wpad/wpad.dat" - which means
you get WPAD functionality. 

I start Mozilla at home (e.g. domain: home.net). 
Mozilla does DNS lookup of wpad.home.net and downloads wpad.dat with is a PAC
file containing proxy server settings.
All works fine

Then I VPN into work. /etc/resolv.conf is changed to reflect new DNS servers and
new default domain (e.g. work.com).

Attempt to go to Web page.
Mozilla reports that it cannot connect to proxy server - it's still trying to
use the home proxy. Note that Mozilla _is_using_the_correct_DNS_server_addresses.

i.e. it has "noticed" that the DNS resolver settings have changed, but it isn't
recognising that as such a network change has occurred, it would be a good idea
to re-check http://wpad/wpad.dat again.

Does that make sense?

Thanks

Jason
(for the record...) The problem Jason described has nothing to do w/ the original bug (concerns about reading /etc/resolv.conf).

The problem is a current limitation, Necko has no network discovery features. If you switch networks, you need to reload the PAC file yourself (by going offline/online, pressing PAC reload, etc.) Sort of like HUP-ing your PAC file.
Hi there

Wow - I put this in 2003?

Does this comment mean you don't think it's a bug? It's just that MSIE does support this, and expecting users to reload PAC files/go offline/restart FF when they don't have to with MSIE is a loss to FF  IMHO. AS the original reporter said, roaming laptop users are only getting bigger and the issue with detecting if the network you are on requires or doesn't require a proxy is more and more an issue. I mean - this isn't a "nicety" - most corporate LANs block outbound Web access - you have to go via their proxy. 

What about catching it with the error code? I see that FF does create an error page if the proxy you were using is no longer resolvable/reachable. What about adding a new WPAD/PAC lookup in there just before returning the error?

e.g.

currently:

if (proxy_error_occurs ) {
 errorPage ("your proxy settings aren't working")
}

becomes:

if (proxy_error_occurs) {
 if (automatic_or_PAC) {
   redo_proxy_settings
   if (!redo_url_attempt) {
    errorPage("your proxy settings aren't working")
   }
 }else{
  errorPage("your proxy settings aren't working")
 }
}

Jason, if you move your PAC/WPAD comments to a different bug, we can talk about it there.
You need to log in before you can comment on or make changes to this bug.