Closed Bug 30917 Opened 22 years ago Closed 21 years ago

implement DNS caching and request cancelation

Categories

(Core :: Networking, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla0.9.4

People

(Reporter: warrensomebody, Assigned: gordon)

References

Details

(Keywords: perf)

Attachments

(4 files)

We need to implement DNS caching and request cancelation. The code for this 
exists on the DNS_CANCEL_BRANCH which Gordon will land after beta.
=> M15
Target Milestone: M15
Moving what's not done for M15 to M16.
Target Milestone: M15 → M16
Status: NEW → ASSIGNED
Whiteboard: 2d
Keywords: beta2
Keywords: nsbeta2
Putting on [nsbeta2+][5/16] radar.  This is a feature MUST complete work by 
05/16 or we may pull this feature for PR2.
Whiteboard: 2d → [nsbeta2+][5/16]2d
The DNS_CANCEL_BRANCH has been landed (for awhile now).  We just need to tune the 
cache size.
Is this fixed now?  tever what are the latest test results.
Whiteboard: [nsbeta2+][5/16]2d → [NEED INFO]2d
The nsbeta2 work is complete.  The bug is being left open as a reminder that we 
need to do performance tuning.
Keywords: nsbeta2
Summary: implement DNS caching and request cancelation → [perf] implement DNS caching and request cancelation
Putting on [nsbeta3+] radar for PR3 work.
Keywords: nsbeta3, perf
Whiteboard: [NEED INFO]2d → [nsbeta3+]2d
M16 has been out for a while now, these bugs target milestones need to be 
updated.
Removing [nsbeta3+], will re-eval later for beta3.
Whiteboard: [nsbeta3+]2d → 2d
2 separate issues, but neither for beta3.
Whiteboard: 2d → [nsbeta3-]
This bug was fixed months ago, wasn't it? If there's performance tuning that 
can be done here, please file a separate bug.
Target Milestone: M16 → Future
It's been a while since any activity on this bug -- has DNS caching been
implemented?
I'll take a look at this in the .9 or .91 timeframe.  It should be a relatively 
small change.
Target Milestone: Future → mozilla0.9.1
*** Bug 76610 has been marked as a duplicate of this bug. ***
Whiteboard: [nsbeta3-] → [nsbeta3-][DNS]
*** Bug 74153 has been marked as a duplicate of this bug. ***
Just to recap what we discussed in the performance meeting, we talked about
having a JS pref (not user visible) which controls the lifetime of a DNS cache
entry. Since 4.x cached DNS entries for 30 minutes, and lots of people hated
that, the pref will have a short default value, like 5 or 10 minutes. 

Customers who use proxy servers may want to jack up this value in their own
builds, since the proxy server is the only address they ever resolve.
Blocks: 71668
Does this lifetime refer to the amount of time the entry lives in the cache, or
the amount of time since it was last used?

This matter seems to hit me a little harder at home than at work since I lost my
DSL connection and spend several hours per night dialed-up via a modem.  It's a
little frustrating when I click a link on a web page which takes me to another
page on the same site, and I'm blocked for a couple of seconds waiting for the
DNS resolution to take place. 
   
I think the expiration time needs to be the life of the DNS entry in the cache, 
not an offset from the last use.  Otherwise, a DNS entry could last indefinitely, 
which is exactly what the folks who run servers have complained about.  We will 
use a preference however, so embedders can arbitrarily crank up the time as high 
as they want to optimize for their special uses.  My recollection from the 
newsgroup debates we had a couple years ago is that keeping DNS entries around 
for about 5 minutes wouldn't aggravate them too much, but might drastically 
reduce the latency for typical users.
Keywords: nsbeta3nsbeta1+
Summary: [perf] implement DNS caching and request cancelation → implement DNS caching and request cancelation
Whiteboard: [nsbeta3-][DNS]
qa to me.

Any reason we wouldn't just use the TTL value? Then the hostmaster can handle 
the details (proxy server records could have a long ttl so they stick longer, 
etc.)

QA Contact: tever → benc
(qa to me should have been done as benc@netscape.com...)
Many platforms provide no API at all to getting the TTL information.  We'd have
to write our own XP resolver with such an API or borrow one from somewhere. 
This is perhaps the right thing to do, though, but a bunch more work.
Yuk. Nevermind what I said above.
Priority: P3 → P1
*** Bug 75134 has been marked as a duplicate of this bug. ***
*** Bug 81469 has been marked as a duplicate of this bug. ***
+mostfreq, so we can catch dupes faster
Keywords: mostfreq
per PDT triage moving to 0.9.2
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Blocks: 72805
I have nsDnsService.[h|cpp] on a branch named DNS_BRANCH, and I've fixed some 
build errors on Linux and Windows.

Linux and Windows still need a bit of debugging.
Blocks: 42898
Bug fixes have been checked into the DNS_BRANCH, and Linux and Windows seems to 
work.  If anybody would like to build and test this change, I'd be very 
interested in the results.
gordon please attach the diffs (from the trunk) here... a lot of reviewers don't
always have access to the branch. Thanks.
Priority: P1 → P2
I tested with patch on linux. Looked dns trafic with tcpdump and seems
that names only resolved once.

One problem with patch was that mozilla sometimes hangs on shutdown, but
that seems to be fixed on branch.
I'm seeing horrible DNS resolving problems in Mozilla, and it's starting to make
the browser unusable for the first time since I switched to Mozilla as my
primary browser in November 2000.  It's not the DNS system, because Netscape 4
will load pages right away while Mozilla is stuck resolving, or worse, claims
that a hostname is unknown.  I have seen this happen: (1) try to load a URL in
Mozilla -- error occurs that the hostname is unknown, (2) load the same URL in
Netscape 4 -- comes right up, fast, no problem, (3) try to load the same URL
again in Mozilla -- same error occurs.

I am also frequently getting error messages claiming unknown hosts like
"ad.doubleclick.net" and other advertising servers.  While I don't mind missing
banner ads, I do mind having to dismiss the dialogs (sometimes many on a single
page, actually), and again Netscape 4 can always resolve those hostnames just
fine.  I don't know what's going on here, but the DNS performance and
reliability is absolutely pathetic.

This branch code, is it an XP resolver that will honor TTL values or is it just
an internal cache for results from the system resolver?  (It seems like both are
needed, ideally...)

If an XP resolver is needed, I'd be interested in working on that if I could
find the time.  (That's a VERY big "if".)  However, I have no idea how to
approach the problem, being mostly unfamiliar with Mozilla internals.  I know
I've got the skills that I could implement a resolver, but I have no clue how it
would be properly integrated with Mozilla.  I would be inclined to make it run
in a single thread, processing requests asynchronously, though perhaps with a
synchronous API call available for other threads to block on a result.

If this is an area that hasn't been addressed, could someone give me some
pointers on how to proceed, and how the current code works?  I'm really NOT
making any promises here, but if nobody's working on it, it wouldn't hurt for me
to take a stab at it...
sr=darin for the current DNS_BRANCH:

===================================================================
File: nsDnsService.h   	Status: Up-to-date

   Working revision:	1.27.28.4
   Repository revision:	1.27.28.4	/cvsroot/mozilla/netwerk/dns/src/nsDnsService.h,v
   Sticky Tag:		DNS_BRANCH (branch: 1.27.28)
   Sticky Date:		(none)
   Sticky Options:	(none)

===================================================================
File: nsDnsService.cpp 	Status: Up-to-date

   Working revision:	1.79.24.12
   Repository revision:	1.79.24.12	/cvsroot/mozilla/netwerk/dns/src/nsDnsService.cpp,v
   Sticky Tag:		DNS_BRANCH (branch: 1.79.24)
   Sticky Date:		(none)
   Sticky Options:	(none)
Whiteboard: r=dougt sr=darin a=?
a= asa@mozilla.org for checkin to the trunk.
(on behalf of drivers)
Whiteboard: r=dougt sr=darin a=? → r=dougt sr=darin a=asa
stop the presses!!!

my optimized winnt build is hanging at startup with this patch.  i see the
ui thread getting stuck here (shown running TestHttp):

NTDLL! 77f6829b()
KERNEL32! 77f04f41()
NSPR4! 3001a669()
NSPR4! 3001a865()
NECKO! 01794af2()
NECKO! 01794dc8()
XPCOM! 10039fa8()
XPCOM! 1003cef4()
XPCOM! 10043ed3()
XPCOM! 1003fb88()
XPCOM! 10040143()
NECKO! 017847f7()
NECKO! 01784f00()
XPCOM! 10039fa8()
XPCOM! 1003cef4()
XPCOM! 10043ed3()
XPCOM! 1003fb88()
XPCOM! 10040143()
XPC3250! 01c20bfa()
XPCOM! 10050204()
XPC3250! 01c2d851()
XPC3250! 01c33b3d()
JS3250! 00c883d2()
JS3250! 00c9351e()
JS3250! 00c88b68()
JS3250! 00c6659c()
JSLOADER! 01473b2f()
JSLOADER! 014731f5()
JSLOADER! 01472c3d()
JSLOADER! 014729c9()
JSLOADER! 01472412()
JSLOADER! 01472293()
XPCOM! 1003ebcd()
XPCOM! 10013d04()
PLDS4! 00231d42()
XPCOM! 10013ccd()
XPCOM! 10004067()
XPCOM! 1003ea74()
XPCOM! 1003e3c4()
XPCOM! 100441ab()
TESTHTTP! 0040176e()
TESTHTTP! 004017bc()
TESTHTTP! 00402693()
KERNEL32! 77f1ba06()

i suspect that this is the DNS service getting hung up.

gordon: i also needed to add the following patch to nsDnsService.cpp to get
it to compile on win32:

Index: nsDnsService.cpp
===================================================================
RCS file: /cvsroot/mozilla/netwerk/dns/src/nsDnsService.cpp,v
retrieving revision 1.79.24.12
retrieving revision 1.79.24.13
diff -u -r1.79.24.12 -r1.79.24.13
--- nsDnsService.cpp	2001/06/06 21:49:51	1.79.24.12
+++ nsDnsService.cpp	2001/06/06 23:44:44	1.79.24.13
@@ -1279,7 +1279,7 @@
         DispatchMessage(&msg);
     }
 
-    return rv;
+    return NS_OK;
 }
 
 #endif

i've gone ahead and checked in this patch on the DNS_BRANCH.
problems solved.
You guys suck. Big time.

This code that was checked in on the trunk doesn't even compile on 
BeOS and OS/2.

You could have at least let someone know that this was coming in.
Mike, your fix for OS/2 and BEOS should allow the DNS service to work, but we 
still need to add code to evict lookups after they've been stored in the dns 
cache.  I've create bug 84420 to track that, which I can fix tomorrow after 
getting the necessary reviews.

Marking this bug FIXED.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Using nightly build 08 Jun 2001.  Works like charm. 

But for google (http://www.google.com) it resolves again and again!  Why?  b/c
google has some sort of round robin dns?  

A linke 'host -a www.google.com' results
Trying "www.google.com."
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27022
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 4

;; QUESTION SECTION:
;www.google.com.                        IN      ANY

;; ANSWER SECTION:
www.google.com.         226     IN      A       216.239.37.100

;; AUTHORITY SECTION:
GOOGLE.COM.             159117  IN      NS      NS2.GOOGLE.COM.
GOOGLE.COM.             159117  IN      NS      NS1.GOOGLE.COM.
GOOGLE.COM.             159117  IN      NS      NS3.GOOGLE.COM.
GOOGLE.COM.             159117  IN      NS      NS4.GOOGLE.COM.

;; ADDITIONAL SECTION:
NS2.GOOGLE.COM.         65882   IN      A       216.239.34.10
NS1.GOOGLE.COM.         35890   IN      A       216.239.32.10
NS3.GOOGLE.COM.         45104   IN      A       216.239.36.10
NS4.GOOGLE.COM.         28997   IN      A       216.239.38.10

Received 194 bytes from 216.128.2.250#53 in 757 ms


any ideas?
LinuxLover:

If this happens on Mozilla 0.9.1 or Netscape 6.1b1, please open a new bug and
we'll go there.

I'll be marking this verified once I get done verifying through the backlog and
write a good test case.
Ben:  we didn't have DNS caching in 0.9.1 or the beta.
gordon: whatever :) The main point was to send the google problem to another
bug, unless you think it should stay here. I want to avoid a lot of bug morphing.
 looked at the tcpdump i see the dns request send every time
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
What platform are you testing?
Target Milestone: mozilla0.9.2 → mozilla0.9.4
My previous comment attachment shows much superfluous DNS activity,
in this case merely caused by loading http://news.bbc.co.uk/
hornet.cl.cam.ac.uk is the client machine running Mozilla beta 0.9.3
             (Build ID 2001080104)
wwwcache.cam.ac.uk is my local web cache
resolv0.cl.cam.ac.uk is my local DNS server
We badly need DNS caching.

Austin
This bug only seems to manifest itself when using a web cache.
A friend running 0.9.3 with Direct Connection to Internet selected in
the Advanced/Proxies preferences page doesn't see this behaviour.
I do, and I have "Automatic proxy configuration" selected, set to
  http://www.cl.cam.ac.uk/proxy.configThis config file reads as follows:
function FindProxyForURL(url, host)
{
   if(  shExpMatch( url, "*cgi*" ) ||
        shExpMatch( url, "snews:*" ) ||
        shExpMatch( url, "https:*" ) ||
        dnsDomainIs( host, "pgp.net") ||
        dnsDomainIs( host, "cam") ||
        dnsDomainIs( host, "ac") ||
        dnsDomainIs( host, "ac.uk" ) ||
        dnsDomainIs( host, "ja.net" ) ||
        dnsDomainIs( host, "www.avantek.co.uk" ) ||
        dnsDomainIs( host, "www.elsevier.nl" ) ||
        dnsDomainIs( host, "localhost" ) ||
        isPlainHostName( host ) ||
        isInNet( host, "131.111.0.0", "255.255.0.0") ||
        isInNet( host, "128.232.0.0", "255.255.0.0") ||
        isInNet( host, "129.169.0.0", "255.255.0.0") ||
        isInNet( host, "192.168.100.1", "255.255.255.0") ||
        isInNet( host, "127.0.0.1",   "255.255.255.255") )
        return "DIRECT";
   else return  "PROXY wwwcache.cam.ac.uk:8080; " +
                "DIRECT";
}
Okay, could you create a new bug describing the problem you're having with auto
proxy config, and close this bug again.  Please include the platform you're
testing on.  That will make it easier for us to try duplicating your results.

Thanks.
This bug can be closed.  I've opened a new bug report about the
interaction with automatic proxy config, Bug#94079.

Austin
restoring fixed per gordon@netscape.com 2001-06-06 21:41
Status: REOPENED → RESOLVED
Closed: 21 years ago21 years ago
OS: other → All
Resolution: --- → FIXED
Whiteboard: r=dougt sr=darin a=asa
verifying per coments
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.