Closed Bug 938803 Opened 6 years ago Closed 6 years ago

DNS: nsDNSPrefetch should bypass cache when load flags request it

Categories

(Core :: Networking: DNS, enhancement)

25 Branch
enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla28

People

(Reporter: sworkman, Assigned: sworkman)

Details

(Whiteboard: [qa-])

Attachments

(5 files, 6 obsolete files)

15.62 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
5.31 KB, patch
sworkman
: review+
Details | Diff | Splinter Review
11.41 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
1.50 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
10.47 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
nsDNSPrefetch (in nsHttpChannel) and nsHTMLDNSPrefetch (HTMLAnchorElement) do not reflect load flags with respect to cache bypassing.

For nsDNSPrefetch in nsHttpChannel, this means a lost opportunity to prefetch the IP address in cases where transactions/connections are queued up. nsSocketTransport will honor the load flag and refresh the DNS cache entry anyway, so this is not an issue of the cache entry being refreshed or not, but when it is refreshed. Using nsDNSPrefetch, but the request could have been sent earlier, especially in cases when the HTTP request is queued.

For nsHTMLDNSPrefetch in HTMLAnchorElement, it means that hard refreshes for a page will not refresh the DNS cache entries for the links. It may be debatable whether a hard refresh should also refresh these link prefetches or not. But if the user is refreshing the page due to some change in network conditions, it seems like a good idea to update these cache entries too.

nsHTMLDNSPrefetch is also used with nsContentSink, but I'm not sure how that's supposed to work.
Sneaking this one in to improve the readability and information content of the debug logs in nsHostResolver::ResolveHost.
Attachment #832503 - Flags: review?(mcmanus)
-- Changes nsDNSPrefetch to support cache bypassing.
-- Changes nsHttpChannel::BeginConnect so that DNS refreshes are requested through nsDNSPrefetch in order to bypass the cache early. If the prefetch is a refresh, then mCaps is set such that the socket will NOT request a refresh later.

In addition, it's possible that the AsyncOpens relating to resources on the page may happen close enough together to coalesce the prefetched DNS requests into fewer (or one) results in nsHostResolver. My impression is that AsyncOpens happen closer to each other in time than socket connects initiated by nsHttpConnectionMgr. This seems especially with slow connections and hard page refreshes, where each resource has its own connection - if we can get the DNS refreshes to happen earlier and with some more coalescing, the sockets should connect faster.

I could be wrong here and have missed a scenario, so please come back at me.
Attachment #832522 - Flags: review?(mcmanus)
-- Changes nsHTMLDNSPrefetch to support bypassing the DNS cache.

The next patched will make use of this in HTMLAnchorElement and nsContentSink.
Attachment #832525 - Flags: review?(mcmanus)
-- Passes OwnerDoc() into nsHTMLDNSPrefetch::PrefetchLow to get load flags and consequently find out if the DNS cache should be bypassed.

Kyle, I'm not sure if OwnerDoc() is the right object to pass in here. But basically the idea is to get load flags to find out if the page was loaded with a hard refresh or if the page was meant to bypass the cache in some other way. At the moment, we're not honoring these load flags for link prefetching. (Note: this is different from the Seer - this is just resolving hostnames for links on a page, not trying to predict what a user will do next).

I'm open to opinions that say we shouldn't do this either.
Attachment #832536 - Flags: review?(khuey)
Feedback for this one only, because I'm not familiar with nsContentSink. Unsure if I'm using the right document pointer here as well.
Attachment #832540 - Flags: feedback?(khuey)
Comment on attachment 832503 [details] [diff] [review]
v1.0 Improve DNS debugs to show which path is taken in nsHostResolver::ResolveHost

Review of attachment 832503 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/dns/nsHostResolver.cpp
@@ +684,4 @@
>                      // Add callback to the list of pending callbacks.
>                      PR_APPEND_LINK(callback, &he->rec->callbacks);
>                      he->rec->flags = flags;
> +                    rv = IssueLookup(he->rec);

nice catch
Attachment #832503 - Flags: review?(mcmanus) → review+
Comment on attachment 832522 [details] [diff] [review]
v1.0 Change nsDNSPrefetch and nsHttpChannel::BeginConnect to honor cache bypassing in load flags

Review of attachment 832522 [details] [diff] [review]:
-----------------------------------------------------------------

this is a nice tweak!

::: netwerk/base/src/nsDNSPrefetch.cpp
@@ +57,5 @@
>                                       this, nullptr, getter_AddRefs(tmpOutstanding));
>  }
>  
>  nsresult
> +nsDNSPrefetch::PrefetchLow(bool refreshDNS /* = false */)

please don't put the default val comments in the c++ implementation

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +4513,5 @@
>  
>      // if this somehow fails we can go on without it
>      gHttpHandler->AddConnectionHeader(&mRequestHead.Headers(), mCaps);
>  
> +    if ((mLoadFlags & VALIDATE_ALWAYS) ||

move if to one line
Attachment #832522 - Flags: review?(mcmanus) → review+
Comment on attachment 832525 [details] [diff] [review]
v1.0 Bug 938803 - Change nsHTMLDNSPrefetch to honor cache bypassing in load flags

Review of attachment 832525 [details] [diff] [review]:
-----------------------------------------------------------------

This is an interesting patch Steve!

I don't think we want to do this in quite this way 1] because the DNS prefetecher is highly cache dependent.. and 2] it doesn't strike me that just because page foo is being reloaded the hrefs inside the new page are also invalid.. in most cases I wouldn't think they would be stale.

can you file a bug (if we don't have one) that we need to have a generic cache clearing routine that gets called on network changes? Its way past time we had that (and we can debate there what kind of stuff it triggers.. obviously not the disk object cache, but DNS almost certainly.)

::: content/html/content/src/nsHTMLDNSPrefetch.cpp
@@ +104,5 @@
> +{
> +  nsCOMPtr<nsILoadGroup> loadGroup = aDocument->GetDocumentLoadGroup();
> +  if (loadGroup) {
> +    nsLoadFlags loadFlags = 0;
> +    loadGroup->GetLoadFlags(&loadFlags);

check return value

@@ +107,5 @@
> +    nsLoadFlags loadFlags = 0;
> +    loadGroup->GetLoadFlags(&loadFlags);
> +    return loadFlags & nsIRequest::VALIDATE_ALWAYS ||
> +           loadFlags & (nsIRequest::LOAD_BYPASS_CACHE |
> +                        nsICachingChannel::LOAD_BYPASS_LOCAL_CACHE);

a little clearer to write this as

loadFlags & (a | b | c)
Attachment #832525 - Flags: review?(mcmanus) → review-
(In reply to Patrick McManus [:mcmanus] from comment #8)
> Comment on attachment 832525 [details] [diff] [review]
>
> I don't think we want to do this in quite this way 1] because the DNS
> prefetecher is highly cache dependent.. and 2] it doesn't strike me that
> just because page foo is being reloaded the hrefs inside the new page are
> also invalid.. in most cases I wouldn't think they would be stale.

Yeah - I wasn't as convinced about this one as the prefetch in nsHttpChannel, and I'm not super surprised you're not convinced. I'm ok leaving it.

> can you file a bug (if we don't have one) that we need to have a generic
> cache clearing routine that gets called on network changes? Its way past
> time we had that (and we can debate there what kind of stuff it triggers..
> obviously not the disk object cache, but DNS almost certainly.)

Agreed. This is on my todo list for lame-network stuff. I'd also like to discuss dropping HTTP connections on network changes - I have observed on my mac (running 10.8) that Safari detects Wifi on/off events instantly, where we (and Chrome) keep spinning. Firefox keeps sockets around until the OS notifies us through poll(); this takes around a minute. Seems like this is related to NetworkManager and NetworkLinkDetection - I've opened a meta bug to manage that one, as I see multiple bugs to do with NetworkManager, but not one specifically to do with Necko responding to it.

We can discuss options more there - Bug 939318.
Summary: DNS: nsDNSPrefetch and nsHTMLDNSPrefetch should bypass cache when load flags request it → DNS: nsDNSPrefetch should bypass cache when load flags request it
Landed 1st two patches r+d by Patrick.

https://hg.mozilla.org/integration/mozilla-inbound/rev/1431d2e9c5f7
https://hg.mozilla.org/integration/mozilla-inbound/rev/dba5cd88962a

Will remove r? for other patches since we're not taking that route.
Attachment #832536 - Flags: review?(khuey)
Attachment #832540 - Flags: feedback?(khuey)
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/1431d2e9c5f7 and https://hg.mozilla.org/integration/mozilla-inbound/rev/dba5cd88962a along with bug 923458 for being one of the possible causes of breaking non-10.7 xpcshell tests like this: https://tbpl.mozilla.org/php/getParsedLog.php?id=30720160&tree=Mozilla-Inbound

Neither push build on OSX, so I don't know which was the cause.
This updated patch should fix the xpcshell-test failure: the problem was that nsSocketTransport was getting to ResolveHost before nsDNSPrefetch had completed its lookup. Since the REFRESH_DNS flag for nsSocketTransport was unset, it resulted in nsSocketTransport getting an old DNS cache entry. In this case, addresses had been blacklisted and the connection failed, causing the test to fail.

So, the only difference between this patch and the last is that this one doesn't unset REFRESH_DNS in mCaps. If DNS refreshing is desired, it will be added to the nsDNSPrefetch request AND the DNS request carried out by nsSocketTransport.

So, no nice prefetch optimization here. Still, it makes nsDNSPrefetch obey the load flags, and it might be quicker for those cases where the nsSocketTransport tries to ResolveHost towards the end of the DNS lookup - nsHostResolver should coalesce the requests in those cases. If RequestHost is called after this time, that should be the same as current behavior, i.e. the socket will wait on a lookup via getaddrinfo.

Carrying forward the r+ on this one since it's a minor change.

I have a follow up patch that tries to avoid the second lookup, but it's not as elegant and might not be worth pursuing (or there are more important things to work on at the moment). I'll upload that next for feedback.
Attachment #832522 - Attachment is obsolete: true
Attachment #8334893 - Flags: review+
-- Add nsHttpChannel::OnLookupComplete callback function: use it to reset REFRESH_DNS if it was previously requested, both for nsHttpChannel::mCaps and the transactions Caps (depends on concrete class implementing nsAHttpTransaction).

-- Unset mDNSPrefetch in OnLookupComplete, and copy domain lookup timings into mTransactionTimings. This is identical to behavior in nsHttpChannel::OnStopRequest. It does mean, however, that DNS prefetch timings will not be used in GetDomainLookupStart if the prefetch has completed and OnStopRequest is yet to be called. In that case, timings from the transaction will be used. This seems like a relatively innocuous change in behavior to me, but those could be famous last words.

-- Add nsAHttpTransaction::SetCaps, and for those classes which don't do a noop for Caps, add implementations of SetCaps. In addition, change declaration of mCaps to be Atomic for threadsafe reading and writing, since mTransaction is used on both the main thread and the socket thread.

With these changes, DNS prefetches that complete before nsHttpConnectionMgr calls MakeNewConnection will not try to do a refresh. But if the prefetch has not completed, nsSocketTransport will still try to bypass the cache, as requested.

Let me know what you think Pat. I don't want to spend a lot of time on this one at the moment, so if you think this is a little ugly, or the Atomic/threadsafety is sketchy, then I'm fine skipping on it.

(Note: Also setting the other patches to obsolete to clean up the attachments list).
Attachment #832525 - Attachment is obsolete: true
Attachment #832536 - Attachment is obsolete: true
Attachment #832540 - Attachment is obsolete: true
Attachment #8334903 - Flags: feedback?(mcmanus)
Whiteboard: [leave open]
Comment on attachment 8334903 [details] [diff] [review]
v1.0 Improve the benefit of nsDNSPrefetch in nsHttpChannel::BeginConnect

Review of attachment 8334903 [details] [diff] [review]:
-----------------------------------------------------------------

I think the only real problem with your patch is that despite making mCaps atomic SetCaps(GetCaps() & REFRESH) boils down to read, modify, write and that sequence is open to race conditions of the other thread doing something similar but with a totally different flag resulting in data loss.

you could do something like a mtransaction::mdnswasrefreshed flag which is defined to only ever transition in one place from false to true (being written by the main thread). Make sure to make it a uint32 so it is safely updated. The socket thread, upon finding Caps & REFRESH, could check that flag to see if it could turn off REFRESH itself. That's racy, but its safe and the behavior is conservative if you loose the race. Then we get rid of all that new setcaps code as well.

what do you think of that?
Attachment #8334903 - Flags: feedback?(mcmanus)
Layering the patches: this first one doesn't touch any nsAHttpTransaction classes; rather it just sets nsHttpChannel::mCaps if needed.
Attachment #8334903 - Attachment is obsolete: true
Attachment #8337008 - Flags: review?(mcmanus)
I like your idea, Pat (and of course I should have locked the whole get and unset - but it's removed now anyway).

So, this patch goes on top of the last one (attachment 8337008 [details] [diff] [review]), and adds SetDNSWasRefreshed to classes implementing nsAHttpTransaction. I only see three cases where NS_HTTP_REFRESH_DNS can be unset, i.e. nsHttpTransaction, NullHttpTransaction and nsHttpPipeline (which calls nsHttpTransaction internally). Maybe I missed something in the SPDY classes.

Let me know if this is what you meant.

FYI: I did a quick test locally, googling 'test', then doing a hard refresh - I see 7 http channels opened for 'www.google.com'; without the patch, 8 calls to getaddrinfo; with the patch only 7. So, for that case, one call to getaddrinfo saved: hopefully other cases requiring DNS refreshes will see a little bit of redundancy removed.
Attachment #8337014 - Flags: review?(mcmanus)
I'm sorry I'm going to mess with this patch some more - I really don't like to do that.. but I think we can make it stronger.

instead of mDnsWasRefreshed can we change that to mCapsToClear variable? And then the definition of caps() is always mCaps & ~mCapsToClear.. and SetDnsRefereshed() just becomes mCapsToClear |= NS_HTTP_REFRESH_DNS.. I think that makes it more clear what's going on and it can be extended to other caps that the main thread might want to fuss with.
Attachment #8337008 - Flags: review?(mcmanus)
Attachment #8337014 - Flags: review?(mcmanus)
A little ride-along patch - typo in the patch for Bug 941884. Should have been a null check for addr_info in 'unspecHe' and not 'he'. Noticed it while verifying the other patches.

It's just a one-liner, but I can open a new bug for this one if you'd like.
Attachment #8338039 - Flags: review?(mcmanus)
Updated per Pat's request.
Attachment #8337014 - Attachment is obsolete: true
Attachment #8338041 - Flags: review?(mcmanus)
Attachment #8337008 - Flags: review?(mcmanus)
Attachment #8338039 - Flags: review?(mcmanus) → review+
Comment on attachment 8338039 [details] [diff] [review]
v1.0 Bug 938803 - Correct typo in nsHostResolver::ResolveHost for Bug 941884: 'he' to 'unspecHe'

Thanks Pat!

https://hg.mozilla.org/integration/mozilla-inbound/rev/9d17f28e30dd
Attachment #8338041 - Flags: review?(mcmanus) → review+
Attachment #8337008 - Flags: review?(mcmanus) → review+
Thanks Wes!

I removed the offending assertion - it's actually misplaced to include it, since OnStopRequest could have been called already, resulting in a null mDNSPrefetch, e.g. a page load is canceled, or loaded from the cache - in both these cases OnStopRequest could have completed, nulling mDNSPrefetch before OnLookupRequest could be called.

Trying again:

https://hg.mozilla.org/integration/mozilla-inbound/rev/fb6205ce5289
https://hg.mozilla.org/integration/mozilla-inbound/rev/dd984f4c205f
Should have removed [leave open] from whiteboard earlier. Setting RESOLVED FIXED.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → mozilla28
Ooops. Thanks Ryan!
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.