Closed Bug 1802947 Opened 3 years ago Closed 1 year ago

Double DNS request when visiting www.unex.es

Categories

(Core :: Networking: DNS, defect, P2)

Firefox 106
defect

Tracking

()

RESOLVED WORKSFORME
Performance Impact none

People

(Reporter: 080ariel, Unassigned)

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:106.0) Gecko/20100101 Firefox/106.0

Steps to reproduce:

Enter the URL www.unex.es and watch for DNS querys.

Actual results:

The browser performs two DNS query for the domain "www.unex.es."
which you can see in wireshark(attached screenshot).

Expected results:

The browser only does one DNS request for "www.unex.es."

The Bugbug bot thinks this bug should belong to the 'Core::Networking: DNS' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Networking: DNS
Product: Firefox → Core
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x2020|2|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28http%2Cunex.es%29	0|0x2000|0|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x10|0|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x2020|2|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28http%2Cunex.es%29	0|0x2000|0|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x10|0|0|

Looking at about:networking#dns there do seem to be multiple entries depending on the partitionKey and flags.
The OS should be able to load entries from the cache, and not send multiple queries for the same domain, but that doesn't always work apparently.
I'm not sure if there's a way to optimize things here while not leaking timing info to trackers (meaning pages should not be able to determine if a domain was previously loaded by how fast the load occured). So I think we might be able to special case the http&https case.

Blocks: necko-perf
Whiteboard: [necko-triaged][necko-priority-review]
Severity: -- → S3
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P2

(In reply to Valentin Gosu [:valentin] (he/him) from comment #2)

www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x2020|2|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28http%2Cunex.es%29	0|0x2000|0|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x10|0|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x2020|2|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28http%2Cunex.es%29	0|0x2000|0|0|
www.unex.es	ipv4	false	158.49.17.240 113	^partitionKey=%28https%2Cunex.es%29	0|0x10|0|0|

Looking at about:networking#dns there do seem to be multiple entries depending on the partitionKey and flags.
The OS should be able to load entries from the cache, and not send multiple queries for the same domain, but that doesn't always work apparently.
I'm not sure if there's a way to optimize things here while not leaking timing info to trackers (meaning pages should not be able to determine if a domain was previously loaded by how fast the load occured). So I think we might be able to special case the http&https case.

Thanks for helping me with the bug :D

I don't have the same symptoms as you. I only have one entry in about:networking#dns:

www.unex.es	ipv4	false	158.49.17.240 102	^partitionKey=%28https%2Cunex.es%29	0|0x2021|2|0|

But it still does two DNS requests to the DNS server.
I'm using glibc stub resolver and as far as I'm aware it doesn't have a DNS cache.

Could you try to capture some HTTP logs for this?
Also add GetAddrInfo:5 to the log modules.

Flags: needinfo?(080ariel)

Log reproducing the bug with following flags added as requested: timestamp,sync,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5,GetAddrInfo:5

Flags: needinfo?(080ariel)

I can't reproduce the bug when visiting view-source:www.unex.es, only www.unex.es.

Sorry for comment spamming but I think this is useful info.

Thank you for the logs. They are quite useful.

2022-12-03 19:17:54.287877 UTC - [Parent 10221: DNS Resolver #3]: D/GetAddrInfo [DNS]: Resolving www.unex.es using PR_GetAddrInfoByName
2022-12-03 19:17:54.491459 UTC - [Parent 10221: DNS Resolver #2]: D/GetAddrInfo [DNS]: Resolving www.unex.es using PR_GetAddrInfoByName

We can see here 2 calls to the system resolver.
They seem to be driven by these calls - it seems the problem is that we have one channel without a partition key.

2022-12-03 19:17:54.287697 UTC - [Parent 10221: Main Thread]: D/nsHostResolver Resolving host [www.unex.es]<^partitionKey=%28https%2Cunex.es%29> type 65. [this=7fa13d663c10]
2022-12-03 19:17:54.287844 UTC - [Parent 10221: Socket Thread]: D/nsHostResolver Resolving host [www.unex.es]<^partitionKey=%28https%2Cunex.es%29> type 0. [this=7fa13d663c10]
2022-12-03 19:17:54.491422 UTC - [Parent 10221: Main Thread]: D/nsHostResolver Resolving host [www.unex.es]<> type 0. [this=7fa13d663c10]

It the DNS request without an origin suffix is coming from:

NeckoParent::RecvHTMLDNSPrefetch
HTMLDNSPrefetch::Prefetch
nsDNSService::AsyncResolveInternal

This happens when I open the new tab page. I assume it's because it runs in a priviledged context, it has no originAttributes.

Right, so, the DNS prefetch happens in the content process, when the HTMLAnchorElement gets added to the page.
We take the origin attributes from the parent page, which is about:newtab - which doesn't have any originAttributes.
So, this should be relatively benign, and should actually go away if you replace your new tab page with Blank Page in about:preferences#home.
This shouldn't cause any performance issue, as any new navigations would not be affected.

I'll keep this bug open to deal with the problem I mentioned in comment 2: That we sometimes resolve with different origin attributes when doing a HTTPS upgrade - that might actually be a performance issue.

Thanks for doing all the hard work.

So, this should be relatively benign, and should actually go away if you replace your new tab page with Blank Page in about:preferences#home.

I already have that settings.

Exact steps I take to repro the bug(with both new tab page and homepage set to Blank Page):
1. Open Firefox.
2. Type www.unex.es without clicking anything. For me it autoselects the address bar so I can
type the url directly.
3. Press Enter.

I can send a video if it helps.

Again, thanks for the help.

Could you also see if the same happens if you set browser.urlbar.speculativeConnect.enabled to false?
Also try setting browser.places.speculativeConnect.enabled to false. It's most likely caused by one of these.

Flags: needinfo?(080ariel)

(In reply to Valentin Gosu [:valentin] (he/him) from comment #11)

Could you also see if the same happens if you set browser.urlbar.speculativeConnect.enabled to false?
Also try setting browser.places.speculativeConnect.enabled to false. It's most likely caused by one of these.

Keeps happening with both of them set to false.
I already had browser.urlbar.speculativeConnect.enabled set to false, but not browser.places.speculativeConnect.enabled.

Flags: needinfo?(080ariel)

Thank you for that information. I might be able to give you a special build that prints something whenever that condition occurs.

Attached file backtraceFirstCall (obsolete) —
A full backtrace of the problem. First call to the resolver: ` `

Managed to get some backtraces of the duplicated call to the DNS resolver querying the address of www.unex.es using a fresh copy of mozilla-central. The lines in nsprpub/pr/src/misc/prnetdb.c are not the upstream ones because I added some code to aid with the backtraces.

Attachment #9307240 - Attachment is obsolete: true

Unfortunately because the DNS resolution is dispatched to a different thread than the consumer that initiated it, the backtrace doesn't tell us much. I'm interested in what piece of the code initiates a DNS resolution for www.unex.es with an empty origin attributes.

The call with empty origin attributes seems to be the first one. aOriginAttributes has the following data according to GDB:

(gdb) p aOriginAttributes $1 = (const mozilla::OriginAttributes &) @0x7fff9d16c528: {<mozilla::dom::OriginAttributesDictionary> = {<mozilla::dom::DictionaryBase> = {mIsAnyMemberPresent = true}, mFirstPartyDomain = <gNullChar> u"", mGeckoViewSessionContextId = <gNullChar> u"", mInIsolatedMozBrowser = false, mPartitionKey = <gNullChar> u"", mPrivateBrowsingId = 0, mUserContextId = 0}, <No data fields>}

The backtrace leading to that call to ResolveHost with that aOriginAttributes is the following:

#0 nsHostResolver::ResolveHost(nsTSubstring<char> const&, nsTSubstring<char> const&, int, unsigned short, mozilla::OriginAttributes const&, unsigned short, unsigned short, nsResolveHostCallback*) (this=0x7ffff78f7e00, aHost="www.unex.es", aTrrServer=<gNullChar> "", aPort=-1, type=0, aOriginAttributes=..., flags=20, af=0, aCallback=0x7fff9d16c4c0) at /mozilla-unified/netwerk/dns/nsHostResolver.cpp:425 #1 nsDNSService::AsyncResolveInternal(nsTSubstring<char> const&, unsigned short, unsigned int, nsIDNSAdditionalInfo*, nsIDNSListener*, nsIEventTarget*, mozilla::OriginAttributes const&, nsICancelable**) (this=0x7fffd3da23a0, aHostname="www.unex.es", type=0, flags=20, aInfo=0x0, aListener=0x7fffa60aebe0, target_=0x0, aOriginAttributes=..., result=0x7fffffffb588) at /mozilla-unified/netwerk/dns/nsDNSService2.cpp:1053 #2 nsDNSService::AsyncResolveNative(nsTSubstring<char> const&, nsIDNSService::ResolveType, unsigned int, nsIDNSAdditionalInfo*, nsIDNSListener*, nsIEventTarget*, mozilla::OriginAttributes const&, nsICancelable**) (this=0x7fffd3da23a0, aHostname="www.unex.es", aType=nsIDNSService::RESOLVE_TYPE_DEFAULT, flags=20, aInfo=0x0, aListener=0x7fffa60aebe0, target_=0x0, aOriginAttributes=..., result=0x7fffffffb588) at /mozilla-unified/netwerk/dns/nsDNSService2.cpp:1126 #3 non-virtual thunk to nsDNSService::AsyncResolveNative(nsTSubstring<char> const&, nsIDNSService::ResolveType, unsigned int, nsIDNSAdditionalInfo*, nsIDNSListener*, nsIEventTarget*, mozilla::OriginAttributes const&, nsICancelable**) () at /mozilla-unified/netwerk/dns/nsDNSService2.cpp:1126 #4 mozilla::net::Predictor::RunPredictions(nsIURI*, mozilla::OriginAttributes const&, nsINetworkPredictorVerifier*) (this=0x7fff9978b9d0, referrer=0x7fff9d1a1c10, originAttributes=..., verifier=0x0) at /mozilla-unified/netwerk/base/Predictor.cpp:1180 #5 mozilla::net::Predictor::PredictForPageload(nsICacheEntry*, nsIURI*, unsigned char, bool, nsINetworkPredictorVerifier*) (this=0x7fff9978b9d0, entry=0x7fff968e4490, targetURI=0x7fff9d1a1c10, stackCount=0 '\000', fullUri=false, verifier=0x0) at /mozilla-unified/netwerk/base/Predictor.cpp:750 #6 mozilla::net::Predictor::PredictInternal(unsigned int, nsICacheEntry*, bool, bool, nsIURI*, nsINetworkPredictorVerifier*, unsigned char) (this=0x7fff9978b9d0, reason=1, entry=0x7fff968e4490, isNew=false, fullUri=false, targetURI=0x7fff9d1a1c10, verifier=0x0, stackCount=0 '\000') at /mozilla-unified/netwerk/base/Predictor.cpp:647 #7 mozilla::net::Predictor::Action::OnCacheEntryAvailable(nsICacheEntry*, bool, nsresult) (this=0x7fff9d187880, entry=0x7fff968e4490, isNew=false, result=nsresult::NS_OK) at /mozilla-unified/netwerk/base/Predictor.cpp:206 #8 mozilla::net::CacheEntry::InvokeAvailableCallback(mozilla::net::CacheEntry::Callback const&) (this=0x7fff96a1ff60, aCallback=...) at /mozilla-unified/netwerk/cache2/CacheEntry.cpp:870 #9 mozilla::net::CacheEntry::AvailableCallbackRunnable::Run() (this=0x7fff9feef4c0) at /mozilla-unified/netwerk/cache2/CacheEntry.h:231 #10 mozilla::RunnableTask::Run() (this=0x7fff99632900) at /mozilla-unified/xpcom/threads/TaskController.cpp:538 #11 mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) (this=0x7fffd2f62700, aProofOfLock=...) at /mozilla-unified/xpcom/threads/TaskController.cpp:851 #12 mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) (this=0x7fffd2f62700, aProofOfLock=...) at /mozilla-unified/xpcom/threads/TaskController.cpp:683 #13 mozilla::TaskController::ProcessPendingMTTask(bool) (this=0x7fffd2f62700, aMayWait=false) at /mozilla-unified/xpcom/threads/TaskController.cpp:461 #14 mozilla::TaskController::InitializeInternal()::$_2::operator()() const (this=0x7fffd2f2b158) at /mozilla-unified/xpcom/threads/TaskController.cpp:187 #15 mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_2>::Run() (this=0x7fffd2f2b130) at /mozilla-unified/obj-x86_64-pc-linux-gnu/dist/include/nsThreadUtils.h:546 #16 nsThread::ProcessNextEvent(bool, bool*) (this=0x7ffff7847820, aMayWait=false, aResult=0x7fffffffc897) at /mozilla-unified/xpcom/threads/nsThread.cpp:1204 #17 NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7ffff7847820, aMayWait=false) at /mozilla-unified/xpcom/threads/nsThreadUtils.cpp:474 #18 mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (this=0x7fffd2f3eb00, aDelegate=0x7ffff7827b60) at /mozilla-unified/ipc/glue/MessagePump.cpp:85 #19 MessageLoop::RunInternal() (this=0x7ffff7827b60) at /mozilla-unified/ipc/chromium/src/base/message_loop.cc:381 #20 MessageLoop::RunHandler() (this=0x7ffff7827b60) at /mozilla-unified/ipc/chromium/src/base/message_loop.cc:374 #21 MessageLoop::Run() (this=0x7ffff7827b60) at /mozilla-unified/ipc/chromium/src/base/message_loop.cc:356 #22 nsBaseAppShell::Run() (this=0x7fffd3d49c00) at /mozilla-unified/widget/nsBaseAppShell.cpp:150 #23 nsAppStartup::Run() (this=0x7fffd2f443d0) at /mozilla-unified/toolkit/components/startup/nsAppStartup.cpp:295 #24 XREMain::XRE_mainRun() (this=0x7fffffffd340) at /mozilla-unified/toolkit/xre/nsAppRunner.cpp:5767 #25 XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) (this=0x7fffffffd340, argc=4, argv=0x7fffffffe6d8, aConfig=...) at /mozilla-unified/toolkit/xre/nsAppRunner.cpp:5960 #26 XRE_main(int, char**, mozilla::BootstrapConfig const&) (argc=4, argv=0x7fffffffe6d8, aConfig=...) at /mozilla-unified/toolkit/xre/nsAppRunner.cpp:6016 #27 mozilla::BootstrapImpl::XRE_main(int, char**, mozilla::BootstrapConfig const&) (this=0x7ffff7803660, argc=4, argv=0x7fffffffe6d8, aConfig=...) at /mozilla-unified/toolkit/xre/Bootstrap.cpp:45 #28 0x00005555555e6c14 in do_main(int, char**, char**) (argc=4, argv=0x7fffffffe6d8, envp=0x7fffffffe700) at /mozilla-unified/browser/app/nsBrowserApp.cpp:226 #29 0x00005555555e662f in main(int, char**, char**) (argc=4, argv=0x7fffffffe6d8, envp=0x7fffffffe700) at /mozilla-unified/browser/app/nsBrowserApp.cpp:430
Full backtrace is in attachment full_bt.txt

Ah, that's actually super useful, thank you!
It seems this is caused by the predictor. You can turn it off by setting network.predictor.enabled to false.

Well, this is embarrasing: the cause of the bug in my system was uBlock trying to get the cname of the hostname . (https://github.com/gorhill/uBlock/blob/master/platform/firefox/vapi-background-ext.js#L266). I already had network.predictor.enabled set to false. If only I had done the obvious thing first...

I still have seen some unrelated weird things with DNS and Firefox trying to solve this bug:

  • nspr calls gethostbyname_r() in glibc in my system which is deprecated. I think it does that because it cannot initialize IPv6.
  • Ctrl + Shift + R which I believe is "Reload without cache" does lots of DNS requests to the same domain.
  • In Firefox startup there are a lot of duplicated DNS queries to mozilla domains.
  • The predictor DNS call is not working.

They might be bugs but even if they are(not very sure about that) they are very very low impact and I do not think they deserve their own Bug report. Feel free to comment if have insight into any of this or want me to open a more detailed bug report.

Also feel free to close the bug or leave it open to address the comment 2 problem if you want.

Thanks for all the help :D

Jesús - does this happen (with ublock) frequently (or at all) with other sites? I.e. is this a common result of using UBlock?

Flags: needinfo?(080ariel)

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #21)

Jesús - does this happen (with ublock) frequently (or at all) with other sites? I.e. is this a common result of using UBlock?

I think this is because of the browser.dns.resolve webextension API. uBlock is using it to get the cname of the domain, to do some extra filtering.
It uses an empty originAttributes dictionary for the requests.

Flags: needinfo?(080ariel)
Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged]
Performance Impact: --- → none

@valentin- removing from necko-perf; you wanted to keep it open to deal with some stuff from comment 2. Is this still the case given comment 20?

No longer blocks: necko-perf
Flags: needinfo?(valentin.gosu)

Considering we don't see this for regular browsing, I think it's OK to close it. Thanks!

Status: NEW → RESOLVED
Closed: 1 year ago
Flags: needinfo?(valentin.gosu)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: