Warm up DNS on android Intent invocation
Categories
(Core :: Networking, enhancement, P1)
Tracking
()
People
(Reporter: jesup, Assigned: kaya, NeedInfo)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged][necko-priority-queue][fxdroid][group1])
Attachments
(1 file)
Really mostly a Fenix thing - when we're started by intent, and gecko isn't running, in parallel with starting gecko and sending it the URL, start a DNS lookup of the target domain. This will get the OS to send out the DNS request earlier instead of waiting for Gecko to be up and processing the initial URL.
For DoH, we could move DoH to a separate process which may either survive or could be started far faster than all of gecko (and be started in parallel to gecko). This will be considerably more work, since DoH would need to become standalone, and we'd need to modify Gecko to use an external DoH service.
Comment 1•9 months ago
|
||
I was thinking maybe we could also postpone other startup activities that may delay the initial page load - such as update checking and a bunch of things that usually happen after browser-delayed-startup-finished. I see there are some gecko-view specific ones. that we do there
I think the DNS cache preload might be an easy and impactful first step though.
Reporter | ||
Updated•3 months ago
|
Comment 2•3 months ago
|
||
Perf folks are about to land our first test which should cover this scenario
https://bugzilla.mozilla.org/show_bug.cgi?id=1898221
Reporter | ||
Updated•2 months ago
|
Assignee | ||
Updated•2 months ago
|
Assignee | ||
Updated•2 months ago
|
Updated•2 months ago
|
Assignee | ||
Comment 3•2 months ago
|
||
Comment 4•1 month ago
|
||
Was discussing this with Kaya and Randell and perf folks.
In terms of verifying the impact of the early dns request, it would be easier if:
• we flush the OS dns cache (ideally via ADB since that's how the test is run. But perhaps there is a better way)
• we introduce artificial delays into the network's DNS resolution, which would be via UDP in this case (not sure if RogersInABox also delays UDP traffic)
One way to do this:
• Create a Wifi Hot Spot on your MacBook, and use Network Link Conditioner to delay all traffic
• Connect the android device to the MacBook
• Use WireShark on the wifi interface on MacBook to verify that the "warm up" DNS request is made (make it to a dummy host, just for testing)
We likely won't be able to get all of these aspects into CI, but if we can verify them locally, that should be sufficient.
Comment 5•1 month ago
|
||
(In reply to Andrew Creskey [:acreskey] from comment #4)
• we introduce artificial delays into the network's DNS resolution, which would be via UDP in this case (not sure if RogersInABox also delays UDP traffic)
For manual testing I have a DNS server implementation that we can use to delay the DNS response:
https://github.com/valenting/dev-dns-server
We can add a setTimeout to this line.
We can also use the console.log messages to see when requests are received and responses are sent.
In automation I'm not sure we can change the DNS servers, or if that would actually break the test harness.
Comment 6•14 days ago
|
||
Let me see if I can evaluate the impact of the WIP patch.
Comment 7•14 days ago
|
||
This is looking promising in that the warmupDNS
coroutine from the WIP patch ends up making the DNS request about 200
milliseconds before the app link pageload request.
I'm hotspotting from my macbook and observing the traffic via Wireshark.
I've modified your patch, Kaya, so that the warmUpDns call is always to a random host name (so it's not cached), e.g. invalid_host_name700tb
I'm also changing the applink test URL on every run so that it's also not cached. Using www.chase.com
and www.etsy.com
here.
But you can see the warmup request starting about 200ms before the actual. (time in seconds, is the second column)
1012 31.365202 192.168.2.5 192.168.2.1 DNS 82 Standard query 0x5762 A invalid_host_name700tb
1013 31.371331 192.168.2.1 192.168.2.5 DNS 82 Standard query response 0x5762 A invalid_host_name700tb
1014 31.555525 192.168.2.5 192.168.2.1 DNS 73 Standard query 0x76f5 HTTPS www.chase.com
1015 31.558283 192.168.2.5 192.168.2.1 DNS 73 Standard query 0x740e A www.chase.com
799 29.511546 192.168.2.5 192.168.2.1 DNS 82 Standard query 0xa475 A invalid_host_nameagjya
800 29.516334 192.168.2.1 192.168.2.5 DNS 82 Standard query response 0xa475 A invalid_host_nameagjya
801 29.726112 192.168.2.5 192.168.2.1 DNS 72 Standard query 0x35e0 HTTPS www.etsy.com
802 29.734264 192.168.2.5 192.168.2.1 DNS 72 Standard query 0x869b A www.etsy.com
803 29.770818 192.168.2.1 192.168.2.5 DNS 189 Standard query response 0x869b A www.etsy.com CNAME zone1.www.etsy.com CNAME etsy.map.fastly.net A 151.101.1.224 A 151.101.65.224 A 151.101.129.224 A 151.101.193.224
805 29.788772 192.168.2.1 192.168.2.5 DNS 183 Standard query response 0x35e0 HTTPS www.etsy.com CNAME zone1.www.etsy.com CNAME etsy.map.fastly.net SOA ns1.fastly.net
Next I'll see how it impacts resolution time when warming up DNS for the actual host.
Comment 8•14 days ago
•
|
||
This patch looks to be working as intended -- we make the early DNS A record lookup via the warmup coroutine and it gets used.
Note that we still make the HTTS RR record later on (more on this in a bit).
With the warmup, applink to www.nfl.com:
Note the early A
record lookup and the HTTPS
lookup that follows about 270ms later at 52.189114
:
9391 51.919297 192.168.2.5 192.168.2.1 DNS 71 Standard query 0x01ca A www.nfl.com
9501 51.949922 192.168.2.1 192.168.2.5 DNS 174 Standard query response 0x01ca A www.nfl.com CNAME global.nfl.map.fastly.net A 151.101.129.153 A 151.101.193.153 A 151.101.1.153 A 151.101.65.153
9854 52.189114 192.168.2.5 192.168.2.1 DNS 71 Standard query 0xefd3 HTTPS www.nfl.com
9861 52.219438 192.168.2.1 192.168.2.5 DNS 168 Standard query response 0xefd3 HTTPS www.nfl.com CNAME global.nfl.map.fastly.net SOA ns1.fastly.net
Without the warmup, applink to ww.canada.ca
Note how necko makes the two lookups, A
and HTTPS
at 26.588626
and 26.605788
8173 26.588626 192.168.2.5 192.168.2.1 DNS 73 Standard query 0xd783 HTTPS www.canada.ca
8193 26.605788 192.168.2.5 192.168.2.1 DNS 73 Standard query 0xb3dc A www.canada.ca
8297 26.651477 192.168.2.1 192.168.2.5 DNS 164 Standard query response 0xb3dc A www.canada.ca CNAME www.canada.ca.edgekey.net CNAME e4073.dscb.akamaiedge.net A 184.26.192.192
8298 26.652391 192.168.2.1 192.168.2.5 DNS 212 Standard query response 0xd783 HTTPS www.canada.ca CNAME www.canada.ca.edgekey.net CNAME e4073.dscb.akamaiedge.net SOA n0dscb.akamaiedge.net
Sometimes the improvements can be seen via the performance timing api in the dom, i.e. performance.timing
, but it's not consistent.
I don't think we'll be able to measure anything in the applink startup test because ~45ms is likely within the noise. (And the test needs to run with a single iteration of a unique host every time).
But from the wireshark logs this looks to be a good improvement.
However, before we land this, there are a couple of items to resolve:
1 - We are planning on rolling out DoH on Android, bug 1801530, sooner rather than later. We don't want to leak the applink host via cleartext dns when DoH is enabled, so this warmupDNS code shouldn't run in that case
2 - In bug 1852752 we enabled HTTPS resource records (we race them against A records in native DNS). This patch will mean that we end up using the HTTPS RR less frequently for applink scenarios (probably not at all). Not sure if that's critical.
Comment 9•10 days ago
|
||
(In reply to Andrew Creskey [:acreskey] from comment #8)
However, before we land this, there are a couple of items to resolve:
1 - We are planning on rolling out DoH on Android, bug 1801530, sooner rather than later. We don't want to leak the applink host via cleartext dns when DoH is enabled, so this warmupDNS code shouldn't run in that case
I wouldn't block on that. Let's file a bug blocking bug 1801530 to make sure enabling DoH disables the warmupDNS code.
2 - In bug 1852752 we enabled HTTPS resource records (we race them against A records in native DNS). This patch will mean that we end up using the HTTPS RR less frequently for applink scenarios (probably not at all). Not sure if that's critical.
I think that's probably OK, especially considering we don't do DoH yet. We could also try to warm up the HTTPS record, but I'm not sure if HTTPS records get cached in the OS resolver on Android.
Comment 10•10 days ago
•
|
||
Thanks for looking at this, Valentin.
Kaya, I created bug 1929005 to block bug 1801530, ensuring that we don't run this code when DoH is available on Fenix.
I believe we can proceed with this patch.
Description
•