early DNS lookups fail with network.trr.mode=3 using network.trr.uri with hostname in it
Categories
(Core :: Networking: DNS, defect, P2)
Tracking
()
People
(Reporter: steven, Assigned: valentin)
References
Details
(Whiteboard: [necko-triaged][trr])
Attachments
(1 file)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:73.0) Gecko/20100101 Firefox/73.0
Steps to reproduce:
- Run Firefox Nightly (this issue was authored against build ID 20200225094028)
- Set home custom URL home page in Preferences>Home (e.g. https://start.duckduckgo.com)
- Set network.trr.uri to a DoH URL with a hostname in it, instead of a bare IP address (e.g. https://example.com/dns-query), but do not set any network.trr.bootstrapAddress values.
- Set network.trr.mode to 3
- Close all Nightly windows and start a new Nightly process
Actual results:
On startup, the home page hostname fails to resolve, causing the browser to display "Hmm. We're having trouble finding that site." If a manual refresh is requested, it loads fine.
Expected results:
The home page should load on the first try.
What I think is happening:
The TRR resolver service is not using getaddrinfo() for the hostname in trr.uri early enough in browser startup.
If I specify the IP for the hostname in network.trr.uri via network.trr.bootstrapAddress, the homepage loads fine on startup without the need for a manual refresh. But setting the bootstrap address should not be necessary on Firefox 74+ from what I've read about network.trr.mode=3's intended behavior.
Evidence for this is that if I set MOZ_LOG=nsHostResolver:5 and run the reproduction steps, TrrLookup responds to all the early lookup requests with the message "service not enabled". Later in the startup it does the lookup for the TRR URI's hostname via getaddrinfo() and succeeds, which enables all the subsequent TRR requests to work properly.
Note also that this affects a lot of other lookups in early startup beyond what the homepage would have required (e.g. incoming.telemetry.mozilla.org, profile.accounts.firefox.com, etc).
Comment 1•5 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Comment 2•5 years ago
|
||
Valentin, can we make this better? Probably we can, we can also remember the last used TRR server's IP address. or is it the confirmation code the problem?
| Assignee | ||
Comment 3•5 years ago
|
||
Yes, the problem is the confirmation code - however, for mode 3 I think we can just ignore the confirmation, and all resolutions would just wait for the TRR connection to be established. At worst it would just fail if the TRR connection returns an error or times out.
| Assignee | ||
Comment 4•5 years ago
|
||
Note that this probably also happens for a TRR uri that specifies an IP, but it's just faster if it doesn't have to wait for the TRR's DNS resolution to complete.
I'll try to write up a fix next week.
| Assignee | ||
Comment 5•5 years ago
|
||
Comment 8•5 years ago
|
||
Backed out changeset ed75364b23c3 (Bug 1618042) for xpc shell failures complaining about test_trr.js
Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&fromchange=22ff475bd4e4993a6e2647ee5f37f8394c565c22&tochange=b7d06bfdd0d5655fba6c53d7d86070b1f2f2ace6&searchStr=xpc
Backout link: https://hg.mozilla.org/integration/autoland/rev/b7d06bfdd0d5655fba6c53d7d86070b1f2f2ace6
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=292475475&repo=autoland&lineNumber=3329
[task 2020-03-10T15:24:40.681Z] 15:24:40 INFO - TEST-START | netwerk/test/unit/test_trr.js
[task 2020-03-10T15:24:49.208Z] 15:24:49 WARNING - TEST-UNEXPECTED-FAIL | netwerk/test/unit/test_trr.js | xpcshell return code: 0
[task 2020-03-10T15:24:49.209Z] 15:24:49 INFO - TEST-INFO took 8525ms
| Assignee | ||
Updated•5 years ago
|
Comment 10•5 years ago
|
||
| bugherder | ||
Updated•5 years ago
|
Comment 12•5 years ago
|
||
Hi guys, tried verifying this issue, but by following the exact steps from Comment 0, the "Hmm. We’re having trouble finding that site." message is still displayed. Tried on 77.0a1 (2020-04-27) on Windows 7 (x64). Valentin, any idea about this? Thanks!
| Assignee | ||
Comment 13•5 years ago
|
||
(In reply to Catalin Sasca, QA [:csasca] from comment #12)
Hi guys, tried verifying this issue, but by following the exact steps from Comment 0, the "Hmm. We’re having trouble finding that site." message is still displayed. Tried on 77.0a1 (2020-04-27) on Windows 7 (x64). Valentin, any idea about this? Thanks!
What value did you use for network.trr.uri ?
Comment 14•5 years ago
|
||
I used the example from Comment 0, step 3 - (https://example.com/dns-query).
| Assignee | ||
Comment 15•5 years ago
|
||
(In reply to Catalin Sasca, QA [:csasca] from comment #14)
I used the example from Comment 0, step 3 - (https://example.com/dns-query).
Set network.trr.uri to a DoH URL with a hostname in it, instead of a bare IP address (e.g. https://example.com/dns-query), but do not set any network.trr.bootstrapAddress values.
https://example.com/dns-query doesn't host a DNS over HTTPS server, so the fact that it doesn't work is to be expected.
I think you can use the default value for the pref - https://mozilla.cloudflare-dns.com/dns-query
Comment 16•5 years ago
|
||
Yeah, thanks Valentin, that is only just an example indeed, got a little confused :)).
I was able to reproduce the issue on Firefox Nightly (2020-02-25) under Windows 7 (x64) following the STR from Comment 0.
The issue is no longer reproducible on latest Nightly 77.0a1 (2020-04-29) and Firefox 76.0. Tests were performed on Windows 7 (x64), Ubuntu 18.04 (x64) and macOS 10.15.4.
| Assignee | ||
Comment 17•5 years ago
|
||
Thanks for confirming, Catalin!
Description
•