Closed Bug 1625865 Opened 5 years ago Closed 5 years ago

DoH server IP changes to fallback IP and is cached forever

Categories

(Core :: Networking: DNS, defect, P1)

75 Branch
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: bugzilla, Assigned: valentin, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged][trr])

Attachments

(2 files)

1.79 MB, application/octet-stream
Details
2.87 MB, application/octet-stream
Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0

Steps to reproduce:

I've set up an AdGuard Home DNS server on my VPS, enabled encryption (DoH and DoT), and tried to use it.
https://github.com/AdguardTeam/AdGuardHome

Actual results:

But if the server is not reachable (I've tied to pass through SSL-connects through nginx server, or set unavailable IPs in DNS) TRR fallbacks to my routers address - legacy resolving.
When I've fixed the config of my server (removed the nginx for now), Firefox is trying to use routers IP-address to connect to my DoH domain.
Moreover, when I try to open a dashboard on that domain it loads routers dashboard.
And that domain isn't even listed in about:networking#dns!
I am able to load that dashboard ONLY on a new clean profile. And when I reproduce the problem that profile is wasted too.

Expected results:

The DoH IP-address must be re-resolved from time to time, ideally following TTL.

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → Networking: DNS
Product: Firefox → Core
Priority: -- → P1
Whiteboard: [necko-triaged][trr]

Valentin, please take a look?

Flags: needinfo?(valentin.gosu)
Priority: P1 → --

I believe the priority was taken by accident.

Priority: -- → P1

Thanks for the report. I am very curious how this ends up happening.

(In reply to Revertron from comment #0)

But if the server is not reachable (I've tied to pass through SSL-connects through nginx server, or set unavailable IPs in DNS) TRR fallbacks to my routers address - legacy resolving.
When I've fixed the config of my server (removed the nginx for now), Firefox is trying to use routers IP-address to connect to my DoH domain.
Moreover, when I try to open a dashboard on that domain it loads routers dashboard.

If it doesn't appear in about:networking#dns that probably means that we are falling back to the system DNS.
The operating system may also cache the DNS name for a while. I recommend setting the TTL to at least 60s and see what happens if you try again after waiting for one minute.

And that domain isn't even listed in about:networking#dns!

So, if the connection to the DoH server is still active, it might be reused instead of re-resolved, without doing another DNS request.
In about:networking#http you should see at least one active connection for that domain.

I am able to load that dashboard ONLY on a new clean profile. And when I reproduce the problem that profile is wasted too.

This is persistent problem? Meaning it doesn't go away after restart?
That is unexpected.
Do you think you could help us with some logs? https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

Expected results:

The DoH IP-address must be re-resolved from time to time, ideally following TTL.

This is what should normally happen.
If you are able to gather some logs we can try to make sense of this.

Assignee: nobody → valentin.gosu
Flags: needinfo?(valentin.gosu) → needinfo?(bugzilla)
Attached file log_bad.zip

The HTTP logging log got on a profile with error.
The TRR domain always resolves to 10.0.0.1.
Good log follows.

Attached file log_good.zip

Good log from a new profile.
I couldn't break my server the same way, therefore got two logs from separate profiles.

Flags: needinfo?(bugzilla)

(In reply to Revertron from comment #6)

Created attachment 9138261 [details]
log_good.zip

Good log from a new profile.
I couldn't break my server the same way, therefore got two logs from separate profiles.

Thanks for the log.
It seems that we do 2 native resolves fordns.wyrd.link then just use that for the rest of the session.
The server seems to be responsive through the session, so it seems natural to just reuse the connection.
But you didn't answer my previous questions:

What happens when you restart Firefox? Does it still use the previous bad IP?

Flags: needinfo?(bugzilla)

(In reply to Valentin Gosu [:valentin] (he/him) from comment #7)

But you didn't answer my previous questions:

What happens when you restart Firefox? Does it still use the previous bad IP?

Ah, my bad.
Yes, I'm updating the beta from time to time, closing it and switching between profiles, and that behavior stays no matter what :)

Flags: needinfo?(bugzilla)

(In reply to Revertron from comment #8)

(In reply to Valentin Gosu [:valentin] (he/him) from comment #7)

But you didn't answer my previous questions:

What happens when you restart Firefox? Does it still use the previous bad IP?

Ah, my bad.
Yes, I'm updating the beta from time to time, closing it and switching between profiles, and that behavior stays no matter what :)

So, on the bad profile you had a bootstrapAddress=10.0.0.1
When this happens when switching profiles, do you still have a bootstrapAddress pref set on those profiles?
That could indeed cause it to always resolve to the same thing. Could you check?

Flags: needinfo?(bugzilla)

I have a similar issue. When simply enabling "DNS over HTTPS" from "General > Network Settings > Settings", TRR works fine for a while and then it falls back to the system DNS servers. "about:networking" shows "TRR" as "false" for all DNS entries.

If I force TRR to mode 3, it works fine for a few hours and then some websites stop working until I restart Firefox. I have no issue using Cloudflare with DNS over HTTPS in other DNS forwarders, iOS VPNs etc. It's only when Firefox uses DoH directly that it stops working after some time.

Blocks: doh

I encountered a more serious issue where even with TRR mode 3, Firefox falls back to the system resolver after an extended duration and "about:networking" shows "TRR" as "false" for all DNS entries. Restarting the browser does not fix the issue, but restarting Windows does. Very strange.

(In reply to Kurian Thampy from comment #11)

I encountered a more serious issue where even with TRR mode 3, Firefox falls back to the system resolver after an extended duration and "about:networking" shows "TRR" as "false" for all DNS entries. Restarting the browser does not fix the issue, but restarting Windows does. Very strange.

I suspect you either have a VPN or proxy on your computer?

Flags: needinfo?(kathampy)

(In reply to Valentin Gosu [:valentin] (he/him) from comment #13)

I suspect you either have a VPN or proxy on your computer?
There are no VPN adapters or proxy settings in Windows at all, and no VPN / proxy add-ons in Firefox. The Wi-Fi disconnects several times a day, and I leave Firefox running throughout. There are no other active adapters with a gateway configured, so for the duration the Wi-Fi is disconnected, the system has no default route. Perhaps this is triggering something.

Flags: needinfo?(kathampy)

Previously when TRR stopped working after some time in mode 3, Firefox would not resolve any domains and restarting the browser would fix it. In Firefox 77, when TRR stops working in mode 3, it's falling back to the system resolver.

(In reply to Kurian Thampy from comment #15)

Previously when TRR stopped working after some time in mode 3, Firefox would not resolve any domains and restarting the browser would fix it. In Firefox 77, when TRR stops working in mode 3, it's falling back to the system resolver.

Please open a separate bug and attach some logs: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

Closing as reporter didn't follow up with an answer to needinfo

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: