firefox stops DNS resolving certain hostnames
Categories
(Core :: Networking: DNS, defect, P2)
Tracking
()
People
(Reporter: gustavo, Assigned: valentin)
Details
(Whiteboard: [necko-triaged])
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0
Steps to reproduce:
tried to open a website that I often use
Actual results:
the tab remained blank while the progress indicator on the tab handle moved left and right, at the bottom of the window I could read "resolving xxx.domain.yyy"
Expected results:
the webpage should have opened
Reporter | ||
Comment 1•5 years ago
|
||
Here goes contextual information for this bug.
Firefox reaches a certain condition where it stops resolving certain DNS names: there are some URLs which can not be opened ever again until Firefox is restarted.
However:
- other websites will open normally without any name resolution problem
- the non-resolving websites resolve fine in the command line and on Chrome
- they also resolve fine on Firefox on a private window
- they also resolve if I turn on temporarily DNS over HTTPs
- if I then turn DNS over HTTPs off, it no longer resolves again (once the cache expires)
Environment:
- Ubuntu Linux 16.04 LTS
- OpenVPN with very long running VPN sessions
Possibly related reports:
Reporter | ||
Comment 2•5 years ago
|
||
Additional info:
- netiher shift+R nor ctrl+shift+R forces the reload
- in the Network section of the Web Developer tools we only see the GET, which never returns
- not sure if this could be related: https://bugzilla.mozilla.org/show_bug.cgi?id=1405307
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
(In reply to Gustavo Homem from comment #1)
Here goes contextual information for this bug.
- they also resolve fine on Firefox on a private window
This is quite interesting. Thanks for bringing it up.
If this is true, it would mean that the initial getaddrinfo call for xxx.domain.yyy
is stuck, because of VPN interactions.
Making other requests for the same domain will not trigger a different call to getaddrinfo, instead it will append the callback to the initial call, which doesn't complete.
Doing this in private browsing causes the DNS entry to have a different key, so we do a separate getaddrinfo, which now succeeds.
It seems likely that we need a way when clearing the DNS cache to restart active DNS queries as well.
Reporter | ||
Comment 5•5 years ago
|
||
Thank you for looking into this.
Additional comments for clarity:
-
xxx.domain.yyy is a normal public domain not at all related to the VPNs we often connect to - this affects several domains not a specific one.
-
you mention calls to getaddrinfo which do not complete. If I understand correctly getaddrinfo is an OS level call. The OS level name resolution for such domains is working for the domain Firefox stops resolving - that is what I meant by "resolve fine in the command line" in the initial report; of course it could be that in the seconds during VPN connection or disconnection it doesn't work, but after connection / disconnection it does
If I understand correctly your comment, Firefox is prone to get domains unresolvable forever (until restarted) if a getaddrinfo call is done on a specific moment where the system DNS is failing. Therefore, the problem we are seeing here would not be specific to VPNs, but would also affect unreliable DNS situations, is that it?
Anyway, on our side we have very reliable DNS but we are suffering from this problem on a daily basis and killing/restarting Firefox so it is likely related to VPNs. I took a long time to report this because I wanted to make sure it really wasn't something wrong on our side. Thanks again for looking into this.
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to Gustavo Homem from comment #5)
If I understand correctly your comment, Firefox is prone to get domains unresolvable forever (until restarted) if a getaddrinfo call is done on a specific moment where the system DNS is failing. Therefore, the problem we are seeing here would not be specific to VPNs, but would also affect unreliable DNS situations, is that it?
This is arguably a bug in the system's getaddrinfo
library. Do you happen to know if this also happens on OSX/Windows?
The way we resolve names we make a blocking call to getaddrinfo on a dedicated thread. If another DNS request needs to be done for the same domain, we just append the callback to the ongoing getaddrinfo call, and notify all of the consumers once it's done. Unfortunately, it's tough to tell why the call hangs forever. The privatebrowsing call works because we don't just append the callback, but instead we perform a separate call to getaddrinfo. This is probably why you can resolve it fine in the command line.
Also, do you think you could help us confirm this is the problem by using sending us some HTTP logs of the issue?
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Just go to about:networking and enable logging, reproduce the issue, wait for a few minutes, then stop the logging and send us the logs.
If you can reproduce this with a new profile that would be best, as it can sometimes capture sensitive info (domains you connect to, cookies).
Then email me a link to the logs via https://send.firefox.com/
Thanks!
Reporter | ||
Comment 7•5 years ago
|
||
Hi Valentin,
Thanks for sharing the details. We don't have OSX or Windows here so I have no information on that.
I will prepare the logs for you on Linux.
A quick google search reveals some reports that getaddrinfo hangs on certain situations - not clear which ones. Not sure if Firefox should expect that situation and handle it - IMO getaddrinfo should return after a reasonable timeout, but there could something complex that I am missing.
Updated•5 years ago
|
Assignee | ||
Comment 8•5 years ago
|
||
Hi Gustavo, I haven't received the logs yet. Please let me know if you were able to send them.
Thanks!
Reporter | ||
Comment 9•5 years ago
|
||
Hi Valentin,
I have not produced the logs because sadly I have not run into the situation again yet. Obtaining the logs will be the first thing that I do once I run into the situation again.
Cheers,
Gustavo
Assignee | ||
Comment 10•5 years ago
|
||
Lacking logs it is difficult to figure out how to fix this.
Please let me know if it happens again and reopen the bug.
Thanks!
Description
•