Closed Bug 1405307 Opened 8 years ago Closed 6 years ago

After connecting to VPN Firefox is unable to navigate to sites inside the VPN but Chrome can

Categories

(Core :: Networking, defect, P2)

55 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 968273

People

(Reporter: tnmurphy, Assigned: CuveeHsu)

Details

(Whiteboard: [necko-triaged])

Attachments

(7 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.91 Safari/537.36 Steps to reproduce: I loaded Firefox then I connected to my VPN then I tried to navigate to an intranet site within that VPN. I am using Fedora Linux 26 on amd64. My VPN type is 'cisco'. Actual results: My internet provider's page (Virgin Media UK) for an unknown website appeared. Expected results: The same as Google Chrome - the intranet site should have appeared. As far as I can tell this is a DNS issue because when I put in IP addresses it works. The 'host' utility on my machine returns a correct address. /etc/resolv.conf contains the ip addresses of the VPN's name servers. I tried the version 58 nightly release for 03 October 2017 and it did the same thing.
The user agent of the browser with which I had the problem was: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
As far as I am aware this has been a long standing problem and isn't a regression - I've had it in many previous versions of Firefox.
Component: Untriaged → Networking
Product: Firefox → Core
Hi there, I have same problem on OS X. Let's describe it in my words. If already connected to any network, Firefox will be able to retrieve data. If switching the VPN / or disable it, while firefox is still open, Firefox stales and seems unable to retrieve any page data. Chrome however can. (It maybe due to changing DNS-Server issues. Maybe chrome uses a better mechanism for DNS handling, because of advertising issues, to increase hits on any circumstances, while firefox seems to cache the DNS-Server entries?)
I use Firefox 56.0 (64-Bit on OS X)
@Daniel do you know if Gecko flush the DNS cache when enabling/disabling VPN?
Flags: needinfo?(daniel)
Firefox will flush the DNS cache when it detects a "network change", which is basically when network interfaces come and go or if they change addresses. Adding, changing or removing a VPN would cause one of those things and should thus have made Firefox flush the DNS cache and the other things it does for network changes. The description here makes it sound to me as if the network change that the starting of the VPN should cause wasn't detected by Firefox. Tim, as I read this, you get this problem when you start Firefox and then the VPN. If you do it in the other order, ie first start the VPN and then Firefox, I presume it works properly then? Also, if you enable HTTP logging [*](and include the 'nsNotifyAddr' module) before the VPN is started and then start the VPN, it will be interesting to know if Firefox properly register that as a network change or not. [*] = https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(daniel) → needinfo?(tnmurphy)
Attached file log.txt-main.20514.gz
gzipped log from user session. VPN was off, I logged in and setup logging from about:networking with the nsNotifyAddr module added (no numeric parameter so defaulted to nsNotifyAddr:0), Then I connected to the VPN and turned off logging. I got the bug ready and before sending I decided to try the internal site just to be sure I was reproducing the problem and the darn thing worked (i.e. I saw the internal site) ! I then did the whole experiment again 3 times and got the buggy behaviour where I was stuck.
Flags: needinfo?(tnmurphy)
First I want to apologise that my first log file was from Nightly firefox - I am flipping between that and my linux distro's version which has also been updated to version 56 from 55. There seems to be a difference between 55/56 and my nightly build in that the older ones never "recover" whereas Nightly now seems to eventually load intranet pages after a minute or two. 58.0a1 (2017-10-22) (64-bit). I think this is new in the last week or two.
I have a similar problem with Firefox 57 b12 and OSX. My VPN client is Cisco. 1. Open Firefox 2. Go to www.google.com -> It works 3. Connect to VPN using the Cisco VPN client 4. Try www.google.com again -> It hangs Its consistent and easily reproducible. This does not happen with Chrome. Its an annoying problem and is making me consider going back to Chrome :-( (and I don't link that).
And the funny thing is that this does not occur on other website. Do search engines/google have special handling?/
Assigning it to Daniel for now.
Assignee: nobody → daniel
Priority: -- → P2
Whiteboard: [necko-triaged]
I'm seeing a possibly related issue to this, which only affects Firefox and it seems to be related to a specific domain. How to reproduce: 1. Open firefox (tested on 56 and 57 beta) 2. Open these 2 URLS: https://api.windscribe.com/MyIp?time=1490793491114&client_auth_hash=64cad0e8dcc9cfeb0571daebada62fe3&ddyyd https://api.8829b1c1913ee6c1dfb78a10c8f9a908954f20d6.com/MyIp?time=1490793491114&client_auth_hash=64cad0e8dcc9cfeb0571daebada62fe3 3. Connect to an OpenVPN based VPN 4. Refresh the 2 pages Expected behavior: should show the new IP as provided by the VPN Actual behavior: First URL still shows the old (non-VPN) IP, while the 2nd URL shows the correct IP. This issue "self-rectifies" after 5-10 minutes and it shows the correct IP. This does not occur in Chrome or Edge.
Forgot to mention, it also shows the correct IP if you restart the browser.
I came back to Firefox from Chrome and was very positive about the quantum release. This is tbe only issue that is forcing me back to Chrome and I hope this gets fixed soon. Thanks!
I forget to mention I am using Tunnelblick 3.7.4a (build 4920) on Mac OS X and on GNU/Linux (Debian) I use openvpn. Currently also with the latest nightly build the problem occurs. Restarting Firefox however is a workaround to bypass it. (however one should consider VPN as not too stable, so you might find yourself in a situation, restarting firefox upto 4 times an hour, which can be quite disturbing and cumbersome) It does beside in Chrome not occur in Chromium the underlaying Chrome Opensource Code. kind regards, Alex
It seems to be related to TCP states. We run pfSense as a router. If I clear the states associated with the machine running Firefox in the firewall, the issue gets "resolved".
We are working on something that could help with this problem but it will not be available until next year. We should probably contact the firewall company to fix this issue. Alex, are you using the same firewall or some other firewall?
Flags: needinfo?(info)
Hi I use iptables on OS X... additionally I use "Little Snitch"... however deactivating it, does not change the behavior as described above. I just wanted to report the bug. No hurry, to fix it.
Flags: needinfo?(info)
So any idea when this is going to be fixed? Last update was 11 months ago and honestly this pretty annoying. I switched to firefox for privacy reasons yet you guys cant seem to support VPN connectivity while Chrome seems to work flawlessly. Ill be switching browsers if this doesnt get fixed soon as its extremely annoying and makes firefox unusable
We "fixed" the issue in our VPN software by forcibly terminating all TCP sockets after tunnel up. This resolved the issue above.
What software are you using for the vpn and how are you terminating all sockets?
Windscribe, but you can use the following utility to do this manually: https://www.nirsoft.net/utils/cports.html
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Assignee: daniel → nobody
Status: ASSIGNED → NEW
See Also: → 1522673

This is not the same issue as bug 1522673.

Junior, can you take a look into this?

Flags: needinfo?(juhsu)
See Also: 1522673

It's pretty weird. No nsHttpChannel except detect portal and telemetry was created after "network:link-status-changed" in comment 8.
The DNS cached is cleared, which is good.

Comment 20 catches my eyes since it hints that we might mis-reuse the existed connections.
In that case, we should have some nsHttpChannel's

We might need more information based on logs in comment 7 and comment 8.
And, I can try to reproduce when I get a chance.

Assignee: nobody → juhsu
Status: NEW → ASSIGNED
Flags: needinfo?(juhsu)

I can not reproduce with Viscosity and Tunnelblick in macOS, which comment 9 indicated (local site or public site)

Want to make some progress here.

Is it still an issue for current version? If yes, could you make HTTP log with prepended "nsNotifyAddr:5"?
It would be good to gather log like:
(1) enable logging
(2) visit the website by refreshing or entering URL bar
(3) enable VPN
(4) visit the website again (which hangs or returns failure)

https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

Flags: needinfo?(yegors)
Flags: needinfo?(tnmurphy)
Flags: needinfo?(info)

I'm unable to reproduce the issue anymore.

Flags: needinfo?(yegors)
Attached file log2.txt-main.20064.gz
Flags: needinfo?(tnmurphy)

(In reply to Junior [:junior] from comment #26)

Want to make some progress here.

Is it still an issue for current version? If yes, could you make HTTP log with prepended "nsNotifyAddr:5"?
It would be good to gather log like:
(1) enable logging
(2) visit the website by refreshing or entering URL bar
(3) enable VPN
(4) visit the website again (which hangs or returns failure)

This instruction is the wrong-way around. The website that cannot be reached is inside the VPN - you cannot see it from outside. The problem is that when you do connect to the VPN, Firefox still cannot reach it because somehow it's still resolving addresses the old way. After some long time (10s of minutes in my case) the browser does catch up with the system but not in the way that Chrome does very quickly.

(In reply to Tim Murphy from comment #31)

(In reply to Junior [:junior] from comment #26)

Want to make some progress here.

Is it still an issue for current version? If yes, could you make HTTP log with prepended "nsNotifyAddr:5"?
It would be good to gather log like:
(1) enable logging
(2) visit the website by refreshing or entering URL bar
(3) enable VPN
(4) visit the website again (which hangs or returns failure)

This instruction is the wrong-way around. The website that cannot be reached is inside the VPN - you cannot see it from outside. The problem is that when you do connect to the VPN, Firefox still cannot reach it because somehow it's still resolving addresses the old way. After some long time (10s of minutes in my case) the browser does catch up with the system but not in the way that Chrome does very quickly.

Thanks for description and the log.
I mentioned about step (2) since some people also suffer from public sites.

Hello Tim,
Could you please also provide log for good case for the intranet access?
the main file is enough (i.e., non-child)

Thanks!

Flags: needinfo?(tnmurphy)

(In reply to Junior [:junior] from comment #33)

Hello Tim,
Could you please also provide log for good case for the intranet access?
the main file is enough (i.e., non-child)

Thanks!

Good case for the host name instead of IP address.
Looks like it's workable after minutes (or probably restart firefox?)
Thanks!

Attached file log4.txt-main.20064.xz
Flags: needinfo?(tnmurphy)

I felt that I had to search and replace the company domain name (sorry) with mycompany.com. This is the good case where I loaded Firefox after connecting.

I am not sure this is really what you want. Maybe you want me to have a failure and then wait 10-20 minutes until there is a success? I don't have a predictable time - the last time I came back after 30 minutes or so and it worked. I will try to get this kind of thing tomorrow.

(In reply to Tim Murphy from comment #36)

I felt that I had to search and replace the company domain name (sorry) with mycompany.com. This is the good case where I loaded Firefox after connecting.

I'm a little confused. I suppose what you mean is search and replace in the log files.
I figure out you already did this in Comment 28, right?

Could you specify what URI you failed to browse?
I supposed it's http://jira/, but now it looks like http://jira.mycompany.com

No sorry from you is needed, and thanks for gathering this for us.
You always can send me the log directly if it's needed.

Flags: needinfo?(tnmurphy)

(In reply to Junior [:junior] from comment #37)

(In reply to Tim Murphy from comment #36)

I felt that I had to search and replace the company domain name (sorry) with mycompany.com. This is the good case where I loaded Firefox after connecting.

I'm a little confused. I suppose what you mean is search and replace in the log files.
I figure out you already did this in Comment 28, right?

Yes to both.

Could you specify what URI you failed to browse?
I supposed it's http://jira/, but now it looks like http://jira.mycompany.com

I browsed for http://jira.mycompany.com because I didn't want to allow anything like the search domains to cause a problem - I wanted to show that it was definitely an issue with name resolution.

Flags: needinfo?(tnmurphy)

After some investigation, it could be http-cache related (not DNS cache)
Could you try two things for me:
(a) Try to use private browsing to browse http://jira.mycompany.com again. If my theory is right, a new private browsing window is without cache, so it should work.
(b) Toggle network.http.rcwn.enabled in about:config to false. Restart the browser and reproduce again.

Thanks!

Flags: needinfo?(tnmurphy)

(In reply to Junior [:junior] from comment #39)

After some investigation, it could be http-cache related (not DNS cache)
Could you try two things for me:
(a) Try to use private browsing to browse http://jira.mycompany.com again. If my theory is right, a new private browsing window is without cache, so it should work.
(b) Toggle network.http.rcwn.enabled in about:config to false. Restart the browser and reproduce again.

Thanks!

For (a), maybe try a shift-reload?

More findings

From comment 28:
2019-04-02 19:16:05.077515 UTC - [Parent 20064: Main Thread]: D/nsHttp Creating nsHttpChannel [this=0x7fba7e55d000]
This channel succeeded to get resource from network

2019-04-02 19:16:23.723314 UTC - [Parent 20064: Link Monitor]: D/nsNotifyAddr SendEvent: changed
We got a network change. I supposed we connect to VPN now.

2019-04-02 19:16:38.039335 UTC - [Parent 20064: Main Thread]: D/nsHttp Creating nsHttpChannel [this=0x7fba7d579000]
This channel went to get resource from cache.

However, per comment 31, we can only access the site under VPN.
It confuses me why the first connection is good to connect.
Maybe we're already under VPN before the first connect, but event goes later.
We prune all the dead connections (included we just connect which should not regard as dead).

Try to reconnect again, connection goes to cache, and it's not working for some weird reason.
However, both connections did there job well, send all their data to child.

For the second connection, we got

2019-04-02 19:16:38.104299 UTC - [Child 23853: Main Thread]: D/nsHttp HttpChannelChild::Cancel [this=0x7fc617955800, status=804b0002]
2019-04-02 19:16:38.104319 UTC - [Child 23853: Main Thread]: D/nsHttp 0x7fc617955800 called from script: http://jira.mycompany.com/:1:434

Now I believe it's not a network/cache issue.
http://jira.mycompany.com/ is text/html, possible with some script, which cannot be examined at our side.

Compared to the success sample in Comment 35.
The only difference from request header is Cookie, and we get different result from server
The failure sample is a non-keepalive 200, which kinda indicates "reject to connect"
The good sample is a 302, forwarding to another page.

Therefore, the server side will have more information to debug, instead of randomly guessing the cookie setting.

Flags: needinfo?(tnmurphy)

When I type a non-existing URL into my browser (when the hostname doesn't exist) my ISP sends me to it's own "search page". This is Virgin Media BTW. So it looks like a successful connection but it's really a failure to get to the true destination website.

Unfortunately the private browsing window behaves the same. I am attaching a log (log5) with one request before I connected and one after. I really think this issue is something about which nameserver is being queried first.

So basically Firefox is sending out a request which doesn't cause a connection error and looks totally successful to it and it's just wrong because it asked the "pre-VPN" nameserver instead of the "post-VPN" nameserver.

Attached file log5.txt-main.20064.xz

(In reply to Tim Murphy from comment #42)

When I type a non-existing URL into my browser (when the hostname doesn't exist) my ISP sends me to it's own "search page". This is Virgin Media BTW. So it looks like a successful connection but it's really a failure to get to the true destination website.

Unfortunately the private browsing window behaves the same. I am attaching a log (log5) with one request before I connected and one after. I really think this issue is something about which nameserver is being queried first.

So basically Firefox is sending out a request which doesn't cause a connection error and looks totally successful to it and it's just wrong because it asked the "pre-VPN" nameserver instead of the "post-VPN" nameserver.

shift-refresh will re-do the DNS, bypass the cache.

Is you URL bar changed after your ISP brings you to the search page? If no, you may try a hard refresh.

Or you might set
network.dnsCacheExpiration and network.dnsCacheExpirationGracePeriod
in about:config to 0 or small enough numbers
Go to about:networking#dns to check if the DNS record is cached.

FWIW, we delegate the DNS resolution to OS.

Note for cache:
I did a quick experiment: do we use the cache data after changing network?
Firefox and chrome are still use cache if it's not stale when I switch among wifi, hotspot, vpn,

needinfo? for comment 42.
If the shift reload doesn't work, we have problems to dig.

Flags: needinfo?(tnmurphy)

It is quite difficult to do "shift-refresh" since I am effectively refreshing the "virgin media" search page which I was redirected to. Anyhow after doing shift-ctrl-R on that redirected page I tried putting in the desired page (jira) again and this time I got it.

Flags: needinfo?(tnmurphy)

Also check if restarting the browser makes it work.
It might be that our call to getaddrinfo stops working if resolv.conf is changed mid session.

I know that if I exit the browser and restart then everything works fine - that was my usual solution which gets annoying after a while.

Cache-Control: max-age=600 (seconds) might be the pain point here.
I didn't do much experiment in chrome, but it seems always hit the net for the .html

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE

Thanks Tim very much for helping us get the problem! Happy hacking.

Flags: needinfo?(info)

I have this exact same problem in Firefox 99.0 under Linux.

Everything works with no VPN.
With VPN:

  • Most webpages load, albeit slowly.
  • Google and google search won't load in Firefox (but load in Chrome).
  • Restarting Firefox DON'T fix the problem.
  • In a new private window it DOES work.

Please, reopen the bug as it was clearly not fixed.

(In reply to vnmabus from comment #53)

I have this exact same problem in Firefox 99.0 under Linux.

Everything works with no VPN.
With VPN:

  • Most webpages load, albeit slowly.
  • Google and google search won't load in Firefox (but load in Chrome).
  • Restarting Firefox DON'T fix the problem.
  • In a new private window it DOES work.

Please, reopen the bug as it was clearly not fixed.

Forgot to say: if I connect to my mobile phone internet instead of to the Wifi it DOES work. In fact, it is funny because I have a problem connecting to SSH in the exact same circumstances that cause this bug. Could it be related?

Could you file a new bug and also try to get a http log?
Thanks.

Flags: needinfo?(vnmabus)

(In reply to Kershaw Chang [:kershaw] from comment #55)

Could you file a new bug and also try to get a http log?
Thanks.

Filed bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1765513

Flags: needinfo?(vnmabus)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: