RITM0019776 - Network Request:Other Network Requests for Benjamin Smedberg

RESOLVED INCOMPLETE

Status

Infrastructure & Operations
Mozilla VPN: Support requests
RESOLVED INCOMPLETE
4 years ago
3 years ago

People

(Reporter: jbraddock, Unassigned)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
Comments from Service Now request: 

VPN troubles: I got MozillaVPN to work on my Linux box (Fedora 19). It works for a little while, but if I stop sending data for a while (maybe a few minutes?) it will disconnect.

From reading logs, this is supposed to be prevented by the "keepalive" instruction which is in the .ovpn file. But since we now don't use the .ovpn file directly but instead import it into NetworkManager, I think we need some equivalent configuration item in the networkmanager config file. But I can't figure out what. Please help.
(Reporter)

Comment 1

4 years ago
Would it be possible to get someone to help Benjamen? I am not sure in which way to direct him. Thanks!

:jabba - you are also CCd on the Service Now request.

Comment 2

4 years ago
Have you tried looking at instructions here? https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=30769829 There are some bugs with network manager that has caused issues and we found that running openvpn from the command line is the best route to mitigate any issues that network manager has.

Running openvpn from the cli just involves the following:

$ sudo openvpn --config ~/path/to/openvpn.conf.ovpn 

It will prompt for your LDAP username and password, go ahead and try using that and see if there is still any issues with keeping a connection open with the VPN server.

Comment 3

4 years ago
I helped write some of the workarounds on that page.

If we're really telling people that they should not use networkmanager, that really sucks the UI, and we should remove all of the workaround instructions for networkmanager on that page. But I'd really expect that if we're deploying a configuration like this, we should get it to work in networkmanager because that's the UI that ships with every modern Linux distro.
From various bugs, it seemed like issues with NetworkManager can roughly be summarized as:

(a) NetworkManager erratically corrupts imported configs
(b) NetworkManager sporadically drops VPN connections

Are these issues unique to our configs, and not experienced by other NM users?

Comment 5

4 years ago
The mis-importation appears to be fairly unique to our configs: we have many different openvpn connection options, and networkmanager is managing to combine them incorrectly.

I don't know about the dropping issue. Our keepalive is very short from what I've seen in the past (it appears to be no more than 2 minutes?), so it may be that keepalive in general isn't working but doesn't affect most users because their timeout is 15 or 30 minutes, and they typically touch the VPN within that period. I'm not sure.
(Reporter)

Comment 6

4 years ago
:limed - do you have any additional information for bsmedberg on comment 5?
hi, I was asked to reply to this by :limed.

we have keep-alive enabled server side by the directive "keepalive 10 120".
What this means is that after 10 seconds without activity on the data channel, openvpn will ping the client. After 120s it will restart the connection.

It also means that openvpn pushes the keepalive configuration to the client when connecting (i.e. the client has keep alive enabled even if the local settings don't specify it - in fact, even if the server does not push the setting, the clients still default to 10s, 120s)

This leads me to believe that the problem might be elsewhere.

Do you have the logs from openvpn when it disconnects? Those should be stored in /var/log/... or available via journalctl, depending on your setup/linux distribution.
Also, the process is generally called "nm-openvpn" instead of "openvpn" in the logs.
Flags: needinfo?(benjamin)

Comment 8

4 years ago
Created attachment 824835 [details]
/var/log/messages log


Here is the log from /var/log/messages when it disconnects. This time without my ssh keepalive pings active, it took 15 minutes to timeout, instead of what I previously remember which was 3-5 minutes.
Flags: needinfo?(benjamin)
so, according to attachment 824835 [details] openvpn doesn't get the keepalives and restarts the session, then session restart fails (due to the up/down script apparently - this stuff is client side/not normally provided)

- why does the vpn restart (and doesn't get the server keepalive/pings)? generally this happens because the connection fails (laptop goes to sleep, connection is lossy, etc.) or because there's 2x the same VPN competing for the same user. It could be something else, but those are the common issues. we can make the restart timeout longer, but it probably won't solve the core issue.

- why does up/down scripts fail? I don't know :( it's probably specific to NM or to your distro/setup

To check for the connection's reliability, you can try "mtr openvpn.scl3.mozilla.com" and check the loss.
It's likely that this works better from the command line than NM (network manager) since it wouldn't trigger any possible NM bug, specially in the up/down scripts. It might also be easier to troubleshot.
Assignee: infra → vpn-support
Component: Infrastructure: OpenVPN → Mozilla VPN: Support requests

Updated

3 years ago
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2473]
deleted bogus whiteboard tag
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/2473]
Is this still an issue?

Comment 12

3 years ago
I have worked around the issue by setting ServerAliveInterval 60 in my ssh config, but otherwise the VPN does drop pretty quickly with no activity. But I really don't have time to diagnose further myself, so if I'm the only one experiencing this my workaround is sufficient.
Unfortunately, I have not seen reports of this issue from any other users.

Comment 14

3 years ago
Let's call it INCOMPLETE then... not worth spending a lot of time when I have a workaround.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.