Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle reason='attempt-reconnect' on Linux, and stub for it on macOS/*BSD #89

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dlenski
Copy link
Owner

@dlenski dlenski commented Sep 12, 2021

The case where the "real" network device disappears or disconnects, while the VPN/tunnel device stays up is a surprisingly complex one to handle correctly. The main issue is that the "explicit route" to the VPN gateway on the underlying network device may have disappeared, and the OS routing utilities may erroneously suggest a looped-back route (over the VPN/tunnel device) as the optimal route to the gateway.

This issue arises particularly — but NOT exclusively — if the default route for the relevant network family is now assigned to the VPN/tunnel device.

See:

We need to handle attempt-reconnect in vpn-slice. This mostly borrows from what vpnc-script does.

Still TODO:

Here is an example of OpenConnect + vpn-slice correctly re-establishing the route
to the VPN gateway even after it's removed by a network outage, leaving the default route looped-back
through the VPN/tunnel interface.

$ openconnect --script 'vpn-slice --dump -vvv 0.0.0.0/0'
...
... <authenticate successfully>
...
Called by /usr/sbin/openconnect (PID 123456) with environment variables for vpnc-script:
  reason                  => reason=<reasons.connect: 2>
  VPNGATEWAY              => gateway=IPv4Address('1.2.3.4')
  TUNDEV                  => tundev='tun0'
  ....
Complete set of subnets to include in VPN routes:
  0.0.0.0/0
Set explicit route to VPN gateway 1.2.3.4 (via 10.224.0.1, dev wlan0, src 10.224.0.123)
Blocked incoming traffic from VPN interface with iptables.
Adding route to nameserver 8.8.8.8 through tun0.
Adding route to nameserver 1.1.1.1 through tun0.
Adding route to subnet 0.0.0.0/0 through tun0.                                            # <----
Added routes for 2 nameservers, 1 subnets, 0 aliases.
...
... <we have successfully connected>
...
... <disconnect and reconnect from WiFi, so that explicit route to gateway is lost>
...
... <OpenConnect dead peer detection kicks in>
...
Failed to reconnect to host vpn.company.com: Interrupted system call
sleep 10s, remaining timeout 300s
...
Called by /usr/sbin/openconnect (PID 551671) with environment variables for vpnc-script:
  reason                  => reason=<reasons.attempt_reconnect: 5>
  VPNGATEWAY              => gateway=IPv4Address('1.2.3.4')
  TUNDEV                  => tundev='tun0'
  ...
Complete set of subnets to include in VPN routes:
  0.0.0.0/0
Reset explicit route to VPN gateway 1.2.3.4 (via 10.224.0.1, dev wlan0, metric 600)       # <---
SSL negotiation with vpn.company.com
Connected to HTTPS on vpn.company.com with ciphersuite (TLS1.2)-(ECDHE-SECP256R1)-(RSA-SHA512)-(AES-256-GCM)

@dlenski dlenski requested a review from gmacon September 12, 2021 00:54
@dlenski dlenski force-pushed the handle_reason_attempt_reconnect branch from 635463b to db53fd1 Compare September 12, 2021 01:15
@dlenski
Copy link
Owner Author

dlenski commented Sep 12, 2021

I would like to extend this PR to also handle the macOS/*BSD case. See how we do it for macOS/*BSD in the standard vpnc-script: https://gitlab.com/openconnect/vpnc-scripts/blob/master/vpnc-script#L395-412

However, I absolutely need some macOS/*BSD testers to help me with it. Anyone willing to take a shot at it?

…*BSD

The case where the "real" network device disappears or disconnects, while
the VPN/tunnel device stays up is a surprisingly complex one to handle
correctly.  The main issue is that the "explicit route" to the VPN gateway
on the underlying network device may have disappeared, and the OS routing
utilities may erroneously suggest a looped-back route (over the VPN/tunnel
device) as the optimal route to the gateway.

This issue arises particularly — but NOT exclusively — if the *default*
route for the relevant network family is now assigned to the VPN/tunnel
device.

See:

- https://gitlab.com/openconnect/openconnect/issues/17 for the initial report of this problem,
- https://gitlab.com/openconnect/openconnect/-/commit/c2755eefb4e00e915c330495b33d3f5db926615b
  for where the vpnc-script call with reason='attempt-reconnect' was added
  to OpenConnect (merged in v8.02)
- https://gitlab.com/openconnect/vpnc-scripts/-/commit/1000e0f6dd7d6bff163169a46359211c1fc3a6d2
  for where an initial placeholder was first added to vpnc-script
- https://gitlab.com/openconnect/vpnc-scripts/-/merge_requests/14 for the
  first actually-working support in vpnc-script (for Linux)m
- and numerous subsequent changes to handle macOS/*BSD, IPv6, and corner cases

We need to handle attempt-reconnect in vpn-slice.  This mostly borrows from
what vpnc-script does.

Still TODO:

- Flesh out the macOS/*BSD implementation.  Instead of 'route -n get', we should use 'netstat -r -n'
  to ensure that we don't get a looped-back route, as vpnc-script does since
  https://gitlab.com/openconnect/vpnc-scripts/-/blob/412a1faffa72fcda54e8c42d22e0057e56240ff1/vpnc-script#L395-402
- Linux: preserve the 'onlink' route flag.  This requires replacing 'ip route get' with 'ip route show'
  in all cases. See https://gitlab.com/openconnect/vpnc-scripts/-/merge_requests/27

Here is an example of OpenConnect + vpn-slice correctly re-establishing the route
to the VPN gateway even after it's removed by a network outage, leaving the default route looped-back
through the VPN/tunnel interface.

    $ openconnect --script 'vpn-slice --dump -vvv 0.0.0.0/0'
    ...
    ... <authenticate successfully>
    ...
    Called by /usr/sbin/openconnect (PID 123456) with environment variables for vpnc-script:
      reason                  => reason=<reasons.connect: 2>
      VPNGATEWAY              => gateway=IPv4Address('1.2.3.4')
      TUNDEV                  => tundev='tun0'
      ....
    Complete set of subnets to include in VPN routes:
      0.0.0.0/0
    Set explicit route to VPN gateway 1.2.3.4 (via 10.224.0.1, dev wlan0, src 10.224.0.123)   # <----
    Blocked incoming traffic from VPN interface with iptables.
    Adding route to nameserver 8.8.8.8 through tun0.
    Adding route to nameserver 1.1.1.1 through tun0.
    Adding route to subnet 0.0.0.0/0 through tun0.                                            # <----
    Added routes for 2 nameservers, 1 subnets, 0 aliases.
    ...
    ... <we have successfully connected>
    ...
    ... <disconnect and reconnect from WiFi, so that explicit route to gateway is lost>
    ...
    ... <OpenConnect dead peer detection kicks in>
    ...
    Failed to reconnect to host vpn.company.com: Interrupted system call
    sleep 10s, remaining timeout 300s
    ...
    Called by /usr/sbin/openconnect (PID 551671) with environment variables for vpnc-script:
      reason                  => reason=<reasons.attempt_reconnect: 5>
      VPNGATEWAY              => gateway=IPv4Address('1.2.3.4')
      TUNDEV                  => tundev='tun0'
      ...
    Complete set of subnets to include in VPN routes:
      0.0.0.0/0
    Reset explicit route to VPN gateway 1.2.3.4 (via 10.224.0.1, dev wlan0, metric 600)       # <---
    SSL negotiation with vpn.company.com
    Connected to HTTPS on vpn.company.com with ciphersuite (TLS1.2)-(ECDHE-SECP256R1)-(RSA-SHA512)-(AES-256-GCM)
@dlenski dlenski force-pushed the handle_reason_attempt_reconnect branch from db53fd1 to 76c43e4 Compare September 12, 2021 01:19
@dlenski dlenski changed the title Handle reason='attempt-reconnect' Handle reason='attempt-reconnect' on Linux, and stub for it on macOS/*BSD Sep 12, 2021
@dlenski dlenski marked this pull request as ready for review September 12, 2021 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant