strongSwan: Race condition on IPv6 route insertion
linux network vpn ipv6
Yesterday, I was setting up USTC LUG's IKEv2 VPN with strongSwan on my OrangePi
Zero box at home. My ipsec.conf
is simply like this:
conn lugvpn-ikev2 right=vpn.xxx.yyy rightid=%vpn.xxx.yyy rightsubnet=::/0 rightauth=pubkey leftsourceip=%config6 leftauth=eap eap_identity=cuihao
My purpose is to enable IPv6 access so IPv4 config is omitted. When strongSwan is started, I found that IPv6 default route is not correctly inserted (to table 220, which strongSwan policy-based VPN uses).
This is the related debug log of strongSwan (sensitive information is mosaicked):
charon: 10[KNL] installing route: ::/0 via 192.168.42.1 src 2001:da8:d800:XX52:YY::ZZ dev eth0 charon: 10[KNL] getting iface index for eth0 charon: 10[KNL] sending RTM_NEWROUTE 208: => 76 bytes @ 0xb5c11f34 charon: 10[KNL] 0: 4C 00 00 00 18 00 05 06 D0 00 00 00 41 0C 00 00 L...........A... charon: 10[KNL] 16: 0A 00 00 00 DC 04 00 01 00 00 00 00 14 00 01 00 ................ charon: 10[KNL] 32: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ charon: 10[KNL] 48: 14 00 07 00 20 01 0D A8 D8 00 XX 52 00 YY 00 00 .... ....._R.!.. charon: 10[KNL] 64: 00 00 00 ZZ 08 00 04 00 03 00 00 00 ............ ...... charon: 10[KNL] received netlink error: Invalid argument (22) charon: 10[KNL] unable to install source route for 2001:da8:d800:XX52:YY::ZZ
At first I was suprised by this stupid route spec:
::/0 via 192.168.42.1 src 2001:da8:d800:XX52:YY::ZZ dev eth0
.
But later it's revealed that it's only a log message. strongSwan won't really
do that [1].
I decoded the dumped hex data (which is a rtnetlink message send to kernel to add route), it says:
ip -6 route add unicast ::/0 dev eth0 \ src 2001:da8:d800:XX52:YY::ZZ dev eth0 \ tos 0 table 220 protocol static metric 3
Exactly this is the missing default route. However, if I manually run the
command, there is no error at all, and IPv6 become accessable. Since manually
route insertion worked, I wrote an updown script (see leftupdown
parameter in
ipsec.conf
) to do it. Unfortunately, the script failed with exactly the same
error if called by strongSwan immediately after the VPN connection is up:
updown: RTNETLINK answers: Invalid argument
The I realized that probably there was a race condition. So I let the updown
script to output more diagnostic information. Compared with normal condition,
there is a tentative
flag set on the virtual IPv6 address of VPN:
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether ZZ:YY:XX:WW:VV:UU brd ff:ff:ff:ff:ff:ff inet6 2001:da8:d800:XX52:YY::ZZ/128 scope global tentative deprecated valid_lft forever preferred_lft 0sec
man ip-address
says that a tentative address is
an IPv6 address which have not yet passed duplicate address detection (DAD).
Follow this clue, I found useful articles [2] [3] which describe how IPv6 DAD
does cause race condition problem: In short, a program
cannot bind to an IPv6 address in the process of DAD (tentative state).
Well, so it is reasonable that a tentative IPv6 address cannot be used in route specification. Here is a demonstration:
$ ip -6 addr add fdfd::fd/128 dev eth1 \ && ip -6 addr show dev eth1 tentative \ && ip -6 route add fdfd::fe/128 src fdfd::fd dev eth1 5: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000 inet6 fdfd::fd/128 scope global tentative valid_lft forever preferred_lft forever RTNETLINK answers: Invalid argument
The graceful way to avoid the race condition is to wait for DAD and then insert route:
ADDR=fdfd::fd/128 IFACE=eth1 TO=fdfd::fe/128 ip -6 addr add $ADDR dev $IFACE until ip -6 addr show to $ADDR dev $IFACE -tentative | grep "" >/dev/null; do sleep 1 done ip -6 route add $TO src ${ADDR%%/*} dev $IFACE
I choose to simply disable DAD. Since the address is assigned by VPN server, there is no need to perform DAD at all:
sysctl -w net.ipv6.conf.eth0.accept_dad=0
Go back to my issue. Essentially, strongSwan should be responsible to avoid race condition. In fact, strongSwan upstream has already resolved this issue in the version 5.5.2 by setting NODAD flag for virtual IPv6 addresses [4]. However, Debian stretch is shipped with strongSwan 5.5.1.
Reference:
- Issue #2684: kernel_netlink plugin - no traffic through VPN when IPv4 policy on IPv6 ESP tunnel uses IPv6 nexthop when installing IPv4 route - strongSwan
- Beware the IPv6 DAD Race Condition
- System: IPv6 Networking and DAD
- 5.5.2 - strongSwan
Page created on 2019-02-17