This not about optimising OpenVPN, it is about solving OpenVPN poor throughput and packet loss issue, where the server receives traffic faster than it actually process.
We are currently in the process of moving data centers. This requires our Couchbase data to be in sync between Gütersloh (DE) and AMS-IX (NL) which does mean that XDCR needs to pump few hundred Gigs across every day and fast. After about 20 minutes or so, everything started to slow down for an unknown reason.
Our choice for VPN solution was OpenVPN due to some limitations caused by the managed network at the german side, so we built the tunnel and managed to get a reasonable link with ~20ms TTL, initial throughput tests showed:
1
|
|
It was not so flash but I did not suspect anything at that point, I accepted it as the capability of the tunnel. Further investigation however revealed TX packet drops on the tunnel interface:
1 2 3 4 5 6 7 |
|
It seemed that the tunnel is not being able to keep up with the amount of traffic it received. After some reading, it turned out, that OpenVPN sets txqueuelen parameter to 100 as default for the tunnel interfaces on both, client and server. It is essentially a buffer, and managed by the network scheduler.
The solution was to set this to 1000, identical to the physical interface configurations:
1 2 |
|
After restarting OpenVPN on both, server and client side, there was no packet drop on the tunnel interfaces and the throughput was better too:
1
|
|
For further optimisation, visit the official Linux guide.