Debugging failed VPN tunnels can be quite annoying… Today we had an issue with a new deployment that had us on a wild goose chase for a while. So a quick post to give all of you some tracking points ;
- The first VPN gateway that receives a packet which is in need of the tunnel will initiate the connection. In ARM you have no way to manually initiate the connect.
As a side effect, the destination gateway is typically the one who has the most useful information regarding the VPN connection. So when debugging, look towards that gateway.
Therefor I would suggest to start a ping from an Azure VM (within the VNET) towards the local network. This will kickstart the connection process. - The diagnostic part on the Azure side is quite “basic” and well hidden… Actually, the commands to get diagnostics are only available in “classic”-mode. Though you can work your way around it. Check out the following post for more information o getting diagnostics for the VNET gateway on Resource Manager.
- With the change from “Classic” to “Resource Manager”, there was also a change in the naming of the VPN types. Previously we had “static” and “dynamic”. The “static” connection was “policy-based” and the “dynamic” was “route-based”. When looking towards the effect, the “route-based” deployment relied on IKE V2, where the “policy-based” deployment relies on IKE V1. This is VERY important to know, as this has an effect on the amount of tunnels you can build. In addition, there are a lot of VPN gateways that do not support IKE V2 (at this moment).
Good luck troubleshooting!