Introduction
Something I had on my to-do for a while now was to post a proof-of-concept to you guys/gals about what BGP on Azure can entail… Now some of you might go; “BGP? What the hell is that?!?”. Check out the following “CBT Micro Nugget” as it is a nice high level description of what BGP is.
So why should you care? BGP can offer you a way to deal with advanced routing paths. This in turn can deliver resiliency to your business.
Proof-of-Concept Design
For today, we’ll be building the following setup ;
This will consist of the following components ;
- Four virtual networks ; VNET001, VNET002, VNET003 & VNET004
- Each VNET will have its own VPN Gateway. We’ll enable BGP on the VPN Gateway and give it its own (unique for, and private to, our deployment) ASN & peering address. The VPN Gateway will be set to “RouteBased”-routing and we’ll use a “Standard” SKU.
- Each VPN Gateway will have two connections towards the “previous” and “next” gateway. The keys per connection pair will be set to the same key and we’ll also enable BGP on the connection.
- We’ll deploy two systems into this PoC setup
- System001 will reside in VNET001
- System004 will reside in VNET004
To test our setup, we’ll execute the following scenario ;
- Connect from system001 to system004 whilst our ring is complete =>the green path will be followed
- Connect from system001 to system004 whilst having deleted the connections between VPNGW001 & VPNGW004 => the yellow path will be followed
RTFM
First things first… Read up on the documentation 😉
- Overview of BGP with Azure VPN Gateways
- How to configure BGP on Azure VPN Gateways using Azure Resource Manager and PowerShell
- About VPN Gateway
Some important key points to remember ;
- BGP is not supported on the “Basic”-SKU
- BGP is not supported with “PolicyBased”-routing
Setting things up!
I’ve created five resource groups ;
Putting my test machines into a separate one (easy clean-up afterwards!) and each network into a separate one. The network resource group will host the VNET, VPN Gateway, public VPN Gateway IP & initiating connections.
So ensure that you’ll have everything setup as you should in a scenario where you would not have BGP on this “ring”. Meaning that all your connections should be set up in a “Connected”-state. 😉
Having issues getting the links connected? 9 out of 10 this will be due to not matching secrets. Though you can always debug by using the following guide!
Configuring BGP on this setup
You can follow the guide as I linked earlier on by using powershell. As I’m a big fan of the “Azure Resource Explorer“, we’ll dive a bit deeper / more direct… First ensure that the Resource Explorer is set to “Read/Write” ;
Now find your VPN Gateway in the Resource Explorer and press “Edit” ;
Scroll down and find the BGP settings ;
Ensure that the ASN & bgpPeeringAddress are unique per gateway! If the tier is set to “Basic”, or the vpnType is set to “PolicyBased”, then you’ll get errors whlist trying to update the configuration. Adjust the settings accordingly on all VPN gateways.
Now browse to the connections and enable BGP on all of them ;
All done!
Test Scenario
Now let’s see how this one behaves… With my ring complete I’m able to access system004 from system001 ;
Now I’ve deleted my connection… and tried again. Here I experienced a connection time out… What happened? I forgot to enable BGP on my connections. So I went to enable it (as I should have), and tried again.
It takes a bit longer, as the route needs to be found. But I’m able to connect via the “orange”-path. Meaning that I went from 001 to 004 via an indirect path traversing over 002 & 003. So it works! 😀
Closing Thoughts
- After doing this PoC, I’ve learned that BGP is a good option to add resiliency & unlock more advanced networking topologies.
- Stay clear of the basic SKU and PolicyBased routing when you want to do BGP.
- An ExpressRoute Gateway is set to PolicyBased routing and you can’t mess with the BGP settings on that one.
Would I need a 3rd party NVA to support BGP over IPSec if I were to adhere to the requirement of encrypting data at rest in transit if I had an ER connection versus a VPN GW .
I didn’t find any details on that use-case at first glance…..
Cheers
Rik
An ExpressRoute currently does not do native encryption and relies on an NVA to do this. A VPN GW does have encryption out-of-the-box.
When you throw BGP into the mix, it becomes a tad more complicated. The VPN GW (in ARM) will support BGP. Though the combination of an NVA & BGP… I would suggest to open a support request to get that one addressed. This as I’m not 100% sure and I do not want to make false statements that would then guide you on a wrong design path. 😉
Thanks for your prompt feedback.
No problem! I hope it was a bit usefull.
Hi Karim , indeed feedback is very useful , I will consider a designated NVA for IPSec termination – I tried to convince customer to drop the requirement but provided they also want to allow non-ssl encrypted flows accross such as RDP – it’s a nono for native ER. need to think a bit now and see what’s the most cost efficient yet scalable solution. Thx again
FYI – It is a common misconception that RDP is unencrypted, whilst in fact it is encrypted by default => https://technet.microsoft.com/en-us/library/cc770833(v=ws.11).aspx 😉
Aside from that, I can relate to the requirement that states that it should counter the possibility that non-encrypted traffics flows over the channel.
Good…now I know what to this weekend 😉
Thx a mill..
/R
Enjoy! 😉
One final question (and I do understand i am digressing from original topic somewhat – slap me on the wrist if you like because I merely write these thing down here to ascertain myself I understand it correctly.
On your particular setup, your config snippet shows BGP peering (my) address as 172.16.254.254 –
IIRC, on a normal BGP GW this would be the bgp router-id and/or source of the updates. You’d also need to specify the peer-as and peer gw address and a method to advertise your prefixes as a minimum – I assume we need to connect the dots this way in Azure as well somehow – where is this performed ? In our case I want to make sure that I grasp the concept of resilience in ARM in general. I understand there’a a strict one to one relationship between a vNet and a VPN GW – my assumption is that ARM takes care of HA by means of availability sets – but we (consumers of Azure resources) need to make sure there’s a second vNet (say in North Europe) and that we peer with two AS’s and two distinct ip addresses if we have just a single primary (physical) DC. (simplified)
For ER GW’s I am none the wiser yet – some folks explained me that the ER-GW’s (Azure site) don’t necessarily reside in a DC (say AMS) but rather in a POP (Edge location) – Nonetheless , so when the POP burns down one fine day I should be able to peer with another POP – Those nuances are to be discussed with a carrier of choice so I’m told because my mileage may vary… I try to find more details but should you have some links of interest you could share it’d be very much appreciated.
Kind regards
Rik
As a bit of a reference towards the howto with BGP ;
* https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-bgp-overview
* https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-bgp-resource-manager-ps
Towards the design questions you are posing ;
* A VNET has a special SUBNET designated for gateways called “GatewaySubnet”.=>https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-howto-site-to-site-resource-manager-portal#gatewaysubnet
* The relationship between VNET & Gateway is typically 1:1. The only exception is when the VPN & ER Gateway co-exist => https://docs.microsoft.com/en-us/azure/expressroute/expressroute-howto-coexist-classic
* An ER circuit is terminated in a “POP” of the region. From there you can create a link your VNET(s) to this circuit. So you can share a circuit between multiple VNETs (if you wish), where each VNET will have one ER gateway. => https://docs.microsoft.com/en-us/azure/expressroute/expressroute-faqs#technical-details
* A VPN/ER Gateway -service- will have the needed resiliency built-in to realize the needed SLA => https://azure.microsoft.com/en-us/support/legal/sla/vpn-gateway/v1_3/
(where we do not provide a technical architecture of how it is designed, rest assured that it does not rely on one node/instance) 😉
* If you want to design for a failover of the ER pop, then you need to foresee multiple routes / ER circuits => https://docs.microsoft.com/en-us/azure/expressroute/expressroute-optimize-routing
Does that tackle the questions that occupy your mind?