Azure : What the BGP is going on there?

Introduction

Something I had on my to-do for a while now was to post a proof-of-concept to you guys/gals about what BGP on Azure can entail… Now some of you might go; “BGP? What the hell is that?!?”. Check out the following “CBT Micro Nugget” as it is a nice high level description of what BGP is.

 

So why should you care? BGP can offer you a way to deal with advanced routing paths. This in turn can deliver resiliency to your business.

 

Proof-of-Concept Design

For today, we’ll be building the following setup ; kvaes-azure-networking-bgp-resiliency

This will consist of the following components ;

  • Four virtual networks ; VNET001, VNET002, VNET003 & VNET004
  • Each VNET will have its own VPN Gateway. We’ll enable BGP on the VPN Gateway and give it its own (unique for, and private to, our deployment) ASN & peering address. The VPN Gateway will be set to “RouteBased”-routing and we’ll use a “Standard” SKU.
  • Each VPN Gateway will have two connections towards the “previous” and “next” gateway. The keys per connection pair will be set to the same key and we’ll also enable BGP on the connection.
  • We’ll deploy two systems into this PoC setup
    • System001 will reside in VNET001
    • System004 will reside in VNET004

 

To test our setup, we’ll execute the following scenario ;

  • Connect from system001 to system004 whilst our ring is complete =>the green path will be followed
  • Connect from system001 to system004 whilst having deleted the connections between VPNGW001 & VPNGW004 => the yellow path will be followed

Continue reading “Azure : What the BGP is going on there?”

Azure : A poor man’s SSL termination (by leveraging Cloudflare)

Introduction

A few weeks back I posted some posts about the Azure Application Gateway. Here I must say I ran into some issues in combination with Rancher. So I was forced to look for alternatives…

One of my requirements was to have a “zero-touch deployment”-capability. Meaning that I did not want to deploy a system where I had to manually change things to get it working.

 

High Level Blueprint

So how would a “poor man’s ssl termination on Azure” look? Basically I’m using Cloudflare as my DNS provider which then provides capabilities like CDN, various SSL options (like SSL Termination = Flexible SSL), WAF, etc. We can start with the free plan, where we can do a redirect to https and do SSL termination.

kvaes-azure-cloudflare-poorman-ssl-termination

In addition, we’ll deploy an NSG (network security = basic azure firewall rule) that is configured to only allow the IP ranges from Cloudflare. This way we speak https on the outside world, and we have to accept that the traffic between Cloudflare and our hosts is unencrypted…

 

Continue reading “Azure : A poor man’s SSL termination (by leveraging Cloudflare)”

Azure : What does the Direct Server Return option do for a Load Balancer?

Introduction

When setting up a load balancing rule in Azure, you’ll be given the opportunity to enable/disable “Direct Server Return”.

2016-08-18 16_06_29-Add load balancing rule - Microsoft Azure

 

So what does it do?

Apart from disabling the “backend port” input field, what does it do? Clicking on the “?” gives us a start…

2016-08-18 16_06_00-Add load balancing rule - Microsoft Azure

Basically, DSR (Direct Server Return) will disable any NAT involved. So the targetted VM should be aware of the loadbalancer IP, or the network flow will break.

So it’s usefull to use as a cluster IP address (for example, when using a cluster IP), though do NOT use it for typical load balancing scenario’s where the nodes aren’t aware of the cluster address.

 

Azure : VNet Peering

Introduction

We’ve talked about setting up VPN connections between VNets in the past… At the end of July, VNet peering entered “preview”. This one allows you to connect two VNets within the same region without the need for a gateway.

 

How does this look?

So let’s look at an example with several VNets ; Two in west europe and one in north europe.

2016-08-16 09_19_01-Choose virtual network - Microsoft Azure

If we select on VNet (from West Europe), we’ll notice another option called “Peerings”.

2016-08-16 09_19_17-Choose virtual network - Microsoft Azure

Press “Add” here, and you’ll be able to link another VNet in the same region.

2016-08-16 09_19_26-Choose virtual network - Microsoft Azure

Issue : Exposing ports with Windows Containers on TP5

A brief post today, so assist people who are probably going to “enjoy” the same networking issue. When coming from docker on linux and working with docker on windows, the first thing you’ll probably run into is the port exposing…

I built a MSSQL 2016 container with the default port (1433) exposed.

PS C:\Users\kvaes> docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
efc7a981f6b9 kvaessql2016 “cmd /S /C ‘powershel” 6 minutes go Up 6 minutes 1433/tcp

Though I was unable to connect from the container host to this port…

PS C:\Users\kvaes> Test-NetConnection -Port 1433 -ComputerName Localhost
WARNING: TCP connect to Localhost:1433 failed

ComputerName : Localhost
RemoteAddress : ::1
RemotePort : 1433
InterfaceAlias : Loopback Pseudo-Interface 1
SourceAddress : ::1
PingSucceeded : True
PingReplyDetails (RTT) : 0 ms
TcpTestSucceeded : False

Now let’s try that directly from the container…

PS C:\Users\kvaes> docker exec -ti efc7a981f6b9 powershell Test-NetConnection -Port 1433 -ComputerName Localhost

ComputerName : Localhost
RemoteAddress : ::1
RemotePort : 1433
InterfaceAlias : Loopback Pseudo-Interface 2
SourceAddress : ::1
PingSucceeded : True
PingReplyDetails (RTT) : 0 ms
TcpTestSucceeded : True

This had me totally flabbergasted! After searching for a solution, I ran into the following github issue ; https://github.com/Microsoft/Virtualization-Documentation/issues/253 

Which pointed me to the following statement ;

This is a known limitation in our Windows NAT implementation (WinNAT) that you cannot access the external port in a static port mapping directly from the container (NAT) host.

The following github issue showed a workaround ; https://github.com/docker/docker/issues/15740

So let’s check the IP of our container…

PS C:\Users\kvaes> docker exec -ti efc7a981f6b9 ipconfig

Windows IP Configuration

Ethernet adapter vEthernet (Temp Nic Name):

Connection-specific DNS Suffix . : 404nupum1doencwb55jgqiwlph.ax.internal.cloudapp.net
Link-local IPv6 Address . . . . . : fe80::3077:b4b4:3a8c:5d83%31
IPv4 Address. . . . . . . . . . . : 172.27.75.141
Subnet Mask . . . . . . . . . . . : 255.240.0.0
Default Gateway . . . . . . . . . : 172.16.0.1

And then setup a proxy to reroute the traffic ;

PS C:\Users\kvaes> netsh interface portproxy add v4tov4 listenaddress=127.0.0.1 listenport=1433 connectaddress=172.27.75
.141 connectport=1433

What does the test from our container host say now?

PS C:\Users\kvaes> Test-NetConnection -Port 1433 -ComputerName Localhost

ComputerName : Localhost
RemoteAddress : ::1
RemotePort : 1433
InterfaceAlias : Loopback Pseudo-Interface 1
SourceAddress : ::1
PingSucceeded : True
PingReplyDetails (RTT) : 0 ms
TcpTestSucceeded : True

And now it works! In all honesty, I find this a serious flaw in the Windows implementation and truly annoying to anyone making the shift from containers in the Linux ecosystem to the Windows ecosystem.

Azure : Traffic Manager in Classic mode vs Resource Manager

Introduction

Today I was setting up a Traffic Manager deployment in Resource Manager. I wanted a rather “simple” failover scenario where my secondary site would only take over when my primary site was down. As you might now, there are several routing methods, where “failover” is one ;

Failover: Select Failover when you have endpoints in the same or different Azure datacenters (known as regions in the Azure classic portal) and want to use a primary endpoint for all traffic, but provide backups in case the primary or the backup endpoints are unavailable.

Though I was surprised that the naming between the “classic mode” (“the old portal“) and “resource manager” (“the new portal“) were different!

 

“Classic Mode” / Service Management

So when taking a look at “classic mode”, we see three methods ;

2016-05-09 13_01_02-Traffic Manager - Microsoft Azure

They are described fairly in-depth on the documentation page, though in short ;

  • Performance : You’ll be redirected to the closest endpoint (based on network response in ms)
  • Round Robin : The load will be distributed between all nodes. Depending on the weight of a node, one might get more or less requests.
  • Failover : A picking order will be in place. The highest ranking system alive will receive the requests.

 

“New Portal” / Resource Manager

When taking a look at “Resource Manager”, we’ll see (again) three methods ;

2016-05-09 13_01_49-Create Traffic Manager profile - Microsoft Azure

Though the naming differs… When going into the technical details, it’s more a naming thing than a technical thing. The functionalitity is (give of take) the same. Where the “Round Robin” had the option of weights (1-1000) before, this now seems a focal point. Where “Failover” was working with a list (visualizuation), you can now directly alter the “priority” (1-1000) of each endpoint.

The info when checking out the routing method from within the portal ;

  • Performance: Use this method when your endpoints are deployed in different geographic locations, and you want to use the one with the lowest latency.
  • Priority: Use this method when you want to select an endpoint which has highest priority and is available.
  • Weighted: Use this method when you want to distribute traffic across a set of endpoints as per the weights provided.

 

TL;DR

Where the naming differs between the two stacks, the functionality remains the same ;

  • Performance didn’t get renamed
  • Round Robin became “weighted”
  • Failover became “priority

Azure : Debugging VPN Connectivity on Resource Manager

Debugging failed VPN tunnels can be quite annoying… Today we had an issue with a new deployment that had us on a wild goose chase for a while. So a quick post to give all of you some tracking points ; vpn_trans

  • The first VPN gateway that receives a packet which is in need of the tunnel will initiate the connection. In ARM you have no way to manually initiate the connect.
    As a side effect, the destination gateway is typically the one who has the most useful information regarding the VPN connection. So when debugging, look towards that gateway.
    Therefor I would suggest to start a ping from an Azure VM (within the VNET) towards the local network. This will kickstart the connection process.
  • The diagnostic part on the Azure side is quite “basic” and well hidden… Actually, the commands to get diagnostics are only available in “classic”-mode. Though you can work your way around it. Check out the following post for more information o getting diagnostics for the VNET gateway on Resource Manager.
  • With the change from “Classic” to “Resource Manager”, there was also a change in the naming of the VPN types. Previously we had “static” and “dynamic”. The “static” connection was “policy-based” and the “dynamic” was “route-based”. When looking towards the effect, the “route-based” deployment relied on IKE V2, where the “policy-based” deployment relies on IKE V1. This is VERY important to know, as this has an effect on the amount of tunnels you can build. In addition, there are a lot of VPN gateways that do not support IKE V2 (at this moment).

Good luck troubleshooting!