There are several questions that I’m often posed that relate to availability on Azure. In today’s post, we’ll take a look at the different availability patterns. Here I hope this will answer a big portion of the questions you might have about availability on Azure. The main intake for this post will relate to the “IaaS” chunk of Azure services. Concepts like Azure SQL, Webapps, etc may have a totally different approach. But then again, you are not responsible for designing (and thus do not need to worry about) the availability aspect of these services.
Availability Basics when using Azure
When you are unfamiliar with the basic concept of availability, there are several key elements to remember ;
- Microsoft will only grant an SLA when using multiple machines (who fulfil the same functionality) reside in a joint availability set.
- An availability set will provide several fault & update domains.
- An update domain will ensure that systems who reside in another update domain are not touched during an update cycle.
- A fault domain will ensure that the systems are fault redundant on hardware level.
Want to know more? I can recommend one of our previous cloud chats…
Where the Azure documentation is also a good place to get your info… 😉
Availability Pattern : Single Region
Let’s delve into availability on a single region… Imagine an availability set with 5 update domains & 3 fault domains. This means that your systems are spread across three “racks” and there will be a sequence of five “update groups”.
Availability Pattern : Multi Region
Now let’s imagine the following scenario… We have two regions, where we have deployed an availability set with one system. Strangely enough, we’ll have the same availability as with the previous pattern! Why? Because the “region pairs” also incorporate that updates are not executed at the same time…
“When executing maintenance, Azure will only update the Virtual Machine instances in a single region of its pair. For example, when updating the Virtual Machines in North Central US, Azure will not update any Virtual Machines in South Central US at the same time. This will be scheduled at a separate time, enabling failover or load balancing between regions. However, other regions such as North Europe can be under maintenance at the same time as East US.”
So we got the “update domain” aspect covered… The systems will not be updated at the same time. So one will always be kept online, as long as you respect the region pairs. Now when looking towards the “fault domain”… We can be quite brief about this one, when in a different zone, you can be 110% sure that the systems will not share the same physical hardware. 😉
So from a pragmatic approach, you will receive the same availability as you would have with the previous pattern. Though… (!) you will NOT get an SLA from Microsoft on the matter. For this the systems will need to be in one Availability Set, and this is something that cannot be spread across regions. 😦
“For all Internet facing Virtual Machines that have two or more instances deployed in the same Availability Set, we guarantee you will have external connectivity at least 99.95% of the time.”
- An SLA of 99,95 is only possible when using availability sets
- You can create a similiar setup across regions
- The multi region pattern will not benefit from the SLA