Introduction
Posts about security are always the ones that make everyone get really excited… Or maybe not everyone. 😉 Anyhow, what is typically the weakest link in any security design? Indeed, the human touch… The effects of this can range from having seen secrets to creating drift (unwanted changes vs de expected baseline). In today’s post, I’ll walk you through an example setup that aims to close some additional holes for you. How will we be doing this? By basically automating the entire infrastructure management with Azure Devops & Terraform. Now you’ll probably think, what does that have to do with security? Good response! We’re going to reduce the points to where human contact can interfere with our security measures. Though we want to do this without putting our agility at risk!
Blueprint
For this exercise, we’re going to leverage this blueprint ;
What can you see here?
- Service Principal (upper left / orange) : The identity that will be used by our automation flow to log into Azure and do whatever needs to be done.
- Two Azure Subscriptions
- “Administration” (left, just above the middle) : This is the subscription that will host the KeyVault which contains the credentials of the service principal.
- “Operations” (the entire block on the right) : This is basically the secure / closed container where our resources will run. Solely managed by our automation flow.
- Azure Devops (left, just below the middle) : Azure Devops will be used as the orchestrator & repository base for the entire automation flow.
- Azure (Build) Agents (down left / orange) : This will be the “machines” (or containers) that will run the automation jobs.
- Audit Logs (top right / orange) : The audit logs for both everything related to Azure Resource Manager & Azure KeyVault.
Now you’re probably going to ask ; “Why did you mark three components in orange?” Good question! From a security point of view, these are the possible attack vectors. These are the only points that make contact with our “secure / closed loop” system.
- The build agent is basically the only one who is going to access the management pane of the subscription.
- Where the service principal will be used by the build agent to do all its tasks. This identity could be compromised and used elsewhere.
- Both the build agent & service principal should be safeguarded to prevent malicious activity. Though aside from preventing, there should also be control measures that provide you with the ability to validate if this was indeed the case. That’s where we want to have audit logs. Why is this an attack vector? If I’ve compromised the identity that manages the subscription, I would be able to cover up my trail too… So let’s ensure that we’ve got this part covered too, shall we?
Service Principal : Theory
The first thing to start with is the service principal… This is basically your “system user” that will do all the management of your subscription. How will it be used here?
- The service principal resides in your Azure Active Directory. Here you’ll generate a secret for it with a limited lifetime and save it into a designated Azure KeyVault.
- This KeyVault will be accessed by Azure Devops to retrieve the credentials. As you would expect, Azure Devops has its own service principal (used in the service connection) which you’ll need to grant access to this KeyVault. Afterwards you can link the KeyVault as a “variable group” in Azure.
- Once the “variable group” has been populated & linked, we can use this in our tasks. As an important side note, if you have “secrets” in your Azure Devops flow, then they will not be set as environment variables. You can only reference them as variables from your tasks!
Service Principal : Example
First up, we have our key vault that hosts all the information for the connection to our subscription.
And we’ve granted the service connection used in Azure Devops the needed access for it. In this case, the “kvaes-VMsnoozer-…” users are service connections used in my Azure Devops project ;
As shown here ;
Where the service connections have been given the ability to get the list of secrets & retrieve the content of a secret.
Now we can create a variable group in Azure Devops that will leverage our KeyVault
And inside of our (build/release) pipeline, we’re going to reference this variable group.
Now we can use it in our pipeline step/tasks ;
steps: - script: terraform init -input=false workingDirectory: $(Build.Repository.LocalPath)/terraform/environment displayName: Terraform Init - script: terraform validate -var-file=${{ parameters.varfile }} -input=false -var 'armsubscriptionid=$(armsubscriptionid)' -var 'armtenantid=$(armtenantid)' -var 'armclientid=$(ARMCLIENTID)' -var 'armclientsecret=$(ARMCLIENTSECRET)' -var 'armobjectid=$(ARMOBJECTID)' workingDirectory: $(Build.Repository.LocalPath)/terraform/environment displayName: Terraform Validate
Now the cool stuff is that the secrets will be downloaded… and the variables “placeholders” will be replace with the values from the KeyVault.
And I can imagine you’re not excited by that… But look at the logging…
Secrets are always blurred out! By default… Boom!
Secure / Closed Loops : Theory
Now let’s take a look at the inside of our closed system… Here we’re having a typical scenario of a processing layer (Function App, in this case) working with a data layer (Azure Storage, Azure SQL, Azure CosmosDB, …). By providing the processing layer with an identity, we can secure the connection with our data layer. How can we do that? By leveraging a Managed Service Identity! Where we can then grant it rights to access resources like KeyVault, Azure SQL, … and so on.
Now what’s the beaty of the above? All of this is being provisioned an kept as a whole by leveraging Terraform for the landscaping, and the native service integrations of Azure. For instance, the native KeyVault integration with AppService! Now Terraform has the nice capability where you can use the attributes/values of resources in the definition of other resources. So when creating my Function plan, I can immediately set the instrumentation key of my created application insights (AppInsights) resource. Likewise for my Azure SQL database, I can immediately set the Service Principal of my subscription as the AAD Admin for my database, and also generate a username & password for my SQL administrator. Afterwards I can lock it up in another KeyVault for a kind of “break glass” emergency situation. Anyhow, aside from our automation, nobody should be able to touch the “infrastructure” as such.
Secure / Closed Loops : Example
Now if we would take a look at an example in Terraform for our SQL creation ;
resource "random_string" "mssqluser" { length = 16 special = false number = false } resource "random_string" "mssqlpassword" { length = 32 special = true override_special = "/@\" " } resource "azurerm_sql_server" "mssql" { name = "${var.prefix_workload}${var.prefix_environment}sqlsrv" resource_group_name = "${azurerm_resource_group.mssql.name}" location = "${azurerm_resource_group.mssql.location}" version = "12.0" administrator_login = "${random_string.mssqluser.result}" administrator_login_password = "${random_string.mssqlpassword.result}" }
Then we see that magically appearing in the Azure Portal ;
Same goes for our AAD Admin… Where we’ll be using our service principal as the AAD Admin ;
resource "azurerm_sql_active_directory_administrator" "mssql" { server_name = "${azurerm_sql_server.mssql.name}" resource_group_name = "${azurerm_resource_group.mssql.name}" login = "sqladmin" tenant_id = "${var.armtenantid}" object_id = "${var.armobjectid}" }


Note ; And where my own user (personal user, so not the service principal) doesn’t even have rights on…
Now how does this look in Terraform? Let’s take a look at an example... where the Azure Function App is created with a Managed System Identity (MSI) ;
resource "azurerm_function_app" "functionswrite" { name = "${var.prefix_workload}${var.prefix_environment}${var.functions_identifier}write" location = "${azurerm_resource_group.functions.location}" resource_group_name = "${azurerm_resource_group.functions.name}" app_service_plan_id = "${azurerm_app_service_plan.functions.id}" storage_connection_string = "${azurerm_storage_account.storage.primary_connection_string}" version = "~2" https_only = "true" app_settings { "APPINSIGHTS_INSTRUMENTATIONKEY" = "${azurerm_application_insights.appinsights.instrumentation_key}" } identity { type = "SystemAssigned" } }
Where the KeyVault resource will have an access policy that will refer back to that MSI ;
resource "azurerm_key_vault_access_policy" "keyvaultpolicymsifunctionswrite" { vault_name = "${azurerm_key_vault.keyvault.name}" resource_group_name = "${azurerm_key_vault.keyvault.resource_group_name}" tenant_id = "${azurerm_function_app.functionswrite.identity.0.tenant_id}" object_id = "${azurerm_function_app.functionswrite.identity.0.principal_id}" secret_permissions = [ "set", "list", ] depends_on = ["azurerm_function_app.functionswrite"] }
Which was all that was needed to create the earlier shown links!
Azure Devops : Codebase Example
If you are looking for some example code to play with… Take a look at the following Github Repository that contains the things which were covered in today’s post. It will contain both ;
- The YAML based pipeline scripts for Azure Devops
- As the Terraform scripts to get everything deployed
As you can notice, the pipeline has been split up in “build” & “deploy”. The reason for this, is that at the moment everything is a build pipeline (due to the fact that Azure Devops currently doesn’t support YAML pipelines yet). Though when this becomes available, it’ll make your life easier to get everything untangled… 😉
Closing Thoughts
I can only imagine that this was quite a bit to digest…? Though I hope it has inspired you to see the security proposition in automation your infrastructure deployment. Having worked with various FSI (Financial Services Industry) customers, I can say that this is typically what they are looking for. Where, and this might seem odd to some people, the cloud actually provides them with security capabilities that were not executed upon with their “On Premise” solution. A while back, a customer worded this perfectly in an internal discussion ;
“We have the capability to do so for years! Though we haven’t done it… and we probably will never do so. Though Azure has delivered on this, where we were stuck!”
Hi ,
its giving following error for me while referring to tenant_id and principal_id:
keyvaultaccess.tf line 10, in resource “azurerm_key_vault_access_policy” “test”:
10: object_id = “${azurerm_function_app.functionapp1.identity.0.principal_id}”
|—————-
| azurerm_function_app.functionapp1.identity is empty list of object
The given key does not identify an element in this collection value.
Does the function app have a managed identity assigned?
I was stuck on the “empty list of object” error for hours until I saw I need to add “SystemAssigned” identity. Thank you very much!