Data Workflows in Azure : Taking an end-to-end look from ingest to reporting!

Introduction

There are a lot of scenario’s where organization are leveraging Azure to process their data at scale. In today’s post I’m going to go through the various pieces that can connect the puzzle for you in such a work flow. Starting from ingesting the data into Azure, and afterwards processing it in a scalable & sustainable manner.

 

High Level Architecture

As always, let’s start with a high level architecture to discuss what we’ll be discussing today ;

 

  • Ingest : The entire story starts here, where the data is being ingested into Azure. This can be done via an offline transfer (Azure DataBox), or online via (Azure DataBox Edge/Gateway, or using the REST API, AzCopy, …).
  • Staging Area : No matter what ingestation method you’re using, the data will end up in a storage location (which we’ll now dub “Staging Area”). From there one we’ll be able to transfer it to it’s “final destination”.
  • Processing Area : This is the “final destination” for the ingested content. Why does this differ from the staging area? Cause there are a variety of reasons to put data in another location. Ranging from business rules and the linked conventions (like naming, folder structure, etc), towards more technical reasons like proximity to other systems or spreading the data across different storage accounts/locations.
  • Azure Data Factory : This service provides a low/no-code way of modelling out your data workflow & having an awesome way of following up your jobs in operations. It’ll serve as the key orchestrator for all your workflows.
  • Azure Functions : Where there are already a good set of activities (“tasks”) available in ADF (Azure Data Factory), the ability to link functions into it extends the possibility for your organization even more. Now you can link your custom business logic right into the workflows.
  • Cosmos DB : As you probably want to keep some metadata on your data, we’ll be using Cosmos DB for that one. Where Functions will serve as the front-end API layer to connect to that data.
  • Azure BatchData Bricks : Both Batch & Data Bricks can be directly called upon from ADF, providing key processing power in your workflows!
  • Azure Key Vault : Having secrets lying around & possibly being exposed is never a good idea. Therefor it’s highly recommended to leverage the Key Vault integration for storing your secrets!
  • Azure DevOps : Next to the above, we’ll be relying on Azure DevOps as our core CI/CD pipeline and trusted code repository. We can use it to build & deploy our Azure Functions & Batch Applications, as for storing our ADF templates & Data Bricks notebooks.
  • Application Insights : Key to any successful application is collecting the much needed telemetry, where Application Insights is more than suited for this task.
  • Log Analytics : ADF provides native integration with Log Analytics. This will provide us with an awesome way to take a look at the status of our pipelines & activities.
  • PowerBI : In terms of reporting, we’ll be using PowerBI to collect the data that was pumped into Log Analytics and joining it with the metadata from Cosmos DB. Thus providing us with live data on the status of our workflow!

 

Now let’s take a look at that End-to-End flow!

Continue reading “Data Workflows in Azure : Taking an end-to-end look from ingest to reporting!”

Landscaping a Secure/Closed Loop Infrastructure in Azure with Terraform & Azure Devops

Introduction

Posts about security are always the ones that make everyone get really excited… Or maybe not everyone. 😉 Anyhow, what is typically the weakest link in any security design? Indeed, the human touch… The effects of this can range from having seen secrets to creating drift (unwanted changes vs de expected baseline). In today’s post, I’ll walk you through an example setup that aims to close some additional holes for you. How will we be doing this? By basically automating the entire infrastructure management with Azure Devops & Terraform. Now you’ll probably think, what does that have to do with security? Good response! We’re going to reduce the points to where human contact can interfere with our security measures. Though we want to do this without putting our agility at risk!

 

Blueprint

For this exercise, we’re going to leverage this blueprint ;

Continue reading “Landscaping a Secure/Closed Loop Infrastructure in Azure with Terraform & Azure Devops”

Using Azure Key Vault with your application settings (environment variables) powering Azure Functions

Introduction

Last week the blog post “Simplifying security for serverless and web apps with Azure Functions and App Service” was published. In essence, it talks about how you can integrate Azure Functions with Azure Key Vault in order to retrieve secrets and import them into the application settings (being environment variables). You can do this in a secure manner, by providing the Azure Functions platform with a Managed Service Identity, and granting its underlying service principle with (limited: list & read) rights to the Key Vault.

 

Let’s take a look!

The first thing we’ll need to do, is to enable the “Managed service identity” for our Azure Function plan. Let’s browse to our Azure Function plan, and then select “Platform features”.

Continue reading “Using Azure Key Vault with your application settings (environment variables) powering Azure Functions”

Azure Functions : Compiled or interpreted C#… What impact does it have on my performance?

Introduction

Last week I did a post about how to integrate Compiled Azure Functions working with VSTS… In the closing thoughts I made a statement about my observation that compiled functions had a performance improvement.

 

Here I should have known Nills would challenge me on that… 😉

 

So… #challengeaccepted

Continue reading “Azure Functions : Compiled or interpreted C#… What impact does it have on my performance?”

VSTS & Compiled Azure Functions – How to set up your basic CI/CD pipeline

Introduction

A question that pops up occasionally is how to setup your Azure Functions DevOps flow when you’re using C# underneath. Today’s post will be a brief one to run you through this process. If you should prefer a video on this… That exists too! Curtosiy of the app service product group.

 

Quick Howto

Let’s take a look at the build process. We have (at least, as this flow did not do any testing => “Shame on me!”) three steps in the build process ;

  • Restore the nuget packages
  • Build the solution (and create a single zip file)
  • Publish the artifact

 

So let’s take a look at one of my own builds… First I kick off with installing NuGet on my build agent (should it not already be present).

Continue reading “VSTS & Compiled Azure Functions – How to set up your basic CI/CD pipeline”

When your Single Page App needs CORS and meets Azure API Management with a Function Backend

Introduction

When you have an SPA (Single Page App), all your code is being run inside of your browser. This means that, from a network perspective, you’ll be talking to the APIs directly. It’s often (rightfully) said that SPAs are an untrusted client, where a typical server-side app is seen as a trusted client. Why is an SPA seen as untrusted? Because from the publisher side (the one providing the service/app), you do not control the device running the code. So this has a huge effect on the security risks involved and how you should mitigate them.

 

One of those mitigations is “CORS” ;

Cross-origin resource sharing (CORS) is a mechanism that allows restricted resources (e.g. fonts) on a web page to be requested from another domain outside the domain from which the first resource was served.[1] A web page may freely embed cross-origin images, stylesheets, scripts, iframes, and videos.[2] Certain “cross-domain” requests, notably Ajax requests, are forbidden by default by the same-origin security policy. (Source : Wikipedia)

 

With CORS, the request will indicate from which domain the calls would originate (and what actions / headers it would like to do). Therefore, the backend can check if the call is warranted or not…

Continue reading “When your Single Page App needs CORS and meets Azure API Management with a Function Backend”

PowerApps & Functions – Where low/no-code meets serverless… organizations can create apps faster!

Introduction

Like many organization, you’re probably also looking for a more “rapid development” track for a subset of your applications. I’ve heard a lot of reasons for this… Going from rapid prototyping to having small apps that make life a lot easier within the organization (like typical approval flows). For this we’re going to see how we can combine PowerApps & Azure Functions! By using PowerApps we want to take a low/no-code approach to creating the front-end, where Functions (or even Logic Apps as an alternative) will allow us to provide specific back-end data.

Recipe for today

Today we’ll be using the following ingredients as a base for the recipe of the day ;

Here we’ll be building a small powerapp that’ll call an API (OpenAPI Spec) that is hosted as an Azure function. So basically connecting a low/no-code app to a serverless API.

Continue reading “PowerApps & Functions – Where low/no-code meets serverless… organizations can create apps faster!”