A few weeks back I posted a blog post on how you can leverage “serverless” components for IoT. Today I’ll show you what it would mean if we replace the Azure Functions component in that post by Azure Stream Analytics.
So the flow between device and event hub is untouched. Though we’ll replace the functions part with Azure Stream Analytics. So how will the end result look?
We’ll be using one Stream Analytics job to trigger three flows. One will store the data into an Azure Table Storage, another on will store it as a JSON file on to Azure Blob Storage and another one will stream it directly into a PowerBi dataset.
So let’s take a look at all the components from within this Stream Analytics Flow we’ll be using…
Input : EventHub
We’ll only be using one input, being our event hub.
The messages onto the quere are serialized in a “json”-format. Now onto our three output…
Output : Power Bi
Our first output will be the one for Power Bi.
You can authorize the connection and specify the dataset & table name.
Output : Table Storage
Next up is the table storage, here you specify the storage account settings. In addtion, I’ve set up the partition key to be my “hostname” and the row key to be my “timestamp”.
I could increase the batch size. As my poc environment is pretty “modest”, I’m keeping this at “1”.
Output : Blob Storage
For archival purposes, I want to keep a history of all my telemetry (data sent by my devices. Why do I want this? If at a given point I gain new insights (go figure?!?), then I’m able to replay the data I gathered earlier on.
Do notice that I specified a path pattern. This will save the telemetry data per year/month/day/hour…
So I could even go back quite granular without doing anything fancy!
Now let’s take a look at the query…
This is a rather “modest” query. There are two non basic treats in there… One is that I’m overriding the timestamp (that is normally given by the event queue) by the one included in my message (as defined by my device) and that there are three queries…
SELECT hostname , timestamp , Humidity , Temperature INTO [storageTableTelemetry] FROM [ehQueueTelemetry] TIMESTAMP BY Timestamp SELECT * INTO [pbiDatasetTelemetry] FROM [ehQueueTelemetry] TIMESTAMP BY Timestamp SELECT * INTO [storageBlobTelemetryJSON] FROM [ehQueueTelemetry] TIMESTAMP BY Timestamp
For the table storage, I’m doing limited select, where for power bi & blob storage, I’m just redirecting everything.
Now, let’s test this… Select the three dots of your input, and either upload sample data, or you can capture it from the input. I’ll be doing that one here.
… and done!
Now you’ll be shown the results for each of your outputs. This one being table storage
And this one being the power bi.
This looks good! So we’ll run that query…
You promised me power bi too! Yeah, I did… 🙂 So let’s a look at that one. Bare in mind that I’m using the preview look & feel. So your screen might look a bit different.
As you can see, the dataset was generated as expected… and I’ve already created a report for my telemetry data.
When we take a look at the dataset, we can see this matches the one we specified in our Stream Analytics job.
And I can now create visualizations for my data… Here is an example for the temperature of this room…
… and for the humidity.
How much did this flow cost me?
- Event Hub : As I’m only using one consumer group, I can limit myself to the basic variant (~9€/month), where the standard variant would set me back about ~19€/month.
- Stream Analytics : The 24×7 running job is about 19€/month, with a bit extra (0,0008€/GB) for my (rather modest) bandwidth.
- Storage : With 2 euro cent / GB, I’m hardly noticing this cost. The transactions (IOPS) will cost me a bit more, but still nothing in compared to the above (which is still not that big a deal).
- Bandwidth : All bandwidth is “ingress” (from outside to Azure), so that doesn’t cost me anything.
- Power Bi : The free tier is sufficient. In my previous flow I had to use the pro tier if I wanted my data to be refreshed more than once a day (up to 8 times per day). Now I have “near online” information as Stream Analytics is feeding the data directly into Power Bi.
In this flow, I’m not using “Azure Functions” anymore… So one might say I’m not having a “serverless”-approach anymore. As I’m still using “PaaS/SaaS”-only building blocks, we are still running “Serverless”. The semantic is mainly oriented towards not using the typical IaaS Compute components.
Azure provides a lot of services, and it’s sometimes a though job to find the right fit for your workload. Here I must say the move from Azure Functions to Azure Stream Analytics worked out wonderful for my workload. I’ve upgraded the visibility of my telemetry towards “near online”, whilst keeping the costs still manageable. The architecture I have here can scale far beyond that one single IoT device I currently have, and that for a cost… that’s still very modest!