Azure Machine Learning : Let’s check our IoT dataset for anomalies!

Introduction

Earlier we setup a basic IoT flow where we captured temperature & humidity and stored it to various outputs. My objective for this week was to create a new flow, that would leverage one of those outputs and do an anomaly detection on the data received. As this detection might take some time, I did not want to do this “in-line” with my current flow. So I’ve added a new one… which kinda looks like this.

2017-01-05-15_38_29-job-diagram-microsoft-azure

The details of the Machine Learning part in combination with Stream Analytics will be for another post. This as I’m still struggling a bit to get it full operational. 😉 So today we’ll “just” cover the Machine Learning aspect of the flow.

 

Disclaimer

To be very clear up front… I’m by no means an expert at machine learning / big data / etc. In my quest to learn, I played around with the Machine Learning Studio of Azure, where I would like to share  my experience on this. 😉

 

The flow

What will we be doing today?

  • Train a model, which will use to detect anomalies in the data we’ll want to test.
  • Setup a service flow that will be able to test data versus the trained model
  • Publish (“deploy”) this service so that we can use it from other services (like Stream Analytics!).

 

Azure Machine Learning Studio

Once we deployed a “Machine Learning Workspace” in Azure, we can see three links ;

2017-01-05-15_47_58-kvaes-iot-ml-microsoft-azure

  • Launch Machine Learning Studio ; To create experiments, train models, etc
  • Launch Machine Learning Gallery ; If you need some inspiration…
  • Launch Machine Learning Web Service Management ; Here we can deploy / manage services (from saved experiments).

We’ll first train a model & create the service. For this we’ll be using the Machine Learning Studio.

2017-01-05-15_30_00-experiments-microsoft-azure-machine-learning-studio

Here you can see I created two experiments. One I use to train my model, where the other one I use to create a service which leverages this model.

2017-01-05-15_46_45-experiments-microsoft-azure-machine-learning-studio

The GUI is very user-friendly and already provides a lot of modules you can use. And if you need some extra juice… you can run your own R/Python (if needed) ;

2017-01-05-15_52_01-experiments-microsoft-azure-machine-learning-studio

 

Training the model

So we’ll start by training a model that will be used for the anomaly detection. The outcome of my endeavours was the following ;

2017-01-05-15_31_02-experiments-microsoft-azure-machine-learning-studio

  • Import the data(set), by using all historical data storage in my Azure Table Storage)
    2017-01-05-15_31_20-experiments-microsoft-azure-machine-learning-studioWhich results in the following ;
    2017-01-05-15_32_33-experiments-microsoft-azure-machine-learning-studio
  • I’ll reduce the number of fields that I’ll present to the training module, as this will reflect back towards the fields that will be expected later on in the service too.
    2017-01-05-15_31_32-experiments-microsoft-azure-machine-learning-studio
    2017-01-05-15_31_42-experiments-microsoft-azure-machine-learning-studio
  • The training will be done by a “Train Anomaly Detection Model”-module, which will be fed by the dataset and giving a learning module.
  • For the learning module, we’ll use the “PCA-Based Anomaly Detection”. As I know the number of columns, we’ll be using the “Single Parameter” training mode, and set it to the number of columns. Not sure what do us? Check the documentation… 😉
    2017-01-05-15_31_53-experiments-microsoft-azure-machine-learning-studio
  • Afterwards we’ll do a test run by using the trained model in combination with our dataset.
    2017-01-05-15_32_57-experiments-microsoft-azure-machine-learning-studio

So you can see that two columns were added ;

  • Scored Labels
  • Scored Probabilities

This is an indication towards the probability of this event being “correct”. So a low value will indicate that this is an anomaly…

 

Once we are happy with the results, we right click on the “Train…. Model”-module, and select “Save as Trained Model” ;

2017-01-05-16_03_00

Follow that flow…2017-01-05-16_03_10-experiments-microsoft-azure-machine-learning-studioAnd it will appear in the “Trained Models”-section ;
2017-01-05-16_33_01-trained-models-microsoft-azure-machine-learning-studio

So we’re all ready to use this model in our service!

 

Creating the service

Next up we’ll create the service… which will have the following topology ;

2017-01-05-15_33_27

When you take a look at this picture. You can notice a “toggle” just above “Save”. This allows you to switch between the “experiment” and “web service” view. Depending on which view, the input & output will be different…

The “grayed” out blocks (“web service input” and “web service output”) are only active in “web service”-view. Where the “import” data flow will only be relevant in the “experiment”-view.

What do we also see here? Kinda the same as we did in the part where we trained the model. The only difference here is that we’ll do the scoring via the trained model we saved.

Once we are happy with the results, we’ll click on the “deploy web service”-button. Which will take us through a nice wizard…

 

Managing the service

Once you would go to the “Azure Machine Learning Web Services”-portal, then you can see the web services you have published and the plans that are active. If you would go to “Web Services”, you can select your web service…

2017-01-05-15_34_03-web-services-management

And here you’ll be able to test your service. Let’s try that one…

2017-01-05-15_34_43-web-services-management

And that works great!

Now if you want to test it out via another application? Go to the “Consume”-tab and get the information you need! 😉

2017-01-05-16_40_39

 

Closing Thoughts

Bare in mind that this was my first machine learning experiment, I think the output is still nice. So I can conclude that this service is quite user-friendly. One of the most handy features I found was that each step/module has the ability to visualize the dataset at that point. So you can “debug” very easily… Despite being a “noob” in this, I must say I’m very impressed & inspired to work more with it.

 

3 thoughts on “Azure Machine Learning : Let’s check our IoT dataset for anomalies!

  1. Hello,
    I am grad student working on using machine learning algorithms to detect anomalies in IoT, I have been searching for datasets and it has proved difficult, please can you point me in the right direction or share knowledge of available datasets with me. I will really appreciate a reply. Thank you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.