Azure : Benchmarking SQL Database Setups – To measure is to know, and being able to improve…

Introduction

To measure is to know. If you can not measure it, you cannot improve it!

Today’s post will go more in-depth on what performance to expect from different SQL implementations in Azure. We’ll be focussing on two kind of benchmarks ; the storage subsystem and an industry benchmark for SQL. This so that we can compare the different scenario’s to each other in the most neutral way possible.

to-measure-is-to-know-storage-database-performance-kvaes

Test Setup

As a test bed I started from one of my previous posts

kvaes-azure-sql-cluster-sios-datakeeper-high-availability-ha

The machines I used were DS1 v2 machines when using single disks and a DS2 v2 machines when using multiple disks. In terms of OS, I’ll be using Windows 2012 R2 and MSSQL 2014 (12.04100.1) as database.

 

Storage Benchmark

The first part of the benchmark was to see how the storage subsystem performed. Here I did several test scenario’s whilst using the “storage performance benchmarker” ;

  • Using a 1TB standard storage disk with SIOS Sync Enabled
  • Using a 1TB standard storage disk with SIOS Sync Disabled
  • Using a P10 premium storage with SIOS Sync Enabled
  • Using a P20 premium storage with SIOS Sync Enabled
  • Using four P10 premium storage with Storage Spaces & SIOS Sync Disabled
  • Using four P10 premium storage with a traditional software stripe & SIOS Sync Disabled
  • Using four P10 premium storage with Storage Spaces & SIOS Sync Enabled

The results ;

2016-04-29 22_28_27-Test-Results-Summary.xlsx - Excel - kvaes

If you want to go more in-depth, the detailed results can be found here ; http://storage.kvaes.be/system/details/KVAESSQL21/

What can we conclude from this?

  • Using SIOS Datakeeper does have an impact on your write performance.
  • More disks improve your “SmallIO” workloads, but you don’t gain much for your “LargeIO” workloads.
  • Using Storage Spaces has no negative/positive impact when compared to a traditional approach (in terms of performance).

 

TPC-H Benchmark

For testing the database, I’ll be using an industry benchmark set called “TPC-H” ;

The TPC Benchmark™H (TPC-H) is a decision support benchmark. It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions.

The performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@Size), and reflects multiple aspects of the capability of the system to process queries. These aspects include the selected database size against which the queries are executed, the query processing power when queries are submitted by a single stream, and the query throughput when queries are submitted by multiple concurrent users. The TPC-H Price/Performance metric is expressed as $/QphH@Size.

In regards to tooling, I tried a few and stuck with the “Dell Benchmark Factory for Databases“. It was very intuitive & user friendly whilst using it throughout the extended benchmarking.

What scenario’s did I test?

  • Use a page blob directly located on my (standard) storage account ; one data file, one log file
  • Use six page blobs directly located on my (standard) storage account ; three data files, three log files
  • Use a single local disk of 1TB standard storage (including SIOS Datakeeper); both my log & data file were on the same disk
  • Use a two local disk of 1TB standard storage (including SIOS Datakeeper); One for my log, one for my data
  • An S3 Azure SQL
  • An S2 Azure SQL
  • An S1 Azure SQL
  • One local P10 disk with SIOS Datakeeper ; log & data on same volume
  • One local P20 disk with SIOS Datakeeper ; log & data on same volume
  • Two page blob (127GB = P10) directly located on my premium storage account ; 1x Log, 1x Data
  • Two page blob (511GB = P20) directly located on my premium storage account ; 1x Log, 1x Data
  • Six page blob (127GB = P10) directly located on my premium storage account ; 3x Log, 3x Data
  • Six page blob (511GB = P20) directly located on my premium storage account ; 3x Log, 3x Data
  • Four local P10 disks bundled via Storage Spaces (column count of four) with SIOS Datakeeper on it ; log & data on same volume

 

What were the results?

  • Average Response Times
    2016-04-29 22_06_53-Test-Results-Summary.xlsx - Excel - kvaes
  • TPS
    2016-04-29 22_53_10-Test-Results-Summary.xlsx - Excel -kvaes
  • TPC-H Score
    2016-04-29 22_53_24-Test-Results-Summary.xlsx - Excel -kvaes

 

What can we conclude from this?

  • Using more data files does not directly improve your performance. The low level routines will use a kind of round robin load balancing mechanism. So multi threaded / users applications will benefit from this, though single threaded workloads will actually suffer slightly.
  • Standard storage is to be avoided for SQL workloads.
  • The difference between a P10 and a P20 isn’t that big as one would expect.
  • Striping data via Storage Spaces has clear advantages above a single disk or multiple data files directly on the storage account.
  • The Azure SQL (PaaS) services aren’t bad (starting from S2), but they aren’t great either… Always make sure to benchmark!

 

Other observations

  • Taking database backups directly to a storage account endures the limitation of one backup file. Where the Azure documentation prescribes this as a best practice… My personal experience is that the performance of this setup is bad.
  • Where I really like the concept of data files directly on the storage account… I must be honest and admit that it’s not a good choice in terms of performance. Here I’m still dreaming of a MSSQL inside a container with its data files directly on the storage account. Though there is still a lot of engineering work to be done in order to achieve this in a performant manner.
  • You might think that the direct usage of the storage account might help you with high availability… Though you cannot place all data via this manner. So you are still stuck with the storage replication, like foreseen by SIOS.

 

TL;DR

  • Always benchmark! Before you deploy any production workload on it, always do a benchmark… You want to know how a system performs.
  • Use a benchmark test that you can repeat and that it product agnostic. For instance ; the TPC-H benchmark can be used for MSSQL, but also for Oracle, My/MariaSQL, PostGre, etc…
  • Some results differed greatly from what I would have expected on my blueprint… So again, always benchmark!

 

Anyhow, I hoped you found these benchmark tests useful! I learned a great deal by doing them, where I can say that I was somewhat surprised with the amount of time it took to gather these. As an indication, I can say that the above information took me about a week in terms of benchmarking/tweaking/etc in order to get to make the correct conclusion(s).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s