Ever tried the mix of Azure, SQL Server, Storage Spaces Direct & Scale Out File Server?

Introduction

A while back I posted a blog post how to setup a High Available SQL cluster on Azure using SIOS Datakeeper. As I’m an avid believer of storage spaces, I was looking for a moment to test drive “storage spaces direct” on Azure. The blog post of today will cover that journey…

UPDATE (01/02/2017) ; At this point, there is no official support for this solution. So do not implement it for production at this point. As soon as this changes, I’ll update this post accordingly!

UPDATE (08/02/2017) ; New official documentation has been released. Though I cannot find official support statements.

 

Solution Blueprint

What do we want to build today?

  • A two node cluster which will be used as a Failover Cluster Instance for MSSQL.
  • As a quorum, we’ll be using the cloud witness feature of Windows 2016 in combination with an Azure storage account.
  • In regards to storage, we’ll create a Scale Out File Server setup which will leverage the local disks of the two servers via Storage Spaces Direct.
  • To achieve a “floating IP”, we’ll be using the Azure LoadBalancer setup (as we did in the last post).

kvaes-sql-cluster-s2d-sofs-azure

 

Continue reading “Ever tried the mix of Azure, SQL Server, Storage Spaces Direct & Scale Out File Server?”

Azure : Benchmarking SQL Database Setups – To measure is to know, and being able to improve…

Introduction

To measure is to know. If you can not measure it, you cannot improve it!

Today’s post will go more in-depth on what performance to expect from different SQL implementations in Azure. We’ll be focussing on two kind of benchmarks ; the storage subsystem and an industry benchmark for SQL. This so that we can compare the different scenario’s to each other in the most neutral way possible.

to-measure-is-to-know-storage-database-performance-kvaes

Test Setup

As a test bed I started from one of my previous posts

kvaes-azure-sql-cluster-sios-datakeeper-high-availability-ha

The machines I used were DS1 v2 machines when using single disks and a DS2 v2 machines when using multiple disks. In terms of OS, I’ll be using Windows 2012 R2 and MSSQL 2014 (12.04100.1) as database.

Continue reading “Azure : Benchmarking SQL Database Setups – To measure is to know, and being able to improve…”

Azure : MSSQL – Impact of parameters on Backup Performance

A while ago we experienced an issue with slow backup performance on an Azure VM. In light of the experience we had on that case, I went into my lab environment to test the impact of different parameters…

What is the test environment we’ll be using?

  • Virtual Machine : D2
  • Storage : Z-drive, based upon storage spaces, with one disk (100GB / LRS / Standard)
  • Paths : SQL Binaries, Data & Log files on Z-drive. Backup destination used was the temporary drive (D).
  • Database : AdventureWorks increased several times up till 1,5GB (using a set of SQL commands by Jonathan Kehayias), compared to the 250MB.

 

Test Run A
Command

BACKUP DATABASE [AdventureWorks2012]
TO DISK = N’d:\Backup\Test.bak’ WITH NOFORMAT, NOINIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, NOREWIND, NOUNLOAD, STATS = 10

Result

Processed 192888 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 405530 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 598418 pages in 211.441 seconds (22.110 MB/sec).

Comment

We started out using a basic backup command… Nothing fancy, and we got 22MB/s, so not that great either.

 

Test Run B
Command

BACKUP DATABASE [AdventureWorks2012]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10

Result

Processed 192888 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 405535 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 598423 pages in 152.066 seconds (30.744 MB/sec).

Comment

Now we added compression, and the backup command went up by 50%.

 

Test Run C
Command

BACKUP DATABASE [AdventureWorks2012original]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10

Result

Processed 24192 pages for database ‘AdventureWorks2012original’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 2 pages for database ‘AdventureWorks2012original’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 24194 pages in 6.776 seconds (27.894 MB/sec).

Comment

Running the same command on the original (smaller) database had a performance that was a tad slower.

 

Test Run D
Command

BACKUP DATABASE [AdventureWorks2012]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10, BLOCKSIZE = 65536, MAXTRANSFERSIZE=2097152

Result

Processed 192888 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 405539 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 598427 pages in 119.499 seconds (39.123 MB/sec).

Comment

Now we’ll add a fixed block size and max transfer size. Be aware that the blocksize is 65k for tape devices and 512 for anything else. So we’ll beefed that up and got a bit less than double from where we started.

 

Test Run E
Command

BACKUP DATABASE [AdventureWorks2012]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10,BUFFERCOUNT = 20, BLOCKSIZE = 65536, ,MAXTRANSFERSIZE=2097152

Result

Processed 192888 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 405546 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 598434 pages in 100.559 seconds (46.492 MB/sec).

Comment

Now we’ll add a buffer count and things heat up even more.

 

Test Run F
Command

BACKUP DATABASE [AdventureWorks2012]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10,BUFFERCOUNT = 100, BLOCKSIZE = 65536, MAXTRANSFERSIZE=2097152

Result

Processed 192888 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 405550 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 598438 pages in 97.435 seconds (47.983 MB/sec).

Comment

And let’s pump up the buffer count… a tad better.

 

Test Run G
Command

BACKUP DATABASE [AdventureWorks2012original]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10, BUFFERCOUNT = 100, BLOCKSIZE = 65536, MAXTRANSFERSIZE=2097152

Result

Processed 24192 pages for database ‘AdventureWorks2012original’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 3 pages for database ‘AdventureWorks2012original’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 24195 pages in 4.437 seconds (42.600 MB/sec).

Comment

Doing the same with the smaller database gives us a lower performance.

 

Test Run H
Command

BACKUP DATABASE [AdventureWorks2012]
TO DISK = N’d:\Backup\Test.bak’ WITH INIT,
NAME = N’HC_TST-Full Database Backup’, SKIP, COMPRESSION, NOREWIND, NOUNLOAD, STATS = 10, BUFFERCOUNT = 100, BLOCKSIZE = 65536, MAXTRANSFERSIZE=2097152

Result

Processed 648352 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Data’ on file 1.
Processed 1592447 pages for database ‘AdventureWorks2012’, file ‘AdventureWorks2012_Log’ on file 1.
BACKUP DATABASE successfully processed 2240799 pages in 260.300 seconds (67.254 MB/sec).

Comment

I ran the “increase” script a few more time to have a database from about 5.2GB. The performance is better than the two other smaller ones.

Conclusion

  • Parameters : Choosing your parameters matters A LOT. We had a case that went from 9MB/s to 90MB/s by using the same set of parameters as above…
  • Database Size : There is a ramp up time involved… Small databases experience a penalty due to this.
  • Design : Be aware of your disk subsystem design. Using one standard disk will provide you with 500 IOPS/s. If you use this disk for both source & target, then you don’t need to be a professor to know that this won’t yield a big performance.

Storage Performance Benchmarker 0.3 – DISKSPD option!

With this, I’m happy to announce the new release of the “Storage Performance Benchmarker“! The previous version was heavily relying on “SQLIO”, where this version offers you the ability to choose between “DISKSPD” (default) and “SQLIO”. The output will still be aggregated in the same manner towards the backend web interface, though the individual output locally will be in the format of the relative output.

2015-08-31 15_42_15-Administrator_ Windows PowerShell

Parameters added ;

  • -TestMethod : Either “DISKSPD” or “SQLIO”, depending on your preference
  • -TestWarmup : The warmup time used if you use “DISKSPD”

If you have any suggestions/comments, feel free to let me know!

chart-55e45a286e95b

Microsoft Azure : Benchmarking SQL Databases

8-4-2013-1-47-24-am1

Introduction
Azure also provides a PaaS-environment for SQL databases. In my effort to get familiar with them, I also ran a benchmark between all editions, which I wanted to share with you. So today we will run a benchmark versus each database edition to see how they will cope!

2015-02-05 11_22_39-Azure SQL Database Service Tiers and Performance Levels

Test/Benchmark Environment

Test System

  • Specs : Azure Basic A3 (4 cores, 7 GB memory)
  • Software : Database Benchmark + SQL Mgmt Studio
  • Network : Internal Virtual Network + Public Internet Connectivity

Test Databases

  • Basic Edition – “BASIC-B0” – 5 DTU
  • Standard Edition S0 – “STD-S0” – 10 DTU
  • Standard Edition S1 – “STD-S1” – 20 DTU
  • Standard Edition S2 – “STD-S2” – 50 DTU
  • Standard Edition S3 – “STD-S3” – 100 DTU
  • Premium Edition P1 – “PREMIUM-P1” – 100 DTU
  • Premium Edition P2 – “PREMIUM-P2” – 200 DTU
  • Premium Edition P3 – “PREMIUM-P3” – 800 DTU

How does that look from the SQL Management Studio?
2015-02-04 14_49_58-191.233.82.148_55325 - Remote Desktop Connection

Test Method

  • All the test were run sequentially
  • 5 tables of 10.000 records were used as test environment
  • The SQL2012 benchmark was used. (so not the compact one)

Results
And now on to the results…

Write
2015-02-04 23_19_44-191.233.84.122_55325 - Remote Desktop Connection
A difference is visible between all database editions, except the “premium”-edition looks to to be on par with eachother. At this time I cannot judge to the cause of this, where I’m assuming the test system or test method reached a limit (to be updated!).

Read
2015-02-04 23_20_12-191.233.84.122_55325 - Remote Desktop Connection
Apart from the “basic”-edition, all editions seem to be on par.

Secondary Read
2015-02-04 23_20_30-191.233.84.122_55325 - Remote Desktop Connection
After the cache has set in, even the “basic”-edition catches up.

And what can I do with it?
You can now run the benchmarks on your own systems and see how they compare… It will provide you with an insight that will render you capable of comparing performance beforehand in regards to a possible implementation.

Implementation Consideration
Once considering the Azure Databases, be ware that you are limited to SQL authentication. Many enterprises have integrated their SQL rights management in Active Directory, which is not possible with the current implementation of Azure.

2015-02-04 14_47_17-Azure SQL Database Security Guidelines and Limitations

Microsoft Azure : Benchmark Tests – Storage – How do the different series relate to each other?

Azure currently has different “series” of machines. The A-series are seen as “general purpose” machines, where the D-series are targeted towards compute optimization. In the US, the G-series have even seen the light! Today I want to know what the effect of this is on storage performance… Typical IT organizations are worried by storage performance in the cloud as their ERP/BI implementation is “quite eager” to obtain the maximum storage performance.

So what will we be covering today?

  • A bit of theory concerning the differences
  • Test Environment Explained
  • Test Method Explained
  • Display of test results
  • Conclusion / analysis of the test results

 

A bit of theory concerning the differences

What does Microsoft say about their series ;

  • General purpose compute (A) – Basic tier : An economical option for development workloads, test servers, and other applications that don’t require load balancing, auto-scaling, or memory-intensive virtual machines.
  • General purpose compute (A) – Standard tier : Offers the most flexibility. Supports all virtual machine configurations and features.
  • Optimized compute (D) : 60% faster CPUs, more memory, and local SSD – D-seres virtual machines feature solid state drives (SSDs) and 60% faster processors than the A-series and are also available for web or worker roles in Azure Cloud Services. This series is ideal for applications that demand faster CPUs, better local disk performance, or higher memories.
  • Performance optimized compute (G) : unparalleled computational performance with latest CPUs, more memory, and more local SSD – G-series virtual machines feature latest Intel® Xeon® processor E5 v3 family, two times more memory and four times more Solid State Drive storage (SSDs) than the D-series. G-series will provide unparalleled computational performance, more memory and more local SSD storage than any current VM size in the public cloud making it very ideal for your most demanding applications.

Sidenote ; Azure has also released “DS” (“Premium Storage“). We won’t be looking into this area, as it is current still under preview.

Today we’ll be checking what we can get out of those machines via benchmarking. Be aware, that Microsoft is open towards the IOPS delivered by each machine. Be sure to check out the support article “Virtual Machine and Cloud Service Sizes for Azure“. Depending on the type of virtual machine, you can attach a maximum amount of disks. Per disk, you are granted a given number of IOPS. The amount of IOPS granted differs by “series”. An “A – Basic” will be granted 300 IOPS per disk. An “A – Standard”, “D” & “G” will be granted 500 IOPS per disk, where the “DS” will be granted 1600 IOPS per disk.

 

Test Environment Explained

We’ll be creating four machines today ;

  • TEST-BSC-A1 : A1 Basic (West Europe)2015-01-27 16_13_03-Virtual machines - Windows Azure
  • TEST-STD-A1 : A1 Standard (West Europe)
    2015-01-27 16_12_48-Virtual machines - Windows Azure
  • TEST-STD-D1 : D1 Standard (West Europe)
    2015-01-27 16_13_24-Virtual machines - Windows Azure
  • TEST-STD-G1 : G1 Standard (West US*)
    2015-01-28 08_48_56-Virtual machines - Windows Azure

Each machine will be installed with Ubuntu 14.04 with the Azure image of 23/01/2015. The system will then be foreseen with two benchmarking tools ;

These packages will be installed from the Azure Ubuntu Repositories by using the following method. First make sure to uncomment all “multiverse” repositories.

sudo vi /etc/apt/sources.list

Then do an update of the packages list and install both softwares

sudo apt-get update && sudo apt-get install bonnie++ iozone3

 

(Disclaimer : For the test with the G1, I created an additional disk, as the base OS disk was not large enough to fit the test file. Bonnie++ advises to create a test file that is twice the size of the memory. This to counter caching mechanisms. / Update : One error I made, was that I the host caching is disabled by default, so some results on the G1 are not aligned with the other tests. This is only relevant towards the Bonnie++ tests, not to the IOzone tests.)

 

Test Environment / Method Explained

Now we are ready to go… On each system the following commands were executed ;

bonnie++ -d /tmp > /tmp/bonnie.txt

iozone -R -l 5 -u 5 -r 4k -s 100m -F /tmp/f1 /tmp/f2 /tmp/f3 /tmp/f4 /tmp/f5 > /tmp/iozone_results.txt

iozone -R -l 5 -u 5 -r 4k -s 100m -F /mnt/f1 /mnt/f2 /mnt/f3 /mnt/f4 /mnt/f5 > /tmp/iozone_results-mnt.txt

So what are we basically going to do? A good description about what IOzone will do can be found in the article “I Feel the Need for Speed: Linux File System Throughput Performance, Part 1” of Linux Magazine. The highlights ;

IOzone

IOzone is open-source and written in ANSI C. It is capable of single thread, multi-threaded, and multi-client testing. The basic idea behind IOzone is to break up a file of a given size into records. Records are written or read in some fashion until the file size is reached. Using this concept, IOzone has a number of tests that can be performed:

  • WriteThis is a fairly simple test that simulates writing to a new file. Because of the need to create new metadata for the file, many times the writing of a new file can be slower than rewriting to an existing file. The file is written using records of a specific length (either specified by the user or chosen automatically by IOzone) until the total file length has been reached.

  • Re-writeThis test is similar to the write test but measures the performance of writing to a file that already exists. Since the file already exists and the metadata is present, it is commonly expected for the re-write performance to be greater than the write performance. This particular test opens the file, puts the file pointer at the beginning of the file, and then writes to the open file descriptor using records of a specified length until the total file size is reached. Then it closes the file which updates the metadata./LI>

  • ReadThis test reads an existing file. It reads the entire file, one record at a time.

  • Re-readThis test reads a file that was recently read. This test is useful because operating systems and file systems will maintain parts of a recently read file in cache. Consequently, re-read performance should be better than read performance because of the cache effects. However, sometimes the cache effect can be mitigated by making the file much larger than the amount of memory in the system.

  • Random ReadThis test reads a file with the accesses being made to random locations within the file. The reads are done in record units until the total reads are the size of the file. The performance of this test is impacted by many factors including the OS cache(s), the number of disks and their configuration, disk seek latency, and disk cache among others.

  • Random WriteThe random write test measures the performance when writing a file with the accesses being made to random locations with the file. The file is opened to the total file size and then the data is written in record sizes to random locations within the file.

  • Backwards ReadThis is a unique file system test that reads a file backwards. There are several applications, notably, MSC Nastran, that read files backwards. There are some file systems and even OS’s that can detect this type of access pattern and enhance the performance of the access. In this test a file is opened and the file pointer is moved 1 record forward and then the file is read backward one record. Then the file pointer is moved 2 records backward in the file, and the process continues.

  • Record RewriteThis test measures the performance when writing and re-writing a particular spot with a file. The test is interesting because it can highlight “hot spot” capabilities within a file system and/or an OS. If the spot is small enough to fit into the various cache sizes; CPU data cache, TLB, OS cache, file system cache, etc., then the performance will be very good.

  • Strided ReadThis test reads a file in what is called a strided manner. For example, you could read data starting at a file offset of zero, for a length of 4 KB, then seek 200 KB forward, then read for 4 KB, then seek 200 KB, and so on. The constant pattern is important and the “distance” between the reads is called the stride (in this simple example it is 200 KB). This access pattern is used by many applications that are reading certain data structures. This test can highlight interesting issues in file systems and storage because the stride could cause the data to miss any striping in a RAID configuration, resulting in poor performance.

  • FwriteThis test measures the performance of writing a file using a library function “fwrite()”. It is a binary stream function (examine the man pages on your system to learn more). Equally important, the routine performs a buffered write operation. This buffer is in user space (i.e. not part of the system caches). This test is performed with a record length buffer being created in a user-space buffer and then written to the file. This is repeated until the entire file is created. This test is similar to the “write” test in that it creates a new file, possibly stressing the metadata performance.

  • FrewriteThis test is similar to the “rewrite” test but using the fwrite() library function. Ideally the performance should be better than “Fwrite” because it uses an existing file so the metadata performance is not stressed in this case.

  • FreadThis is a test that uses the fread() library function to read a file. It opens a file, and reads it in record lengths into a buffer that is in user space. This continues until the entire file is read.

  • FrereadThis test is similar to the “reread” test but uses the “fread()” library function. It reads a recently read file which may allow file system or OS cache buffers to be used, improving performance.

When taking a look at Bonnie++, check out this article by TextualityMy objective is to gain a proper insight towards the latencies with Bonnie++ and use IOzone for the actual thoughput.

 

Display of test results

Latency

2015-01-28 11_57_47-Bonnie.xlsx - Excel  2015-01-28 12_02_31-Bonnie.xlsx - Excel

Throughput

2015-01-28 12_06_15-iometer.xlsx - Excel

2015-01-28 12_01_58-iometer.xlsx - Excel

Download Raw Results Files

Conclusion / analysis of the test results

So what have we learned today?

  • The latency of the A-series is significately higher than those of the D/G-series.
  • There is a performance difference between the “Basic” and “Standard” of the A-series.
  • Whilst the D-series outperform the A-series, the G-series put all of the others in the dark.
  • There is a performance answer to all loads… Just choose wisely!