Qualifying Heats

To confirm that our new hardware will live up to expectations regarding disk throughput means we have to first run the same series of of storage speed tests that we ran to discover our initial problem. Afterwards, we want to run tests that are a reflection of our current workload to see how the new gear will work under real-world conditions. Some utilities are very easy to use and convey the performance improvement very easily to management. Other utilities have you go deep into the weeds to test your gear under different, extreme conditions.

Review of the Rules

Once the rack ‘n’ stack was done for our new gear, we used a combination of three different utilities to test performance and referenced Microsoft’s Fast Track Daa Warehouse white paper regarding proper test configurations. To do a proper test, you’ll be trying multiple configurations for different runtimes with each of the utilities. You’ll be recording them on a spreadsheet to plot them on graphs, and you may need to make some assumptions on how your gear will be used.

For our data warehouse, we had a relatively high use of of TempDB and pre-aggregated tables throughout the day to compensate for the low throughput and minimal RAM footprint. With the newer gear and a superior amount of RAM and throughput, some of these practices can be discontinued potentially.

Also keep in mind, when running performance tests on a host attached to a SAN,that what you do to the SAN to stress test it may affect other hosts using the same SAN. On our older gear, this was a genuine concern so any stress tests we ran in the past had to be done 1) in the off hours and 2) with short runtimes. Fortunately, for the newer gear, since our SAN enclosure is dedicated solely to our production and UAT hosts, and since we were not live yet, performance tests can be done during the day.

When you’re running these performance tests, your goal is to try to get the storage to work at peak capacity to avoid avoid having a skew in the results due to caching. Work with as big as a test file as you can generate on the space available and try to do as many passes as possible to identify and isolate any outliers. If you don’t obtain a consistent set of performance results, you may need to review your configuration.

Gazing into the Crystal ball

Our first utility to test performance of our new disk subsystem was a repeat visit to Crystal DiskMark. It’s a Windows utility that offers basic features to test your disk subsystem, namely the size of your test file and how many tests you want to run. The tests that run are predefined. There are three tests that use 4K-sized IOs, which are smaller than the smallest 8K sized operations for SQL Server. There is also one sequential test that is comparable to doing full table scans of the fact table or running backup operations; this number will be more meaningful for data warehousing workloads. The big take away using Crystal DiskMark is that it will give you a quick number that you can show your management and your SAN team indicating whether the configuration is in the ballpark.

I have the need, the need for DiskSpd

Once you’ve tried a series of tests using Crystal DiskMark, you’ll want to now work with its underlying engine for more precise test configurations. Crystal DiskMark uses Microsoft’s utility DiskSpd, which is the replacement for the venerable console application SQLIO.

You’ll have a love-hate relationship with DiskSpd. You’ll love the fact that there are many different configuration options to set up your tests.

You can set the size of your IOs so you’re not pegged in at 4K or 512 K.
You can set your runtime.
You can set the number of outstanding requests per thread.
You can disable hardware buffering and software buffering which SQL does.
You can set the number of threads to match the number of cores you have.
You can specify the mix between reads and writes.
You can even specify a a warm-up period before you start recording statistics.

You’ll also love the data that you get back from the tests. You’ll have a summary of your configuration, how your individual CPUs did, how many bytes went through each thread, and your overall latency. The part you will hate is having to scribe this information to plot it into graphs. You also don’t have any attractive GUI like Crystal DiskMark so it’s not as compelling to show to management.

Command Line: DiskSpd.exe -c4G -w0 -b64K -F4 -T1b -s8b -o8 -d300 -h D:\testfile.dat
Input parameters:
timespan: 1
 -------------
 duration: 300s
 warm up time: 5s
 cool down time: 0s
 random seed: 0
 path: 'D:\testfile.dat'
 think time: 0ms
 burst size: 0
 software cache disabled
 hardware write cache disabled, writethrough on
 performing read test
 block size: 65536
 using sequential I/O (stride: 524288)
 number of outstanding I/O operations: 8
 thread stride size: 65536
 IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time: 300.00s
thread count: 4
proc count: 8
CPU | Usage | User | Kernel | Idle
-------------------------------------------
 0| 52.69%| 4.34%| 48.35%| 47.31%
 1| 22.53%| 3.93%| 18.60%| 77.47%
 2| 21.07%| 5.31%| 15.76%| 78.93%
 3| 16.06%| 1.28%| 14.79%| 83.94%
 4| 9.43%| 5.87%| 3.56%| 90.57%
 5| 4.66%| 2.58%| 2.07%| 95.34%
 6| 13.68%| 8.24%| 5.44%| 86.32%
 7| 3.69%| 2.36%| 1.32%| 96.31%
-------------------------------------------
avg.| 17.98%| 4.24%| 13.74%| 82.02%
Total IO
thread | bytes | I/Os | MB/s | I/O per s | file
------------------------------------------------------------------------------
 0 | 120050810880 | 1831830 | 381.63 | 6106.09 | D:\testfile.dat (4096MB)
 1 | 123521007616 | 1884781 | 392.66 | 6282.59 | D:\testfile.dat (4096MB)
 2 | 123122024448 | 1878693 | 391.39 | 6262.30 | D:\testfile.dat (4096MB)
 3 | 126010261504 | 1922764 | 400.58 | 6409.20 | D:\testfile.dat (4096MB)
------------------------------------------------------------------------------
total: 492704104448 | 7518068 | 1566.26 | 25060.18
Read IO
thread | bytes | I/Os | MB/s | I/O per s | file
------------------------------------------------------------------------------
 0 | 120050810880 | 1831830 | 381.63 | 6106.09 | D:\testfile.dat (4096MB)
 1 | 123521007616 | 1884781 | 392.66 | 6282.59 | D:\testfile.dat (4096MB)
 2 | 123122024448 | 1878693 | 391.39 | 6262.30 | D:\testfile.dat (4096MB)
 3 | 126010261504 | 1922764 | 400.58 | 6409.20 | D:\testfile.dat (4096MB)
------------------------------------------------------------------------------
total: 492704104448 | 7518068 | 1566.26 | 25060.18
Write IO
thread | bytes | I/Os | MB/s | I/O per s | file
------------------------------------------------------------------------------
 0 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)
 1 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)
 2 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)
 3 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)
------------------------------------------------------------------------------
total: 0 | 0 | 0.00 | 0.00

Command Line: DiskSpd.exe -c4G -w0 -b64K -F4 -T1b -s8b -o8 -d300 -h D:\testfile.dat

Input parameters:

timespan: 1

-------------

duration: 300s

warm up time: 5s

cool down time: 0s

random seed: 0

path: 'D:\testfile.dat'

think time: 0ms

burst size: 0

software cache disabled

hardware write cache disabled, writethrough on

performing read test

block size: 65536

using sequential I/O (stride: 524288)

number of outstanding I/O operations: 8

thread stride size: 65536

IO priority: normal

Results for timespan 1:

*******************************************************************************

actual test time: 300.00s

thread count: 4

proc count: 8

CPU | Usage | User | Kernel | Idle

-------------------------------------------

0| 52.69%| 4.34%| 48.35%| 47.31%

1| 22.53%| 3.93%| 18.60%| 77.47%

2| 21.07%| 5.31%| 15.76%| 78.93%

3| 16.06%| 1.28%| 14.79%| 83.94%

4| 9.43%| 5.87%| 3.56%| 90.57%

5| 4.66%| 2.58%| 2.07%| 95.34%

6| 13.68%| 8.24%| 5.44%| 86.32%

7| 3.69%| 2.36%| 1.32%| 96.31%

-------------------------------------------

avg.| 17.98%| 4.24%| 13.74%| 82.02%

Total IO

------------------------------------------------------------------------------

0 | 120050810880 | 1831830 | 381.63 | 6106.09 | D:\testfile.dat (4096MB)

1 | 123521007616 | 1884781 | 392.66 | 6282.59 | D:\testfile.dat (4096MB)

2 | 123122024448 | 1878693 | 391.39 | 6262.30 | D:\testfile.dat (4096MB)

3 | 126010261504 | 1922764 | 400.58 | 6409.20 | D:\testfile.dat (4096MB)

------------------------------------------------------------------------------

total: 492704104448 | 7518068 | 1566.26 | 25060.18

Read IO

------------------------------------------------------------------------------

0 | 120050810880 | 1831830 | 381.63 | 6106.09 | D:\testfile.dat (4096MB)

1 | 123521007616 | 1884781 | 392.66 | 6282.59 | D:\testfile.dat (4096MB)

2 | 123122024448 | 1878693 | 391.39 | 6262.30 | D:\testfile.dat (4096MB)

3 | 126010261504 | 1922764 | 400.58 | 6409.20 | D:\testfile.dat (4096MB)

------------------------------------------------------------------------------

total: 492704104448 | 7518068 | 1566.26 | 25060.18

Write IO

------------------------------------------------------------------------------

0 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)

1 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)

2 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)

3 | 0 | 0 | 0.00 | 0.00 | D:\testfile.dat (4096MB)

------------------------------------------------------------------------------

total: 0 | 0 | 0.00 | 0.00

The Best of Both Worlds

Knowing time was short to test the storage before I had to install SQL and start migrating assets, I wanted to use a utility that had the precision of DiskSpd but with some sort of GUI to make it attractive and easy to explain our metrics to management and our sponsors. Fortunately, Darryl van der Peijl, a Microsoft MVP, developed a PowerShell utility that sets all the key parameters for DskSpd in the background, runs the tests for you and plots them not only to a GUI, but also to a CSV file or HTML report as desired. Simply set the number of different tests you want to run, set the parameters for each individual test, set global parameters and output destination for all your tests, and then click Start Tests.

What Tests Should I Run?

How you should use DiskSpd depends a lot on how you anticipate SQL Server will operate on your host. Are you doing a predominantly OLTP workload? Are you doing cube processing scanning fact tables? How much temp DB activity is expected? To help configure DiskSpd to match expected I/O patterns for SQL Server, consult their Using DiskSpd in SQL Server environments document enclosed in the ZIP download. The document has an excellent summary of the random and sequential I/O patterns for SQL Server as well as several command line settings to test specific SQL Server operation patterns.

Historically for our data warehouse, the emphasis has been building a mix of both operational reports and deep dive analytical reports hitting fact tables with a minimum of indexing. There are indexes on business keys and primary keys to handle incremental updates, but other than the fields being indexed to reduce results sets by calendar period, most filtering criteria has been on un-indexed columns, forcing full table scans. My battery of tests will focus a lot on 512 K sequential reads, typical of SQL Server’s Enterprise Edition read-ahead capability. I’ll want to create a large enough test file to discount any SAN-level caching and maximize the number of outstanding IOs per thread to saturate the HBA cards to find the upper limit for the configuration…