Do you watch the warm-up laps just before every NASCAR and Indy Car race, where everyone is following the pace car and doing slalom turns, making sure the track is clear of debris, warming the tires and checking their gear? Moving data from a single data file setup to a multiple file setup is kinda like that; you’re moving data out of pit row where everything is in single file and spreading out so everyone can get to top speed as soon as the green flag flies.

In my project, I had the dubious distinction of using a RANGE LEFT partition function and failing to split my function and schemes in a timely enough fashion to avoid having my most recent year of data being bottled up in a filegroup meant for one quarter. Code-wise, the easiest approach to correct the problem would be to just bite the bullet and do a partition split.

Dan Guzman, in addition to his current blog at www.dbdelta.com, has a handy script to plot out row counts by object and partition scheme…

Starting from my last post, we’re in this current data distribution…

 

 

 

 

So, let’s add a new filegroup and redistribute the data…

It looks pretty straightforward: add a new filegroup, tag in the partition scheme what the next filegroup will be, then split the function with a new boundary. However, how does our data look now?

 

 

 

 

 

Well, we set the boundary between data for 1995 and 1996 and moved the 1996 data into its filegroup, but we still have 1997 data in the 1995 filegroup until we add its filegroup and do another split, incurring another data shuffle. Furthermore, the 1995 filegroup will forever be the spillover filegroup for rows beyond the highest boundary; it looks pretty un-intuitive and will be a hassle when try to age out this filegroup with a sliding window. I’ll be stuck with the filegroup for the life of the partition scheme.

Remember, this is a small example with a single table. Imagine the runtime when doing this with 100 tables with 500 GB of data, with everything locked at one point or another. Can you do this within the maintenance window you’re allowed?

A more controlled approach may be in order, where you build a new partition function and scheme, build staging tables for each partition, bulk load the data into each partition based on the indicated boundaries, then do a partition switch to merge all the slices together. Here’s one possible checklist…

  1. Make a Backup
  2. Set to Simple Recovery Model
  3. Create New Partition Function
  4. Create New Filegroups with Proper DataFiles
  5. Create New Partition Scheme with new FileGroups
  6. For Each Table
    1. Create New Final Table on New Partition Scheme
    2. Create Staging Tables in Each Partition Scheme Filegroup
    3. Add Check Constraints on Each Staging Table Based on Partition Scheme Boundaries
    4. Bulk Copy Data Into Each Staging Table With Data Sorted on Clustered Key
    5. Switch Staging Tables into New Final Table
    6. Confirm Rowcount Between Old and New Final Table
    7. Drop Staging Tables
    8. Rebuild Non-Clustered Indexes on New Final Table
    9. Rename New Table as Old Table
    10. Run DBCC CHECKIDENT to make sure all identities are set for future inserts
  7. Lather. Rinse. Repeat.

Step 6, as you can tell, will be a very complex scripting exercise. Of particular note is 6-3; I mentioned bulk loading and not a straight INSERT INTO…SELECT or SELECT…INTO statement as I want to minimize the logging as much as possible. You can do the bulk loading with bcp and the BULK INSERT statement, but they are not meant for table-to-table transfers. You can use the Import/Export wizard, but that’s one table at a time and it’s single-threaded with one data flow task if you look at the SSIS package it creates.

If you’re working with Enterprise edition, you may be familiar with the database engine’s Read–Ahead feature. While SQL Server is running a query and fetching data and 64K chunks (equal to an extent) it may anticipate you needing additional data and start reading ahead in 512K chunks. If you have a large fact table, with a size in the hundreds of gigabytes, it is a good candidate for persuading SQL Server to do a read–ahead operation. How can we do that using SSIS?

One possible implementation is to have a Data Flow task with a set of OLEDB data sources with nearly identical SELECT statements, with their only difference being a predicate that is not overlapping any other predicate in the set. Essentially, each is set to the upper and lower boundaries for a given partition. In the case of our example above with data from 1992 through 1997, with six years worth of data, there would be 6 OLE DB data sources with the same SELECT clause but different WHERE clauses. Then, each OLE DB destination would have a table lock on it with fast load also enabled to maximize the throughput. When in execution, you will have multiple concurrent OLE DB data sources feeding individual target tables with their specific years of data.

Once all the data flows have completed, you’d then do a partition switch operation with the ALTER TABLE statement to stitch all the individual target tables together. Prior to stitching the individual tables, you have to make sure the proper CHECK constraints are in place so that the column used in the partition function and reference in the partition scheme has no overlapping values with any other target table. Here’s an example of the staging table, new partition function and partition scheme and new target table along with the switch operation…

This gets to be pretty complicated when constructing it for single table, so imagine what it’s going to be like when you have to do it for a hundred tables. This is a good candidate for BIML, which uses SMO and XML markup to dynamically create SSIS packages. If you were pursuing a BIML approach, you’d loop through a collection of source tables to create individual packages for each source table to port their data to a new target table in the correct partition scheme. Ultimately, it would be a series of Execute SQL tasks to perform the DDL like the ones above operations plus the complex Data Flow task above to load the data in parallel. If it weren’t for the Data Flow, you’d be doing the work as a console app in your .NET language of choice or PowerShell.

Well, you actually can mimic the parallel data flow in a console app.

Back in the day, I was a VB developer, starting with VB6 and then moving on to the early .NET versions. While not nearly as popular as C#, Java or PowerShell nowadays, it still gets the job done. You just need to do a little homework to find out the syntax you need to do the same job that’s already published in those languages.

To simulate the functionality of Data Flow performing a bulk insert operation, you’re going to use the SqlBulkCopy class. The implement it, it’ll be just a matter of instantiating a SQLDataReader, indicating the number of rows per batch, specifying which target table to insert to and calling the WriteToServer method.

However, to do a parallel load of multiple staging tables, you need a little more code.

By default, a SQLBulkCopy object runs synchronously, meaning it’ll run the one bulk insert operation before continuing with code execution. However, there is an option to run multiple bulk copy operations asynchronously. This is done by using .NET’s multi-threading functionality to encapsulate the SQLBulkCopy objects as Task objects, building an array of tasks, and waiting on all the Tasks to finish before continuing with further code execution. Here’s a sample stub looping through 4 years of data.

In brief, you’re defining a function that returns a Task executing an asynchronous bulk copy operation, you’re creating an array of tasks that will run concurrently, and you accessing the data from the source sequentially so that the data streams directly from source table through the SQLDataReader object directly to the SQLBulkCopy object, otherwise you’d run the risk of running out of memory before you’ve buffered the source table in the SQLDataReader object.

It’ll take some trial and error, but you’ll want to find the balance between the amount of memory you can spare to SQL and the number of concurrent threads you can run to pump the data from the old table to the staging tables for the new partitioned table. Be careful of memory grant requirements, especially if you have resource governor enabled on your instance. If you see RESOURCE SEMPAPHORE waits, one or more of your threads is waiting for a memory grant in order to fulfill its query. For my 768 GB instance, I set a Max Memory Grant of 5% and was able to run eight threads concurrently to bulk load my partition slices.

After all your data has been rewritten to the slices, your partition SWITCH command can be issued to stitch them all together into a new single table. Before you could do that switch, every table slice needs a check constraint on the column being provided to the partition scheme with check enabled. This is very important as the database engine will not accept a partition slice that hasn’t had its data checked to make sure it fits within the upper and lower boundaries indicated for the partition. If you didn’t do it when creating the staging table in the first place, you’ll need to do it before issuing the ALTER TABLE command.

Couple this code with your favorite SMO implementation to duplicate table and index objects and to create new partition functions and schemes and, in the end, you will end up with the piece of very valuable code to handle restriping badly partitioned data into new tables without having to do the manual work in SSIS or figure out the mechanics of BIML.

With our new hardware racked ‘n’ stacked, disk subsystem stress-tested, consumption rates calculated for our server and our workload, and data restriped for optimal sequential access, all that’s left is to open the data warehouse to the users for validation and business use.