Monday 29 April 2013

Replacing a SQL Cursor with SSIS

On a forum post recently the questions was asked how to replace a cursor with an SSIS package. This can be done several ways depending on the situation. In this situation there is a number on each row that determines the number of times a row needs to be written to the destination.
The source table looks like the following image.

image

The Number of Nights column tells us how many times this row needs to be inserted into the destination table. So the Destination should look like the following image after the load is complete. Notice the number of nights matches the number of times the row appears on the destination table.

image

This can be performed by using a cursor to loop through each row, but this is very slow. If you needed to perform this for millions of rows it would be a very long process. The power of SSIS is in the batch loads it performs in data flows.  You can perform this using a small SSIS package. Here is an image of the package Control Flow you can create to perform this kind of cursor work.

image

This SSIS package will have two variables, intCounter and IntNumber of Nights. The counter variable will increment during the loop. The number of nights variable will hold the maximum number of nights from the source table.

image

The first task in the package is an Execute SQL Task. It retrieves the maximum number of nights and saves it in the number of nights variable. This will control the number of times the loop runs.
The query in the Execute SQL Task is:
Select max(NumberofNights) as Nights
From CursorSource
The result set is single row and intNumberofNights is mapped under result set.
 image


image

The For Loop Container will loop from 1 to the max number of nights. The image below shows how this is set up. This is assuming the lowest number of nights will be 1.

image

The only thing left is the Data Flow. The source will be an OLEDB source with the following SQL query.
SELECT        OptionId, StartDate, AllocationID, NumberofNights
FROM            dbo.CursorSource
WHERE        (NumberofNights >= ?)
The question mark is a parameter and is mapped to the intCounter variable. This will only select rows that have the number of nights greater than or equal to the counter.

image

The destination is an OLEDB Destination. No special setup needed for this task, just map the source columns to the proper destination columns.

image

This package will give you the results in the first two table images. The parameter in the Data Flow source prevents it from loading a row too many times. 
Let me know if you have any questions, or if you have a cursor problem you need solved let me know.

No comments:

Post a Comment