On a forum post recently the questions was asked how to replace a
cursor with an SSIS package. This can be done several ways depending on
the situation. In this situation there is a number on each row that
determines the number of times a row needs to be written to the
destination.
The source table looks like the following image.
The Number of Nights column tells us how many times this row needs to be inserted into the destination table. So the Destination should look like the following image after the load is complete. Notice the number of nights matches the number of times the row appears on the destination table.
This can be performed by using a cursor to loop through each row, but this is very slow. If you needed to perform this for millions of rows it would be a very long process. The power of SSIS is in the batch loads it performs in data flows. You can perform this using a small SSIS package. Here is an image of the package Control Flow you can create to perform this kind of cursor work.
This SSIS package will have two variables, intCounter and IntNumber of Nights. The counter variable will increment during the loop. The number of nights variable will hold the maximum number of nights from the source table.
The first task in the package is an Execute SQL Task. It retrieves the maximum number of nights and saves it in the number of nights variable. This will control the number of times the loop runs.
The query in the Execute SQL Task is:
Select max(NumberofNights) as Nights
From CursorSource
The result set is single row and intNumberofNights is mapped under result set.
The For Loop Container will loop from 1 to the max number of nights. The image below shows how this is set up. This is assuming the lowest number of nights will be 1.
The only thing left is the Data Flow. The source will be an OLEDB source with the following SQL query.
SELECT OptionId, StartDate, AllocationID, NumberofNights
FROM dbo.CursorSource
WHERE (NumberofNights >= ?)
The question mark is a parameter and is mapped to the intCounter variable. This will only select rows that have the number of nights greater than or equal to the counter.
The destination is an OLEDB Destination. No special setup needed for this task, just map the source columns to the proper destination columns.
This package will give you the results in the first two table images. The parameter in the Data Flow source prevents it from loading a row too many times.
Let me know if you have any questions, or if you have a cursor problem you need solved let me know.
The source table looks like the following image.
The Number of Nights column tells us how many times this row needs to be inserted into the destination table. So the Destination should look like the following image after the load is complete. Notice the number of nights matches the number of times the row appears on the destination table.
This can be performed by using a cursor to loop through each row, but this is very slow. If you needed to perform this for millions of rows it would be a very long process. The power of SSIS is in the batch loads it performs in data flows. You can perform this using a small SSIS package. Here is an image of the package Control Flow you can create to perform this kind of cursor work.
This SSIS package will have two variables, intCounter and IntNumber of Nights. The counter variable will increment during the loop. The number of nights variable will hold the maximum number of nights from the source table.
The first task in the package is an Execute SQL Task. It retrieves the maximum number of nights and saves it in the number of nights variable. This will control the number of times the loop runs.
The query in the Execute SQL Task is:
Select max(NumberofNights) as Nights
From CursorSource
The result set is single row and intNumberofNights is mapped under result set.
The For Loop Container will loop from 1 to the max number of nights. The image below shows how this is set up. This is assuming the lowest number of nights will be 1.
The only thing left is the Data Flow. The source will be an OLEDB source with the following SQL query.
SELECT OptionId, StartDate, AllocationID, NumberofNights
FROM dbo.CursorSource
WHERE (NumberofNights >= ?)
The question mark is a parameter and is mapped to the intCounter variable. This will only select rows that have the number of nights greater than or equal to the counter.
The destination is an OLEDB Destination. No special setup needed for this task, just map the source columns to the proper destination columns.
This package will give you the results in the first two table images. The parameter in the Data Flow source prevents it from loading a row too many times.
Let me know if you have any questions, or if you have a cursor problem you need solved let me know.
No comments:
Post a Comment