Monday, 29 April 2013

SSIS For Each Loop over files with Date in Name

SSIS For Each Loop over files with Date in Name

 
A common request I have received over the years is to move or copy files from one directory to another based on the file name.  This can be accomplished in SSIS without using any script task. You can do this with a for each loop, an empty sequence container, a file system task, and a few expressions on some variables.
Here is a look at the entire package:
image
Now here are the variables, the variables with the red circle have expressions on them. You can build this package with fewer variables but that would cause maintenance headaches in the future. This package can be maintained using configuration files or tables and control all variables without an expression. The variables with expressions are based on the other variables so the configuration changes would up these indirectly.
image
The first thing you need to do is create a for each loop and map in the variables to the directory and file spec properties. In the image below you can see the directory is using the input directory variable and the file spec is using the file extension and has an asterisk in front of the expression for the wild card. Notice also that you only need the file name only.
image
The file name is mapped to the current file variable.
image
Here are the files that you want to loop over and check the date. If the date is older than one day then you want the file to be moved to the backup folder.
image
Now you need a sequence container inside the for each loop. There will be no task in this sequence container because the container is used as an anchor point for the expression between the sequence container and the file system task.
Below you can see the file system task is set up to use the variables and is set to move the files. This could also be changed to copy or delete the file based on your needs.
image
Now for the expressions, the variables with the red circles next to them have expressions on them.
The dtFileDate variable is used to get the date from the current file name in the loop. This takes a combination of right and substring functions to parse out the date as the following format yyyy-mm-dd. Then this is converted from a string to date. This is now used in the precedence constraint expression between the sequence container and the file system task to see if the date meets the move criteria.
dtFileDate=
(DT_DATE) (Substring(Right(@[User::strCurrentFile],8),1,4)+ “-” +
Substring( Right(@[User::strCurrentFile],4),1,2)+”-”+
Right(@[User::strCurrentFile],2))
The dtCutoffDate Variable uses the strDaysBack variable to subtract the number of days from today’s date. This allows you to update the strDaysBack variable using configurations. This is why it is a good idea to never hard code in the values in your expressions. Always use a separate variable to reduce maintenance.
dtCutOffDate =
DATEADD( “d”, @[User::strDaysBack] , GETDATE()  )
The strFullFileName Variable combines three variables to get the complete path to the current file. This is needed because we only get the name of the file in the for each loop. Getting just the name in the for each loop makes it easier to get the date in the file name.
strFullFileName =
@[User::strInputDirectory]+ @[User::strCurrentFile]+ @[User::strFileExtension]
The last thing you need is the expression between the sequence container and the file system task. This compares the dates and if the expression is true then the file system task is performed.
image
Hope this help you when building complex packages looping through files. Keep in mind if your files have a different date format you will need to adjust your expressions to get the date. Let me know if you have any questions.

No comments:

Post a Comment