Monday 29 April 2013

How to configure a Foreach Loop Container: File Enumerator

How to configure a Foreach Loop Container? The Foreach Loop Container is a repeating container that loops through a collection. These are the collections:


This article explains the File Enumerator. The others will follow in the coming time.

Foreach File Enumerator
The file enumerator loops through a collection of files within a folder and makes it possible to execute for example a Data Flow Task foreach of the files without manualy changing the connection string.

1) Create variabel
We need a variable to store the filepath in. Create a string variable with the name FilePath.

Variabel










Note: I addad a value (the filepath of one of the sourcefiles) to show in the expression later on. It's optional.

2) Foreach Loop
Add a Foreach Loop Container to the Control Flow and give it a suitable name. Go to the Collection tab and:
A) Select Foreach File Enumerator. You have to selected it although it is already selected by default, but that's a little bug in the user interface. They eventually solved it in SQL 2012.
B) Select the folder where your sourcefiles are and use wildcard characters to specify the files to include in the collection. There is one little strange behaviour here that could cause unexpected results: *.xls will also return files with the extension .xlsx (it's simulair to a dos command dir *.xls).
C) Choose Fully qualified because that returns the complete filepath and not only the name and extension.

Configure Foreach Loop Container




















3) Variable mapping
Go to the Vaiable Mappings tab. Now we are gonna use the variable from step 1 and fill it with the filepath from the collection.

Map variable to Index 0 of the collection



















4) Dataflow
The loop is now ready. Add a Data Flow Task to the loop and create a simple dataflow that reads one of your source files and add the data to a database table. (Flat File Source/Connection Manager can be replaced by an Excel Source/Connection Manager)

Add Data Flow Task. => Add Source & Destination




















5) Connection Manager Expression
The Flat File Connection Manager YourSourceFile from the previous step is still hardcoded with the path of one of your sourcefiles and the variable FilePath contains the filepath from the Foreach Loop.

To replace this hardcoded path with the value of the variable we use an expression on the Connection Manager. Goto the properties of the FlatFile Connection Manager and add a new expression.

Add an expression






















Note: This step is the same if you use an Excel Connection Manager.


6) Expression
Select Connection String in the Expression Editor and press the button to edit the expression. In the Expression Builder drag the variable FilePath to the textbox.


Build the expression. Use Evaluate to test the expression.



















Note: If you used an Excel Connection Manager, then don't add an expression on the Connection String, but on ExcelFilePath. The rest is the same.


7) Testing
Now your Foreach Loop is ready for testing, but you could also add a File System Task to move the imported files to an archive folder or add a Script Task that archivces the source files in a zip file.

File System Task





















Note: the File Enumerator only loops through files. It doesn't return folders. Read this arcticle for a Foreach Folder Enumerator.
Note: the File Enumerator can't be sorted. Read this article for a sorted File Enumerator.

No comments:

Post a Comment