Monday, 29 April 2013

Integration Services Data Transformations

Integration Services Data Transformations:

SQL Server Integration Services transformations are the components in the data flow of a package that aggregate, merge, distribute, and modify data. Transformations can also perform lookup operations and generate sample datasets. This section describes the transformations that Integration Services includes and explains how they work.
The following transformations perform business intelligence operations such as cleaning data, mining text, and running data mining prediction queries.
Slowly Changing Dimension Transformation
The transformation that configures the updating of a slowly changing dimension. The Slowly Changing Dimension transformation coordinates the updating and inserting of records in data warehouse dimension tables.
Important
The Slowly Changing Dimension Wizard only supports connections to SQL Server.
Fuzzy Grouping Transformation
The Fuzzy Grouping transformation performs data cleaning tasks by identifying rows of data that are likely to be duplicates and selecting a canonical row of data to use in standardizing the data.
Fuzzy Lookup Transformation
The transformation that looks up values in a reference table using a fuzzy match. The Fuzzy Lookup transformation performs data cleaning tasks such as standardizing data, correcting data, and providing missing values.
Term Extraction Transformation
The Term Extraction transformation extracts terms from text in a transformation input column, and then writes the terms to a transformation output column. The transformation works only with English text and it uses its own English dictionary and linguistic information about English.
Term Lookup Transformation
The Term Lookup transformation matches terms extracted from text in a transformation input column with terms in a reference table. It then counts the number of times a term in the lookup table occurs in the input data set, and writes the count together with the term from the reference table to columns in the transformation output. This transformation is useful for creating a custom word list based on the input text, complete with word frequency statistics.
Data Mining Query Transformation
The transformation that runs data mining prediction queries.
 
 
Row Transformations:
The following transformations update column values and create new columns. The transformation is applied to each row in the transformation input.
Character Map Transformation
The Character Map transformation applies string functions, such as conversion from lowercase to uppercase, to character data. This transformation operates only on column data with a string data type.
You configure the Character Map transformation in the following ways:
· Specify the columns to convert.
· Specify the operations to apply to each column.
Copy Column Transformation
The transformation that adds copies of input columns to the transformation output. The Copy Column transformation creates new columns by copying input columns and adding the new columns to the transformation output.
Data Conversion Transformation
The transformation that converts the data type of a column to a different data type.
Derived Column Transformation
The Derived Column transformation creates new column values by applying expressions to transformation input columns
Export Column Transformation
The Export Column transformation reads data in a data flow and inserts the data into a file. For example, if the data flow contains product information, such as a picture of each product, you could use the Export Column transformation to save the images to files.
Import Column Transformation
The Import Column transformation reads data from files and adds the data to columns in a data flow. Using this transformation, a package can add text and images stored in separate files to a data flow
Script Component
The Script component hosts script and enables a package to include and run custom script code
OLE DB Command TransformationThe OLE DB Command transformation runs an SQL statement for each row in a data flow. For example, you can run an SQL statement that inserts, updates, or deletes rows in a database table.
 
Rowset Transformations:
The following transformations create new rowsets. The rowset can include aggregate and sorted values, sample rowsets, or pivoted and unpivoted rowsets.
Aggregate Transformation
The Aggregate transformation applies aggregate functions, such as Average, to column values and copies the results to the transformation output. Besides aggregate functions, the transformation provides the GROUP BY clause, which you can use to specify groups to aggregate across.
Sort Transformation
The Sort transformation sorts input data in ascending or descending order and copies the sorted data to the transformation output. You can apply multiple sorts to an input; each sort is identified by a numeral that determines the sort order
Percentage Sampling Transformation
The Percentage Sampling transformation creates a sample data set by selecting a percentage of the transformation input rows. The sample data set is a random selection of rows from the transformation input, to make the resultant sample representative of the input.
Row Sampling TransformationThe Row Sampling transformation is used to obtain a randomly selected subset of an input dataset. You can specify the exact size of the output sample, and specify a seed for the random number generator.
Pivot Transformation
The Pivot transformation makes a normalized data set into a less normalized but more compact version by pivoting the input data on a column value
Unpivot Transformation
The Unpivot transformation makes an unnormalized dataset into a more normalized version by expanding values from multiple columns in a single record into multiple records with the same values in a single column
 

Split and Join Transformations:
The following transformations distribute rows to different outputs, create copies of the transformation inputs, join multiple inputs into one output, and perform lookup operations.
Conditional Split Transformation
The Conditional Split transformation can route data rows to different outputs depending on the content of the data. The implementation of the Conditional Split transformation is similar to a CASE decision structure in a programming language. The transformation evaluates expressions, and based on the results, directs the data row to the specified output. This transformation also provides a default output, so that if a row matches no expression it is directed to the default output.
Multicast Transformation
The Multicast transformation distributes its input to one or more outputs. This transformation is similar to the Conditional Split transformation. Both transformations direct an input to multiple outputs. The difference between the two is that the Multicast transformation directs every row to every output, and the Conditional Split directs a row to a single output
Union All Transformation
The Union All transformation combines multiple inputs into one output. For example, the outputs from five different Flat File sources can be inputs to the Union All transformation and combined into one output
Merge Transformation
The Merge transformation combines two sorted datasets into a single dataset. The rows from each dataset are inserted into the output based on values in their key columns
Merge Join Transformation
The Merge Join transformation provides an output that is generated by joining two sorted datasets using a FULL, LEFT, or INNER join
Lookup Transformation
The Lookup transformation performs lookups by joining data in input columns with columns in a reference dataset. You use the lookup to access additional information in a related table that is based on values in common columns.
Cache Transform
The transformation that writes data from a connected data source in the data flow to a Cache connection manager that saves the data to a cache file. The Lookup transformation performs lookups on the data in the cache file
Note: The Cache Transform writes only unique rows to the Cache connection manager.

 

Auditing Transformations:

Integration Services includes the following transformations to add audit information and count rows.
Audit Transformation
The Audit transformation enables the data flow in a package to include data about the environment in which the package runs.
Row Count Transformation
The Row Count transformation counts rows as they pass through a data flow and stores the final count in a variable

No comments:

Post a Comment