Lookup transformation is
responsible for joining columns to the data flow/data set, by accessing
additional information in a related table based on values in common columns.
Problem: We have a table EMP which
data are fetching by OLEDB Source, this table contains column FName, LName,
Salary, and Country. In the flow, we need to add column empAddress and empPhone
of Employees which information are in EMPDETAILS table.
Solution: We would use the Lookup
Transformation. Now see below the use of lookup transformation.
Create table EMP(
Id int identity(1,1),
FName VARCHAR(50),
LName VARCHAR(50),
Salary DECIMAL(18,2),
Country VARCHAR(50)
)
insert into EMP (FName,LName,Salary,Country)
Values('Dilip','Singh',20000,'IND')
insert into EMP (FName,LName,Salary,Country)
Values('Viru','Verma',40000,'USA')
insert into EMP (FName,LName,Salary,Country)
Values('Anil','Singh',30000,'UK')
insert into EMP (FName,LName,Salary,Country)
Values('Raju','Shukla',25000,'IND')
insert into EMP (FName,LName,Salary,Country)
Values('Raj','Kushwaha',20000,'AUS')
insert into EMP (FName,LName,Salary,Country)
Values('Salim','Sekh',26000,'PAK')
insert into EMP (FName,LName,Salary,Country)
Values('Nirmit','Katihar',27000,'AUS')
insert into EMP (FName,LName,Salary,Country)
Values('Dilip','Singh',23000,'AUS')
select * from EMP
Create Table EMPDETAILS
(
Id int identity(1,2),
empId int not null,
empAddress varchar(200),
empPhone Varchar(20)
)
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (1,'Delhi','9015518898')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (2,'california','8898283856')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (3,'Bristol','8898283226')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (4,'Haryana','9073348898')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (5,'Kingston','8875647643')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (6,'Lahor','8876664646')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (7,'Lahor','8876664646')
Insert into EMPDETAILS(empId,empAddress,empPhone)
Values (18,'UP','9087765564')
|
First, drag and drop an OLEDB
Source and make connection manager then select EMP table (I hope you know about
OLEDB Source). In the dataset, you will find only columns like FName, LName,
Salary, Country now we have required additional information like Address and
Phone which belongs to EMPDETAILS table.
Now we will use Lookup
Transformation to add or join column from EMPDETAILS table based on common
column, In EMPDETAILS table empId is the foreign key of EMP it means empId is
common between EMP and EMPDETAILS on this basis we can perform Lookup
Transformation.
Drag and drop Lookup
Transformation and add flow path from Oledb Source and edit it. It will look
like below.
Here I will stop the example and
start the explain about lookup.
GENERAL
Full cache: Full
Cache is responsible for generating and loading the reference dataset into
cache before the Lookup transformation is executed.
Partial cache: Partial
Cache is responsible for generating the reference dataset during the execution
of the Lookup transformation. Load the rows with matching entries in the
reference dataset and the rows with no matching entries in the dataset into the
cache.
No cache: No-cache is
responsible for generating the reference dataset during the execution of the
Lookup transformation. No data is loaded into cache.
Cache connection manager: Cache
connection manager is responsible for configuring the Lookup transformation to
use a Cache connection manager. This option is available only if the Full cache
option is selected.
OLE DB connection manager: OLEDB the connection manager is responsible for configuring the Lookup transformation to
use an OLE DB connection manager.
Specify how to handle rows with no
matching entries
Select an option for handling rows
that do not match at least one entry in the reference data set.
When you select Redirect rows to
no match output, the rows are redirected to a no-match output and are not
handled as errors. The Error option on the Error Output page of the Lookup
Transformation Editor dialog box is not available.
When you select any other option
in the Specify how to handle rows with no matching entries list box, the rows
are treated as errors. The Error option on the Error Output page is available.
CONNECTION
OLE DB connection manager: Here
you can select an existing OLE DB connection manager from the list, or create a
new connection by clicking New.
Use a table or view: Here
you can select a table from the list, or you can create a new table or view on
by clicking New.
Use Result of an SQL Query: By
choosing this option, you can browse to a pre-existing query, build a new query
check query syntax and preview query result.
Build Query:
Create the Transact-SQL statement to run by using Query Builder, a graphical
tool that is used to create queries by browsing through data. It looks like
below.
Browse Query: Using
this option, you can browse a pre-existing query.
Parse Query: This
is responsible for checking the syntax of the query
Preview: This
is responsible to preview query result.
COLUMNS
Available Inputs Column: These
columns come through data flow from a connection source. The input columns and
the lookup column must have matching data types. Use a drag-and-drop operation
to map available input columns to lookup columns. You can also map input
columns to lookup columns using the keyboard, by highlighting a column in the
Available Input Columns table, pressing the Application key, and then clicking
Edit Mappings.
Available Lookup Columns: The
lookup columns are those columns comes through a reference table in which you
want to look up values that match the input column. Use a drag-and-drop the operation to map available input columns to lookup columns. Use the checkboxes
to select lookup columns in the referenced table on which to perform the lookup
operations.
Lookup Column: These
columns comes when you select a column from available lookup columns. The
selections are reflected in the checkbox selections in the Available Lookup
Columns table.
Lookup Operation: Select
the option to perform an operation on the lookup column.
Output Alias: The
selections are reflected in the checkbox selections in the Available Lookup
Columns table.
ADVANCED
Note: This
option available only for Partial and no-cache.
Cache size (32-bit): Adjust
the cache size (in megabytes) for 32-bit computers. The default value is 5
megabytes.
Cache size (64-bit): Adjust
the cache size (in megabytes) for 64-bit computers. The default value is 5
megabytes.
Enable cache for rows with no matching
entries: Cache rows with no matching entries in the reference
dataset.
Allocation from cache: Specify
the percentage of the cache to allocate for rows with no matching entries in
the reference dataset.
Modify the SQL statement: Modify
the SQL statement that is used to generate the reference dataset.
Error and Output: I
will explain it later in Error Handling...
Now I am starting output
explanation...
Please apply Data viewers on
every path, input data looks like below.
Here only on column match with a
column in the referenced table. For example, the id of EMP(input table) is
matched with empId in EMPDETAILS(reference table).
Here show matched output data, you
can see empAddress and empPhone columns are joined in the dataset.
like this, you can see not matched
data as given below.