Skip to main content

How to perform join between data flow/data set in SSIS

Lookup transformation is responsible to join columns to the data flow/data set, by accessing additional information in a related table based on values in common columns.

Problem : We have a table EMP which data are fetching by OLEDB Source, this table contains column FName, LName, Salary  and Country. In the flow we need to add column empAddress and empPhone of Employees which information are in EMPDETAILS table.

Solution : We would use Lookup Transformation. Now see below the use of lookup transformation..


Create table EMP(

Id int identity(1,1),

FName VARCHAR(50),

LName VARCHAR(50),

Salary DECIMAL(18,2),

Country VARCHAR(50)

)


insert into EMP (FName,LName,Salary,Country)

Values('Dilip','Singh',20000,'IND')

insert into EMP (FName,LName,Salary,Country)

Values('Viru','Verma',40000,'USA')

insert into EMP (FName,LName,Salary,Country)

Values('Anil','Singh',30000,'UK')

insert into EMP (FName,LName,Salary,Country)

Values('Raju','Shukla',25000,'IND')

insert into EMP (FName,LName,Salary,Country)

Values('Raj','Kushwaha',20000,'AUS')

insert into EMP (FName,LName,Salary,Country)

Values('Salim','Sekh',26000,'PAK')

insert into EMP (FName,LName,Salary,Country)

Values('Nirmit','Katihar',27000,'AUS')

insert into EMP (FName,LName,Salary,Country)

Values('Dilip','Singh',23000,'AUS')
select * from EMP

Create Table EMPDETAILS

(

Id int identity(1,2),

empId int not null,

empAddress varchar(200),

empPhone Varchar(20)

)

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (1,'Delhi','9015518898')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (2,'california','8898283856')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (3,'Bristol','8898283226')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (4,'Haryana','9073348898')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (5,'Kingston','8875647643')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (6,'Lahor','8876664646')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (7,'Lahor','8876664646')

Insert into EMPDETAILS(empId,empAddress,empPhone)

Values (18,'UP','9087765564')




First drag and drop an OLEDB Source and make connection manager then select EMP table (I hope you know about OLEDB Source) . In data set you will find only columns like FName, LName, Salary, Country now we have requirement additional information like Address and Phone which belongs to EMPDETAILS table. 

  




Now we will use Lookup Transformation to  add or join column from EMPDETAILS table on the basis of common column, In EMPDETAILS table empId is the foreign key of EMP it means empId is common between EMP and EMPDETAILS on this basis we can perform Lookup Transformation.

Drag and drop Lookup Transformation and add flow path from Oledb Source and edit it. It will look like below .



Here I will stop the example and start the explain about lookup


GENERAL



Full cache: Full Cache is responsible to generate and load the reference dataset into cache before the Lookup transformation is executed.

Partial cache: Partial Cache is responsible to generate the reference dataset during the execution of the Lookup transformation. Load the rows with matching entries in the reference dataset and the rows with no matching entries in the dataset into cache.

No cache: No cache is responsible to generate the reference dataset during the execution of the Lookup transformation. No data is loaded into cache.

Cache connection manager: Cache connection manager is responsible to configure the Lookup transformation to use a Cache connection manager. This option is available only if the Full cache option is selected.
OLE DB connection manager: OLEDB connection manager is responsible to configure the Lookup transformation to use an OLE DB connection manager.

Specify how to handle rows with no matching entries

Select an option for handling rows that do not match at least one entry in the reference data set.

When you select Redirect rows to no match output, the rows are redirected to a no match output and are not handled as errors. The Error option on the Error Output page of the Lookup Transformation Editor dialog box is not available.

When you select any other option in the Specify how to handle rows with no matching entries list box, the rows are handled as errors. The Error option on the Error Output page is available.


CONNECTION



OLE DB connection manager: Here you can select an existing OLE DB connection manager from the list, or create a new connection by clicking New.

Use a table or view : Here you can select table from the list or you can create new table or view on by clicking New.

Use Result of an SQL Query: By choosing this option you can browse to a pre-existing query, build a new query check query syntax and preview query result.

Build Query: Create the Transact-SQL statement to run by using Query Builder, a graphical tool that is used to create queries by browsing through data. It look like below.


   
Browse Query: Using this option you can browse pre-existing query.

Parse Query: This is responsible to check the syntax of the query

Preview: This is responsible to preview query result.

COLUMNS


Available Inputs Column: This columns are come through data flow from a connection source. The input columns and lookup column must have matching data types. Use a drag-and-drop operation to map available input columns to lookup columns. You can also map input columns to lookup columns using the keyboard, by highlighting a column in the Available Input Columns table, pressing the Application key, and then clicking Edit Mappings.

Available Lookup Columns: The lookup columns are those columns comes through reference table in which you want to lookup values that match the input column. Use a drag-and-drop operation to map available input columns to lookup columns. Use the check boxes to select lookup columns in the reference table on which to perform lookup operations.

Lookup Column: These columns comes when you select column from available lookup columns. The selections are reflected in the check box selections in the Available Lookup Columns table.

Lookup Operation:  Select option to perform operation on lookup column.

Output Alias: The selections are reflected in the check box selections in the Available Lookup Columns table.

ADVANCED




Note: This option available only for Partial and no cache.

Cache size (32-bit):  Adjust the cache size (in megabytes) for 32-bit computers. The default value is 5 megabytes.

Cache size (64-bit):  Adjust the cache size (in megabytes) for 64-bit computers. The default value is 5 megabytes.

Enable cache for rows with no matching entries:  Cache rows with no matching entries in the reference dataset.

Allocation from cache: Specify the percentage of the cache to allocate for rows with no matching entries in the reference dataset.
Modify the SQL statement:  Modify the SQL statement that is used to generate the reference dataset.

Error and Output : I will explain it latter in Error Handling..

Now I am starting output explanation..

Please apply Data viewers on every path, input data looks like below.



Here only on column match with column in reference table. for example id of EMP(input table) is matched with empId in EMPDETAILS(reference table).



 Here show matched output data, you can see empAddress and empPhone columns are join in dataset.

like this you can see not matched data as given below..










Popular posts from this blog

What is difference between UNION and UNION ALL in SQL Server

We use UNION and UNION ALL operator to combine multiple results set into one result set.
UNION operator is used to combining multiple results set into one result set but removes any duplicate rows. Basically, UNION is used to performing a DISTINCT operation across all columns in the result set. UNION operator has the extra overhead of removing duplicate rows and sorting result.
UNION ALL operator use to combine multiple results set into one result set but it does not remove any duplicate result. Actually, this does not remove duplicate rows so it is faster than the UNION operator. If you want to combine multiple results and without duplicate records then use UNION otherwise UNION ALL is better.
Following some rules for using UNION/UNION ALL operator
1.The number of the column should be the same in the query's when you want to combine them. 2.The column should be of the same data type. 3.ORDER BY clause can be applied to the overall result set not within each result set.
4.Column name of …

Dropdownlist selectedindexchanged event is not firing

<asp:DropDownList ID="ddlSource" runat="server" DataSourceID="SqlDataSource1" DataTextField="vcSuplierNm" ViewStateMode="Enabled DataValueField="vcSuplierCode" EnableViewState="true" AppendDataBoundItems="true" OnSelectedIndexChanged="ddl_OnSelectedIndexChanged" AutoPostBack="true"></asp:DropDownList>

Add property ViewStateMode="Enabled" and EnableViewState="true"
in drop DropDownList

Remove special characters from string in SQL server

I faced many times an issue to remove special characters from a string. Suppose you are working on searching concept and you have to remove the special characters from search string due to query performance, there are many solution are available but T-SQL is easily resolved this issue.
Following query may help you to resolve your issue.

DECLARE@strVARCHAR(400) DECLARE@expresVARCHAR(50)='%[~,@,#,$,%,&,*,(,),.,!]%' SET@str='(remove) ~special~ *characters. from string in sql!' WHILEPATINDEX(@expres,@str)> 0 BEGIN SET@str=Replace(REPLACE(@str,SUBSTRING(@str,PATINDEX(@expres,@str), 1 ),''),'-',' ') END SELECT@str



Add day to ISODate in MongoDB

We can use $add operator to add days in ISODate in mongodb, $add is the Arithmetic Aggregation Operator which adds number and date in mongodb.
Syntax:

{ $add: [ <expression1>, <expression2>, ... ] }

Note:  If one of the argument is date $add operator treats to other arguments as milliseconds to add to the date.
Example: Suppose we have a Test collection as below.

{"Title" : "Add day to ISODate in MongoBD","CreatedDate" : ISODate("2016-07-07T08:00:00.000Z")}

Query to add 2 days in CreatedDate

db.Test.aggregate([      { $project: { Title: 1, AddedDate: { $add: [ "$CreatedDate", 2*24*60*60000 ] } } }    ])

Result:

{ "_id" : ObjectId("579a1567ac1b3f3732483de0"), "Title" : "Add day to ISODate in MongoBD", "AddedDate" : ISODate("2016-07-09T08:00:00.000Z") }

Note: As mentioned in above note we have to convert days in millisecond because $add operator treat to other arg…

Remove special character from string in MongoDB

Problem: Suppose wehave a collection and one field is type string contains some special character (like !@#$%) and we don’t want these special character.
Solution: We can easily remove the special character from field using script “replace(/[^a-zA-Z 0-9 ]/g, '')” in our query.  How can we remove special character from string using this script please see following example.
Example: Suppose we have a collection “EduSurvey “where we are collecting information from institutions.

{Name:"JB institute”, About:"This is good one collage for MBA", Information:"This $%%institute ##has good faculty etc$$"}
{Name:"MK institute”, About:"This is good one collage for MCA", Information:"This$$%# is the dummy text12"}
{Name:"MG institute”, About:"This is good one collage for B,Tech", Information:"This# institute@ has&* good infrastructure"}

Did you notice Information fields contains some special character so we…

$group (aggregation) in MongoDB

IF we want to group document by specific expression and want to output for each distinct grouping of document here we have to use $group. IF you familiar with Relational Database like SQL Server, It's work same like GROUP BY clause.
Output document contains the _id field which contains the distinct group by key also output document contains computed fields which grasp the value of some accumulator expression grouped by the _id.
Syntax:

{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... }

Accumulator Operator:
$num : It's returns sum of the numeric value. It can be use in $project also in mongodb 3.2 version.
$avg  : It's returns the average of numeric values. It can be use in $project also in mongodb 3.2 version.
$first : It's returns a value from the first document for each group.
$last : It's returns a value from the first document for each group.
$max : It's returns the highest expression value for each group.…

Add a column, with a default value, to an existing table in SQL Server

You can do it by edit of table design, if you want do it with query see following.
Syntax:
ALTERTABLE{TABLENAME} ADD{COLUMNNAME}{TYPE}{NULL|NOTNULL} CONSTRAINT{CONSTRAINT_NAME}DEFAULT{DEFAULT_VALUE} [WITH VALUES]
Note: Use WITH VALUES to update existing null-able rows.
Example:
ALTERTABLEMyTable ADDMyTableTypeIDINTNOTNULL CONSTRAINTConstraint_MyTableTypeIDDEFAULT0 GO
Note: WITH VALUES handles the NOT NULL part
ALTERTABLEMyTable ADDMyTableTypeIDINT CONSTRAINTConstraint_MyTableTypeIDDEFAULT

Merge and Merge join transformation in SSIS