RDBMS database best design practice never suggests us to keep duplicate records in a table, and If we are working on database schema/table design, we must create the primary key for each. But, many scenarios arise where duplicate records may insert in table, suppose we are working on the Development Database, and data is loaded from different sources where duplicate rows are possible. When we process this data to another database server (PreProd/ Prod), the data should be clean. Therefore, we need to delete the duplicate data.
CREATE TABLE TEMP_EMPLOYEE
(
empId INT IDENTITY(1,1),
FName VARCHAR(100),
LName VARCHAR(100),
Salary DECIMAL(18,2),
Country VARCHAR(100)
)
INSERT INTO TEMP_EMPLOYEE(FName,LName,Salary,Country)
VALUES ('Dilip','Singh',20000,'IND')
INSERT INTO TEMP_EMPLOYEE(FName,LName,Salary,Country)
VALUES ('Raj','Kumar',30000,'USA')
INSERT INTO TEMP_EMPLOYEE(FName,LName,Salary,Country)
VALUES ('Anil','Singh',25000,'AUS')
INSERT INTO TEMP_EMPLOYEE(FName,LName,Salary,Country)
VALUES ('Dilip','Singh',20000,'IND')
INSERT INTO TEMP_EMPLOYEE(FName,LName,Salary,Country)
VALUES ('Dilip','Singh',20000,'IND')
SELECT * FROM TEMP_EMPLOYEE
--Before the
deleting duplicate records, you must check it exists in your table or not
--We can check
duplicate records exist in the table using bellow query
SELECT FName,LName,Salary,Country FROM
TEMP_EMPLOYEE
GROUP BY FName, LName, Salary, Country HAVING COUNT(*)>1
-- To remove
Duplicate records from the table you can use below query
DELETE FROM TEMP_EMPLOYEE WHERE
empId NOT IN ( SELECT MAX(empId) FROM TEMP_EMPLOYEE GROUP
BY FName,LName,Salary,Country )
--**Use above
query, when you want to keep the latest record
--Otherwise you
can use below query
DELETE FROM TEMP_EMPLOYEE WHERE
empId NOT IN ( SELECT MIN(empId) FROM TEMP_EMPLOYEE GROUP
BY FName,LName,Salary,Country )
-- you can
remove duplicate records by using Common Table Expression query given below
;WITH CTE AS (
SELECT FName,LName,Salary,Country,ROW_NUMBER() OVER (PARTITION BY FName,LName,Salary,Country ORDER BY FName,LName,Salary,Country) AS [Rank] FROM TEMP_EMPLOYEE
)
DELETE FROM CTE WHERE [Rank]>1
SELECT * FROM TEMP_EMPLOYEE
DROP TABLE TEMP_EMPLOYEE