Delete duplicate records in SQL Server using ROW_NUMBER

In this SQL Server tutorial, we will learn How to delete duplicate rows in SQL Server, Different methods to delete duplicate rows in SQL Server, and will cover the following topics

Delete duplicate records in SQL Server using ROW_NUMBER

In this SQL Server tutorial, we will learn How to delete duplicate rows in SQL Server, Different methods to delete duplicate rows in SQL Server, and will cover the following topics.

  • How to find duplicate rows in SQL Server
  • Delete Duplicate Rows in SQL Server using Query
  • Delete Duplicate Rows in SQL Server based on one column
  • Delete Duplicate Rows in SQL Server using CTE
  • Delete Duplicate Rows in SQL Server using Row_Number
  • Delete Duplicate Rows in SQL Server using Rank

While developing SQL Server objects, we should adhere to specific best practices, which will eventually help to maintain data integrity and better performance. For example, a table should have primary keys, identity columns, indexes, and constraints.

Also, to ensure that there is no duplicate entry in the SQL Server is a best practice. So in this post, we will understand different ways to find and remove duplicate rows in SQL Server 2019.

Now to remove the duplicate rows from a table, first, we should have a table with duplicate entries. So to implement different ways to remove the duplicate entries, we are going to use the following sample table given below.

Sample table having duplicate rows in SQL Server

Sample Table

And the sample table has a total of 10 records, out of which 4 are duplicate rows.

Here I have used sql server 2019 express edition and sql server management studio.At A Glance - Here's what we'll cover: showHow to find duplicate rows in SQL ServerDelete Duplicate Rows in SQL Server using QueryDelete Duplicate Rows in SQL Server using Row_NumberDelete Duplicate Rows in SQL Server using CTEDelete Duplicate Rows in SQL Server using RankDelete Duplicate Rows in SQL Server based on one column

How to find duplicate rows in SQL Server

Now before directly trying to delete the duplicate rows in SQL Server, lets first try to find the duplicate entries. And for this implementation, we can use the GROUP BY and HAVING clauses in SQL Server 2019.

The GROUP BY clause is used to group multiple rows that have the same values in them. And HAVING is an optional condition used to restrict the rows affected by the GROUP BY clause.

Now to get the duplicate rows using the GROUP BY and HAVING clause, we can follow the following syntax.SELECT col_1, col_2,..,COUNT(*) AS [Count] FROM table_nameGROUP BY col_1, col_2,.. HAVING COUNT(*) > 1;

By using the above syntax, we are grouping the duplicate rows by using the GROUP BY clause. After this, we are using the COUNT function to get the count of multiple duplicate records.

Now lets take an example to understand this implementation, and for this, consider the following query given below.SELECT first_name, last_name, email, Country, COUNT(*) AS [Count] FROM SampleTableGROUP BY first_name, last_name, email, CountryHAVING COUNT(*) > 1;

Now for example demonstration, we are using the sample table. And by using the above query, we are grouping all the records where first_name, last_name, email, and Country values are the same. After this, we are using the Count() function to get the count of duplicated records.

So after successfull query execution, we will get the follwing output.

Finding duplicate rows in sql server

Finding duplicate rows in SQL Server

Read: SQL Server Create Temp Table

Delete Duplicate Rows in SQL Server using Query

So till now, we have learned how to find the number of duplicated records in a SQL Server table. But if we want to remove the duplicate rows or records, first, we need to keep one record out of duplicated ones in our table, and then remove the remaining duplicated values.

For example, in our sample table records at id 2 and 3 are duplicate records. So we only have to delete one of them. Now for this implementation, first we have to select the records, which we want in our table and then remove the remaining ones.

Now for demonstration, consider the following query given below used to delete duplicate records from the sample table.DELETE FROM SampleTable WHERE id NOT IN ( SELECT MIN(id) FROM SampleTable GROUP BY first_name, last_name, email, Country);

In the above example, first, we are using the SELECT statement to select all the duplicate records with a minimum id value. So that we can select the initial records. After this, we are using the DELETE statement to delete the remaining records from the sample table.

After successfully executing the above query, all the duplicate records will be deleted. And now if we query the table, we will get the following records.

removing duplicate rows in sql server

Remaining Table Records

Delete Duplicate Rows in SQL Server using Row_Number

The ROW_NUMBER in SQL Server is a function that returns a sequential integer value to each row within a result sets partition. And it has the following syntax.ROW_NUMBER() OVER( [ PARTITION BY value_expression , ... [ n ] ] order_by_clause )

The Row_Number() function is always used with OVER and the OVER function accepts to parameters. First is the Partition By clause used to partion the data, and second is the Order By clause used to arrange columns in a specific order.

Now we can use the Row_Number() function to assign ranks to duplicate rows in the SQL Server table. And based upon that rank, we can delete that row from the table.

For this demonstration, consider the following example given below.DELETE duplicate_recordsFROM ( SELECT *, DupRank = ROW_NUMBER() OVER( PARTITION BY first_name, last_name, email, Country ORDER BY (SELECT NULL) ) FROM SampleTable ) AS duplicate_recordsWHERE DupRank > 1

In the above example, we are using the Row_Number() function to assign a rank to all the duplicate rows in the sample table. After this, we are using the DELETE statement to delete all the records where the rank is greater than 1.

In our case, there was a total of 4 duplicate records in the sample table, so the query will delete those 4 records from the table.

using row_number to delete duplicate rows

Using Row_Number() to delete duplicate rows

Read: SQL Server Replace Function + Examples

Delete Duplicate Rows in SQL Server using CTE

We can also delete duplicate rows or records in SQL Server, by using Common Table Expression. The Common Table Expression, commonly known as CTE in SQL Server, is used to temporarily store the result set with a name. And later it can be used within SELECT, INSERT, UPDATE, or DELETE statements.

Now to define a CTE in SQL Server, we can use the following syntax.WITH expression_name [ ( column_name [,...n] ) ] AS ( CTE_query_definition )

And to view the result of a common table expression, we use a SELECT query with the CTE expression name.SELECT [column1,column2,..] from expression_name

Now to delete the duplicate records, first, we need to select them, and for this purpose, we are going to use the ROW_NUMBER() function.

For this demonstration, consider the following query given below.WITH CTE( first_name, last_name, email, Country, duplicate_count ) AS (SELECT first_name, last_name, email, Country, ROW_NUMBER() OVER(PARTITION BY first_name, last_name, email, Country                                                     ORDER BY id) AS duplicate_count FROM SampleTable ) SELECT * FROM CTE

In the above query, we are using the SELECT statement as CTE_query to define selected columns. And within the SELECT statement, we are using the Row_Number() function to partition and order the result set based upon the same values. In the end, we are using the CTE expression name to query the table and it will return the following output.

using cte to delete duplicate rows

Duplicate Records

From the above result, it is clearly visible the all the duplicate rows are assigned duplicate_count more than 1. And we can easily use this property to remove the duplicate rows.WITH CTE( first_name, last_name, email, Country, duplicate_count) AS (SELECT first_name, last_name, email, Country, ROW_NUMBER() OVER(PARTITION BY first_name, last_name, email, Country ORDER BY id) AS duplicate_count FROM SampleTable ) DELETE FROM CTE WHERE duplicate_count > 1;

After executing the above query, all the duplicate rows will be deleted (dropped) from the table. In our case, there were 4 duplicate rows in the sample table.

using cte to drop duplicate rows in sql server

Using CTE to drop duplicate rows

Read: IDENTITY_INSERT in SQL Server

Delete Duplicate Rows in SQL Server using Rank

The Rank is also a function in SQL Server, that is used to generate unique values for each row. The Rank() function is very much similar to Row_Number() function, but the Rank() function leaves a gap between the groups.

Now for this implementation, first, we will use the Rank() function to assign ranks to duplicate rows in the table. Then we will use the DELETE statement to delete the duplicate rows whose rank is greater than 1.

For demonstration, consider the following code given below.DELETE duplicate_recordsFROM ( SELECT *, DupRank = RANK() OVER( PARTITION BY first_name, last_name, email, Country ORDER BY Id ASC ) FROM SampleTable ) AS duplicate_recordsWHERE DupRank > 1

In this example, we are using the RANK() function to assign a rank value to each record in the table based upon partition. After this, we are using the DELETE statement to remove all the records where the rank value is greater than 1.

As we are using a sample table in the example which carries 4 duplicate rows. So the query will remove those 4 duplicate rows from the table.

using rank to delete duplicate rows in sql server

Using Rank() to delete duplicate rows

Delete Duplicate Rows in SQL Server based on one column

All the methods discussed to delete duplicate rows in SQL Server till now can work perfectly for multiple as well as single duplicate columns. But for most of the examples discussed until now are based on multiple columns. So in this section, we will use one of the methods to delete duplicate rows based upon one column.

For this implementaion, consider the following query given below to delete duplicate records in SQL Server.DELETE duplicate_records FROM ( SELECT *, DupRank = RANK() OVER( PARTITION BY first_name ORDER BY Id ASC ) FROM SampleTable ) AS duplicate_records WHERE DupRank > 1

In the above-given query, we are using the Rank() function to partition the table based upon the first_name column. And then, the Rank() function will assign rank numbers to each row in the table based upon partition. In the end, we are using the DELETE statement to remove all the rows where the rank number is greater than 1.

Delete Duplicate Rows in SQL Server based on one column

Delete Duplicate Rows based on one column

As in our case, there was a total of 4 duplicate records where the first_name values were the same. So the query will remove 4 rows from the sample table.

You may like the following sql server tutorials:

  • SQL Server Add Column + Examples
  • SQL Server Agent wont start
  • Exception Handling in SQL Server
  • SQL Server convert integer to string
  • How to execute function in SQL with parameters

So in this SQL Server 2019 tutorial, we have learned How to delete duplicate rows in SQL Server, Different methods to delete duplicate rows in SQL Server, and we also covered the following topics.

  • How to find duplicate rows in SQL Server
  • Delete Duplicate Rows in SQL Server using Query
  • Delete Duplicate Rows in SQL Server based on one column
  • Delete Duplicate Rows in SQL Server using CTE
  • Delete Duplicate Rows in SQL Server using Row_Number
  • Delete Duplicate Rows in SQL Server using Rank
Bijay Kumar Sahoo

Bijay Kumar Sahoo

After working for more than 15 years in the Software field, especially in Microsoft technologies, I have decided to share my expert knowledge of SQL Server. Check out all the SQL Server and related database tutorials I have shared here. Most of the readers are from countries like the United States of America, the United Kingdom, New Zealand, Australia, Canada, etc. I am also a Microsoft MVP. Check out more here.

Video liên quan