4+ Ways to Find Duplicates in a Column and Delete Rows in Excel
Duplicate data can be a real pain. It often sneaks up on you when you have to merge two or more sheets. In a small dataset you might be able to spot duplicates manually, but the problem gets bigger as your data grows. To tackle this problem, let’s look at 4 ways to find duplicate values in columns and then delete rows in Excel.
The dataset I have used throughout this article (download at button above ⬆️) contains four columns with details of different holidays in the USA.
Method 1: Filter and Remove Duplicate Values in Excel with the Remove Duplicates Command
The simplest way to remove duplicates in a dataset is by using the built-in Remove Duplicates command. It identifies duplicates, then deletes them, so that only unique values remain. To do that, follow the steps below:
- Select the entire data table (click anywhere in the dataset and press CTRL+A).
- Click on the Data tab.
- Under the Data Tools group, click on the Remove Duplicates command. Or use keyboard shortcut Alt, A, M.
You will see the Remove Duplicates dialog box will pop up.
- Tick all the columns that need to match, before a row is considered a duplicate. E.g. in our example, all four columns should match before it is considered a duplicate row.
- Check that the My data has headers checkbox is selected, and hit OK.
After clicking OK, you will see a dialog box that shows the number of duplicates found, i.e. the number of rows that have been deleted, as well as the number of unique items remaining:
Method 2: Find Duplicates in Columns and Delete Rows in Excel with Conditional Formatting
You can also get help from Conditional Formatting to detect duplicates in columns, and then delete the extra rows.
But be careful, in order for Conditional Formatting to pick up a real duplicate row, we need to create an extra column that concatenates all of the values together, and then we assess this new column for duplicates.
I can hear you asking, “Why can’t we just apply conditional formatting to the whole table and then look for rows that are completely highlighted?”
The reason is simple… This approach might lead to false positives and as a result, unique rows can be deleted in error.
Look at this example – The table below lists the pupils from a few different schools who scored 100% in a subject:
In this image it looks like there are a number of rows that are duplicates, e.g. Harry from Hogwarts, who scored 100% in Maths.
But if we look closely, we can see that NONE of them are duplicated! The only other Harry is on row row 9: This Harry is from Springfield Elementary, and scored 100% in English. If we deleted one of these rows then we would have accidentally deleted a unique record!
To prevent this from happening, we need to create a helper column that contains the values from ALL of the other columns, and then we assess this new column for duplicates.
To use Conditional Formatting to find duplicates in columns and delete rows in Excel, follow these steps:
- First we need to create our helper column. In cell E2, enter the following formula:
- =CONCATENATE(A2, B2, C2, D2)
- Or, if you are using an Excel version after Office 2016:
- =CONCAT(A2:D2)
- Copy the formula down, all the way to cell E16.
- Select the entire data table.
- Now, click on the Home tab from the ribbon.
- Under the Styles group, hit the Conditional Formatting drop-down list.
- Put the cursor on the Highlight Cells Rules option.
- Select the Duplicate Values command.
The Duplicate Values dialog box will pop up.
- Pick a preferable color from the “Values with” drop-down menu.
I chose the Light Red Fill with Dark Red Text option. - Click OK.
Now, you will see the red-colored boxes. However, not every row has all its cells marked red.
- Find the rows where every cell in that row is a duplicate.
- Right-click on the red-marked rows and click Delete.
The Delete dialog box will open up.
- Mark on the Entire Row checkbox and hit OK.
One by one the pairs of matched values will remove by repeating this process. The rest of the detached duplicated values will remain in their rows. You can take a look at them and arrange them manually further.
- How to Compare Rows in Excel for Duplicates (7 Ways)
- 5 Ways to Find Matching Values in Two Worksheets in Excel
Find Duplicates in Columns and Delete Rows in Excel Using VBA Macros
Apply VBA Macros to find duplicates in columns and delete rows in Excel. So, here are the steps below:
- Select the entire data table.
- Hit on the Developer tab from the ribbon.
- Click on the Visual Basic command, under the Code group.
The Visual Basic Editor will open.
- Double-click on the worksheet name you are working on currently from the left menu.
I am working on Sheet4 (Method 2) now. - Insert the Macro in the Visual Basic Editor.
Sub ClearDupRows() Dim pRng1 As Range Set pRng1 = Selection pRng1.RemoveDuplicates Columns:=Array(1), Header:=xlYes End Sub
- From the top-left corner, click on the File drop-down list.
- Hit the Close and Return to Microsoft Excel option.
- Again select the Developer tab and select the Macros command under the Code group.
Or, press ALT+F8 to open Macro dialog box.
The Macro dialog box will come along on the screen.
- Hit Run button.
Now, the duplicated values are gone from the list.
- 5+ Formulas to Find Duplicates in One Column in Excel
- Find, Highlight, and Remove Duplicates in Excel [Step-by-Step]
Find Duplicates in Columns and Delete Rows with COUNTIFS Function & FILTER Command
Use the COUNTIFS function to detect similar values in Excel.
Syntax
COUNTIFS(criteria_range1, criteria1,..)
Formula
=COUNTIFS(E2:E16,E2)
Formula Explanation
The formula counts the number of occurrences of the value in cell E2 within the range E2:E16.
Formula
=A2&B2&C2&D2
Formula Explanation
This formula will link up the top cells of each column together.
To find duplicates in columns and delete rows with COUNTIFS function, follow these steps below:
- Create two new columns beside the data table (Column E & F).
- Type this formula inside cell E2: =A2&B2&C2&D2
- Press ENTER to join the cells together.
- Double-click on the Fill Handle.
This will copy the formula toward the entire column. - Copy this formula and paste it into cell F2: =COUNTIFS(E2:E16,E2)
- Press ENTER.
- Double-click on the Fill Handle.
The total count of repetitions will show up in column F.
I will filter the duplicates now.
- Select the Data tab from the ribbon.
- Under the Sort & Filter group, click on the Filter command.
- Right-click on the filter dropdown icon.
- Select Number Filters > Equals command.
- In Custom AutoFilter dialog box, select Equals and type ‘1’ in the empty box beside Equals under the Repeats dropdown. Then, hit OK.
- Now, select column E and column F.
- Click on the Home tab > Editing group > Clear drop-down menu > Clear All command.
I have removed column E and column F from my dataset to keep it neat and clean.
- 4+ Methods to Filter Duplicate Values in Excel
- How to Remove Duplicates in Excel [13 + Different Methods]
- How to Find Duplicates in Two Columns in Excel (7 Methods)
- Vlookup for Duplicate Values and Return the Matches in Excel [8 Cases]
How to Remove Duplicates But Keep Rest of the Row Values?
Suppose you have a messy dataset. Now you want to remove the identical values from one particular column but keep the rest of the row, rather than deleting the entire row. But before removing the duplicates, first, you need to find out the duplicates.
Formula
=C2=C3
Formula Explanation
As the column is alphabetically rearranged, the repeated texts are set one after another. So now the duplicates will be referred to as TRUE and the unique values will be referred to as FALSE.
To remove duplicates but keep rest of the row values in Excel, go through the steps below:
- Select the entire data range.
- Click on the Data tab > Sort & Filter group > A-Z command.
Now, I rearranged all the columns to perform our next formula.
- Add a new column header beside the data table and give it a name.
I created a new column in column E named Matches.I want to find the duplicates from the Holiday Name (column C) of my dataset.
- Type this formula in cell E2: =C2=C3
- Press ENTER.
- Click on the Fill Handle icon.
Fill Handle will copy-paste the formula through the entire column.
- Click on the Data tab > Sort & Filter group > Filter command.
A small filter drop-down icon will come along upon the Matches (column E) header. - Click on the filter drop-down icon and unmark the TRUE checkbox from there.
- Select the cells from C2 to C16.
- Go to the Home tab > Editing group > Clear drop-down > Clear Contents command.
Make sure to keep the column header (Holiday Name).
- Select the cells of column Matches and go Home tab> Editing group> Clear drop-down> Clear All command.
Rest of the cells of rows are intact, but the similar cells of column C are gone.
Conclusion
I have shown you 4 ways to find duplicates in columns and then delete the rows. I hope you’ve found your required solution from this blog. Tell us how you like the solutions in the comment section.
Frequently Asked Questions
What is the formula for finding duplicates in Excel?
The formula for finding duplicates in Excel is =COUNTIF(A:A, A1)>1
Here A:A is the column containing data, and A1 is the first cell in that column; this formula checks if the value in a cell appears more than once in the specified column.
Why should we remove duplicates?
Removing duplicates in data is essential to enhance accuracy and streamline analysis by eliminating redundant information, ensuring data integrity, and preventing errors in statistical calculations or decision-making processes.
How do I delete non duplicates in Excel?
To delete non-duplicates in Excel, use the Remove Duplicates feature:
- Select the data range.
- Go to the Data tab.
- Click Remove Duplicates.
- Choose the relevant columns.
By following above steps, Excel will retain only unique values, removing non-duplicates.