A family of Microsoft spreadsheet software with tools for analyzing, charting, and communicating data
Hi,
In the Data > Remove Duplicates window, just ensure that only the first box is checked.
Hope this helps.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Hello,
I have a problem with duplicate rows, I know Excel has a function that allows to remove the duplicate rows but it doesnt fit for what I am looking for.
Imagine you have to analyse a big amount of data that has this structure:
| ID Product | Product 1 | Product 2 |
|---|---|---|
| 3726 | Lemon | Orange |
| 3726 | Orange | Lemon |
| 9842 | Meat | Fish |
| 9842 | Fish | Meat |
And you need to remove the duplicates that in this case would be the second row (3726-Orange-Lemon) and the fourth because they are duplicate, but when you introduce this data in the "Remove Duplicate" function it doesnt detect this registers as duplicate, and with 4 rows is ok but with 46000 rows you can not check out row by row, so how can I remove all those rows that are duplicate but Excel doesnt detect them as duplicate?
A family of Microsoft spreadsheet software with tools for analyzing, charting, and communicating data
Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.
Do you have Microsoft 365 or Office 2021? If so, you could create a helper column and use that in the Remove Duplicates dialog
Using your example, enter the following formula in D2:
=TEXTJOIN("|",TRUE,SORT(B2:C2,,,TRUE))
and fill down. Result:
| ID Product | Product 1 | Product 2 | All Products | |
|---|---|---|---|---|
| 3726 | Lemon | Orange | Lemon | Orange |
| 3726 | Orange | Lemon | Lemon | Orange |
| 9842 | Meat | Fish | Fish | Meat |
| 9842 | Fish | Meat | Fish | Meat |
In D2 input formula then drop down, it will get the result as below. Then you can sort from dup column and delete the rows with duplicate tagged.
=IF(COUNTIF($A$2:A2,A2)=2,"duplicate","")
| ID Product | Product 1 | Product 2 | Dup |
|---|---|---|---|
| 3726 | Lemon | Orange | |
| 3726 | Orange | Lemon | duplicate |
| 9842 | Meat | Fish | |
| 9842 | Fish | Meat | duplicate |
Help Column=
=IF(COUNTIF($A$2:A2,A2)=2,A2&C2&B2,A2&B2&C2)
Dup Column=
=IF(COUNTIF($D$2:D2,D2)=2,"duplicate","")
| ID Product | Product 1 | Product 2 | Help Column | Dup |
|---|---|---|---|---|
| 3726 | Lemon | Orange | 3726LemonOrange | |
| 3726 | Orange | Lemon | 3726LemonOrange | duplicate |
| 9842 | Meat | Fish | 9842MeatFish | |
| 9842 | Fish | Meat | 9842MeatFish | duplicate |
| 3783 | ABC | ACD | 3783ABCACD | |
| 3783 | ABD | 3783ABD |