Share via

I want to remove duplicate based on the following criteria.

Anonymous
2024-03-31T22:19:12+00:00
I want the formula to return a total of 5 rows( removing the duplicates ) but keep the rows that have a Package code 20 or 25 . If the duplicate orders do not have a row containing 20,25 then it can bring in the row with  first occurrence by Order Number

Here is the Data

Order Number Package Code
A123 20
A123 1
A123 2
A123 3
A456 1
A456 25
A456 4
A456 8
A456 7
B789 1
B789 2
B789 3
B789 4
C567 5
C567 6
D098 9
D098 10
D098 20
D098 1
D098 0

Here is what I want to formula to return:

Order Number Package Code
A123 20
A456 25
B789 1
C567 5
D098 20
Microsoft 365 and Office | Excel | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments

2 answers

Sort by: Most helpful
  1. Anonymous
    2024-04-01T00:16:02+00:00

    //select * from remove_duplicate_ord;

    select * from (

    select * from remove_duplicate_ord order by iif(Package Code in (20,25),0,1)) group by Order Number;

    https://s21.ax1x.com/2024/04/01/pF7QnEV.jpg

    https://e.anyoupin.cn/EData/?s=remove\_duplicate\_ord

    Was this answer helpful?

    0 comments No comments
  2. Anonymous
    2024-03-31T23:07:43+00:00

    Hello,

    I found a solution to the problem of removing duplicates based on specific criteria, specifically prioritizing certain package codes, in Excel. Here's a brief explanation:

    Create a List of Priority Codes: In this case, we're focusing on package codes 20 and 25. We want to keep these codes in our final list.

    Use a Formula to Select the Right Rows: We need a formula that checks for these priority codes within each order number. If the order has a priority code, it should be selected. If not, the formula should default to the first occurrence of that order number.

    Apply this Logic to the Data Set: I applied this logic using a Python script for demonstration, but in Excel, you can use a combination of IF, VLOOKUP, and INDEX/MATCH functions to achieve a similar result.

    Here’s a Python script that demonstrates the logic:

    import pandas as pd

    Your data goes here

    data = {"Order Number": ["A123", "A123", ...], "Package Code": [20, 1, ...]} df = pd. DataFrame(data)

    Defining priority package codes

    priority_codes = [20, 25]

    Logic to select the right row

    def select_row(group): priority_rows = group[group['Package Code'].isin(priority_codes)] return priority_rows.iloc[0] if not priority_rows.empty else group.iloc[0]

    Apply the logic and get the result

    result_df = df.groupby('Order Number').apply(select_row).reset_index(drop=True)

    This script will give you a table like:

    Order Number Package Code A123 20 A456 25 B789 1 C567 5 D098 20 In Excel, you can replicate this logic using formulas, or if you're comfortable with VBA, you can automate this process. Please remember to adjust your Excel functions to follow this logic.

    Best Regards, Saurabh

    Was this answer helpful?

    0 comments No comments