SSIS : If there is a duplicate row then do not load the data from Stage table to Final Table.

kkran 831 Reputation points
2023-03-30T21:39:44.5166667+00:00

Hi Team - I am loading data from CSV file to STAGE table and then to FINAL table via SSIS Package. So if there is duplicate row in the first two columns then it should NOT process/load the data from STAGE to FINAL instead it should send an email to the user saying 'there is a duplicate in the file'.

Below is the query of staging. If the below condition satisfies then the user should receive an email and should not load the data into Final table.

This one should be plugged into execute SQL Task.

Could you please help with this on how to do it in SSIS ?

SELECT Part_Number, Org_Name, count(*) as Qty 
 FROM [dbo].[Staging] 
 GROUP BY Part_Number, Org_Name
 HAVING count(*)> 1 
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,916 questions
SQL Server Integration Services
SQL Server Integration Services
A Microsoft platform for building enterprise-level data integration and data transformations solutions.
2,469 questions
0 comments No comments
{count} votes

Accepted answer
  1. ZoeHui-MSFT 33,781 Reputation points
    2023-04-03T03:16:43.4933333+00:00

    Hi @kkran,

    Not sure if you want to combine the code like below.

    declare @count int select @count = (SELECT sum(CASE WHEN CHARINDEX('.',[Month1],0) > 0 THEN 1 WHEN CHARINDEX('.',[Month3],0) > 0 THEN 1 WHEN CHARINDEX('.',[Month2],0) > 0 THEN 1 WHEN CHARINDEX('.',[Month4],0) > 0 
      THEN 1 ELSE 0 END) FROM stagingtable ) 
    declare @count2 int select @count2=(select count(*) from (SELECT Part_Number, Org_Name
    FROM stagingtable
    GROUP BY Part_Number, Org_Name
    HAVING count(*)> 1) t)
      if @count >0 or @count2>0
      begin raiserror('There is decimal or duplicate in my_table',16,1) end 
      else begin INSERT INTO yourtable SELECT * FROM  stagingtable  end
    
    

    As you said, pass in a variable for the 'Qty' column and see if value is >1 then do not process and if <1 then check the decimals and move to next steps, it is also a good way.

    Regards,

    Zoe Hui


    If the answer is helpful, please click "Accept Answer" and upvote it.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. ZoeHui-MSFT 33,781 Reputation points
    2023-03-31T06:28:26.4233333+00:00

    Hi @kkran,

    You may use below code in your execute SQL Task.

    IF EXISTS
    (SELECT Part_Number, Org_Name
    FROM MyTable
    GROUP BY Part_Number, Org_Name
    HAVING count(*)> 1 )
    BEGIN
      RAISERROR('there is duplicate row ',1,1)
    END
    ELSE
    BEGIN
      INSERT INTO FINAL_table SELECT * FROM STAGE_Table
    END
    
    

    And then connect to the Send Mail Task. Remember to set the Constraint to fail.

    User's image

    Regards,

    Zoe Hui


    If the answer is helpful, please click "Accept Answer" and upvote it.