Select single record when rows are duplicated

Jasmine Al 1 Reputation point
2021-01-29T18:45:32.537+00:00

DECLARE @Bluemchen table(TableId int, SId int, Type int, aDate datetime, col1 varchar(200), col2 varchar(200), agent int, ref_table_id int null)
insert into @Bluemchen VALUES
(1, 10, 100, '1/1/2021', 'a', 'aa', 400 null)
,(2, 10, 100, '1/1/2021', 'a', 'aa', 401, 1)
,(3, 10, 200, '1/15/2021', 'b', 'bb', 400, null)
,(4, 10, 200, '1/15/2021', 'b', 'bb', 401, 3)
, (5, 10, 300, '1/16/2021', 'c', 'cc', 400, null)
, (6, 10, 100, '1/1/2021', 'a', 'aa', 403, 1)

This table has some duplicate records but when the duplicate record gets created ref_table_id contains id of the record that triggers the creation of duplicates.
The row ids 2, 6 are duplicates of 1. 4 is duplicate of 3. For the duplicate records, the only column different is agent.

When selecting data only one row of multiple duplicates should be selected and agent column should contain agents separated by commas.

Please note like col1 and col2 there are more columns that contain same data when row gets duplicated.

Output:

61959-result.png

Developer technologies Transact-SQL
{count} votes

3 answers

Sort by: Most helpful
  1. Viorel 122.5K Reputation points
    2021-01-29T19:14:36.297+00:00

    I think that the next query will work:

    select min(TableId) as TableId, Sid, Type, aDate, col1, col2,  
           string_agg(agent, ', ') within group (order by agent) as Agents  
    from @tbl  
    group by Sid, Type, aDate, col1, col2   
    
      
    

  2. Yitzhak Khabinsky 26,586 Reputation points
    2021-01-29T19:26:46.467+00:00

    Please try the following solution.

    SQL

    DECLARE @tbl table(TableId int, SId int, Type int, aDate date, col1 varchar(200), col2 varchar(200), agent int, ref_table_id int null);
    INSERT INTO @tbl VALUES
    (1, 10, 100, '1/1/2021', 'a', 'aa', 400, null)
    ,(2, 10, 100, '1/1/2021', 'a', 'aa', 401, 1)
    ,(3, 10, 200, '1/15/2021', 'b', 'bb', 400, null)
    ,(4, 10, 200, '1/15/2021', 'b', 'bb', 401, 3)
    , (5, 10, 300, '1/16/2021', 'c', 'cc', 400, null)
    , (6, 10, 100, '1/1/2021', 'a', 'aa', 403, 1);
    
    DECLARE @separator CHAR(1) = ',';
    
    SELECT MIN(TableId) AS TableId, SId, Type, aDate, col1, col2
          , STUFF((SELECT @separator + CAST(agent AS VARCHAR(30)) AS [text()]
             FROM @tbl AS O
             WHERE O.SId = C.SId 
     AND o.Type = c.type
             FOR XML PATH('')), 1, 1, NULL) AS AgentList
    FROM @tbl AS c
    GROUP BY SId, Type, aDate, col1, col2;
    

    Output

    +---------+-----+------+------------+------+------+-------------+
    | TableId | SId | Type |   aDate    | col1 | col2 |  AgentList  |
    +---------+-----+------+------------+------+------+-------------+
    |       1 |  10 |  100 | 2021-01-01 | a    | aa   | 400,401,403 |
    |       3 |  10 |  200 | 2021-01-15 | b    | bb   |     400,401 |
    |       5 |  10 |  300 | 2021-01-16 | c    | cc   |         400 |
    +---------+-----+------+------------+------+------+-------------+
    
    0 comments No comments

  3. EchoLiu-MSFT 14,621 Reputation points
    2021-02-01T08:11:27.677+00:00

    Hi @Jasmine Al

    Welcome to the Microsoft TSQL Q&A Forum!

    The method provided by Viorel-1 is applicable to SQL Server 2017 (14.x) and later;The method provided by YitzhakKhabinsky-0887 is applicable to all supported SQL Server versions, and you can choose the method that suits you.

    If you have any question, please feel free to let me know.

    Regards
    Echo


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.