Having tested we learned that the sorts, required for deduping each column, will cause performance issues on large datasets. As a result, we've taken an approve to depude the data on the database instead of in the data flow.
SSIS 2019 - Identify and Redirect Duplicate Values in Data Flow
libpekin
166
Reputation points
Hello,
To be clear this is not a requirement to identify and redirect or dedupe rows in the data flow task leveraging the Sort Transformation. Instead, the requirement is to identify the duplicate values in a column then handle affected rows differently. Using the attached mock-up table, ProductID 1002 will quality as duplicates, therefore rows 3 and 4 should flow down a different path. Likewise, ProductCode A01 resulting in rows 2 and 6 being redirected.
Any help is appreciated. Thanks!