Sometimes, hidden characters or trailing spaces can make values look identical, but they're actually different. Try trimming the values and see if that makes any difference:
SELECT DISTINCT LTRIM(RTRIM(col_A)),
LTRIM(RTRIM(col_B)),
LTRIM(RTRIM(col_C)),
LTRIM(RTRIM(col_D))
FROM table_name
SQL Server (and by extension Azure SQL Database) can be case sensitive or insensitive based on the collation setting. If the collation is case-sensitive, then 'A' and 'a' would be considered different values. To check this, you can convert all characters to lowercase (or uppercase) and then do the distinct:
SELECT DISTINCT LOWER(col_A), LOWER(col_B), LOWER(col_C), LOWER(col_D)
FROM table_name
It's possible that the way Dataflow reads or writes data might be slightly different than the COPY ACTIVITY method. For example, if the data format has changed or if there are issues with data types, conversions, or the way it handles null values.
When using Dataflow, it's crucial to ensure that the upsert operation is correctly configured based on the primary key. It might be helpful to double-check the sink's configuration and the mapping of columns. You might want to inspect the output of your Dataflow transformations before they get to the sink. By isolating the transformation, you can figure out at which step the duplicates are introduced.
Make sure there aren't any concurrent operations running on the same dataset. For instance, if another operation inserts data at the same time as your Dataflow, you might see inconsistencies.
If you're performing the operation in a transaction, be aware of the transaction's isolation level. Some isolation levels allow for 'dirty reads' which can result in reading uncommitted data, leading to perceived duplicates.