ADF Copy Data Upsert creates two rows having same composite key

Dan Barbary 36 Reputation points
2022-07-14T21:50:01.397+00:00

I have a Copy Data task with "Upsert" enabled.
220914-image.png

Something very strange is happening where when a row that is found in the destination data set is also in the DELTA file which is to be loaded. ADF is not UPDATING the row, and is adding a second row

220941-image.png

This however is only happening with a delta parquet source file which is coming from a different server. I believe the columns I declared (cora_acct_id,hostitemid,server_ip_addr) would create a composite key to find the data within the destination - and update if it is there... well that is not happening, but strangely only for my US data and not my Canadian data.

Why in the world is it writing a second row for this and other examples. (377 examples I can find)

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} vote

1 answer

Sort by: Most helpful
  1. AnnuKumari-MSFT 34,566 Reputation points Microsoft Employee Moderator
    2022-07-15T13:22:07.177+00:00

    Hi @Dan Barbary ,
    Thankyou for using Microsoft Q&A platform and thanks for posting your question.
    As I understand your query, you are trying to perform upsert inside Copy data activity in ADF, however it's not working as expected as it is creating a duplicate. Please let me know if my understanding is incorrect.

    What is happening in your case is certainly not expected. As upsert merges the data if the keyColumns matches and then perform update or insert accordingly. There shouldn't be duplicates coming in.

    Kindly check if there is any trailing space in any of the column values - cora_acct_id, hostitemid, server_ip_addr which is leading it to treat the two rows as separate entries.

    Also, kindly make sure that the source data doesn't have duplicate entries.

    Please let us know if any of these is the case . Looking forward to your response. Thanks !


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.