ADF - cosmos upsert question

Stuart Porter 1 Reputation point
2020-10-13T21:00:29.88+00:00

Hi
I have a file that has a million rows - which i upload into cosmos db,
I let cosmos db generate the id as a guid
next day i get a new complete file which has some deltas and new records - there are two fields I can check to see if the record has changed

I thought doing an upsert would be the way to go - but that seems tied to the generated ID - which all records would get on insert
how do you configure the upsert to use a different key / field combination ?

so i can update the records that have changed and still insert the new records?

Thanks
mrP

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,665 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,826 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Stuart Porter 1 Reputation point
    2020-10-14T21:21:20.57+00:00

    so far my only option is to truncate the container and re-import a million records every night ?

    I can create a second source and then do an exists to identify new records - which is good
    but as for updating any changed records - nothing seems to be working :(

    any ideas ?


  2. HimanshuSinha-msft 19,476 Reputation points Microsoft Employee
    2020-10-28T20:52:52.367+00:00

    Hello @Stuart Porter ,
    My sincere apoloziges for the delay in reply . The reason we asked for an email from you so that we could have helped you redirect to the correct team for early resolution .
    Anyways I think I have a resolution you will have to map the unquie id in the CSV to to the ID column in the mapping tab . In my example below I have a SSN column . I did tested the UPSERT behavour and it worked fine .

    Please do let me know howe it goes .

    35777-upsert.gif

    Thanks Himanshu
    Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members


  3. Stuart Porter 1 Reputation point
    2020-11-03T15:05:19.317+00:00

    not yet :(
    the difference is that our input data does not have a unique key - there i am using the derivedcolumn operation to create an key - this gets persisted into the cosmosdb container

    when i get a new file i can do an exists to pick out the new inserts - but selecting upserts requires a alter row operation - i seem to be running into an issue there - how do i reference the incoming import key against the existing import key and do a update ?

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.