Data Flow - "Delete If" setting in Alter Row

arkiboys 9,686 Reputation points
2021-10-05T16:17:29.217+00:00

Hello,
For awhile now I have been trying to get this to work but no success...
My source is a view on prem sql server, say vw_Department
The data flow has source, alter row with upsert if set to true() and the sink Delta parquet has Allow upsert set to true
The keyColumn is also set for the sink, i.e. DepartmentID (However, this is dynamically retrieved, split($dfpKeyColumn, ',')
Anyway, the upsert if seems to be working fine because if I edit the source table, i.e. update a row or insert a row iinto the table, then the Sink parquet Delta reflects those changes...
Then I try to allow for if a row is deleted at source, i.e. a row is removed from the table in source on-prem sql server.
So I add the Delete If and Allow Delete accordingly...
When a row is removed from the source, then in Sink, still I see all the rows including the deleted ones. Whereas I should only see the upsert or unchanged ones.
I know I have to place a condition for the Deleted If but not sure what to put for it to make this delete work fine.
Noe that I can not put a condition like "Name" == "test" because I want to remove the row from Sink if it is deleted at source.
I placed the upsert If after the Delete If and even changed their priority other way round but still not solved the issue yet
Any thoughts?

Thank you

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,133 questions
0 comments No comments
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 38,326 Reputation points Microsoft Employee
    2021-10-06T10:26:20.407+00:00

    Hi @arkiboys ,

    Thank you for posting query on Microsoft Q&A Platform.

    You can implement it by adding your source and sink both as source transformations, and then use exists transformation to know missing rows in Sink, and then use alter row transformation to apply Delete If policy.

    In below example, I am using empFile as source and empTable as sink and trying to perform same. In my example, I dont empId 1 in my source, hence I have to delete that row from my sink table. Please check below detailed implementation for same and follow same.

    Step1: empTable added as source.

    138156-emptable.gif

    Step2: empFile added as source

    138097-empcsvfile.gif

    Step3: Derived column to type case my id column in source file to Int.

    138060-derviedcolumn.gif

    Step4: Exists transformation to know missing rows in file

    138134-exists.gif

    Step5: Alter row to apply Delete If policy

    138124-alterrow.gif

    Step6: Sink transformation

    138144-sink.gif

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. MarkKromer-MSFT 5,206 Reputation points Microsoft Employee
    2021-10-05T16:53:02.287+00:00

    You would need to set a rule for your delete policy in the Alter Row to tell ADF which rows to delete. If you do not have a flag or value that you can check to see if it is to be deleted, you could use an Exists transformation to look for rows that are not present in the source compared to rows in your target and set those rows for delete.


  2. HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
    2021-10-07T00:52:25.493+00:00

    Hello @arkiboys ,

    I did tried to work with the delete logic and I missing me the required result when I was using the EXIST transformation , honestly I always thought its should work , but I tried with JOIN and it worked .

    In my case the the source was SQL and sink as delta table , I think the key here is to get the determine which rows needs to be deleted .

    Source 1 : SQL
    Source 2 : Delta tables
    I used the Right outer join

    138298-image.png

    one row which is deleted in the source is shown below

    138246-image.png

    I am using a Two select and one filter to get the Employeeid ( we can improve here )

    138269-image.png

    138341-image.png

    138307-image.png

    And used the below in the Alter row activity .

    iif(EMPLOYEEID>0,true(),false())

    138276-image.png

    I can confirm that I was able to delete that one row from the sink side

    138313-image.png

    Please do let me know how it goes .
    Thanks
    Himanshu

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators