SAP CDC transformation doesn't have ODQ columns as output

Dmytro Honcharuk 50 Reputation points
2023-05-04T13:47:07.2233333+00:00

Hello, we are trying to implement SAP CDC -> ADLS parquet mapping dataflow. But output from the SAP CDC transformation drops the ODQ metadata columns like ODQ_CHANGEMODE. They are not available in the sink file. I can see those columns only in staging files. We are using auto-mapping in our flow. Is there a way to retrieve those ODQ columns and push them into the sink parquet file?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,644 questions
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator
    2023-05-10T22:13:47.1766667+00:00

    @Dmytro Honcharuk Thanks for the additional information.

    I wish I had SAP environment to test this scenario and share with some screenshots but as per my discussion with product team, in the source transformation, ADF process those SAP ODQ fields internally and mark the rows as insert, update, or delete. For multiple update rows, ADF sort/order them and pick the last after image. Essentially, it performs de-duplication of multiple updates in SAP.

    The recommendation from product team is to create a new column named, say "row operation" using Derived column (use isUpdate, isInsert , isDelete functions) and set the field value to insert, update, or delete based on mapping dataflow row marker. This allows for sink to have regular data columns + row operation (set to I / U / D).

    User's image

    User's image

    MS Doc for reference: Expression functions in mapping data flow

    You could use the added column in Snowflake to perform final merge.

    Hope this info helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Dmytro Honcharuk 50 Reputation points
    2023-05-10T08:47:05.7066667+00:00

    We have our DWH in Snowflake and want to read incremental deltas from SAP to data lake and after that merge data in Snowflake into target tables with SCD 2. Hence, we need to have ODQ metadata columns to be able to perform proper merge operation on Snowflake side - to identify change mode for the row. The problem that we cannot skip intermediate file storage operation due to the internal architecture concerns.


  2. Zhangyi Yu 6 Reputation points Microsoft Employee
    2023-06-20T08:13:27.98+00:00

    This is by design, today ADF has four SAP system columns:

    ODQ_CHANGEMODE: SAP generated column and replaced by dataflow language “RowMarker” after dedupe operation.

    ODQ_ENTITYCNTR: SAP generated column

    __PACKAGEID, SEQUENCENUMBER: ADF generated columns to ensure the order of delta change we extract

    User's image

    After SAP source in dataflow, if customer still wants to know the operation type (CRUD) of each row, he can add derive transformation to get it with dataflow expression (RowMarker is internal column, invisible to customer). For example:

    User's image

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.