SAP CDC transformation doesn't have ODQ columns as output

Question

SAP CDC transformation doesn't have ODQ columns as output

Dmytro Honcharuk 50

Hello, we are trying to implement SAP CDC -> ADLS parquet mapping dataflow. But output from the SAP CDC transformation drops the ODQ metadata columns like ODQ_CHANGEMODE. They are not available in the sink file. I can see those columns only in staging files. We are using auto-mapping in our flow. Is there a way to retrieve those ODQ columns and push them into the sink parquet file?

KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-05T22:37:23.9533333+00:00

Hi @Dmytro Honcharuk ,

Welcome to Microsoft Q&A forum and thanks for reaching out here.

Sorry I don't have SAP CDC instance to test this scenario, but have you got chance to do explicit mapping and see if you were able to see those columns on your source side? Also, after your source transformation, can you add a derived column transformation and see if you can add that column value?

I have also reached out to product team to check if this is an expected behavior or if you are missing anything. Product team would like to know more about your use case scenario. Could you please elaborate a bit on your use case scenario using ODQ metadata columns.

Thank you
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-09T04:29:33.87+00:00

@Dmytro Honcharuk Checking to see if you have got a chance to see my previous response. If so could you please help share few details about the use case scenario?

Thanks
Dmytro Honcharuk 50 Reputation points

2023-05-09T07:13:46.78+00:00

Hello, no those columns are not available in derived columns transformation for select. And thus we cannot push it further into parquet. The same for explicit mapping - ODQ columns are not available.
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-09T16:54:19.0666667+00:00

Thank for your response @Dmytro Honcharuk . Could you also please share few details about the use case scenario of those columns? Product team is interested to know about the use case. Thanks

Accepted answer

2 additional answers

Your answer

KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-05T22:37:23.9533333+00:00

Hi @Dmytro Honcharuk ,

Welcome to Microsoft Q&A forum and thanks for reaching out here.

Sorry I don't have SAP CDC instance to test this scenario, but have you got chance to do explicit mapping and see if you were able to see those columns on your source side? Also, after your source transformation, can you add a derived column transformation and see if you can add that column value?

I have also reached out to product team to check if this is an expected behavior or if you are missing anything. Product team would like to know more about your use case scenario. Could you please elaborate a bit on your use case scenario using ODQ metadata columns.

Thank you
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-09T04:29:33.87+00:00

@Dmytro Honcharuk Checking to see if you have got a chance to see my previous response. If so could you please help share few details about the use case scenario?

Thanks
Dmytro Honcharuk 50 Reputation points

2023-05-09T07:13:46.78+00:00

Hello, no those columns are not available in derived columns transformation for select. And thus we cannot push it further into parquet. The same for explicit mapping - ODQ columns are not available.
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-09T16:54:19.0666667+00:00

Thank for your response @Dmytro Honcharuk . Could you also please share few details about the use case scenario of those columns? Product team is interested to know about the use case. Thanks

Answer 1

@Dmytro Honcharuk Thanks for the additional information.

I wish I had SAP environment to test this scenario and share with some screenshots but as per my discussion with product team, in the source transformation, ADF process those SAP ODQ fields internally and mark the rows as insert, update, or delete. For multiple update rows, ADF sort/order them and pick the last after image. Essentially, it performs de-duplication of multiple updates in SAP.

The recommendation from product team is to create a new column named, say "row operation" using Derived column (use isUpdate, isInsert , isDelete functions) and set the field value to insert, update, or delete based on mapping dataflow row marker. This allows for sink to have regular data columns + row operation (set to I / U / D).

User's image

MS Doc for reference: Expression functions in mapping data flow

You could use the added column in Snowflake to perform final merge.

Hope this info helps.

Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

Dmytro Honcharuk 50 Reputation points

2023-05-11T13:42:17.95+00:00

Thanks, this approach works for us
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-11T15:56:34.1033333+00:00

@Dmytro Honcharuk Glad to know it was helpful. Appreciate much for confirming as it helps other community members. :)

Have a good day!

Answer 2

Dmytro Honcharuk 50

We have our DWH in Snowflake and want to read incremental deltas from SAP to data lake and after that merge data in Snowflake into target tables with SCD 2. Hence, we need to have ODQ metadata columns to be able to perform proper merge operation on Snowflake side - to identify change mode for the row. The problem that we cannot skip intermediate file storage operation due to the internal architecture concerns.

Erik Lundsten 0 Reputation points

2023-05-10T10:42:41.7633333+00:00

I have a very similar scenario to this, please let me know if you find a solution for it!
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-05-11T15:57:54.44+00:00

Hi Erik Lundsten , Thanks for reaching out here. Please follow the solution which I shared earlier. That should help achieve the requirement. If you have any further questions, please do open a new thread so that it will have higher visibility to the community.

Thank you

Answer 3

This is by design, today ADF has four SAP system columns:

ODQ_CHANGEMODE: SAP generated column and replaced by dataflow language “RowMarker” after dedupe operation.

ODQ_ENTITYCNTR: SAP generated column

__PACKAGEID, SEQUENCENUMBER: ADF generated columns to ensure the order of delta change we extract

User's image

After SAP source in dataflow, if customer still wants to know the operation type (CRUD) of each row, he can add derive transformation to get it with dataflow expression (RowMarker is internal column, invisible to customer). For example:

User's image

Share via

SAP CDC transformation doesn't have ODQ columns as output

2 additional answers

Your answer