SAP CDC Source in Synapse Dataflow - Full and Incremental Load Discrepancy with Partition Condition

Dev-nimbeo 1 Reputation point
2024-01-24T15:00:13.5966667+00:00

I am writing to report an incident related to the SAP CDC source within a Synapse Dataflow. The issue revolves around the use of full and incremental loads in delta tables, specifically when applying a filter via a partition condition. Currently, the problem observed is that the partition condition is only considered during the full load, and subsequent incremental loads seem to ignore this condition. It is pertinent to mention that the Partition Type selected is "Source".

Incident Details:

  • Platform: Synapse Dataflow
  • Source: SAP CDC
  • Load Types: Full Load and Incremental Load
  • Issue: Partition condition not consistently applied in incremental loads after full load execution.

Steps to Reproduce:

  1. Initiate a full load with a specified partition condition in the SAP CDC source within the Synapse Dataflow.
  2. Observe that the partition condition is correctly applied during the full load.
  3. Execute incremental loads following the full load, and notice that the partition condition appears to be ignored in these subsequent incremental loads.

Expected Behavior: The partition condition should be consistently applied in both full and incremental loads within the Synapse Dataflow when utilizing the SAP CDC source.

Impact: The current behavior hinders the effective use of partition conditions in incremental loads, potentially leading to data inconsistencies and inaccuracies in the Synapse Dataflow.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,128 questions
{count} votes

1 answer

Sort by: Most helpful
  1. phemanth 3,680 Reputation points Microsoft Vendor
    2024-01-25T10:31:40.5166667+00:00

    @Dev-nimbeo
    Thanks for the question and using MS Q&A platform.

    I understand that you are facing an issue with the partition condition not being consistently applied in incremental loads after full load execution in Synapse Dataflow when utilizing the SAP CDC source.

    check if the SAP CDC connector is configured correctly and if the partition condition is specified correctly in the SAP CDC source dataset.

    the SAP CDC connector uses the SAP ODP framework to extract data from SAP source systems. please go through: https://learn.microsoft.com/en-us/azure/data-factory/connector-sap-change-data-capture you can use a manual, limited workaround to extract mostly new or updated records. In a process called watermarking, extraction requires using a timestamp column, monotonically increasing values, and continuously tracking the highest value since the last extraction. However, some tables don’t have a column that you can use for watermarking. please follow the blog https://techcommunity.microsoft.com/t5/fasttrack-for-azure/metadata-driven-data-ingestion-pipeline-using-the-sap-cdc/ba-p/3940416

    if the issue persists do let us know. Hope this helps. Do let us know if you any further queries.