azure data factory's databricks runtime

Osama Tarek 20 Reputation points
2024-06-03T14:24:46.9933333+00:00

Hello, i am trying to ingest data from SAP to fabric lakehouse table using dataflow and copy activity, however because my sink table has change data feed enabled, the pipeline throws the following error:

Operation on target Data flow1 failed:

"{"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: at Sink 'sink1': Cannot write to table with delta.enableChangeDataFeed set. Change data feed from Delta is not yet available.","Details":"org.apache.spark.sql.AnalysisException: Cannot write to table with delta.enableChangeDataFeed set. Change data feed from Delta is not yet available."

even though my delta table minreader version si 1 and minwriter version is 4 which is the nesscary versions to run change data feed.

and when i tried to update the delta table version it threw this error:

"Operation on target Data flow1 failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: at Sink 'sink1': Delta protocol version is too new for this version of the Databricks Runtime. Please upgrade to a newer release."

however i found nothing about databricks runtime inside azure data factory.

would appreciate any info on this and whether its possible to sink into a delta table with change data feed enabled.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,017 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,896 questions
0 comments No comments
{count} votes

Accepted answer
  1. BhargavaGunnam-MSFT 28,111 Reputation points Microsoft Employee
    2024-06-03T18:09:37.5566667+00:00

    Hello Osama Tarek,

    Welcome to the Microsoft Q&A forum.

    Delta lake with CDC is only available with Deltalake version 2.0.0

    User's image

    Reference document: https://docs.delta.io/2.0.0/versioning.html#features-by-protocol-version

    Deltalake 2.0 is only supported in Synapse Spark 3.3

    User's image

    https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-33-runtime

    So we do not support delta sink with CDC enabled in dataflow now

    In adf, data flows are mostly use synapse spark 3.1.

    Please note: Data flows will move to Spark 3.3/Deleta 2.x , but I don't have an ETA for this feature request.

    I can follow up with PG and update this thread if I hear anything back from them.

    I hope this answers your question. Please let me know if you have any further questions.

    If this answers your question, please consider accepting the answer by hitting the Accept answer and up-vote as it helps the community look for answers to similar questions.

    0 comments No comments

0 additional answers

Sort by: Most helpful