Azure Synapse Link for Dataverse - how to handle schema evolution

alex feng 86 Reputation points
2021-09-27T23:58:45.36+00:00

Hi there,

I have setup a data lake cdm export from my dynamic 365 crm. I want to use Spark to read the data and I was using this library and it all works until I added new columns to table
https://github.com/Azure/spark-cdm-connector/blob/master/documentation/overview.md

In its limitation section it specifies schema evolution is not supported, which explains why it fails to read the files after new column added.

Further checking the FAQ page for Synapse Link
https://learn.microsoft.com/en-us/powerapps/maker/data-platform/export-data-lake-faq#what-happens-when-i-add-a-column

It seems when a new column added, only new/update rows will have the new column and the old rows will stay the same. That's what I observe in my data folder as well.

What would be the best way to handle this situation? If not Spark, what other tools can I use?

Thanks in advance.

Regards,
Alex

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,368 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Saurabh Sharma 23,676 Reputation points Microsoft Employee
    2021-10-18T19:24:24.04+00:00

    Hi @alex feng ,

    We have public documentation on configuring the export with Azure Synapse Analytics, then using Spark to read and transform the data. This is a new integration called Azure Synapse Link for Dataverse that was just released in May that bypasses the need for using the Spark-CDM connector.

    Please let me know if this works or if you have any other questions.

    Thanks
    Saurabh

    1 person found this answer helpful.