Azure Synapse Link for Dataverse - how to handle schema evolution

Question

Hi there,

I have setup a data lake cdm export from my dynamic 365 crm. I want to use Spark to read the data and I was using this library and it all works until I added new columns to table
https://github.com/Azure/spark-cdm-connector/blob/master/documentation/overview.md

In its limitation section it specifies schema evolution is not supported, which explains why it fails to read the files after new column added.

Further checking the FAQ page for Synapse Link
https://learn.microsoft.com/en-us/powerapps/maker/data-platform/export-data-lake-faq#what-happens-when-i-add-a-column

It seems when a new column added, only new/update rows will have the new column and the old rows will stay the same. That's what I observe in my data folder as well.

What would be the best way to handle this situation? If not Spark, what other tools can I use?

Thanks in advance.

Regards,
Alex

Answer

Hi @alex feng ,

We have public documentation on configuring the export with Azure Synapse Analytics, then using Spark to read and transform the data. This is a new integration called Azure Synapse Link for Dataverse that was just released in May that bypasses the need for using the Spark-CDM connector.

Please let me know if this works or if you have any other questions.

Thanks
Saurabh

Azure Synapse Link for Dataverse - how to handle schema evolution

1 answer