Hi there,
I have setup a data lake cdm export from my dynamic 365 crm. I want to use Spark to read the data and I was using this library and it all works until I added new columns to table
https://github.com/Azure/spark-cdm-connector/blob/master/documentation/overview.md
In its limitation section it specifies schema evolution is not supported, which explains why it fails to read the files after new column added.
Further checking the FAQ page for Synapse Link
https://learn.microsoft.com/en-us/powerapps/maker/data-platform/export-data-lake-faq#what-happens-when-i-add-a-column
It seems when a new column added, only new/update rows will have the new column and the old rows will stay the same. That's what I observe in my data folder as well.
What would be the best way to handle this situation? If not Spark, what other tools can I use?
Thanks in advance.
Regards,
Alex