You'd typically need to use a Spark activity after the data copy to read the data and then write it back in delta format.
After copying the data, you can use an Azure Synapse Spark pool to convert the data into delta format. Here’s a rough guide on how you might do this:
- Add a new Spark notebook or Spark job in your Synapse workspace.
- Read the parquet files into a Spark DataFrame.
- Write out the DataFrame in delta format using the
.format("delta")option in the DataFrame writer.
You can automate this process by creating a pipeline that includes both the Copy Data activity followed by the Spark job or notebook activity.
# Read the Parquet files into a DataFrame
df = spark.read.format("parquet").load("path/to/your/staging/parquet/files")
# Write out the DataFrame in Delta format
df.write.format("delta").save("path/to/your/final/delta/output")