In your Synapse Data Flow, start by setting up your source to connect to the Azure Gen 2 Storage Account where your Delta file is stored. You can use parameters in your Data Flow to dynamically filter data so create a parameter, like CurrentDate
, to hold the value of the date you want to process.
When configuring your source in the Data Flow, you can use a SQL-like query to select data from the partition that matches your CurrentDate parameter, an example :
SELECT * FROM your_delta_table WHERE date_column = @CurrentDate
Azure Synapse has native support for Delta Lake, which should allow you to efficiently query specific partitions.
As a backup, you can add a Filter transformation after the source to further ensure only data from the desired partition is used. However, I can say it is more efficient if done at the source query level.
Don't forget to think about setting up the incremental load.