How to process multiple delta files using dataflow and we have to apply filter for date column to pick the latest records without importing columns

2025-04-07T12:33:20.1233333+00:00

Hi Team,

Could you please let us know, There is an issue we are facing i.e we have to pass date value as filter condition in dataflow, without importing columns in data flow, referencedate is column name and date value will be like yyyy-MM-dd, we have to pass this value as parameter in dataflow to filter records (without importing the columns ). Could you please help us on this.

Thanks in Advance
Thank you.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} votes

Accepted answer
  1. Anonymous
    2025-04-10T07:45:50.8733333+00:00

    Hello @SaiSekhar, MahasivaRavi (Philadelphia), Glad the above steps worked for you. Posting it as answer for the benefit of the community who faces similar issues in future.

    • First make sure you have not imported any source columns and enabled the Schema drift in the source of the dataflow.
    • To filter the date type column, create a string type parameter in the dataflow without any default value.
    • Now, in the dataflow, take a filter transformation with below expression.
      
      	toString(byName('mydate'))==$dateparam
      
      
      Here, byName() will give the column values if exists and if not, it will give the null values. The above expression will check required condition using the parameter. enter image description here
    • You can pass the parameter value to the dataflow activity inside For-each activity.

    Refer this MS documentation to know about the parameterization in the dataflows.

    By this way you can pass date value as filter condition in dataflow, without importing columns in data flow.

    Coming to the error that you have asked in the comments, please cross-check the below steps to troubleshoot the issue:

    • You have mentioned that you are using a For-each activity to do the task for the 89 tables. Here, there is a chance that the For loop is running in a parallel and that might be reason for the above excessive cores error. To resolve this, give a certain number of batches for the parallel execution of the For-each. By giving the batch number, it will limit the number of executions of the dataflow simultaneously so that it will execute as per the given cores. If still facing the issue, try with a sequential run of the for-each activity. This will reduce the load on the Runtime completely, but the pipeline run will take more time to complete its run.
    • Also, try to opt for the Memory-optimized Compute type in the dataflow activity integration run time.
    • If you're frequently working with heavy loads, go to Azure Portal → Synapse Workspace → Usage + quotas. Find your MemoryOptimized quota. Request increase.

    Hope this helps.

    If the answer is helpful, please click Accept Answer and kindly upvote it. If you have any further questions about this answer, please click Comment.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.