Filtering data for last 24 months in Mapping Data Flows

ADITYA SINGH 0 Reputation points
2024-03-19T01:29:58.9966667+00:00

How can I filter data using Mapping Data Flows for the last 24 months, starting from the max date and going back? The date column is in the mm/dd/yyyy format.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,343 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,372 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,549 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Amira Bedhiafi 15,216 Reputation points
    2024-03-19T09:38:06.6366667+00:00

    I am assuming that you want to filter based on the maximum date found in your dataset :

    First, you need to identify the maximum date in your dataset. You can use an aggregate transformation to find the max date

    If your date is in a string format like mm/dd/yyyy, use the toDate() function in a derived column transformation to convert it to a date format: toDate(your_date_column, 'MM/dd/yyyy'). If not skip this step

    Now, you will need to calculate the start date for the 24-month range. After getting the max date (let's call this max_date), you can calculate the start date of your 24-month period.

    Here you need to use a derived column transformation to subtract 24 months from max_date. The expression can be like addMonths(max_date, -24)

    Now, use a Filter transformation to keep only the rows within the last 24 months. The condition should check if the date column is between the start date and the max_date.

    0 comments No comments

  2. AnnuKumari-MSFT 30,751 Reputation points Microsoft Employee
    2024-03-28T05:17:33.9+00:00

    @ADITYA SINGH

    Thankyou for using Microsoft Q&A platform.

    I understand that you want to filter data using Mapping Data Flows for the last 24 months, starting from the max date and going back, kindly try the below steps in dataflow:

    1. Add a source transformation to your data flow and select the table that contains the date column you want to filter on.
    2. Add an aggregate transformation to your data flow and group by the date column. In the aggregate transformation, add a max function to get the maximum date value from the date column.
    3. Add a derived column transformation to your data flow and create a new column that calculates the date 24 months ago from the max date. You can use the addMonths function to subtract 24 months from the max date. For example, if your max date column is called MaxDate, you can create a new column 'newCol' with the expression addMonths(MaxDate, -24).
    4. Add a filter transformation to your data flow and filter on the date column using the new column you created in step 3. You can use the greaterThanOrEqual function to filter on dates that are greater than or equal to the date 24 months ago. For example, you can filter on the expression greaterThanOrEqual(DateColumn, newCol).

    Hope it helps. Kindly accept the answer by clicking on Accept answer button. Thankyou

    0 comments No comments