Azure Data Factory - Data Flow filtering function

Andres Esteban 156 Reputation points
2024-01-02T16:03:13.31+00:00

Hello

I have a source table where telemetry data is stored every minute from a device that measures presence (0 or 1) as shown in the image:

User's image

In order to be able to work with the data, I need to create a new table where only the records that involve a change of state are stored, i.e. from the first image to the one shown below:

User's image

To do this, I am configuring a DataFlow in Azure Data Factory, but I am not able to configure the filtering step, could someone help me? I have tried the following but it does not recognise the 'lag' function:

User's image

Presence == 1 || (Presence == 0 && (lag(Presence) != 0 || lag(Presence) is null))

Could someone help me and tell me how to keep the rows that involve a change of state please?

Thank you very much in advance

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

Accepted answer
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2024-01-03T13:09:52.0333333+00:00

    @Andres Esteban

    Thanks for reaching MS Q&A

    Here’s a high-level overview of how you can achieve this:

    Source Transformation: Connect to your source data.

    Window Transformation: Add a window transformation after your source transformation. In the window transformation, you can use the lag function to create a new column that contains the previous state. For example, you could create a new column PreviousPresence with the expression lag(Presence, 1).

    Derived Column Transformation: Add a derived column transformation after the window transformation. In this transformation, create a new column StateChanged with the expression Presence != PreviousPresence. This will give a boolean value indicating whether the state has changed compared to the previous row.

    Filter Transformation: Add a filter transformation after the derived column transformation. In the filter transformation, filter the rows where StateChanged is true. This will keep only the rows where the state has changed.

    Sink Transformation: Connect to your destination data store and write the result.

    please go through: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-window-functions

    Hope this helps. Do let us know if you any further queries.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.