How to build ADF dataflow pipeline for complex queries

Aravind 20 Reputation points
2024-01-12T09:54:22.66+00:00

Hi Team, I asked this question previously, and yeah I got the answer but again I raised my query but not yet gotten the answer so for that I rasing separately.

WITH RankedRows AS (
  SELECT
    a.*,
    ROW_NUMBER() OVER (PARTITION BY a.Id ORDER BY a.Id) AS RowNum
  FROM
    [Lms.Core].[Sections] a (nolock)
    INNER JOIN [Lms.Core].[SectionModuleInstances] (nolock) b ON a.id = b.SectionId
    INNER JOIN [Lms.Core].[CourseSections] (nolock) c ON b.SectionId = c.SectionInstanceId
  WHERE
    c.CourseId in( select a.Id 
from [Lms.Core].[Courses] a (nolock)
inner join [Lms.Core].[CatalogItems] b (nolock)
on a.id=b.scopeId
inner join [Lms.Core].[CatalogDefintions] c (nolock)
on b.CatalogDefinitionId=c.Id
where b.CatalogDefinitionId in(722,752,814) and b.Scope=50)
)
SELECT
  *
FROM
  RankedRows
WHERE
  RowNum = 1;

so I know how to build a Dataflow pipeline for above query except one part. in where condition we used subquery like below

WHERE

    c.CourseId in( select a.Id 

from [Lms.Core].[Courses] a (nolock)

inner join [Lms.Core].[CatalogItems] b (nolock)

on a.id=b.scopeId

inner join [Lms.Core].[CatalogDefintions] c (nolock)

on b.CatalogDefinitionId=c.Id

where b.CatalogDefinitionId in(722,752,814) and b.Scope=50)

)



so how to implement this in filter task? it's nearly impossible because there only supports expression so for this what I need to implement this in ADF side?

Azure SQL Database
Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
9,013 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Pinaki Ghatak 5,600 Reputation points Microsoft Employee Volunteer Moderator
    2024-01-12T10:03:22.4966667+00:00

    Hello Aravind In Azure Data Factory, the Filter activity in a pipeline is used to apply a filter expression to an input array. However, it does not directly support SQL-like subqueries. Instead, you can use a combination of Lookup, Filter, and ForEach activities to achieve similar functionality. Here’s a high-level approach: Lookup Activity: Use a Lookup activity to execute your subquery and retrieve the data. This will give you an array of results that you can use in the next steps.

    1. Filter Activity: Use a Filter activity to filter the data based on your condition. The Filter activity in Azure Data Factory allows row filtering based upon a condition. The output stream includes all rows that match the filtering condition. The filter transformation is similar to a WHERE clause in SQL.
    2. ForEach Activity: If you need to perform operations on each of the filtered results, you can use a ForEach activity. You can use the output from the Filter activity as an input to the ForEach activity.

    Remember, the Filter activity in Azure Data Factory works with arrays, so you’ll need to structure your data accordingly. Also, the condition in the Filter activity must be an expression that returns a boolean value. Please note that this is a high-level approach, and you might need to adjust it based on your specific use case and data structure. It’s also a good idea to refer to the official Azure Data Factory documentation for more detailed information and examples. Does this answer your question?

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.