Azure Data factory for Real time event processing

Sudarshan Kumar 20 Reputation points
2024-02-01T06:07:04.48+00:00

Hi , I have a use case where we get 300K events every day in On prem SQL server . we need to consume this event from on prem and process in Azure . For single event Processing we need to talk to different tables and data bases . For example to process one event we need to connect to 3 data bases and at least 100 times . After processing we need to create a parquet file and save to One lake . The processing time should be as quick as possible ,the SLA is less than 15 sec . Can some one suggest of ADF is good fit for this use case or we should go with Azure functions with Logic app integrations ?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,958 questions
{count} votes

2 answers

Sort by: Most helpful
  1. phemanth 11,900 Reputation points Microsoft Vendor
    2024-02-02T07:41:59.9033333+00:00

    Hi @Sudarshan Kumar

    Thanks for the question and using MS Q&A

    The options for your use case of processing 300K events daily from an on-premises SQL server and consuming them in Azure. Based on your requirements, we’ll compare Azure Data Factory (ADF) and Azure Functions with Logic App integrations.
    Azure Data Factory (ADF)

    • Purpose: ADF is designed for bulk data movement and orchestrating data workflows. It allows you to create data-driven workflows for data movement, transformation, and loading.
    • Strengths: Scalability: ADF can handle large volumes of data efficiently. Data Transformation: It supports data transformations and ETL (Extract, Transform, Load) processes. Optimized for Data Processing: ADF ensures optimal performance for data-related tasks.
    • Considerations: Data-Centric: ADF focuses on data movement and transformation, so it’s well-suited for your scenario. Integration with Azure Services: You can integrate ADF with other Azure services like Azure SQL Database, Azure Data Lake, etc. Parallel Processing: Consider boosting DTUs/parallel copy options to enhance copy performance.
    • Recommendation: ADF would be a better fit for your data movement and transformation needs.

    Azure Functions with Logic App integrations:

    • Purpose: Azure Functions are serverless compute resources for event-driven scenarios, while Logic Apps are designed for application integration.
    • Strengths: Event-Driven: Logic Apps excel at conducting repeating activities based on events (e.g., file arrival or deletion). Connectivity: Logic Apps unify internal and external services, creating a cohesive infrastructure for business processes. Low/No Code: Logic Apps are intuitive and require minimal coding.
    • Considerations: Application Integration: Logic Apps are more about connectivity and orchestration. Complex Business Workflows: Azure Functions can handle more complex business logic. Asynchronous Processing: Logic Apps are well-suited for asynchronous integration.
    • Recommendation: Use Azure Functions with Logic Apps to orchestrate ETL via ADF for repetitive tasks.

    In summary:

    • Azure Data Factory is ideal for data movement, transformation, and scalability.
    • Azure Functions with Logic Apps are great for event-driven scenarios and application integration.

    Consider combining ADF, Logic Apps, and other services (such as Functions) to meet your specific requirements. Feel free to explore both options further and choose the one that aligns best with your use case!  Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


  2. Gabriel Nespoli 0 Reputation points
    2024-03-25T17:00:15.8433333+00:00

    Just to complement the @phemanth answer, Azure Data Factory is a batch processing service, not a real-time one. In the ADF SLA documentation is written "We guarantee that at least 99.9% of the time, all activity runs will initiate within 4 minutes of their scheduled execution times.".

    Even though most of the time the activities will initiate immediately as soon they are triggered, it cannot be guaranteed.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.