Azure Databricks vs Adf

CzarR 296 Reputation points
2022-05-22T02:36:50.14+00:00

HI, I have experience using ADF and SSIS. I am now trying to understand and implement Databricks at work. From what I can see ADF has a GUI which is a simple drag and drop which runs on preconfigured spark clusters called Integrated run times. Databricks you have the ability to configure the cluster yourself and program the ETL yourself and do data analytics too at the same time. Databricks is good at processing streaming data. Not sure if ADF can process streaming data.

I have few questions that I need your help to understand
1.) Is writeup above accurate? or anything wrong with my understanding
2.) Especially when you are doing data engineering i.e. ELT/ELT when would you use DataBricks over ADF activities, in real time? A few example scenarios might help.
3.) Can I say Azure databricks is like a stored procedure that can do complex data manipulations in just one script

We are about to start building a Azure synapse warehouse with internal/external sources from data lakes, so wondering in what scenarios I can use Databricks vs ADF activities while creating my pipelines.

Please help. Thanks in advance.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,466 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,175 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,681 questions
0 comments No comments
{count} votes

Accepted answer
  1. Samy Abdul 3,371 Reputation points
    2022-05-23T05:17:24.16+00:00

    Hi @CzarR The real and notable difference is Azure Data Factory is batch process and natively could not handle heavy real time data intensive

    ingestion that is coming across from devices or sensors for that ,other Azure services such as Event Hub and IoT hub with streaming analytics etc. are

    used. Whereas the databricks is ideal platform for high volume real time streaming data. Typically , both ADF and Databricks are used in combination as

    required.

    https://www.mssqltips.com/sqlservertip/6438/azure-data-factory-vs-ssis-vs-azure-databricks/ Thanks

    1 person found this answer helpful.
    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Samy Abdul 3,371 Reputation points
    2022-05-22T09:18:05.313+00:00
    1 person found this answer helpful.
    0 comments No comments

  2. CzarR 296 Reputation points
    2022-05-22T17:46:27.257+00:00

    Hi @Samy Abdul , these artcles help. Thank you so much.

    Still a real scenarios would help understand. From what I can see ADF can do everything that the databricks can do. Why put extra effort writing code into Databricks? Examples would help. Thanks again.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.