Hello @Akthar Hussain ,
Welcome to Microsoft Q&A platform.
Both ADF’s Mapping Data Flows and Databricks utilize spark clusters to transform and process big data and analytics workloads in the cloud.
Mapping data flows are visually designed data transformations in Azure Data Factory. Data flows allow data engineers to develop data transformation logic without writing code. The resulting data flows are executed as activities within Azure Data Factory pipelines that use scaled-out Apache Spark clusters. Data flow activities can be operationalized using existing Azure Data Factory scheduling, control, flow, and monitoring capabilities.
Mapping data flows provide an entirely visual experience with no coding required. Your data flows run on ADF-managed execution clusters for scaled-out data processing. Azure Data Factory handles all the code translation, path optimization, and execution of your data flow jobs.
Azure Databricks is based on Apache Spark and provides in memory compute with language support for Scala, R, Python and SQL. Data transformation/engineering can be done in notebooks with statements in different languages. That makes this a flexible technology to include advanced analytics and machine learning as part of the data transformation process. You are also able to run each step of the process in a notebook, so step by step debugging is easy. You will also be able to see this process during job execution, so it is easy to see if your job stops.
Azure Databricks clusters can be configured in a variety of ways, both regarding the number and type of compute nodes. Managing to set the correct cluster is an art form, but you can get quite close as you can set up your cluster to automatically scale within your defined threshold given the workload. It can also be set to automatically terminate when it is inactive for a certain time. When used with ADF the cluster will start up when activities are started. parameters can be sent in and out from ADF. Azure Databricks is closely connected to other Azure services, both Active Directory, KeyVault and data storage options like blob, data lake storage and sql.
The biggest drawback of Databricks in my mind is that you must write code. Most BI developers are used to more graphical ETL tools like SSIS, Informatica or similar, and it is a learning curve to rather write code. Many will say that poorly written code will be very hard to maintain, but I’ve seen plenty of examples where graphical ETL isn’t easy to follow either.
Hope this helps. Do let us know if you any further queries.
----------------------------------------------------------------------------------------
Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.