data factory - databricks

arkiboys 9,621 Reputation points
2020-11-26T08:29:06.64+00:00

Hello,
I am in the process of learning the data factory.
1-
Is there anything that data factory/data flow does not do which means it needs to be done in databricks?
I am trying to find-out that if I know data factory really well, then is there a point of me learning databricks?
2-
On another note, how do I set this post so that if I get a response, then I get automated email for notification?
Thanks

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,917 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,540 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 77,081 Reputation points Microsoft Employee
    2020-11-26T10:59:46.577+00:00

    Hello @arkiboys ,

    Is there anything that data factory/data flow does not do which means it needs to be done in databricks?

    Azure Data Flows internally uses Azure Databricks. Dataflows help build orchestration, activity, and resource management, and then Azure Databricks helps to build compute.

    I am trying to find-out that if I know data factory really well, then is there a point of me learning databricks?

    Azure Data Factory and Azure Databricks both services for a different purposes.

    What is Azure Data Factory?

    Azure Data Factory is the platform that solves such data scenarios. It is the cloud-based ETL and data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. Using Azure Data Factory, you can create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores. You can build complex ETL processes that transform data visually with data flows or by using compute services such as Azure HDInsight Hadoop, Azure Databricks, and Azure SQL Database.

    What is Azure Databricks?

    Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. Azure Databricks offers two environments for developing data-intensive applications: Azure Databricks SQL Analytics and Azure Databricks Workspace.

    Azure Databricks is an Apache Spark-based analytics service that allows you to build end-to-end machine learning & real-time analytics solutions. Azure Databricks offers all of the components and capabilities of Apache Spark with a possibility to integrate it with other Microsoft Azure services.

    Setting up a Spark cluster is really easy with Azure Databricks with an option to autoscale and terminate the cluster after being inactive for reduced costs. Supports Python, Scala, R and SQL and some libraries for deep learning like Tensorflow, Pytorch and Scikit-learn for building big data analytics and AI solutions.

    On another note, how do I set this post so that if I get a response, then I get an automated email for notification?

    If you want a reminder to come back and check responses? Here is how to subscribe to a notification.

    Hope this helps. Do let us know if you any further queries.

    ------------

    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.

0 additional answers

Sort by: Most helpful