Why Data Factory only recognize Python scripts from dbfs into Databricks, and not from repos?

Bruna dos Santos Almeida 25 Reputation points
2023-09-20T19:07:48.2933333+00:00

When you create a process for a Databricks python script processe in Data Factory Pipeline there's only option for dbfs source. Futher, If you have a python scripts from repos, Data Factory only reconize if you change to notebook type.

User's image

Doc: https://learn.microsoft.com/en-us/azure/data-factory/transform-data-databricks-python#add-a-python-activity-for-azure-databricks-to-a-pipeline-with-ui

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,093 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,261 questions
{count} votes

Accepted answer
  1. Amira Bedhiafi 20,571 Reputation points
    2023-09-20T19:10:36.38+00:00

    Repos is relatively new compared to DBFS. Microsoft and Databricks might be in the process of improving the integration, but it can take time for new features to be supported in integrated products like ADF.

    We all agree that the primary use case for ADF is data orchestration. Directly running scripts or notebooks might be a common task, but navigating and deciding which version of the script to run from a Git repo could introduce complexities that don't align with the primary purpose of ADF.


0 additional answers

Sort by: Most helpful