Git integration with Databricks Repos

Learn how to integrate Git source control with Databricks Repos. To support best practices for data science and engineering code development, Databricks Repos provides repository-level integration with Git providers. You can develop code in an Azure Databricks notebook, sync it with a remote Git repository, and use Git commands for updates and source control.


Support for arbitrary files in Databricks Repos is now in Public Preview. For details, see Work with files in the UI and Import Python and R modules.

What can you do with Databricks Repos?

Databricks Repos provides source control for data and AI projects by integrating with Git providers.

In Databricks Repos, you can use Git functionality to:

  • Clone, push to, and pull from a remote Git respository.
  • Create and manage branches for development work.
  • Create notebooks, and edit notebooks and other files.
  • Visually compare differences upon commit.

For step-by-step instructions, see Work with notebooks and project files in Azure Databricks Repos.

For other tasks, work in your Git provider:

  • Create a pull request.
  • Resolve merge conflicts.
  • Merge or delete branches.
  • Rebase a branch.

Databricks Repos also has an API that you can integrate with your CI/CD pipeline. For example, you can programmatically update a Databricks repo so that it always has the most recent code version.

For information about best practices for code development using Databricks Repos, see CI/CD workflows with Databricks Repos and Git integration.

Security and audit logging

Databricks Repos provides security features such as allow lists to control access to Git repositories and detection of clear text secrets in source code.

When audit logging is enabled, audit events are logged when you interact with a Databricks repo. For example, an audit event is logged when you create, update, or delete a Databricks repo, when you list all Databricks Repos associated with a workspace, and when you sync changes between your Databricks repo and the remote Git repo.

Supported Git providers

Azure Databricks supports these Git providers:

  • GitHub
  • Bitbucket Cloud
  • GitLab
  • Azure DevOps (not available in Azure China regions)
  • AWS CodeCommit
  • GitHub AE

Databricks Repos also supports Bitbucket Server, GitHub Enterprise Server, or a GitLab self-managed subscription instance integration, if the server is internet accessible.

To integrate with a private Git server instance that is not internet-accessible, get in touch with your Databricks representative.

Support for arbitrary files in Databricks Repos is available in Databricks Runtime 8.4 and above.