Developer tools
Databricks provides an ecosystem of tools to help you develop applications and solutions that integrate with Azure Databricks and programmatically manage Databricks resources and data.
This article provides an overview of these tools and recommendations for the best tools for common developer scenarios.
What tools does Databricks provide for developers?
The following table provides a list of developer tools provided by Databricks.
Tool | Description |
---|---|
Authentication and authorization | Configure authentication and authorization for your tools, scripts, and apps to work with Azure Databricks. |
Databricks Connect | Connect to Azure Databricks using popular integrated development environments (IDEs) such as PyCharm, IntelliJ IDEA, Eclipse, RStudio, and JupyterLab. If you are using Visual Studio Code, Databricks recommends the Databricks extension for Visual Studio Code, which is built on top of Databricks Connect, as it provides additional features to enable easier configuration. |
Databricks extension for Visual Studio Code | Connect to your remote Azure Databricks workspaces from the Visual Studio Code integrated development environment (IDE). |
PyCharm Databricks plugin | Configure a connection to a remote Databricks workspace and run files on Databricks clusters from PyCharm. This plugin is developed and provided by JetBrains in partnership with Databricks. |
Databricks SDKs | Automate Azure Databricks from code libraries written for popular languages such as Python, Java, Go, and R. Instead of sending REST API calls directly using curl/ Postman, you can use an SDK to interact with Databricks using a programming language of your choice. |
SQL drivers and tools | Connect to Azure Databricks to run SQL commands and scripts, interact programmatically with Azure Databricks, and integrate Azure Databricks SQL functionality into applications written in popular languages such as Python, Go, JavaScript and TypeScript. |
Databricks CLI | Access Azure Databricks functionality using the Databricks command-line interface (CLI). The CLI wraps the Databricks REST API, so instead of sending REST API calls directly using curl or Postman, you can use the Databricks CLI to interact with Databricks. |
Databricks Asset Bundles | Implement industry-standard development, testing, and deployment (CI/CD) best practices for your Azure Databricks data and AI projects using Databricks Asset Bundles (DABs). |
Databricks Terraform provider and Terraform CDKTF for Databricks | Provision Azure Databricks infrastructure and resources using Terraform. |
Pulumi Databricks resource provider | Provision Azure Databricks infrastructure and resources using Pulumi infrastructure-as-code (IaC). |
CI/CD tools | Integrate popular CI/CD systems and frameworks such as GitHub Actions, Jenkins, and Apache Airflow. |
Tip
You can also connect many additional popular third-party tools to clusters and SQL warehouses to access data in Azure Databricks. See the Technology partners.
Which developer tool should I use?
The following table outlines Databricks tool recommendations for common developer scenarios.
Scenarios | Recommendation |
---|---|
- Interactive development and debugging from a local IDE | Databricks extension for Visual Studio Code PyCharm Databricks plugin For other IDEs, use Databricks CLI with Databricks Connect |
- Direct interaction with Databricks from the command line - Shell scripting - Experimentation - Invoke the REST API directly - Manage local authentication profiles - Sync code from the IDE to the Databricks workspace |
Databricks CLI |
- Manage workflows and deploy projects to Databricks - Apply CI/CD best practices - Co-version, co-author, co-deploy your resources and assets as one unit - Supports the most common resources |
Databricks Asset Bundles (a feature of the CLI) |
- Infrastructure as code, CI/CD - Administer and create workspaces, catalogs, metastores, and enforce permissions - Guarantee environment portability and disaster recovery - Many supported resources |
Databricks Terraform provider |
- Application development - Integrate with existing deployment systems - Create custom Databricks workflows and new web services |
Databricks Python SDK Databricks Java SDK Databricks Go SDK Databricks R SDK |
- Advanced scenarios only - Almost all Databricks resources are available |
Databricks REST API |