Databricks provides an ecosystem of tools to help you develop applications and solutions that integrate with Azure Databricks and programmatically manage Databricks resources and data.
This article provides an overview of these tools and recommendations for the best tools for common developer scenarios.
What tools does Databricks provide for developing locally?
The following table provides a list of developer tools provided by Databricks.
Connect to Azure Databricks using popular integrated development environments (IDEs) such as PyCharm, IntelliJ IDEA, Eclipse, RStudio, and JupyterLab. If you are using Visual Studio Code, Databricks recommends the Databricks extension for Visual Studio Code, which is built on top of Databricks Connect, as it provides additional features to enable easier configuration.
Configure a connection to a remote Databricks workspace and run files on Databricks clusters from PyCharm. This plugin is developed and provided by JetBrains in partnership with Databricks.
Automate Azure Databricks from code libraries written for popular languages such as Python, Java, Go, and R. Instead of sending REST API calls directly using curl/ Postman, you can use an SDK to interact with Databricks using a programming language of your choice. The Databricks SDKs support the complete REST API and provide other features, including unified authentication and pagination, that make them easy to use and extend to cover many scenarios.
Connect to Azure Databricks to run SQL commands and scripts, interact programmatically with Azure Databricks, and integrate Azure Databricks SQL functionality into applications written in popular languages such as Python, Go, JavaScript and TypeScript.
Access Azure Databricks functionality using the Databricks command-line interface (CLI). The CLI wraps the Databricks REST API, so instead of sending REST API calls directly using curl or Postman, you can use the Databricks CLI to interact with Databricks.
Implement industry-standard development, testing, and deployment (CI/CD) best practices for your Azure Databricks data and AI projects using Databricks Asset Bundles (DABs).
You can also connect many additional popular third-party tools to clusters and SQL warehouses to access data in Azure Databricks. See the Technology partners.
Which developer tool should I use?
The following table outlines Databricks tool recommendations for common developer scenarios.
- Direct interaction with Databricks from the command line - Shell scripting - Experimentation - Invoke the REST API directly - Manage local authentication profiles - Sync code from the IDE to the Databricks workspace
- Manage workflows and deploy projects to Databricks - Apply CI/CD best practices - Co-version, co-author, co-deploy your resources and assets as one unit - Supports the most common resources
- Infrastructure as code, CI/CD - Administer and create workspaces, catalogs, metastores, and enforce permissions - Guarantee environment portability and disaster recovery - Many supported resources
- Automate processes where an SDK in your preferred programming language is not available - Advanced scenarios only - Almost all Databricks resources are available
Learn how you can interact with the Azure Machine Learning workspace. You can use the Azure Machine Learning studio, the Python SDK (v2), or the Azure CLI (v2).
Build end-to-end solutions in Microsoft Azure to create Azure Functions, implement and manage web apps, develop solutions utilizing Azure storage, and more.