Introduction

3 minutes

When you deploy Azure Databricks for data engineering workloads, understanding its architecture becomes essential for making informed decisions about data organization, security configurations, and resource management. The architecture determines where your data resides, how compute resources are allocated, and how different components interact to provide the unified analytics platform you rely on.

Azure Databricks organizes resources through a hierarchical structure that spans from account-level governance down to individual data objects. At the foundation, the control plane manages orchestration and configuration while the compute plane processes your data—either in serverless environments fully managed by Azure Databricks or in classic compute running within your Azure subscription. This separation enables you to maintain security and governance while scaling workloads efficiently.

Storage options in Azure Databricks have evolved to meet different organizational needs. Default storage simplifies getting started by providing fully managed storage in serverless workspaces without configuration overhead. External storage connects Unity Catalog to your existing cloud storage accounts, enabling you to work with data managed outside Azure Databricks while maintaining governance. Unity Catalog managed storage lets you define where Unity Catalog stores data at the catalog or schema level, providing fine-grained control over data placement while Unity Catalog handles the lifecycle.

Throughout this module, you explore how these architectural components work together. You learn how the account hierarchy organizes resources across workspaces and metastores, how control and compute planes separate responsibilities, and how different storage patterns serve specific use cases. By understanding these fundamentals, you'll be equipped to configure Azure Databricks environments that align with your organization's security, governance, and operational requirements.

Feedback

Was this page helpful?