databricks architecture

Question

databricks architecture

Vineet S 1,390

how to define datbricks architecture RDD or advance architecture format like below

https://learn.microsoft.com/en-us/azure/databricks/getting-started/overview

and morden architecture says,what is classic compute plane

Accepted answer

0 additional answers

Your answer

Answer 1

Azure Databricks operates with two key planes: Control Plane and Compute Plane, each serving distinct roles. As requested, here's a breakdown of the high-level and modern enterprise architecture of Azure Databricks:

1. Control Plane

Purpose: Handles backend services and management operations.
Scope: Azure Databricks manages this plane within its own cloud infrastructure.
Components:
- Web application interface: Used to manage and interact with Databricks resources.
- Metadata storage: Manages workspace configurations, job schedules, and cluster metadata.
- Cluster management services: Responsible for starting, stopping, and scaling compute clusters.

2. Compute Plane

This is where data processing happens. There are two types of compute planes based on how compute resources are deployed:

a. Classic Compute Plane

Location: Runs inside the customer’s Azure subscription and virtual network (VNet).
Isolation: Each customer’s environment is naturally isolated because all resources are confined to their subscription and VNet.
Use Case: Provides full control over network and security configurations. Ideal for organizations needing strict compliance or custom networking.
Networking: Requires configuration of private endpoints, firewalls, and virtual networks.
Examples:
- Standard Databricks clusters running in customer-managed networks.
- Integrating with other Azure services like Data Lake and Synapse Analytics within the same VNet.

b. Serverless Compute Plane

Location: Resources run in a shared, managed compute layer within Databricks’ environment.
Security: Provides isolation at the cluster and workspace levels, ensuring that customer data remains secure.
Use Case: Recommended for workloads that benefit from faster setup, lower operational overhead, and elasticity.
Networking: Simplifies network configuration by offloading management to Azure Databricks.
Examples:
- Ad-hoc analytics with minimal configuration requirements.
- Lightweight experimentation or proof-of-concept workloads.

Workspace Storage Account

Location: Created in the customer’s Azure subscription during workspace setup.
Purpose: Stores system data and files used within Databricks.
Content:
1. Workspace system data: Logs, command results, job run history, and notebook versions.
2. DBFS (Databricks File System): Deprecated file system used in earlier versions.
3. Unity Catalog workspace catalog: Metadata catalog for data governance and access control.

Differences Between Classic and Serverless Compute Planes

Feature	Classic Compute Plane	Serverless Compute Plane
Location	Customer’s Azure subscription	Databricks-managed shared layer
Control	Full network and security control	Minimal management overhead
Network Config	Requires private endpoints and firewalls	Simplified networking
Use Case	Long-term, complex workloads	Fast, elastic workloads
Isolation	By customer VNet	By workspace and cluster boundaries

Use Cases for Classic Compute Plane

The Classic Compute Plane is ideal when:

You need strict network control, such as using private endpoints, NSGs, or custom VNets.
You must comply with security regulations requiring data to stay in a customer-controlled environment.
There are dependencies on other Azure services running in the same VNet, like Azure Data Lake or Synapse Analytics.

This modern architecture allows organizations to choose between serverless (convenience and elasticity) and classic compute planes (control and compliance) based on their specific workload requirements.

If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

hth

Marcin

Share via

databricks architecture

1. Control Plane

2. Compute Plane

a. Classic Compute Plane

b. Serverless Compute Plane

Workspace Storage Account

Differences Between Classic and Serverless Compute Planes

Use Cases for Classic Compute Plane

0 additional answers

Your answer