Applies to: SQL analytics endpoint and Warehouse in Microsoft Fabric
This article describes the architecture and workload management behind data warehousing in Microsoft Fabric.
The Warehouse and SQL analytics endpoint share the same underlying processing architecture. As data is retrieved or ingested, it leverages a distributed engine built for both small and large-scale data and computational functions.
The processing system is serverless in that backend compute capacity scales up and down autonomously to meet workload demands.
When a query is submitted, the SQL frontend (FE) performs query optimization to determine the best plan based on the data size and complexity. Once the plan is generated, it is given to the Distributed Query Processing (DQP) engine. The DQP orchestrates distributed execution of the query by splitting it into smaller queries that are executed on backend compute nodes. Each small query is called a task and represents a distributed execution unit. It reads file(s) from OneLake, joins results from other tasks, groups, or orders data retrieved from other tasks. For ingestion jobs, it also writes data to the proper destination tables.
When data is processed, results are returned to the SQL frontend for serving back to the user or calling application.
Elasticity and resiliency
Backend compute capacity benefits from a fast provisioning architecture. Although there is no SLA on resource assignment, typically new nodes are acquired within a few seconds. As resource demand increases, new workloads use the scaled-out capacity. Scaling is an online operation and query processing goes uninterrupted.
The system is fault tolerant and if a node becomes unhealthy, operations executing on the node are redistributed to healthy nodes for completion.
Warehouse and SQL analytics endpoint provide burstable capacity that allows workloads to use more resources to achieve better performance, and use smoothing to offer relief for customers who create sudden spikes during their peak times, while they have a lot of idle capacity that is unused. Smoothing simplifies capacity management by spreading the evaluation of compute to ensure that customer jobs run smoothly and efficiently.
Scheduling and resourcing
The distributed query processing scheduler operates at a task level. Queries are represented to the scheduler as a directed acyclic graph (DAG) of tasks. This concept is familiar to Spark users. A DAG allows for parallelism and concurrency as tasks that do not depend on each other can be executed simultaneously or out of order.
As queries arrive, their tasks are scheduled based on first-in-first-out (FIFO) principles. If there is idle capacity, the scheduler might use a "best fit" approach to optimize concurrency.
When the scheduler identifies resourcing pressure, it invokes a scale operation. Scaling is managed autonomously and backend topology grows as concurrency increases. As it takes a few seconds to acquire nodes, the system is not optimized for consistent subsecond performance of queries that require distributed processing.
When pressure subsides, backend topology scales back down and releases resource back to the region.
Applies to: Warehouse in Microsoft Fabric
In the backend compute pool of Warehouse in Microsoft Fabric, loading activities are provided resource isolation from analytical workloads. This improves performance and reliability, as ingestion jobs can run on dedicated nodes that are optimized for ETL and do not compete with other queries or applications for resources.
The Microsoft Fabric workspace provides a natural isolation boundary of the distributed compute system. Workloads can take advantage of this boundary to manage both cost and performance.
OneLake shortcuts can be used to create read-only replicas of tables in other workspaces to distribute load across multiple SQL engines, creating an isolation boundary. This can effectively increase the maximum number of sessions performing read-only queries.