Overview of Apache Spark compute in Microsoft Fabric

Applies to: ✅ Fabric Data Engineering and Data Science

Fabric Data Engineering and Data Science run on a fully managed Apache Spark compute platform. Starter pools provide fast session startup, typically in 5 to 10 seconds, with no manual setup. Custom Spark pools let you tune node size, scaling behavior, and other compute settings for your workload. In short, starter pools provide fast, preconfigured Spark, while custom Spark pools provide deeper control and flexibility.

Starter pools

Starter pools are a fast and easy way to use Spark on the Microsoft Fabric platform within seconds. You can use Spark sessions right away, instead of waiting for Spark to set up the nodes for you, which helps you do more with data and get insights quicker.

Starter pools have Apache Spark clusters with sessions that are always on and ready for your requests. They use medium nodes that dynamically scale up based on your Spark job needs.

When you use a starter pool without any extra library dependencies or custom Spark properties, your session typically starts in 5 to 10 seconds. This fast startup is possible because the cluster is already running and doesn't require provisioning time.

Important

Starter pools are a Microsoft-managed, best-effort optimization that reduces Spark startup time by using prewarmed capacity. Starter pool capacity isn't guaranteed for every run. When prewarmed capacity is available, sessions can typically start in seconds. When it isn't, Fabric starts the session by using standard on-demand capacity, which can take longer. For workloads that need consistent, predictable session start, use a custom live pool.

Note

Starter pools support only Medium node size. If you select a different node size or customize compute configurations, Fabric uses on-demand session startup, which can take 2 to 5 minutes.

However, there are several scenarios where your session might take longer to start.

Custom libraries or Spark properties: If you've configured libraries or custom settings in your environment, Spark has to personalize the session once it's created. The additional time depends on your library publishing mode:

Quick mode: Libraries install at session start. Expect an additional 30 seconds to 5 minutes, depending on the number and size of your dependencies.
Full mode: The environment snapshot deploys at session start, typically adding 1 to 3 minutes.
Full mode with a custom live pool: The snapshot is already preinstalled on hydrated clusters, so library personalization adds minimal overhead and sessions can start in approximately 5 seconds.

Note

The notebook Resources folder and inline library installation commands (such as %pip install) are manual, per-session approaches. They aren't affected by environment publishing and always install during the active session.

Starter pools in your region are fully used: In rare cases, a region's starter pools might be temporarily exhausted due to high traffic. When that happens, Fabric spins up a new cluster to accommodate your request, which takes about 2 to 5 minutes. Once the new cluster is available, your session starts. If you also have custom libraries to install, add the additional 30 seconds to 5 minutes required for personalization.

Advanced networking or security features (Private Links or Managed VNets): When your workspace has networking features such as Tenant Private Links or Managed VNets, starter pools aren't supported. In this situation, Fabric must create a cluster on demand, which adds 2 to 5 minutes to your session start time. If you also have library dependencies, that personalization step can add another 30 seconds to 5 minutes.

Tip

When you need predictable, fast session starts - for example, for scheduled Spark job definitions or other latency-sensitive workloads - use a custom live pool instead of relying on starter pool capacity. Custom live pools keep dedicated clusters warm on a schedule that you define (the active window), so sessions start consistently in approximately 5 seconds during that window. Because the clusters are hydrated in advance, your environment libraries are already preinstalled on the cluster, which removes the per-session library personalization time.

Here are a few example scenarios to illustrate potential start times:

Scenario	Typical Startup Time
Default settings, no libraries	5 – 10 seconds
Default settings + library dependencies	5 – 10 seconds + 30 seconds – 5 min (for library setup)
High traffic in region, no libraries	2 – 5 minutes
High traffic + library dependencies	2 – 5 minutes + 30 seconds – 5 min (for libraries)
Network security (Private Links/VNet), no libraries	2 – 5 minutes
Network security + library dependencies	2 – 5 minutes + 30 seconds – 5 min (for libraries)

When it comes to billing and capacity consumption, you're charged for the capacity consumption when you start executing your notebook or Apache Spark job definition. You aren't charged for the time the clusters are idle in the pool.

For example, if you submit a notebook job to a starter pool, you're billed only for the time period where the notebook session is active. The billed time doesn't include idle time or the time taken to personalize the session with the Spark context. To learn more, see Configure starter pools in Fabric.

Spark pools

A Spark pool is a way of telling Spark what kind of resources you need for your data analysis tasks. You can give your Spark pool a name, and choose how many and how large the nodes (the machines that do the work) are. You can also tell Spark how to adjust the number of nodes depending on how much work you have. Creating a Spark pool is free; you only pay when you run a Spark job on the pool, and then Spark sets up the nodes for you.

If you don't use your Spark pool for 2 minutes after your session expires, your Spark pool will be deallocated. This default session expiration time period is set to 20 minutes, and you can change it if you want. If you're a workspace admin, you can also create custom Spark pools for your workspace, and make them the default option for other users. This way, you can save time and avoid setting up a new Spark pool every time you run a notebook or a Spark job. Custom Spark pools take about three minutes to start, because Spark must get the nodes from Azure. The exception is when you use a custom Spark pool configured as a custom live pool with a Full mode environment; in that case, sessions can start in approximately 5 seconds because the cluster is already hydrated with your library snapshot.

You can even create single node Spark pools, by setting the minimum number of nodes to one, so the driver and executor run in a single node that comes with restorable HA and is suited for small workloads.

The size and number of nodes you can have in your custom Spark pool depends on your Microsoft Fabric capacity. Capacity is a measure of how much computing power you can use. One way to think about it is that two Apache Spark vCores (a unit of Spark compute) equals one capacity unit.

Note

In Apache Spark, users get two Apache Spark vCores for every capacity unit they reserve as part of their SKU. One capacity unit = two Spark vCores. For example, F64 gives 128 Spark vCores, and a 3x burst multiplier increases this value to 384 Spark vCores.

For example, a Fabric capacity SKU F64 has 64 capacity units, which is equivalent to 384 Spark VCores (64 * 2 * 3X Burst Multiplier). You can use these Spark VCores to create nodes of different sizes for your custom Spark pool, as long as the total number of Spark VCores doesn't exceed 384.

Spark pools are billed like starter pools; you don't pay for the custom Spark pools that you have created unless you have an active Spark session created for running a notebook or Spark job definition. You're only billed for the duration of your job runs. You aren't billed for stages like the cluster creation and deallocation after the job is complete.

For example, if you submit a notebook job to a custom Spark pool, you're only charged for the time period when the session is active. The billing for that notebook session stops once the Spark session has stopped or expired. You aren't charged for the time taken to acquire cluster instances from the cloud or for the time taken for initializing the Spark context.

Possible custom pool configurations for F64 based on the previous example. Smaller node sizes have capacity spread across more nodes, so the max number of nodes are higher. Whereas larger nodes are resource-rich, so fewer nodes are needed:

Fabric capacity SKU	Capacity units	Max Spark VCores with Burst Factor	Node size	Max number of nodes
F64	64	384	Small	96
F64	64	384	Medium	48
F64	64	384	Large	24
F64	64	384	X-Large	12
F64	64	384	XX-Large	6

Note

To create custom pools, you need Admin permissions for the workspace. The Microsoft Fabric capacity admin must also grant permissions that allow workspace admins to size custom Spark pools. To learn more, see Get started with custom Spark pools in Fabric.

Nodes

An Apache Spark pool instance consists of one head node and one or more worker nodes. A Spark instance can start with a minimum of one node. The head node runs management services such as Livy, YARN Resource Manager, ZooKeeper, and the Apache Spark driver. All nodes run services such as Node Agent and YARN Node Manager. All worker nodes run the Apache Spark Executor service.

Note

In Fabric, the ratio of nodes to executors is always 1:1. When you set up a pool, one node is dedicated to the driver, and the remaining nodes are used for the executors. The only exception is in a single-node configuration, where the resources for both the driver and the executor are halved.

Node sizes

A Spark pool can be defined with node sizes that range from a small compute node (with 4 vCore and 32 GB of memory) to a double extra large compute node (with 64 vCore and 512 GB of memory per node). Node sizes can be altered after pool creation, although the active session would have to be restarted.

Size	vCore	Memory
Small	4	32 GB
Medium	8	64 GB
Large	16	128 GB
X-Large	32	256 GB
XX-Large	64	512 GB

Note

Node sizes X-Large and XX-Large are only allowed for non-trial Fabric SKUs.

Autoscale

Autoscale for Apache Spark pools allows automatic scale up and down of compute resources based on the amount of activity. When you enable the autoscale feature, you set the minimum and maximum number of nodes to scale. When you disable the autoscale feature, the number of nodes set remains fixed. You can alter this setting after pool creation, although you might need to restart the instance.

Note

By default, spark.yarn.executor.decommission.enabled is set to true, enabling the automatic shutdown of underutilized nodes to optimize compute efficiency. If less aggressive scaling down is preferred, this configuration can be set to false

Dynamic allocation

Dynamic allocation allows the Apache Spark application to request more executors if the tasks exceed the load that current executors can bear. It also releases the executors when the jobs are completed, and if the Spark application is moving to idle state. Enterprise users often find it hard to tune the executor configurations because they're vastly different across different stages of a Spark job execution process. These configurations are also dependent on the volume of data processed, which changes from time to time. You can enable dynamic allocation of executors option as part of the pool configuration, which enables automatic allocation of executors to the Spark application based on the nodes available in the Spark pool.

When you enable the dynamic allocation option for every Spark application submitted, the system reserves executors during the job submission step based on the minimum nodes. You specify maximum nodes to support successful automatic scale scenarios.

Feedback

Was this page helpful?

Last updated on 2026-07-01

Overview of Apache Spark compute in Microsoft Fabric

Starter pools

Spark pools

Nodes

Node sizes

Autoscale

Dynamic allocation

Related content

Feedback

Additional resources