Apache Spark workspace administration settings FAQ

This article lists answers to frequently asked questions about Apache Spark workspace administration settings.

How do I use the RBAC roles to configure my Spark workspace settings?

Use the Manage Access menu to add Admin permissions for specific users, distribution groups, or security groups. You can also use this menu to make changes to the workspace and to grant access to add, modify, or delete the Spark workspace settings.

Are the changes made to the Spark properties at the environment level apply to the active notebook sessions or scheduled Spark jobs?

When you make a configuration change at the workspace level, it's not applied to active Spark sessions. This includes batch or notebook based sessions. You must start a new notebook or a batch session after saving the new configuration settings for the settings to take effect.

Can I configure the node family, Spark runtime, and Spark properties at a capacity level?

Yes, you can change the runtime, or manage the spark properties using the Data Engineering/Science settings as part of the capacity admin settings page. You need the capacity admin access to view and change these capacity settings.

Can I choose different node families for different notebooks and Spark job definitions in my workspace?

Currently, you can only select Memory Optimized based node family for the entire workspace.

Can I configure these settings at a notebook level?

Yes, you can use %%configure to customize properties at the Spark session level in Notebooks

Can I configure the minimum and maximum number of nodes for the selected node family?

Yes, you can choose the min and max nodes based on the allowed max burst limits of the Fabric capacity linked to the Fabric workspace.

Can I enable Autoscaling for the Spark Pools in a memory optimized or hardware accelerated GPU based node family?

Autoscaling is available for Spark pools and enabling that allows the system to automatically scale up the compute based on the job stages during runtime. GPUs are currently unavailable. This capability will be enabled in future releases.

Is Intelligent Caching for the Spark Pools supported or enabled by default for a workspace?

Intelligent Caching is enabled by default for the Spark pools for all workspaces.