Configure high concurrency mode for Fabric notebooks in pipelines
When you execute a notebook step within a pipeline, an Apache Spark session is started and is used to run the queries submitted from the notebook. When you enable high concurrency mode for pipelines, your notebooks will be automatically packed into the existing spark sessions.
This gives you session sharing capability across all the notebooks within a single user boundary. The system automatically packs all the notebooks in an existing high concurrency session.
Note
Session sharing with high concurrency mode is always within a single user boundary. To share a single spark session, the notebooks must have matching spark configurations, they should be part of the same workspace, and share the same default lakehouse and libraries.
For notebooks to share a single Spark session, they must:
- Be run by the same user.
- Have the same default lakehouse. Notebooks without a default lakehouse can share sessions with other notebooks that don't have a default lakehouse.
- Have the same Spark compute configurations.
- Have the same library packages. You can have different inline library installations as part of notebook cells and still share the session with notebooks having different library dependencies.
Fabric workspace admins can enable the high concurrency mode for pipelines using the workspace settings. Use the following steps to configure the high concurrency feature:
Select Workspace Settings option in your Fabric workspace
Navigate to the Data Engineering and Science section > Spark Compute > High Concurrency
In the High Concurrency section, enable the For pipeline running multiple notebooks setting.
Enabling the high concurrency option allows all the notebook sessions triggered by pipelines as a high concurrency session.
The system automatically packs the incoming notebook sessions to active high concurrency sessions. If there are no active high concurrency sessions, a new high concurrency session is created and the concurrent notebooks submitted are packed into the new session.
Open the Fabric workspace
Create a pipeline item using the Create menu
Navigate to the Activities tab in the menu ribbon and add a Notebook activity.
From Advanced settings, specify any string value for the session tag property.
After the session tag is added, the notebook sharing uses this tag as matching criteria bundling all notebooks with the same session tag.
Monitoring and debugging can be challenging when multiple notebooks are running within a shared session. In high concurrency mode, log separation is provided, enabling you to trace logs from Spark events for each individual notebook.
When the session is in progress or in completed state, you can view the session status by navigating to the Run menu and selecting the All Runs option
This opens the run history of the notebook with the list of current active and historic spark sessions
By selecting a session, you can access the monitoring detail view, which displays a list of all Spark jobs executed within that session.
For high concurrency session, you can identify the jobs and its associated logs from different notebooks using the Related notebook tab, which shows the notebook from which that job was run.
- To learn more about High Concurrency mode in Microsoft Fabric, see Overview on High Concurrency Mode in Microsoft Fabric
- To get started with High Concurrency mode for notebooks, see How to use a High Concurrency Mode in Notebooks