Attach and manage a Synapse Spark pool in Azure Machine Learning

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

In this article, you'll learn how to attach a Synapse Spark Pool in Azure Machine Learning. You can attach a Synapse Spark Pool in Azure Machine Learning in one of these ways:

  • Using Azure Machine Learning studio UI
  • Using Azure Machine Learning CLI
  • Using Azure Machine Learning Python SDK

Prerequisites

Attach a Synapse Spark pool in Azure Machine Learning

Azure Machine Learning offers different ways to attach and manage a Synapse Spark pool.

To attach a Synapse Spark Pool with the Studio Compute tab:

Screenshot showing creation of a new Synapse Spark Pool.

  1. In the Manage section of the left pane, select Compute.
  2. Select Attached computes.
  3. On the Attached computes screen, select New, to see the options for attaching different types of computes.
  4. Select Synapse Spark pool.

The Attach Synapse Spark pool panel opens on the right side of the screen. In this panel:

  1. Enter a Name, which refers to the attached Synapse Spark Pool inside the Azure Machine Learning resource.

  2. Select an Azure Subscription from the dropdown menu.

  3. Select a Synapse workspace from the dropdown menu.

  4. Select a Spark Pool from the dropdown menu.

  5. Toggle the Assign a managed identity option, to enable it.

  6. Select a managed Identity type to use with this attached Synapse Spark Pool.

  7. Select Update, to complete the Synapse Spark Pool attach process.

Add role assignments in Azure Synapse Analytics

To ensure that the attached Synapse Spark Pool works properly, assign the Administrator Role to it, from the Azure Synapse Analytics studio UI. These steps show how to do it:

  1. Open your Synapse Workspace in Azure portal.

  2. In the left pane, select Overview.

    Screenshot showing Open Synapse Studio.

  3. Select Open Synapse Studio.

  4. In the Azure Synapse Analytics studio, select Manage in the left pane.

  5. Select Access Control in the Security section of the left pane, second from the left.

  6. Select Add.

  7. The Add role assignment panel will open on the right side of the screen. In this panel:

    1. Select Workspace item for Scope.

    2. In the Item type dropdown menu, select Apache Spark pool.

    3. In the Item dropdown menu, select your Apache Spark pool.

    4. In Role dropdown menu, select Synapse Administrator.

    5. In the Select user search box, start typing the name of your Azure Machine Learning Workspace. It shows you a list of attached Synapse Spark pools. Select your desired Synapse Spark pool from the list.

    6. Select Apply.

      Screenshot showing Add Role Assignment.

Update the Synapse Spark Pool

You can manage the attached Synapse Spark pool from the Azure Machine Learning studio UI. Spark pool management functionality includes associated managed identity updates for an attached Synapse Spark pool. You can assign a system-assigned or a user-assigned identity while updating a Synapse Spark pool. You should create a user-assigned managed identity in Azure portal, before you assign it to a Synapse Spark pool.

To update managed identity for the attached Synapse Spark pool:

Screenshot showing Synapse Spark Pool managed identity update.

  1. Open the Details page for the Synapse Spark pool in the Azure Machine Learning studio.

  2. Find the edit icon, located on the right side of the Managed identity section.

  3. To assign a managed identity for the first time, toggle Assign a managed identity to enable it.

  4. To assign a system-assigned managed identity:

    1. Select System-assigned as the Identity type.
    2. Select Update.
  5. To assign a user-assigned managed identity:

    1. Select User-assigned as the Identity type.
    2. Select an Azure Subscription from the dropdown menu.
    3. Type the first few letters of the name of user-assigned managed identity in the box that shows the text Search by name. A list with matching user-assigned managed identity names appears. Select the user-assigned managed identity you want from the list. You can select multiple user-assigned managed identities, and assign them to the attached Synapse Spark pool.
    4. Select Update.

Detach the Synapse Spark pool

We might want to detach an attached Synapse Spark pool, to clean up a workspace.


The Azure Machine Learning studio UI also provides a way to detach an attached Synapse Spark pool. To do this, follow these steps:

  1. Open the Details page for the Synapse Spark pool, in the Azure Machine Learning studio.

  2. Select Detach, to detach the attached Synapse Spark pool.

Serverless Spark compute in Azure Machine Learning

Some user scenarios might require access to a serverless Spark compute resource, during an Azure Machine Learning job submission, without a need to attach a Spark pool. The Azure Synapse Analytics integration with Azure Machine Learning also provides a serverless Spark compute experience. This allows access to a Spark compute in a job, without a need to attach the compute to a workspace first. Learn more about the serverless Spark compute experience.

Next steps