How is the divison in terms of Spark applications submitted from a pipeline handled?

asciibscii 45 Reputation points
2023-01-18T16:16:47.5866667+00:00

In a Synapse pipeline I have 4 different notebooks that are attached to the same Spark pool with 1 executor each. The spark pool has 3-8 nodes so it should have enough capacity to run all notebooks on the same Spark Instance. The pipeline has a trigger to run once every hour. I thought because there is enough capacity in the spark pool and that they are being started by the same pipeline the submitter for the Apache Spark Application would be the same for all of them, but what I am seeing is that the pipeline has 2 different submitters and that 3 notebooks get och submitter while 4th notebook gets the other pipeline submitter. How can I make it so that all Notebooks get the same submitter? Or rather what is the issue here/how does it work in terms of user/submitter to a Spark pool from a synapse pipeline?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,364 questions
{count} votes

Accepted answer
  1. BhargavaGunnam-MSFT 26,136 Reputation points Microsoft Employee
    2023-01-27T23:05:38.52+00:00

    Hello @asciibscii,

    <update>

    Per my discussion with the internal team, they confirmed that this is not a workspace MSI or user objectID.

    This is a system principal ID at the service level.

    The system principal ID is a unique identifier for the Azure service instance that is enabled with a system-assigned managed identity. It is used to authenticate the service instance with Azure resources that support Azure AD authentication.

    The system principal ID is automatically created by Azure when you enable a system-assigned managed identity on the service instance.

    Please refer to the below document on how to view the system principle.

    https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/how-to-view-managed-identity-service-principal-portal#view-the-service-principal

    Below is a screenshot of the notebook job submitter. I can find this ID in the AAD on the enterprise applications.

    User's image

    User's image

    Unfortunately I don't have these details on the documentation page. But I will request PG to document these details.

    I hope this helps. Please let me know if you have any further questions.

    If this answers your question, please consider accepting the answer by hitting the Accept answer and up-vote as it helps the community look for answers to similar questions


0 additional answers

Sort by: Most helpful