How to pass the correct audience when calling mssparkutils.credentials.getToken on Azure China (Mooncake) cloud?

Abhishek Bhatt 0 Reputation points Microsoft Employee
2024-05-21T04:20:59.6566667+00:00

I'm using Microsoft Spark Utilities (MSSparkUtils) with linked service to authenticate into Azure SQL using System Assigned Managed Identity (Synapse Workspace) on Azure China (Mooncake) cloud. However, when I call gettoken with the audience type AzureOSSDB as described here, the function looks for the endpoint https://ossrdbms-aad.database.windows.net/ as per Azure Public Cloud (CORP tenant) which is incorrect. The correct endpoint to look for on Mooncake cloud is *.database.chinacloudapi.cn as described here. Please advise how can I set the correct cloud environment or override default endpoints in mssparkutils.

Steps to reproduce the behavior

Call mssparkutils.credentials.getToken('AzureOSSDB') on a Synapse Notebook on Azure China (Mooncake) cloud.

User's image

Note: Passing in the audience URI as mssparkutils.credentials.getToken("https://ossrdbms-aad.database.chinacloudapi.cn") also doesn't work and returns this exception.

User's image

Expected behavior

The function call should return a bearer token to be used for SQL authentication as described here.

Additional context

As per security guidelines, we are dependent on mssparkutils for authenticating using managed identities (credential-free), since Azure identity library does not work in Synapse workspace.

Exception Traceback:

`---------------------------------------------------------------------------

Py4JJavaError Traceback (most recent call last)

Cell In [31], line 1

----> 1 mssparkutils.credentials.getToken('AzureOSSDB')

File ~/cluster-env/clonedenv/lib/python3.10/site-packages/notebookutils/mssparkutils/credentials.py:8, in getToken(audience, name)

  7 def getToken(audience, name=''):
```----> 8     return creds.getToken(audience, name)

File ~/cluster-env/clonedenv/lib/python3.10/site-packages/py4j/java_gateway.py:1321, in JavaMember.__call__(self, *args)

   1315 command = proto.CALL_COMMAND_NAME +\

   1316     self.command_header +\

   1317     args_command +\

   1318     proto.END_COMMAND_PART

   1320 answer = self.gateway_client.send_command(command)

-> 1321 return_value = get_return_value(

   1322     answer, self.gateway_client, self.target_id, self.name)

   1324 for temp_arg in temp_args:

   1325     temp_arg._detach()

File /opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py:190, in capture_sql_exception.
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,051 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,992 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,261 Reputation points
    2024-05-28T11:37:46.48+00:00

    @Abhishek Bhatt - One Observation - That you are using the incorrect approach, "AzureOSSDB" is for Open Source databases that we offer in Azure and doesn't include SQL Server (borrowed from the same link that you had provided). You could try following this document that would help out. 

    User's image

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.