Azure Synapse Managed Private Endpoints and Microsoft Spark Utilities / File Mount API

gmfx 21 Reputation points
2022-02-02T11:09:00.723+00:00

When using Microsoft Spark Utilities and File Mount API from a Synapse Notebook on a Synapse Workspace with Managed Virtual Network and Data Exfiltration Protection enabled, the Datalake Storage FQDN names resolves to the public endpoints and not the private endpoints.

Steps to reproduce:

  1. Provision an Azure Synapse Workspace with Data Exfiltration Protection enabled with Managed Private Endpoint to the related Storage Account.
  2. Provision a Spark Pool and create a Notebook and attach it to the Spark pool
  3. Enter the following code snippet in a notebook cell:

mssparkutils.fs.mount(
"abfss://mycontainer@myaccountname.dfs.core.windows.net",
"/test",
{"linkedService":"mylinkedservice"}
)

The effect of this will be a time and an error message showing that the command tried to resolve the public storage account endpoint:

blob.core.windows.net/mycontainer

The expected behavior is that the FQDN name should resolve to the Private Link FQDN from the managed virtual network.

The exact same setup of resources and connections works from a Synapse Workspace without Data Exfiltration Protection.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,341 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,369 questions
Azure Private Link
Azure Private Link
An Azure service that provides private connectivity from a virtual network to Azure platform as a service, customer-owned, or Microsoft partner services.
462 questions
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 37,896 Reputation points Microsoft Employee
    2022-02-15T16:25:58+00:00

    Hi @gmfx ,

    Got response from PG. Below are the details.

    Currently file mount API will always do mount within blob endpoint instead of dfs, so please make sure to create a MPE (Managed Private Endpoint) to blob endpoint instead of dfs.

    Implementation of mounting to always use dfs endpoint for gen2 storage will be available soon. No ETA at this moment. Thank you.

    Hope this helps.

    -----------------

    Please consider hitting Accept Answer. Accepted answers helps community as well.


0 additional answers

Sort by: Most helpful