When using Microsoft Spark Utilities and File Mount API from a Synapse Notebook on a Synapse Workspace with Managed Virtual Network and Data Exfiltration Protection enabled, the Datalake Storage FQDN names resolves to the public endpoints and not the private endpoints.
Steps to reproduce:
- Provision an Azure Synapse Workspace with Data Exfiltration Protection enabled with Managed Private Endpoint to the related Storage Account.
- Provision a Spark Pool and create a Notebook and attach it to the Spark pool
- Enter the following code snippet in a notebook cell:
mssparkutils.fs.mount(
"abfss://mycontainer@myaccountname.dfs.core.windows.net",
"/test",
{"linkedService":"mylinkedservice"}
)
The effect of this will be a time and an error message showing that the command tried to resolve the public storage account endpoint:
blob.core.windows.net/mycontainer
The expected behavior is that the FQDN name should resolve to the Private Link FQDN from the managed virtual network.
The exact same setup of resources and connections works from a Synapse Workspace without Data Exfiltration Protection.