Azure Synapse Analytics - Data exfiltration error when storage account and synapse workspace are in the same tenant and resource group
We have a Synapse Workspace whose primary/default data lake storage account is private. Both the workspace and the storage account are in the same tenant and same resource group:
- Workspace has a managed private endpoint to the storage account
- Workspace has Blob Storage Data Contributor access to the storage account
- I have Blob Storage Data Contributor access to the storage account, am a Synapse Administrator, have Contributor access to the workspace, and am the SQL AD Administrator
- Data exfiltration protection is on
We created the workspace and the storage account using ARM template (stored alongside other resources as infra-as-code) and we're trying to read a dedicated SQL Pool table into a Spark pool using the Azure Synapse Apache Spark to Synapse SQL connector. When we first create it and then add the managed private endpoint to the storage account, we are able to read from the dedicated SQL Pool table into Spark pool without issues. However, when we reapply the ARM template to make changes to some other resource, we are unable to read from the dedicated SQL Pool table claiming:
com.microsoft.spark.sqlanalytics.SQLAnalyticsConnectorException: com.microsoft.sqlserver.jdbc.SQLServerException: Data exfiltration to '<storage account name>.dfs.core.windows.net' is blocked. Add destination to allowed list for data exfiltration and try again.
I do some checks to see if the managed private endpoint connectivity broke, but we all the checks are green on both Synapse Studio and the storage account private endpoints. We are also able to resolve the storage account dns name to an IP that I assume resides inside the managed VNet where the spark pool operates.
We are able to fix this issue by deleting the managed private endpoint and creating it again, but the moment we re-apply our infra-as-code we're back to a broken integration. Is this a bug or are we missing something?