An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
Hey, I know it’s been a while since I posted this question, but I finally had some time to revisit it and found a solution.
Config:
My Synapse workspace is not in a Managed VNet, so there shouldn’t be any restriction on outbound network access. I also verified that the “Allow Azure services and resources to access this workspace” option is enabled in the networking tab.
On my ADLS Gen2 storage account, I enabled “Allow trusted Microsoft services to access this resource” under the networking settings. Additionally, I added my Synapse workspace as a resource instance with access to the storage account. The workspace’s managed identity has Storage Blob Data Owner, Contributor, and Reader roles assigned. I also granted full rwx ACL permissions to this managed identity for the entire container.
Despite all these configurations, I was still getting the following error when trying to read or write files using Spark from Synapse:
Operation failed: "This request is not authorized to perform this operation.", 403, HEAD
What made this confusing was that the issue only occurred when running the notebook interactively (using the managed identity in the Spark session), and not when running it through a pipeline.
Solution:
After some investigation, I discovered that adding a private endpoint to the ADLS Gen2 storage account resolves the issue. You can do this from the Synapse Manage tab under the Security → Managed private endpoints section. Once created, make sure to approve the private endpoint in the storage account.
This solution worked in my case.
TL;DR: Add a private endpoint from Synapse to your ADLS Gen2 storage account.