Synapse Spark Pool can only connect to a Public network storage account ?

Guilr 20 Reputation points
2024-03-28T23:00:59.8966667+00:00

Hello,

Following this doc on "Resolve Azure Synapse Analytics Apache Spark pool storage access issues" --> unsupported scenarios : https://learn.microsoft.com/en-us/troubleshoot/azure/synapse-analytics/spark/spark-jobexec-storage-access#unsupported-scenarios

It seems not possible to connect a spark job to an ADLS Gen 2 Storage account except if it is enabled for Public access.

I've read different answers on other posts/answers, so would like to understand what is the current situation?

Thanks!


ps: from my own tests, on synapse (managed vnet or not), I only succeeded to connect a spark job to a storage account when it is publicly enabled.

If IP restricted or vnet storage account, error is :
azure.core.exceptions.HttpResponseError: This request is not authorized to perform this operation.

ErrorCode:AuthorizationFailure

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,428 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,372 questions
0 comments No comments
{count} votes

Accepted answer
  1. Smaran Thoomu 9,610 Reputation points Microsoft Vendor
    2024-03-29T06:18:22.7233333+00:00

    Hi @Guilr

    Thank you for reaching out to the Microsoft Q&A platform.

    It's great to hear that you are exploring Azure Synapse Analytics Apache Spark pool storage access issues.

    Regarding your question, it is correct that connecting a Spark job to an ADLS Gen 2 Storage account is not possible unless it is enabled for public access. This is because the Spark job needs to access the storage account, and if the storage account is not publicly accessible, it needs to be accessed through a private endpoint. However, private endpoints are not currently supported for Spark jobs in Azure Synapse Analytics.

    As you mentioned, you have tested this scenario and received an error when trying to connect to a storage account that is not publicly enabled. This error occurs because the Spark job is not authorized to access the storage account due to the lack of a private endpoint.

    Appreciate if you could share the feedback on our feedback channel. Which would be open for the user community to upvote & comment on. This allows our product teams to effectively prioritize your request against our existing feature backlog and gives insight into the potential impact of implementing the suggested feature.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful