Azure Synapse Managed Private Endpoint Data Transfer Cost

E P 0 Reputation points
2023-08-31T14:59:28.8233333+00:00

We have a Synapse workspace under one subscription (dev) with a managed private endpoint to a Storage Account in another subscription (production). Data from production is processed in serverless Spark jobs.

We tried to replicate replicated the Synapse setup from dev to the production subscription, and it's functionally ok. But we are seeing data transfer costs. These kind of costs we don't see in the dev subscription, even though we were processing the same amount of data.

We are trying to understand the difference and would appreciate any help.

EDIT: one thing we've found different is that, in the workspace in the production subscription, there's no "fqdns" in the default private endpoint. This one has been created using Bicep and the fields has not been set. The dev one was created using the Azure Portal.

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2023-09-27T12:49:47.8366667+00:00

    @E P - I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .

    Ask: We have a Synapse workspace under one subscription (dev) with a managed private endpoint to a Storage Account in another subscription (production). Data from production is processed in serverless Spark jobs.

    We tried to replicate replicated the Synapse setup from dev to the production subscription, and it's functionally ok. But we are seeing data transfer costs. These kind of costs we don't see in the dev subscription, even though we were processing the same amount of data.

    We are trying to understand the difference and would appreciate any help.

    Solution: What we found is that, when accessing a restricted storage account, there is a functional difference between running a "standalone" Spark job and running it using an activity in a pipeline. The standalone requires the (manage) private endpoint. When using a pipeline, the access is authorized solely through the workspace managed identity.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.