Synapse Analytics reference architecture for on-prem data source without using a Self-Hosted IR

pmscorca 1,052 Reputation points
2023-07-24T17:27:42.07+00:00

Hi,

in order to understand better a Synapse Analytics architecture based on a workspace with the Managed virtual network feature enabled, I'm searching a good and simple sample of a related reference architecture.

I'd like to avoid to install a Self Hosted IR to read data from an on-premise data source and so I'd like to learn better the use of a VNet.

Any suggests to me, please? Thanks

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2023-07-25T06:00:45.5033333+00:00

    @pmscorca - Thanks for the question and using MS Q&A platform.

    Synapse Analytics provides a managed virtual network feature that allows you to securely connect to on-premises data sources without the need for a self-hosted integration runtime (IR). Here is a simple reference architecture for using Synapse Analytics with an on-premises data source without a self-hosted IR:

    Create a virtual network in Azure and configure a site-to-site VPN connection between the virtual network and your on-premises network.

    Create a Synapse workspace and enable the managed virtual network feature. This will allow you to connect to your on-premises data source securely through the VPN connection.

    Create a private endpoint for your on-premises data source in the virtual network. This will allow you to access the data source securely over the VPN connection.

    Create a linked service in Synapse Analytics that points to the private endpoint for your on-premises data source.

    Create a dataset in Synapse Analytics that references the linked service and points to the data you want to analyze.

    Create a pipeline in Synapse Analytics that uses the dataset as a source and performs the desired data transformations and analysis.

    Instead of using Self Hosted integration runtime you can use proxy machines. We will not go into the details of these solutions in this article, but the following documentation provides a step-by-step guide:

    By following these steps, you can securely connect to your on-premises data source without the need for a self-hosted IR. Let me know if you have any further questions!

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.