How can I connect to an external data source when synapse data exfiltration is enabled?

Paul 1 Reputation point
2024-02-05T17:15:42.0966667+00:00

Hey folks, I'm trying to access external services via synapse pipelines and spark pools (salesforce for example) while having data exfiltration enabled. This has been possible using self-hosted integration runtimes secured with proxies to meet our security needs for pipelines however it leaves us unable to use spark notebooks to query the same external sources. We've achieved some success using azure app gateways with private link connections and making reference to the private ip that is provisioned within synapse. We could probably work with that and setup listeners on multiple ports although it can become quite cumbersome to manage all of the mapping of ip:port -> external hostnames. Is there way way to influence the DNS which is published along with the private link connections on the managed virtual network in synapse? Is there otherwise a better way to provide spark pool access to external data sources?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,470 questions
Azure Private Link
Azure Private Link
An Azure service that provides private connectivity from a virtual network to Azure platform as a service, customer-owned, or Microsoft partner services.
472 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Harishga 4,175 Reputation points Microsoft Vendor
    2024-02-06T09:24:14.31+00:00

    Hi @Paul
    Welcome to Microsoft Q&A platform and thanks for posting your question here.

    Yes, it is possible to access external data sources when Synapse data exfiltration is enabled by using DNS. you can use private endpoints, data exfiltration protection through outbound firewall rules, and DNS to access external data sources.

      In your case, you have already achieved some success using Azure App Gateways with private link connections and making reference to the private IP that is provisioned within Synapse. However, it can become quite cumbersome to manage all of the mapping of IP:port -> external hostnames.  

    Regarding your question about influencing the DNS published along with the private link connections on the managed virtual network in Synapse, you can use Azure Private DNS zones to manage custom domain names and map them to private IP addresses. You can create a private DNS zone in your virtual network and add a record set for the external hostname you want to use. Then, you can create a private endpoint for the external service and associate it with the private DNS zone. This way, when you access the external service using the custom domain name, the private endpoint resolves the DNS query to the private IP address of the external service.  

    If you want to control the DNS that is published with the private link connections on the managed virtual network in Synapse, you can create a custom DNS for dedicated SQL pools. This custom DNS can be used to redirect client programs during a disaster.

     Reference:

    https://techcommunity.microsoft.com/t5/azure-synapse-analytics-blog/create-dns-alias-for-dedicated-sql-pool-in-synapse-workspace-for/ba-p/3675676
    https://learn.microsoft.com/en-us/azure/private-link/private-endpoint-dns

    I hope this information helps. Let me know if you have any further questions or concerns.

    0 comments No comments