Azure Data Factory CopyData from DB on-premise to Azure using SHIR

Yudha Cipta Maulana 0 Reputation points
2023-06-07T09:02:10.1233333+00:00

Hi, I had set up Self Hosted Integration Runtime on my on-premise VM with azure cloud using Azure Data Factory. The connection is already been tested and checked green on both integration runtime and linked services. My goal is to be able to copy data from DB on-premise (as table data source) to either ADLS Gen 2 or SQL Server in Azure (as data Sink). The connection between services also had been tested without errors, but when I try to debug the pipeline of copy data between those two locations, it still shown error.

I've been trying to looked up into the log, but its confusing since the error shows on Azure Services even though there is no trace of read data when I debug the pipeline in Azure Data Factory.

Attached the screenshot of the error, thank you!

Screenshot 2023-06-07 155807

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,563 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,672 questions
SQL Server | Other
{count} votes

2 answers

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,647 Reputation points Microsoft Employee Moderator
    2023-06-09T07:32:39.96+00:00

    @Yudha Cipta Maulana Welcome to Microsoft Q&A forum and thanks for reaching out here.

    As you mentioned that your source is On prem Database (I assume you are using SHIR for this connector) and sink is ADLS Gen2 (I assume you are using Azure IR) and the individual test connection to respective data sources went well with out any issues and the issue is happening on when you try to debug the copy activity pipeline.

    Based on above confirmation and looking at the error message, my understanding is that the issue could be related to ADLS Gen2 networking.

    In order validate that, please check if your ADLS Gen2 is firewall enabled. If that is the case, then you will have to whitelist your SHIR hosted IP address to have access to your ADLS Gen2. You may ask that you are using Azure IR for ADLS Gen2 connection but what is the need to whitelist SHIR IP address. The reason is that when either of the source or sink linked service points to a self-hosted IR, the copy activity is executed on the self-hosted IR even though one of the data store is connected to Azure IR. Hence if you have any firewall restrictions, then you will have to whitelist SHIR IP address so that it is allowed to acccess to data store.

    This has been called our here: ADF - Determining which IR to useUser's image

    Please refer to this troubleshooting guide which has related info: Troubleshoot the Azure Data Lake Storage connectors in Azure Data Factory and Azure Synapse

    User's image

    User's image

    In case if your sink is Azure SQL then you will have to consider below pre-requisites:

    User's image

    Ref doc: Copy data from a source to a sink

    Hope this info helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.


  2. Yudha Cipta Maulana 0 Reputation points
    2023-06-13T00:51:18.8766667+00:00

    Update! It turns out I need to whitelist the IP/Domain of Azure Data Lake Storage Gen 2 Services to be able to execute copy data on ADF. And whitelist the SQL Server with its port


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.