Azure SQL Database Port 1433 Workaround - Data Factory Enable Staging option not working

Chandrasekaran, Raji 1 Reputation point
2020-08-13T16:51:59.177+00:00

Hello,

I have developed data pipelines and database to store data while I was working from home. However, I am not able to access the database in the corporate network. After some research I found that port 1433 Outbound should be open for Azure SQL Database data to pass through the corporate network. Our IT department does not allow opening this port. So I asked Azure Support to provide some workaround. Someone from the Azure Support team provided a recommendation to create Staged copy before importing data into Database. When I enable Staging in Azure Data Factory pipelines, data gets shifted causing data mapping to fail.

When I checked the raw data, the data looked right. The way the data should look like:

17417-beforeenablestagingdata.png

In Staging the data is converted into:

17440-afterenablestagingdata.png

I have tried to change the Escape character and different methods. But this data gets shifted when the Enable Staging option is turned ON ONLY. I reached out to Azure Support 3 weeks ago and I did not get any response so far. I would like to know if there is any bug in Enable Staging option or if there is any other workaround for not opening port 1433.

Thanks,

Raji

Azure SQL Database
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,985 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Chandrasekaran, Raji 1 Reputation point
    2020-08-18T16:35:27.577+00:00

    Hi Martin,

    Thanks for replying back! Answering your questions:

    1. Yes, Azure SQL Database is the sink. Actual source is SFTP server. Copy of that data is stored in Azure Data Lake using a separate pipeline.
    2. Staging in copy activity. I enabled this option when I move the data from Data Lake into Database.
    3. Two pipelines - one is to copy from SFTP into Data Lake. The other is to copy data from Data Lake into Database.

    The actual problem is that port 1433 Outbound is not open in the corporate network. I am looking for a workaround. Someone from Azure Support team suggested to use Enable Staging to access data without opening port 1433.

    Regards,

    Raji

    0 comments No comments

  2. MartinJaffer-MSFT 26,046 Reputation points
    2020-08-20T23:57:18.627+00:00

    I see. I was confused when you mentioned corporate network. Is your corporate network a virtual network in Azure?

    I noticed the connection strings included port number, so I had the thought, 'what if we changed the port number?'

    I asked my colleague

    Can Azure SQL Database be configured to use a port other than 1433?

    and he replied

    Not possible...on a VM where you have control of the server, you can change the server port the sql engine listens to but with Azure SQL DB as a PaaS service...this is not possible.

    I did find the origin of the staging suggestion:

    You don't want to open ports other than port 80 and port 443 in your firewall because of corporate IT policies. For example, when you copy data from an on-premises data store to an Azure SQL Database sink or an Azure Synapse Analytics sink, you need to activate outbound TCP communication on port 1433 for both the Windows firewall and your corporate firewall. In this scenario, staged copy can take advantage of the self-hosted integration runtime to first copy data to a Blob storage staging instance over HTTP or HTTPS on port 443. Then it can load the data into SQL Database or Azure Synapse Analytics from Blob storage staging. In this flow, you don't need to enable port 1433.

    So, first step, do you have a self-hosted integration runtime in your corporate network?
    Second step, go to your linked services and ensure that the integration runtime selected, is not the default azure-hosted one, but the self-hosted one you created.