Azure Data Flow Unable to connect to Managed Instance in Private Cloud

Harshal Awasthi 16 Reputation points
2021-04-04T16:14:22.477+00:00

Hello Experts,

I have Managed Instance under Express Route peering Subnet, Opened Public endpoint, and provisioned port 3342 to allow public endpoint access, Linked Services are working as expected and copy activity is working great, however, when using dataflow sink is throwing following error

StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: at Sink 'Group': java.lang.RuntimeException: Cannot connect to SQL database: 'jdbc:sqlserver://<servername>:3342;database=<dbname>', 'User: <username>'.[SQL Exception]Error Code:0, Error Message: The TCP/IP connection to the host <servername>.database.windows.net, port 3342 has failed. Error: \"Connection timed out: no further information.. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.\"., error stack:shaded.msdataflow.com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:228)\nshaded.msdataflow.com.microsoft.sqlserver.jdbc.SQLServerException.ConvertConnectExceptionToSQLServerException(SQLServerException.java:285)\nshaded.msdataflow.com.microsoft.sq

I'm using Memory Optimized Azure Hosted IR and As far as I understand the Spark cluster is unable to see the database server/database. As I have mentioned above linked service test connection is green.

Please provide your expert advice to overcome this challenge.

Thanks in advance

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,525 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Harshal Awasthi 16 Reputation points
    2021-04-05T19:14:41.333+00:00

    Got the solution, maybe someone else will be benefited

    Issue:
    How to open and connect to the public endpoint of a SQL MI and connect to a data factory for copy/dataflow activity.

    Cause:
    Linked Service network connection uses "Data Factory IP's/service tag" while Data Flow uses spark cluster in the backend and they have a dynamic IP which is not part of Data Factory Service Tag.

    Suggested solutions:
    Due to this limitation, there are 2 options to use SQL MI in Data Flow:

    • Open Port 3342 to the Source Internet and Destination Virtual Network in MI Subnet NSG
    • Stage the Data in Data Flow to Azure Blob/Table Storage Copy activity to another source (SQL or Blob) and then continue with Data Flow
    3 people found this answer helpful.