I have a python script in batch service as an element of Azure Data Factory (like here: https://learn.microsoft.com/en-us/azure/batch/tutorial-run-python-batch-azure-data-factory ).
The ADF Pipeline has only one element - the mentioned script. The script connects to Azure SQL Database, fetches the data, preprocesses it and inserts into the table in the same Azure
SQL Database like below:
import urllib
import sqlalchemy
import pandas as pd
sql_query = 'SELECT TOP (1000) * FROM [dbo].[Table]'
server = '<servername>.database.windows.net'
database = '<dbname>'
username = '<username>'
password = '<password>'
driver = '{ODBC Driver 17 for SQL Server}'
odbc_str = 'DRIVER='+driver+';SERVER='+server+';PORT=1433;UID='+username+';DATABASE='+ database + ';PWD='+ password
connect_str = 'mssql+pyodbc:///?odbc_connect=' + urllib.parse.quote_plus(odbc_str)
engine = sqlalchemy.create_engine(connect_str)
df = pd.read_sql(sql_query, engine)
engine.dispose()
However, it can't access the SQL Server because it has Public Access Disabled and I can not mark 'Allow azure resources and services to access this server', there are only private access options. Everything works fine when I do mark this option for testing purposes.
How can I make this ADF Pipeline (one element - python script) access the SQL Server? The ADF resource is related to this network, pipelines based solely on SQL scripts work fine, the issue is related only to the Python script and I'm not even sure where to start - is it related to Integration Runtime or network settings of Batch Account related to the ADF?
I would really appreciate some help.