Thank you so much for your reply!
But PySpark may not be the solution to what we are trying to do:
- I am trying to use Python code locally in our project to connect and make API calls to Azure Synapse where I would like to query the external tables. I am trying to use the pyodbc to connect to Azure Synapse
import pyodbc
server = <azure synapse serverless endpoint>
database = <db_name>
username = <user_name>
password = <password>
driver= '{ODBC Driver 17 for SQL Server}'with pyodbc.connect('DRIVER='+driver+';SERVER='+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+ password+';') as conn:
with conn.cursor() as cursor:
cursor.execute("SELECT top 10 * FROM [dbo].[db_name]")
row = cursor.fetchone()
while row:
print (str(row[0]) + " " + str(row[1]))
row = cursor.fetchone()
But I am running into the issue:
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]External table 'dbo' is not accessible because content of directory cannot be listed. (16561) (SQLExecDirectW)")
So, Is this possible to query the external tables created in Azure Synapse from local?
Can we grant 'select' permission on the external tables or on the database?
Point of doing this is to query and get the csv data in tabular form (within Azure Synapse) and use Python locally to use T-SQL commands to retrieve the data and pump it in our own database. But I do not see support to connect to Azure Synapse and query Azure Synapse external tables locally using Python.
Please advise on this? Thanks!