How to delete a table in Azure Synapse Dedicated Pool from a Synapse Spark notebook

Jimmy Dobbins 41 Reputation points Microsoft Employee
2024-04-10T19:47:22.7+00:00

I have a notebook that loads several tables in Azure Synapse Dedicated pool. I am using the Apache Spark connector for Synapse dedicated pools. I am using mode("overwrite"), but this does not drop the target table. I need to do something similar to "drop table if exists stg.customer" before loading from a dataframe in the notebook. I have tried everything I can think of, including a stored procedure, but nothing works. Any advice is greatly appreciated! A screenshot of my cell and what I have tried follows (note that the procedure works if I run it from a SQL script, but the table remains even after this cell runs successfully...nothing really happened):

User's image

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,374 questions
{count} votes

Accepted answer
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2024-04-13T00:03:11.6766667+00:00

    @Jimmy Dobbins Welcome to Microsoft Q&A platform and thanks for posting your question.

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others "I'll repost your solution in case you'd like to accept the answer.

    **Ask:**I have a notebook that loads several tables in Azure Synapse Dedicated pool. I am using the Apache Spark connector for Synapse dedicated pools. I am using mode("overwrite"), but this does not drop the target table. I need to do something similar to "drop table if exists stg.customer" before loading from a dataframe in the notebook. I have tried everything I can think of, including a stored procedure, but nothing works. Any advice is greatly appreciated! A screenshot of my cell and what I have tried follows (note that the procedure works if I run it from a SQL script, but the table remains even after this cell runs successfully...nothing really happened):

    User's image

    Solution: I got this to work by using PyODBC. Relevant snippets of my code is below, I hope this helps!

    import pyodbc
    # define constants
    SERVER = synapse_server
    DATABASE = synapse_ded_pool
    USERNAME = username
    PASSWORD = mypassword
    connectionString = f'DRIVER={{ODBC Driver 18 for SQL Server}};SERVER={SERVER};DATABASE={DATABASE};UID={USERNAME};PWD={PASSWORD}'
    # establish connection
    conn = pyodbc.connect(connectionString)
    target_table = "stg.customer"
    drop_query = f"""
            IF OBJECT_ID('{target_table}') is not null
                BEGIN
                    drop table {target_table}
                END;
            """
    # drop table in Synapse dedicated pool before loading
    conn.execute(drop_query)
    df.write \
    	.option(Constants.SERVER, synapse_server) \
        .option(Constants.TEMP_FOLDER, synapse_temp_folder) \
        .options(mergeSchema=True) \
        .mode("overwrite") \
        .synapsesql(synapse_ded_pool + '.' + target_table) # 3-part notation required
    conn.close()
    

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Jimmy Dobbins 41 Reputation points Microsoft Employee
    2024-04-12T20:48:10.12+00:00

    I got this to work by using PyODBC. Relevant snippets of my code is below, I hope this helps!

    import pyodbc
    
    # define constants
    SERVER = synapse_server
    DATABASE = synapse_ded_pool
    USERNAME = username
    PASSWORD = mypassword
    
    connectionString = f'DRIVER={{ODBC Driver 18 for SQL Server}};SERVER={SERVER};DATABASE={DATABASE};UID={USERNAME};PWD={PASSWORD}'
    
    # establish connection
    conn = pyodbc.connect(connectionString)
    
    target_table = "stg.customer"
    
    drop_query = f"""
            IF OBJECT_ID('{target_table}') is not null
                BEGIN
                    drop table {target_table}
                END;
            """
    
    # drop table in Synapse dedicated pool before loading
    conn.execute(drop_query)
    
    df.write \
    	.option(Constants.SERVER, synapse_server) \
        .option(Constants.TEMP_FOLDER, synapse_temp_folder) \
        .options(mergeSchema=True) \
        .mode("overwrite") \
        .synapsesql(synapse_ded_pool + '.' + target_table) # 3-part notation required
    
    conn.close()
    
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.