How can we run SQL Update Query in synapse using delta tables

Devender 61 Reputation points
2022-11-09T10:56:08.173+00:00

Hi Community
I have SQL Query which i am trying to do in azure Synapse motebook using delta tables which have some joins and multiple where conditions? SQL Query Looks Like this:

I have loaded the file data into dataframes first and from the dataframes i am loading it into delta tables:
df_1.write.mode("overwrite").format("delta").saveAsTable('TABLE1')
df_2.write.mode("overwrite").format("delta").saveAsTable('TABLE2')

After that i am running a QUERY:
**%sql

MERGE INTO campaign_data using Institute_data
ON TABLE1.STUDENTID=TABLE2.COLLEGEID and TABLE1.ATTENDANCE =='' and TABLE2.PERFORMANCE = '' and TABLE1.PLAYSPORTS !=''
WHEN MATCHED THEN UPDATE SET TABLE1.GIVE_MARKS = '1';**

THE actual SQL Query:
**

UPDATE TABLE1 INNER JOIN TABLE2 ON TABLE1.STUDENTID = TABLE2.COLLEGEID
SET TABLE1.GIVE_MARKS = "1"
WHERE (((TABLE1.ATTENDANCE) Is Null) AND ((TABLE2.PERFORMANCE) Is Null) AND
((TABLE1.PLAYSPORTS) Is Not Null));

**
When i am running the Query in synapse notebook , it is throwing error like
MERGE INTO campaign_data using Institute_data
^
SyntaxError: invalid syntax

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,049 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,262 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,476 Reputation points Microsoft Employee
    2022-11-15T00:29:03.577+00:00

    Hello @Devender ,
    Just to add what @AnnuKumari-MSFT called out .
    The SQL MERGE feature was implement in SPARK 3.0 and above , so please make sure that you check that ( I took my half day ) .
    I was checking on Synapse Spark pool .
    260303-image.png

    This is what I tried .

    import pyspark
    from pyspark.sql import SparkSession
    from pyspark.sql.functions import expr
    from delta.tables import *

    Create spark session

    table1 = [(1,'100','Yes',10),(2,'','Yes',10),(3,'50','',10)]
    table2 = [(1,'Good'),(2,''),(3,'bad')]
    table1columns= ["STUDENTID","ATTENDANCE","PLAYSPORTS","GIVEMARKS"]
    table2columns= ["COLLEGEID","PERFORMANCE"]
    df1 = spark.createDataFrame(data = table1, schema = table1columns)
    df2 = spark.createDataFrame(data = table2, schema = table2columns)
    df1.printSchema()
    df1.show(truncate=False)
    df2.printSchema()
    df2.show(truncate=False)
    df1.write.mode("overwrite").format("delta").saveAsTable('campaign_data')
    df2.write.mode("overwrite").format("delta").saveAsTable('Institute_data')

    On a different notebook i tried this pysql ( You can add the logic as you mentioned )

    MERGE INTO campaign_data a
    USING Institute_data b
    ON a.STUDENTID = b.COLLEGEID
    When Matched then
    Update SET a.ATTENDANCE=a.ATTENDANCE , GIVEMARKS = '1'

    260209-image.png

    Please do let me if you have any queries.
    Thanks
    Himanshu


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.