Synapse Spark notebook reading dedicated pool using synapsesql() - how to query with SQL?

alex feng 86 Reputation points

I'm trying to query dedicated pool tables using the Synapse Spark notebooks. I'm referring to the document here:

It seems to me that I can only pass the table name into the method, for example, if I need to join 2 tables, I need to read both tables into 2 data frame and join them using Spark.

val df_1 ="<DBName>.<Schema>.<TableName_1>")
val df_2 ="<DBName>.<Schema>.<TableName_2>")

then do the join in Spark.

I'm wondering if it's possible to pass the SQL join statement into the Like this

val df ="SQL that join TableName_1 and TableName_2")

I have tried to put a simple SQL selection like 'select * from <DBName>.<Schema>.<TableName_1>' and it complains for 'SyntaxError: invalid syntax'.

Any advice is greatly appreciated!

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,451 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 79,141 Reputation points Microsoft Employee

    Hello @alex feng ,

    Thanks for the question and using MS Q&A platform.

    You cannot pass SQL join statement into the

    You can try the below to join two data frames as shown below.

    For the demo purpose, I had created two tables in the Azure Synapse Dedicated pool named chepra and the tables employee which contains (EMP_ID and EMP_NAME) and employeedetails which contains (EMP_ID, EMP_EMAIL and EMP_CONTACT) as shown below.


    You can use any one of the below methods to join two data frames:


    val empdf ="chepra.dbo.employee")   
    val empdetailsdf ="chepra.dbo.employeedetails")   
    val join = empdf.join(empdetailsdf, Seq("EMP_ID"), "inner")  


    val empdf ="chepra.dbo.employee")   
    val empdetailsdf ="chepra.dbo.employeedetails")   
    empdf.join(empdetailsdf,empdf("EMP_ID") === empdetailsdf("EMP_ID"),"inner").show(false)  


    For more details, refer to the below links:

    Spark SQL Join Types with examples

    Joins in Apache Spark

    Hope this helps. Do let us know if you any further queries.


    Please "Accept the answer" if the information helped you. This will help us and others in the community as well.

    0 comments No comments

0 additional answers

Sort by: Most helpful