List all the common columns between two large table.

Nilesh Patel 111 Reputation points

How to list down all the common columns between two large tables using spark SQL.

Note - Both tables having more than 1M+ records and also column numbers and row numbers are not equal between the tables.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,597 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 34,536 Reputation points Microsoft Employee

    Hi @Nilesh Patel ,

    Thank you for posting query in Microsoft Q&A platform.

    Could you please help me to understand, where the tables are? Also, any specific reason for using Spark SQL only here?

    You can use PySpark also to create dataframe for you data. And then use printSchema() function to see the schema of data frame. That way you can easily compare column names.

    Hope this helps. Please let me know if any further queries.


    Please consider hitting Accept Answer and Up-Vote button. Accepted answers help community as well.