List all the common columns between two large table.

Nilesh Patel 111 Reputation points
2022-09-09T09:32:33.19+00:00

How to list down all the common columns between two large tables using spark SQL.

Note - Both tables having more than 1M+ records and also column numbers and row numbers are not equal between the tables.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,154 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,441 Reputation points Microsoft Employee
    2022-09-09T11:06:12.827+00:00

    Hi @Nilesh Patel ,

    Thank you for posting query in Microsoft Q&A platform.

    Could you please help me to understand, where the tables are? Also, any specific reason for using Spark SQL only here?

    You can use PySpark also to create dataframe for you data. And then use printSchema() function to see the schema of data frame. That way you can easily compare column names.

    Hope this helps. Please let me know if any further queries.

    ----------

    Please consider hitting Accept Answer and Up-Vote button. Accepted answers help community as well.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.