How to connect data from one cluster to another cluster in Databricks SQL

Agrani Nimshani 0 Reputation points
2023-04-03T06:38:39.8233333+00:00

I need to load a table from one cluster to another cluster.

Azure SQL Database
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,259 questions
Azure Database for MySQL
Azure Database for MySQL
An Azure managed MySQL database service for app development and deployment.
862 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,967 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Susheel Bhatt 346 Reputation points
    2023-04-03T07:43:58.7066667+00:00

    To connect data from one Azure Databricks cluster to another Azure Databricks cluster in Databricks SQL, you can use the following steps. But keep in mind that you need to ensure these two clusters are connected to the same Azure Data Lake Storage account, and that the appropriate permissions are set for the two clusters to access the same data in the storage account.

    • Connect to the source cluster and create an external table to represent the source data. You can use the CREATE TABLE statement with the USING clause to create the external table, and specify the location of the data in the source cluster. For example:
    CREATE TABLE source_table
    USING parquet
    OPTIONS (
      'path' 'dbfs:/mnt/source-data'
    )
    
    • Connect to the target cluster and create a database to hold the imported data. You can use the CREATE DATABASE statement to create the database. For example:
    CREATE DATABASE target_db;
    
    • Create a table in the target cluster to hold the imported data. You can use the CREATE TABLE statement with the USING clause to create the table, and specify the location where the data will be imported. For example:
    CREATE TABLE target_table
    USING parquet
    OPTIONS (
      'path' 'dbfs:/mnt/target-data'
    )
    
    • In the target cluster, use the INSERT INTO statement to import the data from the external table in the source cluster into the table in the target cluster. You can use a subquery to select the data from the external table. For example:
    INSERT INTO target_table SELECT * FROM source_table
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.