ADF pipeline to read the data from UC table to adls gen2 account

Ashwini Gaikwad 130 Reputation points
2024-04-11T19:05:05.9733333+00:00

Hello Team,

We have a requirement to create Azure Datafactory pipeline to read the data from UC table, access on the table is provided ( to Azure Datafactory Managed Identity) and copy the data into adls gen2. Is there a way or article to implement this? I can create linked service for ADLS gen2 but how to read the data from UC table in ADF? Please let me know the ways to implement this requirement.

Kind Regards,

Ashwini G

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,466 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,175 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,669 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 89,296 Reputation points Microsoft Employee
    2024-04-12T07:42:34.7466667+00:00

    @Ashwini Gaikwad - Thanks for the question and using MS Q&A platform.

    To read data from a UC table and copy it into ADLS Gen2 using Azure Data Factory, you can follow these steps:

    • Create a linked service for the UC table in Azure Data Factory. You can use the Azure Databricks linked service to connect to your Databricks workspace. You will need to provide the workspace URL, access token, and other required information to create the linked service.
    • Create a dataset for the UC table in Azure Data Factory. You can use the Databricks Delta Lake dataset to read data from the UC table. You will need to provide the table name, catalog name, and schema name to create the dataset.
    • Create a linked service for the ADLS Gen2 account in Azure Data Factory. You can use the Azure Blob Storage linked service to connect to your ADLS Gen2 account. You will need to provide the storage account name, file system name, and other required information to create the linked service.
    • Create a dataset for the ADLS Gen2 account in Azure Data Factory. You can use the Azure Blob Storage dataset to write data to the ADLS Gen2 account. You will need to provide the folder path and other required information to create the dataset.
    • Create a pipeline in Azure Data Factory that uses the copy activity to copy data from the UC table dataset to the ADLS Gen2 dataset. You will need to specify the source and sink datasets, and configure the copy activity to map the columns from the source to the sink.

    Regarding your second question, you can create a linked service for your Databricks workspace in Azure Data Factory and use it to read all the tables which are in delta format under the specific catalog and schema. You can use the Databricks Delta Lake dataset to read data from the tables. You will need to provide the table name, catalog name, and schema name to create the dataset.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.