To load data from Cosmos DB into Azure Synapse Analytics using Azure Data Factory (ADF), follow these steps:
Step 1: Set Up Azure Synapse Workspace
- Go to your Azure Synapse Analytics workspace.
- Ensure you have set up your dedicated SQL pool or serverless SQL pool.
Step 2: Create a Linked Service to Cosmos DB in Azure Data Factory
- Navigate to Azure Data Factory and create or open your pipeline.
- Under Manage on the left pane, select Linked Services.
- Click New and search for Cosmos DB.
- Select Azure Cosmos DB (SQL API) and configure the connection:
- Set the Account Endpoint and Account Key.
- Test the connection and click Create.
Step 3: Create a Linked Service to Synapse Analytics in Azure Data Factory
- In the same Linked Services section, click New and search for Azure Synapse Analytics.
- Choose the appropriate connection (dedicated SQL pool or serverless SQL pool).
- Enter the connection details:
- SQL pool name, server, and database.
- Provide username and password.
- Test the connection and click Create.
Step 4: Set Up a Data Flow or Copy Activity in the Pipeline
- Go to the Author tab and create a new Pipeline.
- Add a Copy Data Activity (or a Data Flow if transformation is needed).
- Configure the Source:
- Select Azure Cosmos DB as the source dataset.
- Choose the container or table you want to load from Cosmos DB.
- Configure the Sink:
- Select Azure Synapse Analytics as the sink dataset.
- Map the fields from Cosmos DB to the corresponding columns in Synapse Analytics.
Step 5: Monitor and Execute the Pipeline
- Once the pipeline is configured, click Debug to test or Trigger Now to run the pipeline.
- Monitor the progress under the Monitor tab in Azure Data Factory.
Let me know if you'd like any specific part of the process elaborated on.