Share via


Connect to Azure Databricks from Microsoft Power Platform

Important

This feature is in Public Preview.

This page explains how to connect to Azure Databricks from Microsoft Power Platform by adding Azure Databricks as a data connection. When connected, you can use your Azure Databricks data from the following platforms:

  • Power Apps: Build applications that can read from and write to Azure Databricks, while preserving your Azure Databricks governance controls.
  • Power Automate: Build flows and add actions that enable executing custom SQL and get back the results.
  • Copilot Studio: Build custom agents using your Azure Databricks data as a knowledge source.

Before you begin

Before you connect to Azure Databricks from Power Platform, you must meet the following requirements:

  • You have a Microsoft Entra ID (formerly Azure Active Directory) account.
  • You have a premium Power Apps license.
  • You have a Azure Databricks account.
  • You have access to a SQL warehouse in Azure Databricks.

Optional: Connect with Azure Virtual Networks

If your Azure Databricks workspace uses Virtual Networks, there are two ways to connect:

  • Integrate Power Platform with resources inside your virtual network without exposing them over the public internet. To connect to the private endpoint of your Azure Databricks workspace, do the following after you configure private connectivity to Azure Databricks:

    1. Set up Virtual Network support for Power Platform.
    2. If your Power Platform virtual network (whether Primary or Secondary) is different than your Azure Databricks virtual network, use virtual network peering to connect the virtual network with Azure Databricks.

    For more information about virtual networks, see Virtual Network support overview.

  • Enable access with hybrid deployment, where a front-end private link with a public endpoint is protected by a Workspace IP Access List. To enable access, do the following:

    1. Enable public access at workspace level. For more details, see Configure IP access lists for workspaces.
    2. Add the AzureConnectors IP range, or specific Power Platform IP range based on your environment’s region, to your Workspace IP access list.

Optional: Create a Microsoft Entra Service Principal

Important

If Azure Databricks and Power Platform are in different tenants, you must use Service Principals for authentication.

Before connecting, complete the following steps to create, set up, and assign a Microsoft Entra Service Principal to your Azure Databricks account or workspace:

Step 1: Add an Azure Databricks connection to Power Platform

Note

If you're using Copilot Studio, the Azure Databricks connector must be defined using Power Apps or Power Automate. Then it can be used in Copilot Studio.

To add an Azure Databricks connection, do the following:

  1. In Power Apps or Power Automate, from the sidebar, click Connections.

  2. Click + New connection in the upper-left corner.

  3. Search for "Azure Databricks" using the search bar in the upper-right.

  4. Select the Azure Databricks tile.

  5. Select your Authentication type from the drop down menu.

  6. Select your authentication method and enter your authentication information.

    • If your Power Platform deployment and Azure Databricks account are in the same Microsoft Entra tenant, you can use OAuth connection. Enter the following information:

      • For Server Hostname, enter the Azure Databricks SQL warehouse hostname.
      • For HTTP Path, enter the SQL warehouse HTTP path.
      • Click Create.
      • Sign in with your Microsoft Entra ID.
    • Service principal connection can be used in any scenario. Before connecting, create a Microsoft Entra service principal. Enter the following information:

      • For Client ID, enter the service principal ID.
      • For Client Secret, enter the service principal secret.
      • For Tenant, enter the service principal tenant.
      • For Hostname, enter the Azure Databricks SQL warehouse hostname.
      • For HTTP Path, enter the SQL warehouse HTTP path.
      • (Optional) You can rename or share the service principal connection with your team members after the connection is created.
    • To find your Azure Databricks SQL warehouse connection details, see Get connection details for an Azure Databricks compute resource.

  7. Click Create.

Step 2: Use the Azure Databricks connection

After you create an Azure Databricks connection in Power Apps or Power Automate, you can use your Azure Databricks data to build Power canvas apps, Power Automate flows, and Copilot Studio agents.

Use your Azure Databricks data to build Power canvas apps

Important

You can only create canvas apps in Power Apps because you can’t save Azure Databricks tables.

To add your Azure Databricks data to your application, do the following:

  1. From the leftmost navigation bar, click Create.
  2. Click Start with a blank canvas and select your desired canvas size to create a new canvas app.
  3. From your application, click Add data > Connectors > Azure Databricks. Select the Azure Databricks connection you created.
  4. Select a catalog from the Choose a dataset sidebar.
  5. From the Choose a dataset sidebar, select all the tables you want to connect your canvas app to.
  6. Click Connect.

Data operations in Power Apps:

The connector supports create, update, and delete operations, but only for tables that have a primary key defined. When performing create operations you must always specify the primary key.

Azure Databricks supports generated identity columns. In this case, primary key values are automatically generated on the server during row creation and cannot be manually specified.

Use your Azure Databricks data to build Power Automate flows

You can write SQL statements in Power Automate because the Azure Databricks Statement Execution API is exposed within Power Automate. To create a Power Automate flow using Azure Databricks as an action, do the following:

  1. From the leftmost navigation bar, click Create.

  2. Create a flow and add any trigger type.

  3. From your new flow, click + and search for "Databricks" to see the available actions.

  4. Select your desired action from the four available options:

    • Execute a SQL Statement: Write and run a SQL statement. Enter the following:

      • For Body/warehouse_id, enter the ID of the warehouse upon which to execute the SQL statement.
      • For Body/statement_id, enter the ID of the SQL statement to execute.

      For more about the advanced parameters, see here.

    • Check status and get results: Check the status of a SQL statement and gather results. Enter the following:

      • For Statement ID, enter the ID returned when the SQL statement was executed.

      For more about the parameter, see here.

    • Cancel the execution of a statement: Terminate execution of a SQL statement. Enter the following:

      • For Statement ID, enter the ID of the SQL statement to terminate.

      For more about the parameter, see here.

    • Get result by chunk index: Get results by chunk index, which is suitable for large result sets. Enter the following:

      • For Statement ID, enter the ID of the SQL statement whose results you want to retrieve.
      • For Chunk index, enter the target chunk index.

      For more about the parameters, see here.

Use Azure Databricks as a knowledge source in Copilot Studio

To add your Azure Databricks data as a knowledge source to a Copilot Studio agent, do the following:

  1. From the sidebar, click Agent.

  2. Select an existing agent or create a new agent by clicking + New agent.

    • Describe the agent by inputting a message and then click Create.
    • Or, click Skip to manually specify the agent’s information.
  3. In the Knowledge tab, click + Knowledge.

  4. Click Advanced.

  5. Select Azure Databricks as the knowledge source.

  6. Input the catalog name your data is in.

  7. Click Connect.

  8. Select the tables you want your agent to use as a knowledge source and click Add.

Conduct batch updates

If you need to perform bulk create, update, or delete operations in response to Power Apps inputs, Databricks recommends to implement a Power Automate flow. To accomplish this, do the following:

  1. Create a canvas app using your Azure Databricks connection in Power Apps.

  2. Create a Power Automate flow using the Azure Databricks connection and use Power Apps as the trigger.

  3. In the Power Automate trigger, add the input fields that you want to pass from Power Apps to Power Automate.

  4. Create a collection object within Power Apps to collect all of your changes.

  5. Add the Power Automate flow to your canvas app.

  6. Call the Power Automate flow from your canvas app and iterate over the collection using a ForAll command.

    ForAll(collectionName, FlowName.Run(input field 1, input field 2, input field 3, …)
    

Concurrent writes

Row-level concurrency reduces conflicts between concurrent write operations by detecting changes at the row-level and automatically resolving conflicts that occur when concurrent writes update or delete different rows in the same data file.

Row-level concurrency is included in Databricks Runtime 14.2 or above. Row-level concurrency is supported by default for the following types of tables:

  • Tables with deletion vectors enabled and without partitioning
  • Tables with liquid clustering, unless deletion vectors are disabled

To enable deletion vectors, run the following SQL command.

ALTER TABLE table_name SET TBLPROPERTIES ('delta.enableDeletionVectors' = true);

For more information about concurrent write conflicts in Azure Databricks, see Isolation levels and write conflicts on Azure Databricks.

Limitations

  • The Power Platform connector does not support government clouds.

Power App limitations

The following PowerFx formulas calculate values using only the data that has been retrieved locally:

Category Formula
Table function
  • GroupBy
  • Distinct
Aggregation
  • CountRows
  • StdevP
  • StdevS