Azure Databricks (Preview)

Azure Databricks offers a unified platform for scalable data management, governance, and analytics, combining streamlined workflows with the ability to handle diverse data types efficiently
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Copilot Studio | Premium | All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Power Apps | Premium | All Power Apps regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Power Automate | Premium | All Power Automate regions except the following: - US Government (GCC) - US Government (GCC High) - China Cloud operated by 21Vianet - US Department of Defense (DoD) |
Contact | |
---|---|
Name | Databricks Support |
URL | https://help.databricks.com |
eng-partner-eco-help@databricks.com |
Connector Metadata | |
---|---|
Publisher | Databricks Inc. |
Website | https://www.databricks.com/ |
Privacy policy | https://www.databricks.com/legal/privacynotice |
Categories | Data |
Before you begin
Before you connect to Azure Databricks from Power Platform, you must meet the following requirements:
- You need a Microsoft Entra ID (formerly Azure Active Directory) account to sign in.
- You need a premium Power Apps license.
- You need a Azure Databricks account
- You need access to a SQL warehouse in Azure Databricks.
Optional: Virtual Network support
If your Azure Databricks workspace uses Virtual Networks, there are two ways to connect. If this does not apply to your environment, you may skip this section.
Integrate Power Platform with resources inside your virtual network without exposing them over the public internet. To connect to the private endpoint of your Azure Databricks workspace, do the following after you configure private connectivity to Azure Databricks:
Set up Virtual Network support for Power Platform.
Note: If your Power Platform virtual network (whether Primary or Secondary) is different than your Azure Databricks virtual network, then you need to use VPC peering to connect the virtual network with Azure Databricks.
For more information about virtual networks, see Virtual Network support overview.
Enable access with hybrid deployment, where a front-end private link with a public endpoint is protected by a Workspace IP Access List. To enable access, do the following:
- Enable public access at workspace level. For more details, see Configure IP access lists for workspaces.
- Add the AzureConnectors IP range, or specific Power Platform IP range based on your environment’s region, to your Workspace IP access list.
Optional: Create a Microsoft Entra Service Principal
Before connecting, complete the following steps to create, set up, and assign a Microsoft Entra Service Principal to your Azure Databricks account or workspace:
- Register a new service principal in Microsoft Entra ID.
- Add service principals to your account.
- Assign a service principal to a workspace.
Step 1: Create a connection to Azure Databricks
Note: If you're only using Copilot Studio, the Azure Databricks connection must be created in Power Apps or Power Automate. Then it can be used in Copilot Studio.
To add an Azure Databricks connection, do the following:
- In Power Apps, from the sidebar, click Connections.
- Click + New connection in the upper-right corner.
- Search for "Azure Databricks" and select the Azure Databricks tile.
- Select your Authentication type from the drop down menu.
- Select your authentication method and enter your authentication information.
- If your Power Platform deployment and Azure Databricks account are in the same Microsoft Entra tenant, you can use OAuth connection. Enter the following information:
- For Server Hostname, enter the Azure Databricks SQL warehouse hostname. Example:
adb-3980263885549757139.2.azuredatabricks.net
- For HTTP Path, enter the SQL warehouse HTTP path. Example:
/sql/1.0/warehouses/a9c4e781bd29f315
- Click Create.
- Sign in with your Microsoft Entra ID.
- For Server Hostname, enter the Azure Databricks SQL warehouse hostname. Example:
- Service principal connection can be used in any scenario. Before connecting, create a Microsoft Entra service principal. Enter the following information:
- For Client ID, enter the service principal ID.
- For Client Secret, enter the service principal secret.
- For Tenant, enter the service principal tenant.
- For Server Hostname, enter the Azure Databricks SQL warehouse hostname.
- For HTTP Path, enter the SQL warehouse HTTP path
- Optional: After you create the connection, you can rename or share the service principal connection with your team members.
- To find your Azure Databricks SQL warehouse connection details, see Get connection details for an Azure Databricks compute resource.
- If your Power Platform deployment and Azure Databricks account are in the same Microsoft Entra tenant, you can use OAuth connection. Enter the following information:
Step 2: Use Azure Databricks connection
Use Azure Databricks to build Power Apps
To add data in Azure Databricks to your canvas app, do the following:
- From the leftmost navigation bar, click Create.
- Click Start with a blank canvas and select your desired canvas size to create a new canvas app.
- From your application, click Add data > Connectors > Azure Databricks. Select the Azure Databricks connection you created.
- Select a catalog from the Choose a dataset sidebar.
- From the Choose a dataset sidebar, select all the tables you want to connect your app to.
- Click Connect.
Use Azure Databricks to Build Power Automate flows
We have exposed the Azure Databricks Statement Execution API within Power Automate, allowing customers to write SQL statements. To create a Power Automate flow using Databricks as an action, do the following:
- From the leftmost navigation bar, click Create.
- Create a flow and add any trigger type.
- From your new flow, click + and search for Azure Databricks to see the available actions.
- Select your desired action from the four available options:
Execute a SQL Statement: Write and run a SQL statement. For more information, see here.
Note: The warehouse_id is the segment that appears after the final slash in your warehouse's HTTP path. For example, in the path /sql/1.0/warehouses/[warehouse_id], the value [warehouse_id] represents your warehouse ID.
Check status and get results: Check the status of a SQL statement and gather results. For more information see here.
Cancel the execution of a statement: Terminate execution of a SQL statement. For more information see here.
Get result by chunk index: Get results by chunk index, which is suitable for large result sets. For more information see here.
Use Azure Databricks as a knowledge source in Copilot Studio
To add data in Azure Databricks as a knowledge source to a Copilot Studio agent, do the following:
- From the sidebar, click Agent.
- Select an existing agent or create a new agent by clicking + New agent.
- When creating a new agent, you can optionally describe the agent by inputting a message. Click Create.
- Or, click Skip to manually specify the agent’s information.
- In the Knowledge tab, click + Knowledge.
- Click Advanced.
- Select Azure Databricks as the knowledge source.
- Input the catalog name your data is in.
- Click Connect.
- Select the tables you want your agent to use as a knowledge source and click Add.
Create/Update/Delete in Power Apps
The connector supports create, update, and delete operations, but only for tables that have a primary key defined. When performing create operations you must always specify the primary key.
Note: Azure Databricks also supports generated identity columns. If this approach is used, when creating new rows, the value will be generated automatically on the server and you do not and cannot specify a value for the primary key when creating a new record.
Conduct batch updates
Azure Databricks does not support conducting concurrent updates on the same row.
When you need to perform bulk create, update, or delete operations in response to Power App inputs, the best practice is to implement a Power Automate flow. To accomplish this, do the following:
Create your canvas Power App using your Azure Databricks connection.
Create a Power Automate flow using the Azure Databricks connection and use Power Apps as the trigger.
- In the Power Automate trigger, add the input fields that you want to pass from Power Apps to Power Automate.
Create a collection object within Power Apps to collect all of your changes.
Add the Power Automate flow to your Power Apps.
Call the Power Automate flow from your Power App and iterate over the collection using a
ForAll
command.ForAll(collectionName, FlowName.Run(input field 1, input field 2, input field 3, …)
Concurrent write conflicts
Row-level concurrency reduces conflicts between concurrent write operations by detecting changes at the row-level and automatically resolving conflicts that occur when concurrent writes update or delete different rows in the same data file.
Row-level concurrency is generally available on Databricks Runtime 14.2 and above. Row-level concurrency is supported by default for the following conditions:
- Tables with deletion vectors enabled and without partitioning.
- Tables with liquid clustering, unless you have disabled deletion vectors
To enable deletion vectors, execute SQL below
-- Enable a feature using a table property and update the table protocol.
ALTER TABLE table_name SET TBLPROPERTIES ('delta.enableDeletionVectors' = true);
For more information about concurrent write conflicts in Azure Databricks, see here
Limitations
- The Power Platform connector does not support government clouds.
Power App Limitations
The following PowerFx formulas only perform calculations on the local result set:
Category | Formula |
---|---|
Table function | - GroupBy - Distinct |
Aggregation | - CountRows - StdevP - StdevS |
Power Automate Limitations
- The data returned from a Power Automate query must not be null, if any value is null power automate will raise an error.
Creating a connection
The connector supports the following authentication types:
OAuth Connection | OAuth Connection | All regions | Not shareable |
Service Principal Connection | Service Principal Connection | All regions | Shareable |
Default [DEPRECATED] | This option is only for older connections without an explicit authentication type, and is only provided for backward compatibility. | All regions | Not shareable |
OAuth Connection
Auth ID: oauth2-auth
Applicable: All regions
OAuth Connection
This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.
Name | Type | Description | Required |
---|---|---|---|
Server Hostname (Example: adb-3980263885549757139.2.azuredatabricks.net) | string | Server name of Databricks workspace | True |
HTTP Path (Example: /sql/1.0/warehouses/a9c4e781bd29f315) | string | HTTP Path of Databricks SQL Warehouse | True |
Service Principal Connection
Auth ID: oAuthClientCredentials
Applicable: All regions
Service Principal Connection
This is shareable connection. If the power app is shared with another user, connection is shared as well. For more information, please see the Connectors overview for canvas apps - Power Apps | Microsoft Docs
Name | Type | Description | Required |
---|---|---|---|
Client ID | string | True | |
Client Secret | securestring | True | |
Tenant | string | True | |
Server Hostname (Example: adb-3980263885549757139.2.azuredatabricks.net) | string | Server name of Databricks workspace | True |
HTTP Path (Example: /sql/1.0/warehouses/a9c4e781bd29f315) | string | HTTP Path of Databricks SQL Warehouse | True |
Default [DEPRECATED]
Applicable: All regions
This option is only for older connections without an explicit authentication type, and is only provided for backward compatibility.
This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 100 | 60 seconds |
Actions
Cancel statement execution |
Requests that an executing statement be canceled. Callers must poll for status to see the terminal state. |
Check status and get results |
Get the status, manifest and results of the statement |
Execute a SQL statement |
Execute a SQL statement and optionally await its results for a specified time. |
Get result by chunk index |
After the statement execution has SUCCEEDED, this request can be used to fetch any chunk by index. |
Cancel statement execution
Requests that an executing statement be canceled. Callers must poll for status to see the terminal state.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Statement ID
|
statement_id | True | string |
Statement ID |
Check status and get results
Get the status, manifest and results of the statement
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Statement ID
|
statement_id | True | string |
Statement ID |
Returns
Statement execution response
- Body
- SqlStatementResponse
Execute a SQL statement
Execute a SQL statement and optionally await its results for a specified time.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
warehouse_id
|
warehouse_id | True | string |
Target warehouse ID |
statement
|
statement | True | string |
The SQL statement to execute. The statement can optionally be parameterized, see parameters |
name
|
name | True | string |
Parameter marker name |
type
|
type | string |
Parameter data type |
|
value
|
value | string |
Parameter value |
|
catalog
|
catalog | string |
Default catalog for execution |
|
schema
|
schema | string |
Default schema for execution |
|
disposition
|
disposition | string |
Result fetching mode |
|
format
|
format | string |
Result set format |
|
on_wait_timeout
|
on_wait_timeout | string |
Action on timeout |
|
wait_timeout
|
wait_timeout | string |
Result wait timeout |
|
byte_limit
|
byte_limit | integer |
Result byte limit |
|
row_limit
|
row_limit | integer |
Result row limit |
Returns
Statement execution response
- Body
- SqlStatementResponse
Get result by chunk index
After the statement execution has SUCCEEDED, this request can be used to fetch any chunk by index.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Statement ID
|
statement_id | True | string |
Statement ID |
Chunk index
|
chunk_index | True | string |
Chunk index |
Returns
- Body
- SqlResultData
Definitions
SqlBaseChunkInfo
Metadata for a result set chunk
Name | Path | Type | Description |
---|---|---|---|
byte_count
|
byte_count | integer |
Number of bytes in the result chunk |
chunk_index
|
chunk_index | integer |
Position in the sequence of result set chunks |
row_count
|
row_count | integer |
Number of rows in the result chunk |
row_offset
|
row_offset | integer |
Starting row offset in the result set |
SqlColumnInfo
Name | Path | Type | Description |
---|---|---|---|
name
|
name | string |
Column name |
position
|
position | integer |
Column position (0-based) |
type_interval_type
|
type_interval_type | string |
Interval type format |
type_name
|
type_name | SqlColumnInfoTypeName |
The name of the base data type. This doesn't include details for complex types such as STRUCT, MAP or ARRAY. |
type_precision
|
type_precision | integer |
Number of digits for DECIMAL type |
type_scale
|
type_scale | integer |
Number of decimal places for DECIMAL type |
type_text
|
type_text | string |
Full SQL type specification |
SqlColumnInfoTypeName
The name of the base data type. This doesn't include details for complex types such as STRUCT, MAP or ARRAY.
The name of the base data type. This doesn't include details for complex types such as STRUCT, MAP or ARRAY.
SqlStatementResponse
Statement execution response
Name | Path | Type | Description |
---|---|---|---|
manifest
|
manifest | SqlResultManifest |
Result set schema and metadata |
result
|
result | SqlResultData | |
statement_id
|
statement_id | string |
Statement ID |
status
|
status | SqlStatementStatus |
Statement execution status |
SqlResultManifest
Result set schema and metadata
Name | Path | Type | Description |
---|---|---|---|
chunks
|
chunks | array of SqlBaseChunkInfo |
Result chunk metadata |
format
|
format | string | |
schema
|
schema | SqlResultSchema |
Result set column definitions |
total_byte_count
|
total_byte_count | integer |
Total bytes in result set |
total_chunk_count
|
total_chunk_count | integer |
Total number of chunks |
total_row_count
|
total_row_count | integer |
Total number of rows |
truncated
|
truncated | boolean |
Result truncation status |
SqlStatementStatus
Statement execution status
Name | Path | Type | Description |
---|---|---|---|
error
|
error | SqlServiceError | |
state
|
state | SqlStatementState |
Statement execution state |
SqlStatementState
SqlServiceError
Name | Path | Type | Description |
---|---|---|---|
error_code
|
error_code | string | |
message
|
message | string |
Error message |
SqlResultSchema
Result set column definitions
Name | Path | Type | Description |
---|---|---|---|
column_count
|
column_count | integer | |
columns
|
columns | array of SqlColumnInfo |
SqlResultData
Name | Path | Type | Description |
---|---|---|---|
byte_count
|
byte_count | integer |
Bytes in result chunk |
chunk_index
|
chunk_index | integer |
Chunk position |
data_array
|
data_array | SqlJsonArray |
Array of arrays with string values |
external_links
|
external_links | array of SqlExternalLink | |
next_chunk_index
|
next_chunk_index | integer |
Next chunk index |
next_chunk_internal_link
|
next_chunk_internal_link | string |
Next chunk link |
row_count
|
row_count | integer |
Rows in chunk |
row_offset
|
row_offset | integer |
Starting row offset |
SqlJsonArray
Array of arrays with string values
Name | Path | Type | Description |
---|---|---|---|
Items
|
array of |
SqlExternalLink
Name | Path | Type | Description |
---|---|---|---|
byte_count
|
byte_count | integer |
Bytes in chunk |
chunk_index
|
chunk_index | integer |
Chunk position |
expiration
|
expiration | date-time |
Link expiration time |
external_link
|
external_link | string | |
http_headers
|
http_headers | object |
Required HTTP headers |
next_chunk_index
|
next_chunk_index | integer |
Next chunk index |
next_chunk_internal_link
|
next_chunk_internal_link | string |
Next chunk link |
row_count
|
row_count | integer |
Rows in chunk |
row_offset
|
row_offset | integer |
Starting row offset |