Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This page describes how the SQL Server connector works, including its core concepts.
How does Azure Databricks connect to SQL Server?
Azure Databricks connects to SQL Server using Transport Layer Security (TLS) and a JDBC connection. Credentials are stored securely in Unity Catalog and can only be retrieved if the user running the ingestion flow has appropriate permissions. Databricks recommends creating a separate SQL Server user for data ingestion. If there are databases or tables you do not want to expose to this user, you can use built-in SQL Server permissions.
How does the connector incrementally pull data?
The connector uses Microsoft Change Tracking and Microsoft Change Data Capture (CDC). If a table has a primary key, Databricks recommends using change tracking for optimal performance. If a table doesn't have a primary key, Databricks recommends using CDC. If change tracking and CDC are both enabled, the connector uses change tracking.
Database variations
The connector supports Azure SQL and AWS RDS SQL databases. This includes SQL Server running on Azure virtual machines (VMs) and Amazon EC2. The connector also supports SQL Server on-premises using Azure ExpressRoute, AWS Direct Connect, and VPN if sufficient bandwidth is available.
Authentication methods
- The connector supports the following authentication methods:
- User-to-machine (U2M) OAuth with Microsoft Entra ID
- Username and password
SQL Server ingestion vs. query federation
In addition to the SQL Server ingestion connector in Lakeflow Connect, Databricks offers a zero-copy connector in Lakehouse Federation. The zero-copy connector allows you to query data in SQL Server without moving it.