Share via

JDBC Connection from Databricks to Azure SQL Hyperscale Using Service Principal

Anonymous
2025-08-04T14:19:13.64+00:00

We are using Azure Databricks to load curated CDC data into Azure SQL Hyperscale as part of a batch and streaming ingestion pipeline. Our JDBC connection string is stored in Azure Key Vault and injected securely into Databricks notebooks.

We are using ActiveDirectoryServicePrincipal–based authentication (SPN), as recommended.

However, we are observing connection failures and inconsistent behavior. The jdbc_url retrieved from Key Vault is structured correctly and includes authentication type and credentials. We use .option("url", jdbc_url) along with .option("dbtable", "dbo.<table_name>") in Databricks.

Could Microsoft please clarify:

Are there any known constraints or additional required parameters (like driver or user/password) for SPN-based JDBC connectivity to Azure SQL Hyperscale from Databricks?

Does the JDBC string need to explicitly include user and password as .option() parameters even if they are embedded within the url?

Is there a requirement to explicitly specify the JDBC driver class (com.microsoft.sqlserver.jdbc.SQLServerDriver) for reliable connectivity?

Is .mode("overwrite") supported and safe for use in production-grade JDBC writes to Azure SQL Hyperscale, or are there schema-related concerns we should be aware of?

Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.


1 answer

Sort by: Most helpful
  1. Vinodh247-1375 43,021 Reputation points Volunteer Moderator
    2025-08-04T15:50:07.7866667+00:00

    Hi ,

    Thanks for reaching out to Microsoft Q&A.

    Yes, there are a few constraints and prerequisites for AD ServicePrincipal authentication with Azure SQL:

    • The Azure AD admin must be configured on the Azure SQL Hyperscale server.
    • The service principal must be created as a contained user in the Azure SQL db, and must be granted appropriate permissions (CREATE, INSERT, UPDATE, etc.).
    • The Databricks cluster must use the correct JDBC driver versionm Microsoft SQL Server JDBC Driver 9.2+ is recommended to avoid authentication bugs.

    Should user/password also be included separately as .option() if embedded in JDBC URL?

    No, if the credentials are embedded in the jdbc_url, you do not need to pass them again in .option("user", ...) or .option("password", ...).

    However, ensure:

    • Secrets are retrieved correctly (not empty or expired).
    • URL is not accidentally URL-encoded.
    • There is no line-break or whitespace in the Key Vault entry.

    Is it mandatory to specify the JDBC driver class explicitly?

    Not mandatory, but recommended in some environments for stability, especially with custom clusters or older DBR versions.

    Is .mode("overwrite") safe and supported for Azure SQL Hyperscale JDBC writes?

    Technically supported, but with important caveats:

    .mode("overwrite") issues a DROP TABLE + CREATE TABLE + INSERT pattern under the hood.

    Azure SQL Hyperscale supports DDL well, but:

    Schema recreation might cause issues with foreign keys, indexes, or permissions.

      If the table exists and is large, `DROP + CREATE` can lead to long execution times and locking issues.
    
      
         You may lose constraints, triggers, or computed columns if the schema is not preserved explicitly.
    ```Best Practice:
    
    - Prefer `.mode("append")` or `.mode("overwrite")` only with staging tables, not production-facing ones.
    
    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the reply was helpful. This will be benefitting other community members who face the same issue.
    

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.