Share via


Best practices for security, compliance, and privacy

The Azure Databricks Security Best Practices and Threat Model can be downloaded as a PDF document from the Security & Trust Center. The sections in this article list the best practices that can be found in the PDF along the principles of this pillar.

1. Manage identity and access using least privilege

  • Leverage multi-factor authentication
  • Use SCIM to synchronize users and groups
  • Limit the number of admin users
  • Enforce segregation of duties between administrative accounts
  • Restrict workspace admins
  • Manage access according to the principle of least privilege
  • Use OAuth or Azure Entra ID token authentication
  • Enforce token management
  • Restrict cluster creation rights
  • Use compute policies
  • Use service principals to run administrative tasks and production workloads
  • Use compute that supports user isolation
  • Store and use secrets securely

Details are in the PDF referenced at the beginning of this article.

2. Protect data in transit and at rest

  • Centralise data governance with Unity Catalog
  • Use Azure Managed Identities to access storage
  • Plan your data isolation model
  • Avoid storing production data in DBFS
  • Configure Azure Storage firewalls
  • Prevent anonymous read access and apply other protections
  • Enable soft deletes and other data protection features
  • Backup your Azure Storage data
  • Configure customer-managed keys for managed services
  • Configure customer-managed keys for storage
  • Use Delta Sharing
  • Configure a Delta Sharing recipient token lifetime
  • Additionally encrypt sensitive data at rest using Advanced Encryption Standard (AES)
  • Leverage data exfiltration prevention settings within the workspace
  • Use Clean Rooms to collaborate in a privacy-safe environment

Details are in the PDF referenced at the beginning of this article.

3. Secure your network and protect endpoints

  • Use Secure Cluster Connectivity (No Public IP)
  • Deploy Azure Databricks into your own Azure virtual network
  • Configure IP access lists
  • Use Azure PrivateLink
  • Implement network exfiltration protections
  • Isolate Azure Databricks workspaces into different networks
  • Configure a firewall for serverless compute access
  • Restrict access to valuable codebases to only trusted networks
  • Use Virtual network encryption

Details are in the PDF referenced at the beginning of this article.

4. Meet compliance and data privacy requirements

  • Restart compute on a regular schedule
  • Isolate sensitive workloads into different workspaces
  • Assign Unity Catalog securables to specific workspaces
  • Implement fine-grained access controls
  • Apply tags
  • Use lineage
  • Use Enhanced Security Monitoring or Compliance Security Profile
  • Control and monitor workspace access for Azure Databricks personnel
  • Implement and test a Disaster Recovery strategy
  • Consider the use of Azure Confidential Compute

Details are in the PDF referenced at the beginning of this article.

5. Monitor system security

  • Leverage system tables
  • Monitor system activities via Azure logs
  • Enable verbose audit logging
  • Manage code versions with Git folders
  • Restrict usage to trusted code repositories
  • Provision infrastructure via infrastructure-as-code
  • Manage code via CI/CD
  • Control library installation
  • Use models and data from only trusted or reputable sources
  • Implement DevSecOps processes
  • Use lakehouse monitoring
  • Use inference tables and AI Guardrails
  • Use tagging as part of your cost monitoring and charge-back strategy
  • Use budgets to monitor account spending
  • Use Azure Policy to create “upper limit” resource controls

Details are in the PDF referenced at the beginning of this article.

Additional Resources

- Download and review the Databricks AI Security Framework (DASF) to understand how to mitigate AI security threats based on real-world attack scenarios