An Apache Spark-based analytics platform optimized for Azure.
Hi Janice Chi
Is Unity Catalog critical? What are the pros/cons and costs
You can implement access controls without Unity Catalog (UC), but for a team of this size and in a regulated environment like healthcare, we strongly recommend using it.
Benefits of Unity Catalog:
- Centralized governance across all data, notebooks, and compute.
- Fine-grained access controls (table, column, view level).
- Lineage, audit logging, and data masking capabilities.
- Better support for scalable role-based access using Azure AD groups.
Cost: Unity Catalog itself has no separate cost, but it does require the Premium or Enterprise tier of Databricks.
More info: Unity Catalog best practices – Microsoft Learn
Access control in Unity Catalog
If you decide not to use Unity Catalog, you'll need to manage permissions via:
- Workspace folder-level ACLs,
- External tools like Azure Purview,
- Manual handling of cluster/job permissions which can become harder to scale.
Should DevOps own cluster/job management
Yes, especially since you're using ADF to trigger notebooks with job-scoped clusters, it makes sense to restrict cluster and job creation to DevOps or leads.
Best Practice:
- Use Cluster Policies to limit the types of clusters users can launch.
- Grant engineers only the
Can Runpermission on jobs, notCan Manage. - Disable personal cluster creation for all non-admin users.
This helps standardize your environment, reduces errors, and aligns with least privilege access.
How to collaborate with Git and avoid overwrites
To enable safe collaboration across multiple engineers:
- Use Databricks Repos with Git integration (GitHub or Azure Repos).
- Organize work around feature branches.
- Use pull requests and enforce branch protections to avoid accidental overwrites.
- Consider using notebooks in source format (
.py,.sql) to make Git diffs cleaner and version control more robust.
More info: https://learn.microsoft.com/en-us/azure/databricks/repos/
Folder organization and RBAC setup
Organize your workspace into logical folders by pipeline stage:
/Shared
/HistoricalLoad
/CatchUpCDC
/RealtimeStreaming
/CommonLibs
/DevOps
RBAC Mapping by Folder:
- Data Engineers:
Can Editon specific pipeline folders. - Leads:
Can Manageon shared folders. - DevOps: Full access to
/DevOpsand jobs/clusters. - Use Workspace ACLs or Unity Catalog roles to enforce this model.
Hope this helps. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.