An Apache Spark-based analytics platform optimized for Azure.
Ranjith Edwards-Data Platform Architect - Databricks Asset Bundles (DAB) are generally not worth using only to move notebooks from Dev → Test/Prod when you’re still orchestrating everything through Azure Data Factory (ADF).
The following diagram provides a high-level view of a development and CI/CD pipeline with bundles:
If your current architecture uses Azure Data Factory (ADF) to orchestrate notebook execution and you do not use Databricks Jobs/Pipelines, adopting Databricks Asset Bundles (DAB) solely to deploy notebooks typically introduces more process overhead than benefit. A simpler Git + Workspace API/CLI approach is usually the best fit for notebook‑only promotion.
Why DAB may not be ideal for notebook‑only scenarios?
- Purpose of DAB: DAB is designed to manage full project deployments (jobs, pipelines, resources, environment configs, tests) as infrastructure‑as‑code.
- Overhead for engineers: Even with minimal templates, DAB requires a project bundle, targets, and a databricks.yml lifecycle. For teams creating frequent notebooks, this adds steps that don’t directly contribute to your ADF‑centric orchestration.
When DAB is a good fit?
- Consider DAB if you plan to:
- Standardize CI/CD across Jobs, Workflows, Pipelines, and notebooks.
- Define environment‑specific configurations (Dev/Test/Prod) in code.
- Enforce governance and auditability for Databricks resources.
- Transition orchestration toward Databricks Workflows/Jobs in the future.
Recommended approach for your team today
Given your requirements and a small engineering team:
Notebook‑only promotion (recommended now):
- Use Git as the source of truth.
- In CI/CD (Azure DevOps/GitHub Actions), call Databricks Workspace API/CLI (e.g., workspace import_dir) to sync notebooks to the target workspace/folder paths for Test and Prod.
- Keep ADF Notebook activities pointed at those paths.
Future‑ready path (if/when you adopt Jobs/Pipelines):
- Introduce DAB with a minimal template so you can declaratively manage Jobs/Pipelines plus notebooks and gain environment presets, permissions, and standardized deployments.
Practical options side‑by‑side:
Option A — Minimal overhead (Recommended now):
- What: Git + Workspace API/CLI to import/sync notebooks to /Shared/<project> per environment.
- Pros: Fast, low ceremony; no bundle YAML; aligns with ADF orchestration.
- Cons: Less structure; fewer IaC benefits until you adopt Jobs/Pipelines.
Option B — Minimal DAB (Future‑proof, controlled):
- What: One repo with default‑minimal bundle; single databricks.yml syncing all notebooks; targets for Dev/Test/Prod.
- Pros: Environment‑aware, versioned deployments; easy to extend to Jobs/Pipelines later.
- Cons: Introduces DAB toolchain and YAML maintenance even if you only deploy notebooks.
Addressing the “process overhead” concern
- Engineers do not need YAML per notebook. With a minimal bundle, you maintain one small databricks.yml per project that syncs an entire notebooks folder.
- However, if engineers primarily create notebooks and ADF remains the orchestrator, even that small YAML is extra ceremony compared to the CLI/Workspace API approach.
For more details, refer to Azure Databricks Asset Bundles.
Hope this helps. Let me know if you have any further questions or need additional assistance. Also, if these answer your query, do click the "Upvote" and click "Accept the answer" of which might be beneficial to other community members reading this thread.
𝘛𝘰 𝘴𝘵𝘢𝘺 𝘪𝘯𝘧𝘰𝘳𝘮𝘦𝘥 𝘢𝘣𝘰𝘶𝘵 𝘵𝘩𝘦 𝘭𝘢𝘵𝘦𝘴𝘵 𝘶𝘱𝘥𝘢𝘵𝘦𝘴 𝘢𝘯𝘥 𝘪𝘯𝘴𝘪𝘨𝘩𝘵𝘴 𝘰𝘯 𝘈𝘻𝘶𝘳𝘦 𝘋𝘢𝘵𝘢𝘣𝘳𝘪𝘤𝘬𝘴, 𝘥𝘢𝘵𝘢 𝘦𝘯𝘨𝘪𝘯𝘦𝘦𝘳𝘪𝘯𝘨, 𝘢𝘯𝘥 Data & AI 𝘪𝘯𝘯𝘰𝘷𝘢𝘵𝘪𝘰𝘯𝘴, 𝘧𝘰𝘭𝘭𝘰𝘸 𝘮𝘦 𝘰𝘯 𝘓𝘪𝘯𝘬𝘦𝘥𝘐𝘯.