Share via

DAB for only notebook deployments from dev to test/prod?

2025-12-22T14:22:26.08+00:00

“Is it worth adopting Databricks Asset Bundles (DAB) solely for deploying notebooks from Dev to Test/Prod workspaces, given that our current architecture uses Azure Data Factory (ADF) to orchestrate notebook execution and we do not use Databricks Jobs or Pipelines?” - We are now recreating a new data platform in a new microsoft tenant. Also how can it be used by a small team of data engineers and data product engineers to quickly create notebooks and deploy to a main folder that can be deployed auto to test/prod ? Would it not be a process overhead for each engineer to be creating an asset bundle and a whole bunch of yaml everytime they want to create and deploy a new notebook?

Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.

0 comments No comments

Answer accepted by question author

PRADEEPCHEEKATLA 91,866 Reputation points
2025-12-22T15:24:54.94+00:00

Ranjith Edwards-Data Platform Architect - Databricks Asset Bundles (DAB) are generally not worth using only to move notebooks from Dev → Test/Prod when you’re still orchestrating everything through Azure Data Factory (ADF).

The following diagram provides a high-level view of a development and CI/CD pipeline with bundles:

User's image

If your current architecture uses Azure Data Factory (ADF) to orchestrate notebook execution and you do not use Databricks Jobs/Pipelines, adopting Databricks Asset Bundles (DAB) solely to deploy notebooks typically introduces more process overhead than benefit. A simpler Git + Workspace API/CLI approach is usually the best fit for notebook‑only promotion.

Why DAB may not be ideal for notebook‑only scenarios?

  • Purpose of DAB: DAB is designed to manage full project deployments (jobs, pipelines, resources, environment configs, tests) as infrastructure‑as‑code.
  • Overhead for engineers: Even with minimal templates, DAB requires a project bundle, targets, and a databricks.yml lifecycle. For teams creating frequent notebooks, this adds steps that don’t directly contribute to your ADF‑centric orchestration.

When DAB is a good fit?

  • Consider DAB if you plan to:
  • Standardize CI/CD across Jobs, Workflows, Pipelines, and notebooks.
  • Define environment‑specific configurations (Dev/Test/Prod) in code.
  • Enforce governance and auditability for Databricks resources.
  • Transition orchestration toward Databricks Workflows/Jobs in the future.

Recommended approach for your team today

Given your requirements and a small engineering team:

Notebook‑only promotion (recommended now):

  • Use Git as the source of truth.
  • In CI/CD (Azure DevOps/GitHub Actions), call Databricks Workspace API/CLI (e.g., workspace import_dir) to sync notebooks to the target workspace/folder paths for Test and Prod.
  • Keep ADF Notebook activities pointed at those paths.

Future‑ready path (if/when you adopt Jobs/Pipelines):

  • Introduce DAB with a minimal template so you can declaratively manage Jobs/Pipelines plus notebooks and gain environment presets, permissions, and standardized deployments.

Practical options side‑by‑side:

Option A — Minimal overhead (Recommended now):

  • What: Git + Workspace API/CLI to import/sync notebooks to /Shared/<project> per environment.
  • Pros: Fast, low ceremony; no bundle YAML; aligns with ADF orchestration.
  • Cons: Less structure; fewer IaC benefits until you adopt Jobs/Pipelines.

Option B — Minimal DAB (Future‑proof, controlled):

  • What: One repo with default‑minimal bundle; single databricks.yml syncing all notebooks; targets for Dev/Test/Prod.
  • Pros: Environment‑aware, versioned deployments; easy to extend to Jobs/Pipelines later.
  • Cons: Introduces DAB toolchain and YAML maintenance even if you only deploy notebooks.

Addressing the “process overhead” concern

  • Engineers do not need YAML per notebook. With a minimal bundle, you maintain one small databricks.yml per project that syncs an entire notebooks folder.
  • However, if engineers primarily create notebooks and ADF remains the orchestrator, even that small YAML is extra ceremony compared to the CLI/Workspace API approach.

For more details, refer to Azure Databricks Asset Bundles.
Hope this helps. Let me know if you have any further questions or need additional assistance. Also, if these answer your query, do click the "Upvote" and click "Accept the answer" of which might be beneficial to other community members reading this thread.


𝘛𝘰 𝘴𝘵𝘢𝘺 𝘪𝘯𝘧𝘰𝘳𝘮𝘦𝘥 𝘢𝘣𝘰𝘶𝘵 𝘵𝘩𝘦 𝘭𝘢𝘵𝘦𝘴𝘵 𝘶𝘱𝘥𝘢𝘵𝘦𝘴 𝘢𝘯𝘥 𝘪𝘯𝘴𝘪𝘨𝘩𝘵𝘴 𝘰𝘯 𝘈𝘻𝘶𝘳𝘦 𝘋𝘢𝘵𝘢𝘣𝘳𝘪𝘤𝘬𝘴, 𝘥𝘢𝘵𝘢 𝘦𝘯𝘨𝘪𝘯𝘦𝘦𝘳𝘪𝘯𝘨, 𝘢𝘯𝘥 Data & AI 𝘪𝘯𝘯𝘰𝘷𝘢𝘵𝘪𝘰𝘯𝘴, 𝘧𝘰𝘭𝘭𝘰𝘸 𝘮𝘦 𝘰𝘯 𝘓𝘪𝘯𝘬𝘦𝘥𝘐𝘯.

Was this answer helpful?


1 additional answer

Sort by: Most helpful
  1. AI answer

    2025-12-22T14:22:36.4+00:00

    Adopting Databricks Asset Bundles (DAB) for deploying notebooks from Development to Test/Production can be beneficial, even if your current architecture primarily uses Azure Data Factory (ADF) for orchestrating notebook execution. DAB facilitates the adoption of software engineering best practices, including source control, code review, testing, and continuous integration and delivery (CI/CD).

    By using DAB, you can streamline the deployment process of your notebooks, ensuring that they are consistently managed and versioned across different environments. This can help reduce errors and improve collaboration among team members during the development lifecycle.

    While you may not be utilizing Databricks Jobs or Pipelines currently, DAB can still provide a structured approach to managing your notebook deployments, which can enhance your overall data platform's operational excellence as you transition to a new Microsoft tenant.

    In summary, even for notebook-only deployments, adopting DAB can offer significant advantages in terms of organization, consistency, and adherence to best practices in your development workflow.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.