Редагувати

Поділитися через


Tutorial: Set up dbt for Fabric Data Warehouse

Applies to: ✅ Warehouse in Microsoft Fabric

This tutorial guides you through setting up dbt and deploying your first project to a Fabric Warehouse.

Introduction

The dbt (Data Build Tool) open-source framework simplifies data transformation and analytics engineering. It focuses on SQL-based transformations within the analytics layer, treating SQL as code. dbt supports version control, modularization, testing, and documentation.

The dbt adapter for Microsoft Fabric can be used to create dbt projects, which can then be deployed to a Fabric Synapse Data Warehouse.

You can also change the target platform for the dbt project by simply changing the adapter, for example; a project built for Azure Synapse dedicated SQL pool can be upgraded in a few seconds to a Fabric Synapse Data Warehouse.

Prerequisites for the dbt adapter for Microsoft Fabric

Follow this list to install and set up the dbt prerequisites:

  1. Python version 3.7 (or higher).

  2. The Microsoft ODBC Driver for SQL Server.

  3. Latest version of the dbt-fabric adapter from the PyPI (Python Package Index) repository using pip install dbt-fabric.

    pip install dbt-fabric
    

    Note

    By changing pip install dbt-fabric to pip install dbt-synapse and using the following instructions, you can install the dbt adapter for Synapse dedicated SQL pool.

  4. Make sure to verify that dbt-fabric and its dependencies are installed by using pip list command:

    pip list
    

    A long list of the packages and current versions should be returned from this command.

  5. If you don't already have one, create a Warehouse. You can use the trial capacity for this exercise: sign up for the Microsoft Fabric free trial, create a workspace, and then create a warehouse.

Get started with dbt-fabric adapter

This tutorial uses Visual Studio Code, but you can use your preferred tool of your choice.

  1. Clone the jaffle_shop demo dbt project onto your machine.

    git clone https://github.com/dbt-labs/jaffle_shop.git
    
  2. Open the jaffle_shop project folder in Visual Studio Code.

    Screenshot from the Visual Studio Code, showing the open project.

  3. You can skip the sign-up if you have created a Warehouse already.

  4. Create a profiles.yml file. Add the following configuration to profiles.yml. This file configures the connection to your warehouse in Microsoft Fabric using the dbt-fabric adapter.

    config:
      partial_parse: true
    jaffle_shop:
      target: fabric-dev
      outputs:    
        fabric-dev:
          authentication: CLI
          database: <put the database name here>
          driver: ODBC Driver 18 for SQL Server
          host: <enter your SQL analytics endpoint here>
          schema: dbo
          threads: 4
          type: fabric
    

    Note

    Change the type from fabric to synapse to switch the database adapter to Azure Synapse Analytics, if desired. Any existing dbt project's data platform can be updated by changing the database adapter. For more information, see the dbt list of supported data platforms.

  5. Authenticate yourself to Azure in the Visual Studio Code terminal.

  6. Now you're ready to test the connectivity. To test the connectivity to your warehouse, run dbt debug in the Visual Studio Code terminal.

    dbt debug
    

    Screenshot from the Visual Studio Code, showing the dbt debug command.

    All checks are passed, which means you can connect your warehouse using dbt-fabric adapter from the jaffle_shop dbt project.

  7. Now, it's time to test if the adapter is working or not. First run dbt seed to insert sample data into the warehouse.

    Screenshot from the Visual Studio Code, showing a dbt seed command.

  8. Run dbt run to validate data against some tests.

    dbt run
    

    Screenshot from the Visual Studio Code, showing a dbt run command.

  9. Run dbt test to run the models defined in the demo dbt project.

    dbt test
    

    Screenshot from the Visual Studio Code, showing a dbt test command.

You have now deployed a dbt project to Synapse Data Warehouse in Fabric.

Move between different warehouses

It's simple move the dbt project between different warehouses. A dbt project on any supported warehouse can be quickly migrated with this three step process:

  1. Install the new adapter. For more information and full installation instructions, see dbt adapters.

  2. Update the type property in the profiles.yml file.

  3. Build the project.

dbt in Fabric Data Factory

When integrated with Apache Airflow, a popular workflow management system, dbt becomes a powerful tool for orchestrating data transformations. Airflow's scheduling and task management capabilities allow data teams to automate dbt runs. It ensures regular data updates and maintains a consistent flow of high-quality data for analysis and reporting. This combined approach, using dbt's transformation expertise with Airflow's workflow management, delivers efficient and robust data pipelines, ultimately leading to faster and more insightful data-driven decisions.

Apache Airflow is an open-source platform used to programmatically create, schedule, and monitor complex data workflows. It allows you to define a set of tasks, called operators, that can be combined into directed acyclic graphs (DAGs) to represent data pipelines.

For more information to operationalize dbt with your warehouse, see Transform data using dbt with Data Factory in Microsoft Fabric.

Considerations

Important things to consider when using dbt-fabric adapter:

  • Review the current limitations in Microsoft Fabric data warehousing.

  • Fabric supports Microsoft Entra ID (formerly Azure Active Directory) authentication for user principals, user identities, and service principals. The recommended authentication mode to interactively work on warehouse is CLI (command-line interfaces) and use service principals for automation.

  • Review the T-SQL (Transact-SQL) commands not supported in Synapse Data Warehouse in Microsoft Fabric.

  • Some T-SQL commands are supported by dbt-fabric adapter using Create Table as Select (CTAS), DROP, and CREATE commands, such as ALTER TABLE ADD/ALTER/DROP COLUMN, MERGE, TRUNCATE, sp_rename.

  • Review Unsupported data types to learn about the supported and unsupported data types.

  • You can log issues on the dbt-fabric adapter on GitHub by visiting Issues · microsoft/dbt-fabric · GitHub.

Next step