Facing an issue to install dbt-core==1.6.0 and dbt-snowflake==1.6.0

2024-07-12T04:30:08.1+00:00

Happens that we’re now facing an issue to install dbt-core==1.6.0 and dbt-snowflake==1.6.0. We found out that ADF UI appears to enable only packages from apache-airflow-providers, and not any that can be installed with pip.

(The ones listed here https://airflow.apache.org/docs/apache-airflow-providers/packages-ref.html)

What is the differences between them? Is this an actual resource of ADF managed airflow?

 

We got to this conclusion because we installed dbt-core and dbt-snowflake inside the container, with Bash Operators inside the Airflow. Is just like doing it in CLI. If it works from there, should work also from ADF UI.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,206 questions
{count} votes

1 answer

Sort by: Most helpful
  1. annie1987j 0 Reputation points
    2024-07-12T04:56:13.8366667+00:00

    Hello, @Siemens Healthineers Madivalappa_Azure Account

    Let’s dive into the differences between the Apache Airflow providers packages and the pip-installable packages and how they relate to Managed Airflow in Azure Data Factory (ADF).

    Apache Airflow Providers Packages:

    What Are They?: Apache Airflow is modular, and its core functionality (scheduler, basic tasks) is delivered as the apache-airflow package. However, additional capabilities can be added by installing separate packages called providers.

    What Do Providers Contain?: Providers include operators, hooks, sensors, and transfer operators that interface with various external systems. They can also extend Airflow core with new features.

    Community-Managed Providers: The Apache Airflow community maintains over 80 provider packages. These packages are versioned separately from the core Airflow releases.

    Custom Providers: You can even create your own custom providers with the same capabilities as community-provided ones.

    Example: For specific services like Amazon or Google, you’ll find provider packages like apache-airflow-providers-amazon or apache-airflow-providers-google.

    Pip-Installable Packages:

    These are Python packages that can be installed using pip. They include not only the core Airflow (apache-airflow) but also any additional dependencies.

    Decoupling: Starting from version 1.8, installing an adapter (like dbt-snowflake) no longer automatically installs dbt-core. Adapters and dbt Core versions are now decoupled to avoid overwriting existing installations.

    Installing dbt-Snowflake: You can install the dbt-snowflake adapter using pip install dbt-snowflake.

    Managed Airflow in Azure Data Factory (ADF):

    What Is It?: ADF offers a managed orchestration service for Apache Airflow called Workflow Orchestration Manager.

    Integration: It allows you to run Apache Airflow DAGs (Directed Acyclic Graphs) within ADF, providing extensibility for orchestrating Python-based workflows at scale on Azure.

    Benefits:

    Azure Reliability: ADF combines Azure’s reliability, scale, security, and ease of management with Airflow’s extensibility.

    Multi-Orchestration: ADF now supports both visual, UI-based pipelines and code-centric, Python-based DAGs (like those in Airflow).

    Use Cases:

    If you’re familiar with Apache Airflow or currently use it, you might prefer Managed Airflow within ADF.

    If you prefer not to write/manage Python-based DAGs, stick with ADF pipelines.

    I hope this info is helpful to you.

    Best Regard,
    Annie Johnston
    DollarTreeCompass