Azure Data Factory Managed Airflow integration with Purview to get Lineage

Ankit Devani 0 Reputation points
2023-10-26T10:56:51.7933333+00:00

I am trying to use Azure Data Factory managed Airflow for my use cases. I am trying to use Azure Purview as data governance tool and want to use airflow to emit lineage to it. I am trying to follow the Microsoft Document here. However, what i am stuck at is after i create event hub how do i configure the .yml file as Azure does it for me and i have no access to it. When i enter openlineage-airflow==1.2.0 that time airflow scheduler stopped working. and how can i add yml file in managed airflow

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,538 questions
Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
1,127 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 22,691 Reputation points
    2023-10-26T12:11:45.6266667+00:00

    As it is mentioned in the documentation you need to configure your Azure Event Hubs instance as the target to which OpenLineage sends the events.Create an ‘openlineage.yml’ file under your Airflow root path. The content of the file is as below:

    transport:
      type: "kafka"
      config:
        bootstrap.servers: "{EVENTHUB_SERVER}:9093"
        security.protocol: "SASL_SSL"
        sasl.mechanism: "PLAIN"
        sasl.username: "$ConnectionString"
        sasl.password: "{PASSWORD}"
        client.id: "airflow-client"
      topic: "microsoft_internal_openlineage"
      flash: True
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.