View/Tables lineage view as report

chandu 0 Reputation points
2024-03-30T02:52:37.4433333+00:00

I have requirement to create Power BI report for views/tables lineage

current project design is source > ADF>Snowflake>power BI

Requirement is suppose if I select VW_TEST , it should give details from source level to view level like source job timings/details/ADF pipeline name/ refresh type etc
This report for Management - need for Operational work

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,471 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 14,571 Reputation points
    2024-03-30T11:17:01.54+00:00

    I will break your requirement into 3 steps :

    1. Collect the data :

    • Start with collecting metadata from your source systems : table names, column data types, ETL jobs that populate these tables. You can use Azure Purview to automate the collection of this metadata.
    • You can use Azure Monitor or the ADF REST API to extract metadata about pipelines, activities, datasets, and linked services
    • Snowflake Information Schema and Account Usage views can provide metadata about tables, views, and query performance (including the last refreshed tables/views, their sizes, and how they're queried)
    • Power BI REST API can help you list datasets, reports, and their dependencies.

    2. Design the Lineage Model

    With the metadata collected, the next step is to design a model that can represent the lineage from source to report. This might involve tables for sources, ETL jobs (with timings and statuses), ADF pipelines, Snowflake tables/views, and Power BI reports/datasets. You'd also need to establish relationships between these tables to reflect the flow of data through your stack.

    You can know more about DM in PBI :

    https://powerbi.microsoft.com/en-us/what-is-data-modeling/

    3. Implement in Power BI

    • Load the collected metadata into Power BI : setting up multiple data sources and configuring scheduled refreshes to ensure the report remains up-to-date
    • Create the report itself. In my opinion, a good starting point would be a filterable tree or flow diagram that lets users select a Snowflake view (like VW_TEST) and see its entire lineage.
    • Interactive Features: Implement slicers or other interactive elements that allow management to filter by specific criteria, such as pipeline name, refresh type, or source system. This makes the report not just informative but also a powerful tool for operational work.

    Learn how to design a report in PBI :

    https://learn.microsoft.com/en-us/training/modules/power-bi-effective-reports/

    0 comments No comments