Hi Mohammed Aamer
Good observation - what you’re seeing is expected with how Databricks lineage is captured today.
In Purview, the notebook nodes that appear in lineage are often tied to the execution/run context, not just the logical notebook file. So when the same notebook runs multiple times (for example through ADF orchestration), Purview can show multiple notebook assets with different IDs. These IDs are system-generated and can look like new notebooks even though the source notebook is the same.
So yes, it can result in many notebook IDs over time, especially in active production environments.
Right now, there isn’t a way to “anchor” all those run-based IDs to one stable notebook asset in Purview. The lineage is reflecting execution history rather than a single static notebook object.
For governance, most teams handle this by:
- Using the notebook path/name in Databricks as the main reference
- Adding descriptions at the table or data asset level instead of each notebook node
- Treating notebook nodes more as technical lineage artifacts than governed assets
This is a current product behavior rather than a scan issue, and incremental vs full scans don’t change it much.
If Microsoft improves notebook normalization in lineage in the future, this should become easier to manage.
Hope this clarifies what you’re seeing.