@JF - Thanks for the question and using MS Q&A platform.
It seems like you are experiencing an issue with the automatic mapping of Synapse Analytics pipelines and dataflows for lineage on existing data assets of Gen 2 Data Lake. I understand that you have several custom pattern rules to overwrite the default behavior of Purview to recognize resource sets and when you apply a scan on the Gen 2 Data Lake, the parquet resource sets are correctly identified. However, the lineage information from Synapse of Dataflows and Pipelines is not automatically mapped on these data assets and a lot of duplicate Gen 2 data assets are created under the root folder in Purview.
To answer your questions, there have been no changes in the lineage mechanism between Purview and Synapse that could cause this issue. It is possible that the change from Azure Purview to Microsoft Purview could have caused some changes, but I cannot say for sure without more information.
To troubleshoot this issue, I would recommend checking the following:
- Check if the data assets that are not being mapped already exist in the data map. If they do, then the lineage information from Synapse should be automatically added to them. If they don't, then Purview should create new data assets for them.
- Check if the data assets that are being created as duplicates have the same name and path as the existing data assets. If they do, then Purview might be creating new data assets instead of mapping the lineage information to the existing ones..
If none of these steps help resolve the issue, I would recommend to open a support ticket for further assistance. They should be able to help you troubleshoot the issue and provide a solution.