Problem scanning Data Sources in Purview

Love Book 20 Reputation points
2025-03-10T16:06:27.8466667+00:00

Hi!

We are currently trying to set up purview for our organization. And whenever we are trying to scan a data source we get an ambigious "internal server error". Note that when we try to "test connection", there is no problem. This problem arises for every type of data source, in our case we have tried to scan Analytic lake (Storage V2), Power BI, Azure dedicated pools (our DWH), and an SQL database.

If we use Analytic lake as an example; the connection is successfully established as mentioned, we can see all containers and the associated blobs before we start the scan.

We use:

  • Azure AutoresolveintegrationRuntime
  • Microsoft Purview-MSI (system) for authentication

Purview also has contributor role on resource group level and Storage Blob Data Reader on the storage account for analytic lake.

We have also double checked our firewall settings for the resource group we are trying to access and there seem to be no restriction there from what we can see.

Any idea what the issue could be?

Microsoft Security | Microsoft Purview
{count} votes

Accepted answer
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2025-03-17T09:32:12.84+00:00

    @Love Book

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .

    Ask: We are currently trying to set up purview for our organization. And whenever we are trying to scan a data source we get an ambigious "internal server error". Note that when we try to "test connection", there is no problem. This problem arises for every type of data source, in our case we have tried to scan Analytic lake (Storage V2), Power BI, Azure dedicated pools (our DWH), and an SQL database.

    If we use Analytic lake as an example; the connection is successfully established as mentioned, we can see all containers and the associated blobs before we start the scan.

    We use:

    Azure AutoresolveintegrationRuntime

    Microsoft Purview-MSI (system) for authentication

    Purview also has contributor role on resource group level and Storage Blob Data Reader on the storage account for analytic lake.

    We have also double checked our firewall settings for the resource group we are trying to access and there seem to be no restriction there from what we can see.

    Any idea what the issue could be?

    Solution: We had already implemented the steps suggested at the time we posted this question. However we did manage to scan a couple of data sources after we changed back to the old purview version (we still cant get it to work on the new one). We also noticed that "problematic" files in ADSL could cause this issue, for example if we tried to scan a folder with a QID file (Maybe this is becuase purview thinks it is a normal parquet file, but it fails when it tries to find a mapping of fields/columns?). With this insights we can find workarounds and we consider the problem solved for now.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.