Catalog Parquet files with Pandas

Caio Tomy Ono Kaminari 101 Reputation points
2021-11-29T21:05:02.587+00:00

Is it possible to identify the schema on the parquet files which were writed with pandas?

Azure Data Catalog
Azure Data Catalog
An Azure service that serves as a system of registration and system of discovery for enterprise data assets.
97 questions
Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
948 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 78,331 Reputation points Microsoft Employee
    2021-11-30T09:20:14.103+00:00

    Hello @Caio Tomy Ono Kaminari ,

    Thanks for the question and using MS Q&A platform.

    The following file types are supported for scanning, for schema extraction and classification where applicable:

    Purview scanner only supports schema extraction for the structured file types (AVRO, ORC, PARQUET, CSV, JSON, PSV, SSV, TSV, TXT, XML, GZIP)

    Note: For AVRO, ORC, and PARQUET file types, Purview scanner does not support schema extraction for files that contain complex data types (for example, MAP, LIST, STRUCT).

    For more details, refer to Supported data sources and file types in Azure Purview.

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

0 additional answers

Sort by: Most helpful