Поділитися через


What is the Microsoft Purview Data Catalog?

Note

We're rolling out the new data catalog experience across our environments. If you don't see it yet, don't worry! You'll have it soon. If you haven't upgraded to the enterprise version of the new Microsoft Purview experience, you need to do so to be able to access the new data catalog experience when it's available in your region.

The goal of Microsoft Purview Data Catalog is to not only provide a platform for data governance, but to drive business value creation in your organization.

Historically, data governance has been a defense mechanism, a way to make sure your data is secure and compliant. But good data governance makes your data more visible to your users and provides many opportunities to reunite your business with the data that fuels it.

Most importantly, all of these new features are available in a single, integrated SaaS framework. Users shouldn't need to switch between applications to practice good data governance. We're working toward a solution where everything is in one place with experiences for data consumers, data stewards, and data owners.

Data governance for the era of AI​

This era of artificial intelligence means that we have more data than ever, more ways to use it, and even more motivation to make sure it's properly used and secured. It was a task many organizations were already struggling with, without the added complexity of using AI tools.

Governing data for an entire organization requires both rigor and flexibility. Clean, secure data requires consistency, but your teams' needs are unique for access and management. Because of this, we believe in a federated governance approach: providing a centralized place to develop data safety, quality, and standards, but providing tools to create self-service access control, discoverability, and maintenance. Federated data governance spreads ownership across your business, reducing bottlenecks and encouraging participation in the life cycle of managing, governing, consuming, and applying data.

Data governance isn't just a gatekeeping function, and when practiced well it also accelerates data value creation. Your data users, stakeholders, and subject matter experts have crucial insights to the data-to day operations of your organization. Good data governance can apply that expertise to reasonably scale the practice of data governance across your entire organization, including your nontechnical functions and business users. Said another way, good data governance utilizes your whole team and matches your data with your data-to-day business functions.

The point is both to reveal your data's business value, and simplify its management even as your organization and data estate grow. Here's how it can help each sector of your business:

  • For organization-wide data consumers:

    • Data discovery - help you easily find the data you need
    • Secure access - facilitates safe access to your data
    • Data understanding - providing what you need to know about the data before and while you use it
  • For data owners and stewards:

    • Data curation and management - helps you deliver high quality data that's easy to understand and safely access for organization-wide applications
    • Responsible data use - helps you ensure that your data is used by intended users for intended purposes
    • Impact analysis - understand action anomaly states impacting your data
  • For data officers and CxO stakeholders:

    • Data value creation - maximize value creation from your data while reducing operations spend
    • Data estate standardization - create common controls across your data estate with federated accountability so your data is healthy and safe.

Governance with the Microsoft Purview Data Catalog

The new Microsoft Purview Data Catalog experience allows you to explore and understand your data categorized by governance domains, search through AI powered copilot, and subscribe to data products that come equipped with all the data you need and the tools to safely access it. Over the last couple years we invested in a strong platform that has an inventory of all your data assets, their metadata, and their lineage so you can understand the topography of your data estate. Now we're providing better tools to manage it as it grows, and more points to surface that data to your business, to make use of it in the day-to-day.

Here are the tools that the Microsoft Purview Data Catalog provides to meet these data governance basics:

Data governance principle Catalog solution Description
Data access - quickly provide right access and enforce right use to balance safety and innovation. Data catalog access policies Provides tools to manage self-service access requests that pairs data with compliance standards and right-use requirements.
Critical data elements Establish access policies on critical types of information to be applied across your data estate.
Glossary terms Attach access policies with your business vocabulary to promote right-use.
Data curation - organizing, annotating, and publishing your data so that it's safely accessible, reuseable, and protected. Governance domains Organize data by business concepts, to make data more accessible and distribute ownership. ​
Data products Group related data assets so users can easily find the full data picture.
Health actions Take action to bring your data up to good governance standards.
Data discovery - users can find the data they need for day-to-day business and innovation. Search & browse the data catalog Search by governance domain, by data product, by keyword, or use the AI powered copilot to find what you need. ​
Self-service access requests Get access to all the data you need with a single request from inside the data catalog. ​
Data health - data quality standards are maintained across your estate, and there's an active data lifecycle keeping your data fresh and secure. Health management Ready-made reports provide the status of your data estate, and your data governance progress, at a glance.
Critical data elements Track important information to standardize and govern use.
Data quality Set quality rules from the top down to simplify distribution and provide snapshot progress tracking.
Health controls See how your data estate measures up to governance standards.
OKRs Map data health and governance goals to your business objectives.
Data understanding - data has quality descriptors that help users understand what the data is and how it should be used. Data products Provide business context for a set of data assets.
Glossary terms Attach your day-to-day business vocabulary to your data assets.
OKRs Link data usage directly to your business objectives.

Get started

Ready to get started making your information clean, valuable, and accessible to your business users? We've created a guide to take you from zero to a fully integrated data catalog: Get started with the Microsoft Purview Data Catalog.

Data Catalog features

Now that you understand how the data catalog supports good data governance practice, and how the features promote data governance principals, explore the features in more detail:

  • Governance domains - an organizational object that provides context for your data assets and make it easier scale data governance practices.
    • Data catalog access policies - provide secure, self-service access to data products.
    • Critical data elements - logically describe your key business data to govern them.
    • Glossary terms - active values that provide context but also apply policies that determine how your data should be managed, governed, and made discoverable for use.
  • Data products - a kit of data assets (tables, files, Power BI reports, etc.) that provides assets with a use case for ease of discovery and understanding.
  • OKRs - drive business value from your data and directly promote data governance objectives.
  • Data Estate Health updates - expanded features to provide new insights and encourage governance ownership.
    • Health controls - track your governance progress with a health score.
    • Health actions - follow these actions in your data estate to improve your governance score.
  • Data Quality

Governance domains

Governance domains allow you to explore data through business concepts, like Marketing or Finance. This helps make data more accessible to everyone. ​

Governance domains are a new way of organizing your data estate. You want and need to access your entire data estate, but a single, uncategorized list is overwhelming. A governance domain is a boundary that aligns your data estate to your organization; think of it as a mini catalog inside your data catalog.

Inside your governance domain are the terms that define users' day to day work. There are objectives and key results (OKRs) to align your organization's goals with your data. There are data products that make data discovery more straight forward for your users.

The goal of the governance domain is to organize your data catalog so that it not only drives healthy data governance, but ties your data directly to its business value and your objectives.

For more information, see the overview of governance domains.

Access policies

Data catalog access policies allow you to manage access to your data products and set up a system to provide access to users who request it. Promote innovation and flexibility in your data estate by creating self-service access opportunities, while upholding security and right-use standards. All in the data catalog.

For more information, see how to create and manage data catalog access policies.

Critical data elements

Critical data elements are a logical grouping of important pieces of information across your data estate. For example: A "Customer ID" critical data element can map "CustID" from one table and "CID" from another table into the same logical container. These groupings can make data easier to understand as well as promoting standardization. Data quality rules and access policies can be attached to these elements to further secure sensitive information across your data estate.

For more information, see how to create and manage critical data elements.

Glossary terms

If you've been using Microsoft Purview, you're familiar with glossary terms and how they can provide critical business context to your data assets. We've taken them from static objects to active objects that help define how your data assets should be managed, governed, and made discoverable. Policies within these terms allow data stewards to scale governance across your entire data estate. Terms applied to data products trickle down to the data assets and automatically secure those resources with their attached policies.

For more information, see the overview of glossary terms.

Data products

Data products are an improvement on catalog organization that group data assets together (tables, files, reports, etc.) for users to discover them. No more requesting access to 15 different tables you might need to build a data model. Once one user does the research to create a viable data product, all other users can benefit from that work. They can find (and request access to) the data in that product and have everything they need in one place.

For more information, see the overview of data products.

OKRs

Objectives and key results link data products directly to your objectives to bridge the gap between your business and your data catalog. You use data to discover and track objectives in your business, and your data catalog should make it easy to see those connections and track your goals.

For more information, see the overview of OKRs.

New Data Estate Health features

Health management has a few new features for you to implement to enhance your data governance strategy and management.

Health controls

Track your journey to complete data governance by monitoring health controls to track your progress. Health controls measure your current governance practices against standards that give your data estate a score.

For more information, see the health controls article.

Health actions

Health actions are concrete steps you can take to improve data governance across your data estate. The actions are provided in a single list that can focus your data governance journey, and democratize ownership. Completing these actions will improve data quality and discoverability across your data estate.

For more information, see the health actions article.

Data Quality

Data quality enables your organization to set rules through your governance domains, data products, and the data assets themselves. These rules trickle down across your environment so you can better evaluate your data across your data estate. Data quality scores are generated at the asset, data product, and governance domain levels, to give you full insight to your data estate.

For more information, see the data quality overview article.