Get started with Microsoft Purview data governance
Bài viết
This article takes you through technical steps to get started building your data governance solution in Microsoft Purview and integrating data governance with your day-to-day business operations.
You might find it helpful to review Plan for Unified Catalog as an orientation to the material below. For a step-by-step illustration of how to set up your environment, visit the sample setup walkthrough.
Prerequisites
You need a Microsoft Purview Enterprise instance, either by:
The Data Governance Administrator role delegates the first level of access for Microsoft Purview Unified Catalog users. Granting this role to a user in your organization is your first step as you get started. Find details on how to assign roles for data governance.
Governance domain
A governance domain is a boundary that enables the common governance, ownership, and discovery of data products, and business concepts like glossary terms and objectives and key results (OKRs). The goal is to empower a governance domain owner to manage their data products, and establish rules for their access, usage, and distribution. Governance domains can be aligned per the following examples:
Corporate / business areas (Human Resources, Sales, Finance, Supply Chain, etc.)
Boundaries based on organizational functions (Customer Experience, Cloud Supply Chain, Business Intelligence).
Data product
A data product is a group of data assets, such as tables, files, or Power BI reports, packaged together with a proper use case to be shared to data consumers in the organization. A governance domain can house many data products, but a data product is managed by a single governance domain and can be discovered across many governance domains.
Use our guide to strategize your governance domain structure.
Assign at least one governance domain owner on each governance domain. This user will be the point of support and decision authority for data being consumed in this governance domain.
Assign data product owners to create data products in your governance domains. These should be business and data experts who can pair data with day-to-day scenarios in their governance domain.
Glossary terms provide vocabulary for business users. These terms allow users to discover and work with data in the vocabulary that is familiar to them versus using abstract technical terminology inherited from physical data sources.
Objectives and key results (OKRs) are the goals or desired outcome of a Governance Domain (for example, 10% rise in sales or 3% reduction in support cases). Objectives should relate to everything an organization does and should define how they're achieving their outcomes.
Health management actions give you and your users steps to improve data health and governance across data estate. These actions correspond to the checks made to calculate a Data Product’s data governance health control score. Addressing these actions raises your health score and promotes an overall more usable and discoverable Unified Catalog. Understanding the value of your Data Products improves the trust others take in that data and help with prioritizing which data to focus on improving first.
Review health actions to start considering next steps for your data governance journey.
Improve data quality and remove data issues
Data quality is the measurement of the quality of data in an organization, based on data quality rules that are configured and defined in Unified Catalog.
Data quality rules provide a description of the state of the data with dimensions like: accuracy, completeness, conformity, consistency, timeliness, and uniqueness. Each rule, when it runs, produces a score that describes how close the data is to its desired state.
Data profiling is the process of examining the data available in your data sources and collecting statistics and information, and assessing the quality level of the data according to a defined set of goals. If data is of poor quality or managed in structures that can't be integrated to meet the needs of the organization, it can affect business processes and decision-making.
The following is a reference example to assist with planning the new Microsoft Purview data governance solution areas, scenarios, tasks, and personas with key stakeholders.
Week 1-2
Area
Scenario
Task
Description/Outcomes
Persona
Data management
Catalog setup
Set up first governance domain
Identify governance domain scope, usage, and owners. Assign accountability to governance domain owner, define/create your first governance domain, description, and assign data owners. Capture feedback establishing the governance domain.
Governance domain owner
Catalog curation
Create data products in the governance domain
Identify scope of data to manage, publish, and owners. Create data products, descriptions, use cases, assign ownership, create and assign glossary terms to help increase usability for data consumers. Map data assets to data products, create access policies for data consumers to attest to when requesting access – capture feedback (ease of use for business unit to manage/understand/own curation)
Data product owner/data steward
Publication
Publish the governance domain and data products
Publish the governance domain and associated data products to make available for discovery, understanding, and access through the Unified Catalog experience. Assign data consumers permissions to access and view the first governance domain by adding them to the Unified Catalog reader role and capture feedback with publication.
Governance domain owner and data product owner
Operations
Data governance and management operations
Assess operational tasks, stakeholders, processes, and procedures to enable data governance and management, evaluate against current state data governance policies, practice, and culture to identify potential areas for improvement/change.
Data governance office
Week 2-3
Area
Scenario
Task
Description/Outcomes
Persona
Data discovery, understanding, and access
Discover and access
Unified Catalog product search
Exercise the Unified Catalog product search experience to help data consumers and users discover and understand data products that are curated and developed for a specific business purpose. Assess data product metadata to determine proper usage, data quality, and applicability to data consumer business outcomes, and then request access. Assess ease of use to data products the user recently reviewed, and data products subscribed to – capture feedback on full data consumer experience.
Data consumer
Data management
Access management
Access request management
Review access requests for data products in the first governance domain and approve or reject. Engage with IT owners for approvals (as appropriate) to data assets.
Data product owner
Catalog curation
Review data product discoverability
Review discoverability and usability of published data products along with data consumer feedback to inform semantic knowledge improvement opportunities (for example, glossary terms, attention to attention items in the action center, etc.).
Data product owner/data steward
Week 3-4
Area
Scenario
Task
Description/Outcomes
Persona
Data management
Data quality
Improve data quality and reduce data issues
Assess top-level data quality by the first governance domain, and evaluate/set up data quality for associated data assets by data product (via connections). Use data quality profiling data to inform quality rules and dimensions to establish key data assets in the data product. Run data quality scans (ad-hoc or scheduled), monitor data quality activity and scans, and setup alerts to be informed of changes with data asset health (via target thresholds). Capture feedback on overall data quality experience.
Data quality steward
Operations
Data governance and management operations
Assess operational tasks, stakeholders, processes, and procedures to enable data quality in the context of data governance and management. Evaluate against current state data governance policies, practice, and culture to identify potential areas for improvement or change.
Data governance office
Week 4-5
Area
Scenario
Task
Description/Outcomes
Persona
Health management
Reports
Manage data governance
Review the controls with business data domain owners and setup regular review of reporting on those controls. The goal of the meeting is to review issues and prioritize solutions or data products that are needed to meet business needs.
Data governance office
Health actions
Improve data governance
Take actions based on the controls to improve data governance and ensure standards are being met.
Data stewards/data product owners
Unified Catalog overview page
The Overview page in Unified Catalog helps users in an organization get started with their data governance journey, understand, and navigate the different steps outlined in this document, using step by step instructions and video demos.