Managing Data quality for critical data elements
Critical data elements (CDEs) are a logical grouping of important columns across tables in your data sources that allow you to strategically focus your governance efforts where you'll have the most effect.
Microsoft Purview Data Quality offers an integrated solution for measuring the quality of Critical Data Elements (CDEs), enabling organizations to ensure that these key data elements meet the required standards for accuracy, completeness, consistency, and integrity.
Organizations can establish specific quality thresholds that CDEs must meet to maintain their quality. Those thresholds are applied at the logical CDE level, but trickle down to all the individual columns that comprise the CDE. These rules can encompass various aspects of data quality, including validation, cleansing, standardization, and enrichment. For example: data quality rules might specify that customer addresses must be standardized to a specific format, or that employee IDs must adhere to a certain pattern.
Once data quality rules are applied to CDEs, Microsoft Purview Data Quality systematically evaluates the underlying physical data elements to assess their compliance with these rules. By using Purview Data Quality's integrated approach, organizations can proactively monitor and manage the quality of their critical data elements, ensuring that they remain reliable, accurate, and fit for purpose. This not only enhances decision-making processes but also helps mitigate risks associated with data errors or inconsistencies, ultimately driving better business outcomes.
Supported asset types
- Azure Data Lake Storage Gen2
- File Types: Delta format
- Azure SQL Database
- Microsoft Fabric lakehouse (delta table)
Available data quality rules for CDEs
Microsoft Purview Data Quality enables configuration of the below rules for CDEs. Selecting a rule will take you to the general data quality rules article for more information.
Rule | Definition |
---|---|
Unique values | Confirms that the values in a column are unique. |
Data type match | Confirms that the values in a column match their data type requirements. |
Empty/blank fields | Looks for blank and empty fields in a column where there should be values. |
Configure data quality for CDEs
If you haven't already, create a critical data element(CDE) and add columns.
Open your CDE by:
- Opening the data catalog and selecting the Data management drop-down and Governance domains submenu.
- Select a governance domain from the list.
- Select the Critical data elements tile.
- Select a critical data element from the list.
Select the Data quality tab in your critical data element.
Add a new rule to the critical data element by selecting New rule.
Select the data quality rule type you want to use and select Next.
Provide the details necessary for your rule type.
Choose whether you'd like to toggle the rule Off or On.
Select Create.
Execute data quality rules for CDEs
When a data quality scan is run for an available data asset that has a column associated with a CDE, the data quality rules you've configured for that CDE will produce a score.
Schedule or run a data quality scan for your data assets associated with your CDE.
Monitor the progress of the data quality scanning job as it executes, ensuring that it completes without errors or interruptions. Check the applied data quality rules ran successfully from the history snapshot.
Review the results of the scanning job to assess the quality of the CDE data asset based on the applied rules.
Analyze the findings from the data quality scanning job to identify any issues, anomalies, or areas of improvement related to the CDE data asset. This could involve cleansing, standardizing, or enriching the data to improve its quality.