Share via


Review anomaly detection logged results

By default, data quality monitoring scan results are stored in the system.data_quality_monitoring.table_results table. Only account admins can access this table, and they must grant access to others as needed. Data quality monitoring uses default storage to store the anomaly detection results. You are not billed for the storage.

Important

The results table system.data_quality_monitoring.table_results contains all results across the entire metastore and includes sample values from tables in each catalog. Use caution when granting access to this table.

Anomaly detection result table schema

Each row in the results table corresponds to a single table in the schema that was scanned.

The table has the following schema:

Column name Contents (for struct data type) Data type Description Example data
event_time timestamp Time when the row was generated. 2025-06-27T12:00:00
catalog_name string Name of the catalog. Used to identify the table. main
schema_name string Name of the schema. Used to identify the table. default
table_name string Name of the table. Used to identify the table. events
catalog_id string Stable ID for the catalog. 3f1a7d6e-9c59-4b76-8c32-8d4c74e289fe
schema_id string Stable ID for the schema. 3f1a7d6e-9c59-4b76-8c32-8d4c74e289fe
table_id string Stable ID for the table. 3f1a7d6e-9c59-4b76-8c32-8d4c74e289fe
status string Consolidated health status at the table level. Unhealthy if any check or group is unhealthy. Healthy, Unhealthy, Unknown
freshness struct Freshness checks.
status string Overall freshness status. Unhealthy
commit_freshness struct Commit freshness check results.
completeness struct Completeness check results.
status string Status of completeness check. Unhealthy
total_row_count struct Total number of rows in the table over time.
daily_row_count struct Number of rows added each day.
downstream_impact struct Summary of downstream impact based on dependency graph.
impact_level int Severity indicator (0 = none, 1 = low, 2 = medium, 3 = high, 4 = very high). 2
num_downstream_tables int Number of downstream tables affected. 5
num_queries_on_affected_tables int Number of queries run on affected downstream tables over the last 30 days. 120
root_cause_analysis struct Information about upstream jobs contributing to the issue.
upstream_jobs array Metadata for each upstream job.

commit_freshness array structure

The commit_freshness struct contains the following:

Item name Data type Description Example data
status string Status of commit freshness check. Unhealthy
error_code string Error message encountered during check. FAILED_TO_FIT_MODEL
last_value timestamp Last commit timestamp. 2025-06-27T11:30:00
predicted_value timestamp Predicted time by which the table should have been updated. 2025-06-27T11:45:00

total_row_count and daily_row_count array structure

The total_row_count and daily_row_count structs contain the following:

Item name Data type Description Example data
status string Status of the check. Unhealthy
error_code string Error message encountered during check. FAILED_TO_FIT_MODEL
last_value int Number of rows observed in the last 24 hours. 500
min_predicted_value int Minimum expected number of rows in the last 24 hours. 10
max_predicted_value int Maximum expected number of rows in the last 24 hours. 1000

upstream_jobs array structure

The structure of the array shown in the upstream_jobs column is shown in the following table:

Item name Data type Description Example data
job_id string Job ID. 12345
workspace_id string Workspace ID. 6051921418418893
job_name string Job display name. daily_refresh
last_run_status string Status of the most recent run. SUCCESS
run_page_url string URL of Databricks job run page. https://<workspace_url>/runs/123

Downstream impact information

In the logged results table, the column downstream_impact is a struct with the following fields:

Field Type Description
impact_level int Integer value between 1 and 4 indicating the severity of the data quality issue. Higher values indicate greater disruption.
num_downstream_tables int Number of downstream tables that might be affected by the identified issue.
num_queries_on_affected_tables int Total number of queries that have referenced the affected and downstream tables in the past 30 days.