Merk
Tilgang til denne siden krever autorisasjon. Du kan prøve å logge på eller endre kataloger.
Tilgang til denne siden krever autorisasjon. Du kan prøve å endre kataloger.
This page contains Python reference documentation for pipeline expectations.
Expectation decorators declare data quality constraints on materialized views, streaming tables, or temporary views created in a pipeline.
The dp module includes six decorators to control expectations behavior. The following table describes the dimensions on which these permutations differ:
| Behavior | Options |
|---|---|
| Action on violation |
|
| Number of expectations | A single expectation or multiple expectations. |
You can add multiple expectation decorators to your datasets, providing flexibility in strictness for your data quality constraints.
When you use expect_all decorators, each expectation has its own description and reports granular metrics.
Syntax
Expectation decorators come after a @dp.table(), @dp.materialized_view or @dp.temporary_view() decorator and before a dataset definition function, as in the following example:
from pyspark import pipelines as dp
@dp.table()
@dp.expect(description, constraint)
@dp.expect_or_drop(description, constraint)
@dp.expect_or_fail(description, constraint)
@dp.expect_all({description: constraint, ...})
@dp.expect_all_or_drop({description: constraint, ...})
@dp.expect_all_or_fail({description: constraint, ...})
def <function-name>():
return (<query>)
Parameters
| Parameter | Type | Description |
|---|---|---|
description |
str |
Required. A description that identifies the constraint. Constraint descriptions must be unique for each dataset. |
constraint |
str |
Required. The constraint clause is a SQL conditional statement that must evaluate to true or false for each record. The constraint contains the actual logic for what is being validated. When a record fails this condition, the expectation is triggered. |
The expect_all decorators require descriptions and constraints to be passed as a dict of key-value pairs.