Purview Data Quality - Custom Rules

Donofrio, Nicolai 5 Reputation points
2024-12-04T20:32:21.4866667+00:00

We can set up out-of-the-box reference checks in Purview DQ, but we're struggling to implement similar rules in the custom section.

Specifically - referencing other tables - at all.

We've asked for documentation but have not received anything useful to this point and have exhausted the resources that are publicly available.

One MS page says, "The Custom rule enables specifying rules that try to validate rows based on one or more values in that row".

Does this imply that referencing other tables is not supported in the custom section? Why would that be?

If so, will it be supported at a later date? Is there any documentation on how the expression builder maps to the actual code being run?

The syntax/supported functions are not the same as our Synapse environment. It's really confusing that this is the instruction, yet there is seemingly no documentation to support it:

User's image Has anyone had any luck here? What are we missing?

Thank you!!

Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
1,466 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Smaran Thoomu 21,600 Reputation points Microsoft External Staff
    2024-12-05T12:44:30.3366667+00:00

    Hi @Donofrio, Nicolai
    Welcome to Microsoft Q&A platform.
    Thank you for your detailed question about implementing custom rules in Microsoft Purview Data Quality (DQ). I'll address your concerns step-by-step:

    • Currently, Purview DQ custom rules are designed to validate rows within the same dataset, focusing on row-level or column-level validation. Unfortunately, referencing other tables is not supported in the custom rules section as of now.

    The statement, "The Custom rule enables specifying rules that try to validate rows based on one or more values in that row", confirms that custom rules are limited to single-table validation.

    This limitation might stem from Purview's current focus on ensuring data quality within individual datasets rather than implementing cross-dataset validation, which can significantly increase complexity.

    As of now, Microsoft has not officially announced plans to support cross-table references in custom rules. However, it's a common feature request, and I recommend submitting or upvoting it on the Azure Feedback Portal to ensure visibility to the product team.

    • The Purview expression builder uses a distinct syntax specific to the Data Quality (DQ) module and is not the same as T-SQL or Synapse SQL. Unfortunately, the documentation for this syntax is currently sparse. The expression builder maps directly to the underlying data quality engine, which executes these rules during validation.
    • If referencing other tables is essential, you could preprocess your data outside of Purview (e.g., in Synapse or Azure Data Factory) to create a unified view that includes data from both tables. Then, apply Purview DQ rules on this view.
    • For custom rules to work effectively, ensure that:
      • The dataset schema in Purview is accurately defined.
      • The rule expressions align with the supported syntax in Purview.
    • If the syntax or supported functions differ significantly from Synapse, it’s likely because Purview's DQ engine operates independently of Synapse’s SQL runtime.

    For clarity on syntax, you can submit a support request via Azure Support. They may provide additional guidance or internal documentation not yet published.

    I hope this helps! Let me know if you have any other questions.

    2 people found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.