Hi @Donofrio, Nicolai
Welcome to Microsoft Q&A platform.
Thank you for your detailed question about implementing custom rules in Microsoft Purview Data Quality (DQ). I'll address your concerns step-by-step:
- Currently, Purview DQ custom rules are designed to validate rows within the same dataset, focusing on row-level or column-level validation. Unfortunately, referencing other tables is not supported in the custom rules section as of now.
The statement, "The Custom rule enables specifying rules that try to validate rows based on one or more values in that row", confirms that custom rules are limited to single-table validation.
This limitation might stem from Purview's current focus on ensuring data quality within individual datasets rather than implementing cross-dataset validation, which can significantly increase complexity.
As of now, Microsoft has not officially announced plans to support cross-table references in custom rules. However, it's a common feature request, and I recommend submitting or upvoting it on the Azure Feedback Portal to ensure visibility to the product team.
- The Purview expression builder uses a distinct syntax specific to the Data Quality (DQ) module and is not the same as T-SQL or Synapse SQL. Unfortunately, the documentation for this syntax is currently sparse. The expression builder maps directly to the underlying data quality engine, which executes these rules during validation.
- If referencing other tables is essential, you could preprocess your data outside of Purview (e.g., in Synapse or Azure Data Factory) to create a unified view that includes data from both tables. Then, apply Purview DQ rules on this view.
- For custom rules to work effectively, ensure that:
- The dataset schema in Purview is accurately defined.
- The rule expressions align with the supported syntax in Purview.
- If the syntax or supported functions differ significantly from Synapse, it’s likely because Purview's DQ engine operates independently of Synapse’s SQL runtime.
For clarity on syntax, you can submit a support request via Azure Support. They may provide additional guidance or internal documentation not yet published.
I hope this helps! Let me know if you have any other questions.