Share via

Is it possible for a Custom SQL Data Quality Rule to be written using subqueries?

Luke Pettigrew 80 Reputation points
2026-03-19T15:13:39.4266667+00:00

Is it possible for a Custom SQL Data Quality Rule to be written using subqueries instead of joins (which are not available to use)?

Eg. I want measure the completeness of an email address where the customer has opted in to receive email correspondence.

Where the mail address is held on a Customer table and the opt in flag is on preferences table

Microsoft Security | Microsoft Purview
0 comments No comments

Answer accepted by question author
  1. SAI JAGADEESH KUDIPUDI 2,210 Reputation points Microsoft External Staff Moderator
    2026-03-23T03:03:59.9066667+00:00

    Hi Luke Pettigrew,
    The statement that joins and subqueries are not supported in Microsoft Purview Unified Catalog Custom SQL Data Quality rules is correct.

    This aligns with current product limitations, where custom rules are evaluated as row-level expressions on a single data asset only, and do not support cross-table operations such as joins or subqueries.

    To meet your requirement, you need to pre-join the Customer and Preferences data into a single table or view, and then apply the data quality rule on that combined dataset.

    Reference Links : https://learn.microsoft.com/purview/unified-catalog-data-quality-rules#custom-rules

    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

    Please do not forget to "Accept Answer" and "up-vote" wherever the information provided helps you, as this can be beneficial to other community members.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Q&A Assist
    2026-03-19T15:13:55.49+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    Yes. Custom SQL Data Quality Rules in Microsoft Purview can use subqueries.

    Custom rules are authored as Spark SQL predicates, and Spark SQL fully supports subqueries, including in WHERE and SELECT clauses. The documentation explicitly shows a custom rule using a subquery, for example:

    distance_km > 
    ( 
      SELECT min(distance_km) 
      FROM Tripdata t  
      WHERE t.payment_type = payment_type
    )
    

    This pattern can be adapted to measure completeness of an email address for customers who have opted in, even when joins are not directly available in the rule UI, by expressing the logic as a subquery predicate.

    When doing so, it is important to avoid ambiguous column references and column shadowing. If a column name exists both in the outer query and the subquery, qualify or alias columns so the outer reference is unambiguous, as shown in the corrected example from the documentation:

    distance_km > 
    ( 
      SELECT min(u.distance_km) 
      FROM ( 
        SELECT distance_km, payment_type AS pt 
        FROM Tripdata 
      ) u 
      WHERE u.pt = payment_type   -- binds to outer payment_type
    )
    

    The same technique can be used for an email/opt‑in scenario: place the appropriate subquery in the rule’s row expression, and carefully qualify or alias columns to ensure the predicate uses the intended outer row values.


    References:

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.