I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .
Ask: I need assistance in creating a functional Sensitive Info Type (SIT) to prevent the unintended or unauthorized sharing of South African ID numbers. I’ve tried using the existing SIT, but it doesn't detect the ID numbers during testing. Additionally, I attempted to create a custom SIT using a regular expression, and although I've found some expressions that are typically used to validate South African ID numbers, they aren't working as expected. The regular expressions work fine on https://regex101.com, but when I apply them in the SIT configuration, they fail.
**Solution:**After further investigation, I discovered that Microsoft Purview utilizes a subset of .NET regex, which can cause certain features to behave differently. For instance, I had to avoid using \b (word boundary) and instead opted for (?<!\d) as an opener and (?!\d) as a closer for the regex.
Here’s an example comparing the original regex we use on Mimecast and the modified version for Purview:
Original Regex (Mimecast):
\b(([0-57-9]\d(0[1-9]|1[012]))|(61-9)|(601[012]))(0[1-9]|[12][0-9]|3[01])[ -]?\d\d\d\d[ -]?\d\d\d\b
Modified Regex (Purview):
(?<!\d)(([0-57-9]\d(0[1-9]|1[0-2]))|(61-9)|(601[0-2]))(0[1-9]|[12][0-9]|3[01])[ -]?\d{4}[ -]?\d{3}(?!\d)
The modified regex successfully worked in my SIT environment and can distinguish between valid and invalid ID numbers.
Thank you once again, and I appreciate the community's continued support.
If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.
If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.
Please don’t forget to Accept Answer
and Yes
for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.