Regular Expressions in Transport Rules
Applies to: Exchange Server 2010
You can use regular expressions in Exchange Server 2010 transport rule predicates to match text patterns in different parts of a message (such as message headers, sender, recipients, message subject, and body). Predicates are used by conditions and exceptions to determine whether a configured action should be applied to an e-mail message.
Looking for management tasks related to transport rules? See Managing Transport Rules.
Contents
Simple Expressions vs. Regular Expressions
Regular Expressions in Exchange 2010
Creating a Transport Rule That Uses a Regular Expression
Simple Expressions vs. Regular Expressions
To understand regular expressions, you must first understand simple expressions. A simple expression is a specific value that you want to match exactly in a message. Predicates using simple expressions match specific words or strings. An example of a simple expression is the title of a document that your organization doesn't want to be distributed outside the organization, such as Yearly Sales Forecast.doc. A piece of data in an e-mail message must exactly match a simple expression to satisfy a condition or exception in transport rules.
A regular expression is a concise and flexible notation for finding patterns of text in a message. The notation consists of two basic character types:
- Literal characters Text that must exist in the target string. These are normal characters, as typed.
- Metacharacters One or more special characters that aren't interpreted literally. These indicate how the text can vary in the target string.
You can use regular expressions to quickly parse e-mail messages to find specific text patterns. This enables you to detect messages with specific types of content, such as social security numbers (SSNs), patent numbers, and phone numbers.
You can't reasonably match this data with a simple expression because a simple expression requires that you enter every possible variation of the value that you want to detect. In many cases, using simple expressions for such applications becomes a logistical challenge, and matching a large number of simple expressions in message content can be resource-intensive. Using regular expressions is generally more efficient. Instead of specifying all possible variations, you can configure the transport rule predicate to search for a text pattern.
Regular Expressions in Exchange 2010
In the Exchange Management Shell, you can use regular expressions in any predicate that accepts the Patterns predicate property. In the Exchange Management Console, you can use regular expressions with any condition or exception that contains the words with text patterns. For more information about predicates, see Transport Rule Predicates.
Warning
You must carefully test the regular expressions that you construct to make sure that they yield the expected results. An incorrectly configured regular expression could yield unexpected matches and cause unwanted transport rule behavior. This may result in undesirable actions being taken on messages and message content, potentially resulting in data loss when actions such as rejecting or bouncing a message are used. Test your regular expressions in a test environment before you implement them in production.
The following table lists the pattern strings that you can use to create a pattern-matching regular expression in Exchange 2010.
Pattern strings
Pattern string | Description |
---|---|
|
The |
|
The |
|
The |
|
The |
|
The |
|
The |
|
The pipe ( |
|
The asterisk ( |
|
Parentheses act as grouping delimiters. For example, |
|
A backslash is used as an escaping character before a special character. Special characters are characters used in pattern strings:
For example, if you want to match a string that contains |
|
The caret ( For example, This character can also be used with the dollar sign ( |
|
The dollar sign ( For example, This character can also be used with the caret ( |
Constructing Regular Expressions
By using the preceding table, you can construct a regular expression that matches the pattern of the data that you want to match. Working from left to right, examine each character or group of characters in the data that you want to match. Read the description of each pattern string to determine how it's applied to the data that you're matching. Then, determine which pattern string in the table represents that character or group of characters, and add that pattern string to the regular expression. When finished, you have a fully constructed regular expression.
This example of a regular expression matches North American telephone numbers in the formats 425 555-0100 and 425.555.0100.
425(\s|.)\d\d\d(-|.)\d\d\d\d
You can expand on this example by adding the telephone format (425) 555-0100, which uses parentheses around the area code. This example of a regular expression matches all three telephone number formats.
\d\d\d((\s|.|-|\)|\)\s)\d\d\d(\s|.|-)\d\d\d\d
You can analyze the previous example as follows:
- \d\d\d This portion requires that exactly three numeric digits appear first.
- ((\s|.|-|\)|\)\s) This portion requires that a space, a period, or a hyphen exists after the three-digit number. Each character-matching string is contained in the grouping delimiters and is separated by the pipe character. This means that only one of the specified characters inside the grouping delimiters can exist in this location in the string being matched. For the separation between area code and the next three digits, it also looks for a close parenthesis, or close parenthesis and space.
- \d\d\d This portion requires that exactly three numeric digits appear next.
- (\s|.|-) This portion requires that a space, a period, or a hyphen exists after the three-digit number.
- \d\d\d\d This portion requires that exactly four numeric digits appear next.
The above regular expression will match the following sample values:
- (425)555.0100
- 425 555 0100
- 425. 555-0100
- (425) 555-0100
- 425-555-0100
- (425) 555-0100
Creating a Transport Rule That Uses a Regular Expression
This example creates a transport rule in the Shell that uses regular expressions to match SSNs in the subject of an e-mail message.
New-TransportRule -Name "Social Security Number Block Rule" -SubjectOrBodyMatchesPatterns '\d\d\d-\d\d-\d\d\d\d' -RejectMessageEnhancedStatusCode "5.7.1" -RejectMessageReasonText "This message has been rejected because of content restrictions"
This example lets you view the new transport rule.
Get-TransportRule "Social Security Number Block Rule" | Format-List