Share via


UNIX-Style Regular Expressions

Note

Indexing Service is no longer supported as of Windows XP and is unavailable for use as of Windows 8. Instead, use Windows Search for client side search and Microsoft Search Server Express for server side search.

 

The {regex} tag specifies a match using UNIX-style regular expressions. The syntax of the {regex} tag is the following.

{regex} regular expression {/regex}

Any character except an asterisk (*), period (.), question mark (?) or vertical bar (|) matches itself. A regular expression can be enclosed in matching quotes ("…"), and must be enclosed in quotes if it contains a space or closing parenthesis (the ")" character ).

The asterisk, period, and question mark behave as they do in Windows. The asterisk matches any number of characters. The period matches end of string. The question mark matches any one character. The vertical bar (|) is an escape character, which indicates special behavior for the open bracket character ([). The following table explains the meanings of special characters in regular expressions.

Character Meaning
( An opening parenthesis opens a group. It must be followed by a matching closing parenthesis.
) A closing parenthesis closes a group. It must be preceded by a matching opening parenthesis.
| [ An opening square bracket preceded (escaped) by a vertical pipe character opens a character class. It must be followed by a matching (unescaped) closing square bracket.
{ An opening brace opens a counted match. It must be followed by a matching closing brace.
} A closing brace closes a counted match. It must be preceded by a matching opening brace.
, A comma separates OR clauses.
* An asterisk matches zero or more occurrences of the preceding expression.
? A question mark matches zero or one occurrence of the preceding expression.
+ A plus sign matches one or more occurrences of the preceding expression.
Other All other characters match themselves.

 

The following table describes characters which, when located between square brackets ([ ]), have special meanings.

Character Meaning
^ A caret matches everything but following classes. ( It must be the first character in the string.)
] A closing square bracket matches another closing square bracket. It may be preceded only by a caret (^); otherwise it closes the class.
- A hyphen is a range operator. It is preceded and followed by normal characters.
Other All other characters match themselves (or begin or end a range).

 

The following table describes the syntax used between braces ({ }).

Character Meaning
{m} Matches exactly m occurrences of the preceding expression (0 < m < 256).
{m,} Matches at least m occurrences of the preceding expression (1 < m < 256).
{m, n} Matches between m and n occurrences of the preceding expression, inclusive (0 < m < 256, 0 < n < 256)

 

To match the asterisk and question mark, enclose them within brackets. For example, [*]sample matches "*sample". The following table illustrates some additional examples of pattern-matching queries.

To Search For Example Results
Documents with extensions that match several patterns.
{prop name=filename} {regex} *.|(do?|,xl?|,p?t|,mdb|) {/regex}
—Or—
#filename *.|(do?|,xl?|,p?t|,mdb|)
Microsoft Office documents, including files with extensions "doc", "dot", "xla", "xls", "xlt", "pot", "ppt", and "mdb".
Paths with long names.
{prop name=path} {regex} "*\|[^\]|{14,|}\*" {/regex}
—Or—
#path "*\|[^\]|{14,|}\*"
Paths with a directory component containing 14 or more characters.