Quantifiers
Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found. The following table lists the quantifiers supported by the .NET Framework.
Greedy quantifier |
Lazy quantifier |
Description |
---|---|---|
* |
*? |
Match zero or more times. |
+ |
+? |
Match one or more times. |
? |
?? |
Match zero or one time. |
{n} |
{n}? |
Match exactly n times. |
{n,} |
{n,}? |
Match at least n times. |
{n,m} |
{n,m}? |
Match from n to m times. |
The quantities n and m are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the ? character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section Greedy and Lazy Quantifiers later in this topic.
Important Note: |
---|
Nesting quantifiers (for example, as the regular expression pattern (a*)* does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see Backtracking. |
Regular Expression Quantifiers
The following sections list the quantifiers supported by .NET Framework regular expressions.
Note
If the *, +, ?, {, and } characters are encountered in a regular expression pattern, the regular expression engine interprets them as quantifiers or part of quantifier constructs unless they are included in a character class. To interpret these as literal characters outside a character class, you must escape them by preceding them with a backslash. For example, the string \* in a regular expression pattern is interpreted as a literal asterisk ("*") character.
Match Zero or More Times: *
The * quantifier matches the preceding element zero or more times. It is equivalent to the {0,} quantifier. * is a greedy quantifier whose lazy equivalent is *?.
The following example illustrates this regular expression. Of the nine digits in the input string, five match the pattern and four (95, 929, 9129, and 9919) do not.
Dim pattern As String = "\b91*9*\b"
Dim input As String = "99 95 919 929 9119 9219 999 9919 91119"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '99' found at position 0.
' '919' found at position 6.
' '9119' found at position 14.
' '999' found at position 24.
' '91119' found at position 33.
string pattern = @"\b91*9*\b";
string input = "99 95 919 929 9119 9219 999 9919 91119";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '99' found at position 0.
// '919' found at position 6.
// '9119' found at position 14.
// '999' found at position 24.
// '91119' found at position 33.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
91* |
Match a "9" followed by zero or more "1" characters. |
9* |
Match zero or more "9" characters. |
\b |
End at a word boundary. |
Match One or More Times: +
The + quantifier matches the preceding element one or more times. It is equivalent to {1,}. + is a greedy quantifier whose lazy equivalent is +?.
For example, the regular expression \ban+\w*?\b tries to match entire words that begin with the letter a followed by one or more instances of the letter n. The following example illustrates this regular expression. The regular expression matches the words an, annual, announcement, and antique, and correctly fails to match autumn and all.
Dim pattern As String = "\ban+\w*?\b"
Dim input As String = "Autumn is a great time for an annual announcement to all antique collectors."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'an' found at position 27.
' 'annual' found at position 30.
' 'announcement' found at position 37.
' 'antique' found at position 57.
string pattern = @"\ban+\w*?\b";
string input = "Autumn is a great time for an annual announcement to all antique collectors.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'an' found at position 27.
// 'annual' found at position 30.
// 'announcement' found at position 37.
// 'antique' found at position 57.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
an+ |
Match an "a" followed by one or more "n" characters. |
\w*? |
Match a word character zero or more times, but as few times as possible. |
\b |
End at a word boundary. |
Match Zero or One Time: ?
The ? quantifier matches the preceding element zero or one time. It is equivalent to {0,1}. ? is a greedy quantifier whose lazy equivalent is ??.
For example, the regular expression \ban?\b tries to match entire words that begin with the letter a followed by zero or one instances of the letter n. In other words, it tries to match the words a and an. The following example illustrates this regular expression.
Dim pattern As String = "\ban?\b"
Dim input As String = "An amiable animal with a large snount and an animated nose."
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'An' found at position 0.
' 'a' found at position 23.
' 'an' found at position 42.
string pattern = @"\ban?\b";
string input = "An amiable animal with a large snount and an animated nose.";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'An' found at position 0.
// 'a' found at position 23.
// 'an' found at position 42.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
an? |
Match an "a" followed by zero or one "n" character. |
\b |
End at a word boundary. |
Match Exactly n Times: {n}
The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.
For example, the regular expression \b\d+\,\d{3}\b tries to match a word boundary followed by one or more decimal digits followed by three decimal digits followed by a word boundary. The following example illustrates this regular expression.
Dim pattern As String = "\b\d+\,\d{3}\b"
Dim input As String = "Sales totaled 103,524 million in January, " + _
"106,971 million in February, but only " + _
"943 million in March."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '103,524' found at position 14.
' '106,971' found at position 45.
string pattern = @"\b\d+\,\d{3}\b";
string input = "Sales totaled 103,524 million in January, " +
"106,971 million in February, but only " +
"943 million in March.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '103,524' found at position 14.
// '106,971' found at position 45.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
\d+ |
Match one or more decimal digits. |
\, |
Match a comma character. |
\d{3} |
Match three decimal digits. |
\b |
End at a word boundary. |
Match at Least n Times: {n,}
The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.
For example, the regular expression \b\d{2,}\b\D+ tries to match a word boundary followed by at least two digits followed by a word boundary and a non-digit character. The following example illustrates this regular expression. The regular expression fails to match the phrase "7 days" because it contains just one decimal digit, but it successfully matches the phrases "10 weeks and 300 years".
Dim pattern As String = "\b\d{2,}\b\D+"
Dim input As String = "7 days, 10 weeks, 300 years"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '10 weeks, ' found at position 8.
' '300 years' found at position 18.
string pattern = @"\b\d{2,}\b\D+";
string input = "7 days, 10 weeks, 300 years";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '10 weeks, ' found at position 8.
// '300 years' found at position 18.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
\d{2,} |
Match at least two decimal digits. |
\b |
Match a word boundary. |
\D+ |
Match at least one non-decimal digit. |
Match Between n and m Times: {n,m}
The {n,m} quantifier matches the preceding element at least n times, but no more than m times, where n and m are integers. {n,m} is a greedy quantifier whose lazy equivalent is {n,m}?.
In the following example, the regular expression (00\s){2,4} tries to match between two and four occurrences of two zero digits followed by a space. Note that the final portion of the input string includes this pattern five times rather than the maximum of four. However, only the initial portion of this substring (up to the space and the fifth pair of zeros) matches the regular expression pattern.
Dim pattern As String = "(00\s){2,4}"
Dim input As String = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' '00 00 ' found at position 8.
' '00 00 00 ' found at position 23.
' '00 00 00 00 ' found at position 35.
string pattern = @"(00\s){2,4}";
string input = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// '00 00 ' found at position 8.
// '00 00 00 ' found at position 23.
// '00 00 00 00 ' found at position 35.
Match Zero or More Times (Lazy Match): *?
The *? quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier *.
In the following example, the regular expression \b\w*?oo\w*?\b matches all words that contain the string oo.
Dim pattern As String = "\b\w*?oo\w*?\b"
Dim input As String = "woof root root rob oof woo woe"
For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'woof' found at position 0.
' 'root' found at position 5.
' 'root' found at position 10.
' 'oof' found at position 19.
' 'woo' found at position 23.
string pattern = @"\b\w*?oo\w*?\b";
string input = "woof root root rob oof woo woe";
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'woof' found at position 0.
// 'root' found at position 5.
// 'root' found at position 10.
// 'oof' found at position 19.
// 'woo' found at position 23.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
\w*? |
Match zero or more word characters, but as few characters as possible. |
oo |
Match the string "oo". |
\w*? |
Match zero or more word characters, but as few characters as possible. |
\b |
End on a word boundary. |
Match One or More Times (Lazy Match): +?
The +? quantifier matches the preceding element one or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier +.
For example, the regular expression \b\w+?\b matches one or more characters separated by word boundaries. The following example illustrates this regular expression.
Dim pattern As String = "\b\w+?\b"
Dim input As String = "Aa Bb Cc Dd Ee Ff"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'Aa' found at position 0.
' 'Bb' found at position 3.
' 'Cc' found at position 6.
' 'Dd' found at position 9.
' 'Ee' found at position 12.
' 'Ff' found at position 15.
string pattern = @"\b\w+?\b";
string input = "Aa Bb Cc Dd Ee Ff";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'Aa' found at position 0.
// 'Bb' found at position 3.
// 'Cc' found at position 6.
// 'Dd' found at position 9.
// 'Ee' found at position 12.
// 'Ff' found at position 15.
Match Zero or One Time (Lazy Match): ??
The ?? quantifier matches the preceding element zero or one time, but as few times as possible. It is the lazy counterpart of the greedy quantifier ?.
For example, the regular expression ^\s*(System.)??Console.Write(Line)??\(?? attempts to match the strings "Console.Write" or "Console.WriteLine". The string can also include "System." before "Console", and it can be followed by an opening parenthesis. The string must be at the beginning of a line, although it can be preceded by white space. The following example illustrates this regular expression.
Dim pattern As String = "^\s*(System.)??Console.Write(Line)??\(??"
Dim input As String = "System.Console.WriteLine(""Hello!"")" + vbCrLf + _
"Console.Write(""Hello!"")" + vbCrLf + _
"Console.WriteLine(""Hello!"")" + vbCrLf + _
"Console.ReadLine()" + vbCrLf + _
" Console.WriteLine"
For Each match As Match In Regex.Matches(input, pattern, _
RegexOptions.IgnorePatternWhitespace Or RegexOptions.IgnoreCase Or RegexOptions.MultiLine)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'System.Console.Write' found at position 0.
' 'Console.Write' found at position 36.
' 'Console.Write' found at position 61.
' ' Console.Write' found at position 110.
string pattern = @"^\s*(System.)??Console.Write(Line)??\(??";
string input = "System.Console.WriteLine(\"Hello!\")\n" +
"Console.Write(\"Hello!\")\n" +
"Console.WriteLine(\"Hello!\")\n" +
"Console.ReadLine()\n" +
" Console.WriteLine";
foreach (Match match in Regex.Matches(input, pattern,
RegexOptions.IgnorePatternWhitespace |
RegexOptions.IgnoreCase |
RegexOptions.Multiline))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'System.Console.Write' found at position 0.
// 'Console.Write' found at position 36.
// 'Console.Write' found at position 61.
// ' Console.Write' found at position 110.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
^ |
Match the start of the input stream. |
\s* |
Match zero or more white-space characters. |
(System.)?? |
Match zero or one occurrence of the string "System.". |
Console.Write |
Match the string "Console.Write". |
(Line)?? |
Match zero or one occurrence of the string "Line". |
\(?? |
Match zero or one occurrence of the opening parenthesis. |
Match Exactly n Times (Lazy Match): {n}?
The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.
In the following example, the regular expression \b(\w{3,}?\.){2}?\w{3,}?\b is used to identify a Web site address. Note that it matches "www.microsoft.com" and "msdn.microsoft.com", but does not match "mywebsite" or "mycompany.com".
Dim pattern As String = "\b(\w{3,}?\.){2}?\w{3,}?\b"
Dim input As String = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'www.microsoft.com' found at position 0.
' 'msdn.microsoft.com' found at position 18.
string pattern = @"\b(\w{3,}?\.){2}?\w{3,}?\b";
string input = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'www.microsoft.com' found at position 0.
// 'msdn.microsoft.com' found at position 18.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
(\w{3,}?\.) |
Match at least 3 word characters, but as few characters as possible, followed by a dot or period character. This is the first capturing group. |
(\w{3,}?\.){2}? |
Match the pattern in the first group two times, but as few times as possible. |
\b |
End the match on a word boundary. |
Match at Least n Times (Lazy Match): {n,}?
The {n,}? quantifier matches the preceding element at least n times, where n is any integer, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,}.
See the example for the {n}? quantifier in the previous section for an illustration. The regular expression in that example uses the {n,} quantifier to match a string that has at least three characters followed by a period.
Match Between n and m Times (Lazy Match): {n,m}?
The {n,m}? quantifier matches the preceding element between n and m times, where n and m are integers, but as few times as possible. It is the lazy counterpart of the greedy quantifier {n,m}.
In the following example, the regular expression \b[A-Z](\w*\s){1,10}?[.!?] matches sentences that contain between one and ten words. It matches all the sentences in the input string except for one sentence that contains 18 words.
Dim pattern As String = "\b[A-Z](\w*\s?){1,10}?[.!?]"
Dim input As String = "Hi. I am writing a short note. Its purpose is " + _
"to test a regular expression that attempts to find " + _
"sentences with ten or fewer words. Most sentences " + _
"in this note are short."
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index)
Next
' The example displays the following output:
' 'Hi.' found at position 0.
' 'I am writing a short note.' found at position 4.
' 'Most sentences in this note are short.' found at position 132.
string pattern = @"\b[A-Z](\w*?\s*?){1,10}[.!?]";
string input = "Hi. I am writing a short note. Its purpose is " +
"to test a regular expression that attempts to find " +
"sentences with ten or fewer words. Most sentences " +
"in this note are short.";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
// The example displays the following output:
// 'Hi.' found at position 0.
// 'I am writing a short note.' found at position 4.
// 'Most sentences in this note are short.' found at position 132.
The regular expression pattern is defined as shown in the following table.
Pattern |
Description |
---|---|
\b |
Start at a word boundary. |
[A-Z] |
Match an uppercase character from A to Z. |
(\w*\s) |
Match zero or more word characters, followed by one or more white-space characters. This is the first capture group. |
{1,10}? |
Match the previous pattern between 1 and 10 times, but as few times as possible. |
[.!?] |
Match any one of the punctuation characters ".", "!", or "?". |
Greedy and Lazy Quantifiers
A number of the quantifiers have two versions:
A greedy version.
A greedy quantifier tries to match an element as many times as possible.
A non-greedy (or lazy) version.
A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a ?.
Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the * greedy quantifier is \b.*([0-9]{4})\b. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows.
Dim greedyPattern As String = "\b.*([0-9]{4})\b"
Dim input1 As String = "1112223333 3992991999"
For Each match As Match In Regex.Matches(input1, greedypattern)
Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next
' The example displays the following output:
' Account ending in ******1999.
string greedyPattern = @"\b.*([0-9]{4})\b";
string input1 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input1, greedyPattern))
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
// The example displays the following output:
// Account ending in ******1999.
The regular expression fails to match the first number because the * quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string.
This is not the desired behavior. Instead, you can use the *?lazy quantifier to extract digits from both numbers, as the following example shows.
Dim lazyPattern As String = "\b.*?([0-9]{4})\b"
Dim input2 As String = "1112223333 3992991999"
For Each match As Match In Regex.Matches(input2, lazypattern)
Console.WriteLine("Account ending in ******{0}.", match.Groups(1).Value)
Next
' The example displays the following output:
' Account ending in ******3333.
' Account ending in ******1999.
string lazyPattern = @"\b.*?([0-9]{4})\b";
string input2 = "1112223333 3992991999";
foreach (Match match in Regex.Matches(input2, lazyPattern))
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
// The example displays the following output:
// Account ending in ******3333.
// Account ending in ******1999.
In most cases, regular expressions with greedy and lazy quantifiers return the same matches. They most commonly return different results when they are used with the wildcard (.) metacharacter, which matches any character.
See Also
Concepts
Regular Expression Language - Quick Reference
Change History
Date |
History |
Reason |
---|---|---|
February 2010 |
Added note on backtracking. |
Information enhancement. |
August 2009 |
Revised extensively. |
Information enhancement. |