Next Match After an Empty Match

When a match is repeated through a string, either by calling NextMatch or by using the collection returned by Regex.Matches, the regular expression engine gives empty matches special treatment.

Usually, NextMatch begins the next match exactly where the previous match left off. However, after an empty match, NextMatch advances by one extra character before trying the next match. This rule guarantees that the matching engine progresses through the string. (If it did not advance an extra character, the next match would start in exactly the same place as the previous match, and it would match the same empty string repeatedly.)

For example, a search for "a*" in the string "abaabb" returns the following sequence of matches.

"a", "", "aa", "", "", ""

Here is another view, in context:

(a)()b(aa)()b()b()

The first match gets the first a. The second match starts exactly where the first match ended, before the first b; it finds zero occurrences of a and returns the empty string.

The third match does not begin exactly where the second match ended, because the second match returned the empty string. Instead, it begins one character later, after the first b. The third match finds two occurrences of a and returns "aa".

The fourth match begins where the third match ended, before the second b, and finds the empty string. Then the fifth match begins before the last b and finds the empty string again. The sixth match begins after the last b and finds the empty string yet again.

See Also

Other Resources

.NET Framework Regular Expressions