Regular Expression Classes

The following sections describe the .NET Framework regular expression classes.

Regex

The Regex class represents an immutable (read-only) regular expression. It also contains static methods that allow use of other regular expression classes without explicitly creating instances of the other classes.

The following code example creates an instance of the Regex class and defines a simple regular expression when the object is initialized. Note the use of an additional backslash as an escape character that designates the backslash in the \s matching character class as a literal character.

    ' Declare object variable of type Regex.
    Dim r As Regex 
    ' Create a Regex object and define its regular expression.
    r = New Regex("\s2000")
[C#]
    // Declare object variable of type Regex.
    Regex r; 
    // Create a Regex object and define its regular expression.
    r = new Regex("\\s2000"); 

Match

The Match class represents the results of a regular expression matching operation. The following example uses the Match method of the Regex class to return an object of type Match in order to find the first match in the input string. The example uses the Match.Success property of the Match class to indicate whether a match was found.

    ' cCreate a new Regex object.
    Dim r As New Regex("abc") 
    ' Find a single match in the input string.
    Dim m As Match = r.Match("123abc456") 
    If m.Success Then
        ' Print out the character position where a match was found. 
        ' (Character position 3 in this case.)
        Console.WriteLine("Found match at position " & m.Index.ToString())
    End If
[C#]
    // Create a new Regex object.
    Regex r = new Regex("abc"); 
    // Find a single match in the string.
    Match m = r.Match("123abc456"); 
    if (m.Success) 
    {
        // Print out the character position where a match was found. 
        // (Character position 3 in this case.)
        Console.WriteLine("Found match at position " + m.Index);
    }

MatchCollection

The MatchCollection class represents a sequence of successful non-overlapping matches. The collection is immutable (read-only) and has no public constructor. Instances of MatchCollection are returned by the Regex.Matches property.

The following example uses the Matches method of the Regex class to fill a MatchCollection with all the matches found in the input string. The example copies the collection to a string array that holds each match and an integer array that indicates the position of each match.

    Dim mc As MatchCollection
    Dim results(20) As String
    Dim matchposition(20) As Integer

    ' Create a new Regex object and define the regular expression.
    Dim r As New Regex("abc")
    ' Use the Matches method to find all matches in the input string.
    mc = r.Matches("123abc4abcd")
    ' Loop through the match collection to retrieve all 
    ' matches and positions.
    Dim i As Integer
    For i = 0 To mc.Count - 1
        ' Add the match string to the string array.
        results(i) = mc(i).Value
        ' Record the character position where the match was found.
        matchposition(i) = mc(i).Index
    Next i
[C#]
    MatchCollection mc;
    String[] results = new String[20];
    int[] matchposition = new int[20];
    
    // Create a new Regex object and define the regular expression.
    Regex r = new Regex("abc"); 
    // Use the Matches method to find all matches in the input string.
    mc = r.Matches("123abc4abcd");
    // Loop through the match collection to retrieve all 
    // matches and positions.
    for (int i = 0; i < mc.Count; i++) 
    {
        // Add the match string to the string array.   
        results[i] = mc[i].Value;
        // Record the character position where the match was found.
        matchposition[i] = mc[i].Index;   
    }

GroupCollection

The GroupCollection class represents a collection of captured groups and returns the set of captured groups in a single match. The collection is immutable (read-only) and has no public constructor. Instances of GroupCollection are returned in the collection that the Match.Groups property returns.

The following console application example finds and prints the number of groups captured by a regular expression. For an example of how to extract the individual captures in each member of a group collection, see the Capture Collection example in the following section.

    Imports System
    Imports System.Text.RegularExpressions

    Public Class RegexTest
        Public Shared Sub RunTest()
            ' Define groups "abc", "ab", and "b".
            Dim r As New Regex("(a(b))c") 
            Dim m As Match = r.Match("abdabc")
            Console.WriteLine("Number of groups found = " _
            & m.Groups.Count.ToString())
        End Sub    
    
        Public Shared Sub Main()
            RunTest()
        End Sub
    End Class
[C#]
    using System;
    using System.Text.RegularExpressions;

    public class RegexTest 
    {
        public static void RunTest() 
        {
            // Define groups "abc", "ab", and "b".
            Regex r = new Regex("(a(b))c"); 
            Match m = r.Match("abdabc");
            Console.WriteLine("Number of groups found = " + m.Groups.Count);
        }
        public static void Main() 
        {
            RunTest();
        }
    }

This example produces the following output.

    Number of groups found = 3
[C#]
    Number of groups found = 3

CaptureCollection

The CaptureCollection class represents a sequence of captured substrings and returns the set of captures done by a single capturing group. A capturing group can capture more than one string in a single match because of quantifiers. The Captures property, an object of the CaptureCollection class, is provided as a member of the Match and Group classes to facilitate access to the set of captured substrings.

For instance, if you use the regular expression ((a(b))c)+ (where the + quantifier specifies one or more matches) to capture matches from the string "abcabcabc", the CaptureCollection for each matching Group of substrings will contain three members.

The following console application example uses the regular expression (Abc)+ to find one or more matches in the string "XYZAbcAbcAbcXYZAbcAb". The example illustrates the use of the Captures property to return multiple groups of captured substrings.

    Imports System
    Imports System.Text.RegularExpressions

    Public Class RegexTest
        Public Shared Sub RunTest()
            Dim counter As Integer
            Dim m As Match
            Dim cc As CaptureCollection
            Dim gc As GroupCollection
            ' Look for groupings of "Abc".
            Dim r As New Regex("(Abc)+") 
            ' Define the string to search.
            m = r.Match("XYZAbcAbcAbcXYZAbcAb")
            gc = m.Groups
            
            ' Print the number of groups.
            Console.WriteLine("Captured groups = " & gc.Count.ToString())
            
            ' Loop through each group.
            Dim i, ii As Integer
            For i = 0 To gc.Count - 1
                cc = gc(i).Captures
                counter = cc.Count
                
                ' Print number of captures in this group.
                Console.WriteLine("Captures count = " & counter.ToString())
                
                ' Loop through each capture in group.            
                For ii = 0 To counter - 1
                    ' Print capture and position.
                    Console.WriteLine(cc(ii).ToString() _
                        & "   Starts at character " & cc(ii).Index.ToString())
                Next ii
            Next i
        End Sub
    
        Public Shared Sub Main()
            RunTest()
         End Sub
    End Class
[C#]
    using System;
    using System.Text.RegularExpressions;

    public class RegexTest 
        {
        public static void RunTest() 
        {
            int counter;
            Match m;
            CaptureCollection cc;
            GroupCollection gc;

            // Look for groupings of "Abc".
            Regex r = new Regex("(Abc)+"); 
            // Define the string to search.
            m = r.Match("XYZAbcAbcAbcXYZAbcAb"); 
            gc = m.Groups;

            // Print the number of groups.
            Console.WriteLine("Captured groups = " + gc.Count.ToString());

            // Loop through each group.
            for (int i=0; i < gc.Count; i++) 
            {
                cc = gc[i].Captures;
                counter = cc.Count;
                
                // Print number of captures in this group.
                Console.WriteLine("Captures count = " + counter.ToString());
                
                // Loop through each capture in group.
                for (int ii = 0; ii < counter; ii++) 
                {
                    // Print capture and position.
                    Console.WriteLine(cc[ii] + "   Starts at character " + 
                        cc[ii].Index);
                }
            }
        }

        public static void Main() {
            RunTest();
        }
    }

This example returns the following output.

    Captured groups = 2
    Captures count = 1
    AbcAbcAbc   Starts at character 3
    Captures count = 3
    Abc   Starts at character 3
    Abc   Starts at character 6
    Abc   Starts at character 9
[C#]
    Captured groups = 2
    Captures count = 1
    AbcAbcAbc   Starts at character 3
    Captures count = 3
    Abc   Starts at character 3
    Abc   Starts at character 6
    Abc   Starts at character 9

Group

The Group class represents the results from a single capturing group. Because Group can capture zero, one, or more strings in a single match (using quantifiers), it contains a collection of Capture objects. Because Group inherits from Capture, the last substring captured can be accessed directly (the Group instance itself is equivalent to the last item of the collection returned by the Captures property).

Instances of Group are returned by the property Match.Groups(groupnum), or Match.Groups("groupname") if the "(?<groupname>)" grouping construct is used.

The following code example uses nested grouping constructs to capture substrings into groups.

    Dim matchposition(20) As Integer
    Dim results(20) As String
    ' Define substrings abc, ab, b.
    Dim r As New Regex("(a(b))c") 
    Dim m As Match = r.Match("abdabc")
    Dim i As Integer = 0
    While Not (m.Groups(i).Value = "")    
       ' Copy groups to string array.
       results(i) = m.Groups(i).Value     
       ' Record character position. 
       matchposition(i) = m.Groups(i).Index 
        i = i + 1
    End While
[C#]
    int[] matchposition = new int[20];
    String[] results = new String[20];
    // Define substrings abc, ab, b.
    Regex r = new Regex("(a(b))c"); 
    Match m = r.Match("abdabc");
    for (int i = 0; m.Groups[i].Value != ""; i++) 
    {
        // Copy groups to string array.
        results[i]=m.Groups[i].Value; 
        // Record character position.
        matchposition[i] = m.Groups[i].Index; 
    }

This example returns the following output.

    results(0) = "abc"   matchposition(0) = 3
    results(1) = "ab"    matchposition(1) = 3
    results(2) = "b"     matchposition(2) = 4
[C#]
    results[0] = "abc"   matchposition[0] = 3
    results[1] = "ab"    matchposition[1] = 3
    results[2] = "b"     matchposition[2] = 4

The following code example uses named grouping constructs to capture substrings from a string containing data in a "DATANAME:VALUE" format that the regular expression splits at the colon (:).

    Dim r As New Regex("^(?<name>\w+):(?<value>\w+)")
    Dim m As Match = r.Match("Section1:119900")
[C#]
    Regex r = new Regex("^(?<name>\\w+):(?<value>\\w+)");
    Match m = r.Match("Section1:119900");

This regular expression returns the following output.

    m.Groups("name").Value = "Section1"
    m.Groups("value").Value = "119900"
[C#]
    m.Groups["name"].Value = "Section1"
    m.Groups["value"].Value = "119900"

Capture

The Capture class contains the results from a single subexpression capture.

The following example loops through a Group collection, extracts the Capture collection from each member of Group, and assigns the variables posn and length to the character position in the original string where each string was found and the length of each string, respectively.

    Dim r As Regex
    Dim m As Match
    Dim cc As CaptureCollection
    Dim posn, length As Integer

    r = New Regex("(abc)*")
    m = r.Match("bcabcabc")
    Dim i, j As Integer
    i = 0
    While m.Groups(i).Value <> ""
        ' Grab the Collection for Group(i).
        cc = m.Groups(i).Captures
        For j = 0 To cc.Count - 1
            ' Position of Capture object.
            posn = cc(j).Index
            ' Length of Capture object.
            length = cc(j).Length
        Next j
        i += 1
    End While
[C#]
    Regex r;
    Match m;
    CaptureCollection cc;
    int posn, length;

    r = new Regex("(abc)*");
    m = r.Match("bcabcabc");
    for (int i=0; m.Groups[i].Value != ""; i++) 
    {
        // Capture the Collection for Group(i).
        cc = m.Groups[i].Captures; 
        for (int j = 0; j < cc.Count; j++) 
        {
            // Position of Capture object.
            posn = cc[j].Index; 
            // Length of Capture object.
            length = cc[j].Length; 
        }
    }

See Also

.NET Framework Regular Expressions | System.Text.RegularExpressions