正则表达式中的定位点
定位点或原子零宽度断言指定字符串中必须发生匹配的位置。 在搜索表达式中使用定位点时,正则表达式引擎不会在字符串中前进或使用字符;仅在指定位置查找匹配。 例如,^ 指定必须从行或字符串开头开始的匹配。 因此,正则表达式 ^http: 仅在其出现在行开头时与“http:”匹配。 下表列出了 .NET Framework 中正则表达式支持的定位点。
定位点 |
说明 |
---|---|
^ |
匹配必须出现在字符串或行的开头。 有关更多信息,请参见字符串或行的开头。 |
$ |
匹配必须出现在字符串或行的末尾,或出现在字符串或行末尾的 \n 之前。 有关更多信息,请参见字符串或行的结尾。 |
\A |
匹配必须出现在仅字符串的开头(不支持多行)。 有关更多信息,请参见仅字符串的开头。 |
\Z |
匹配必须出现在字符串的末尾或出现在字符串末尾的 \n 之前。 有关更多信息,请参见字符串的结尾或结束换行符之前。 |
\z |
匹配必须出现在仅字符串的末尾。 有关更多信息,请参见仅字符串的结尾。 |
\G |
匹配必须从上一个匹配结束的位置开始。 有关更多信息,请参见连续匹配。 |
\b |
匹配必须出现在单词边界上。 有关更多信息,请参见单词边界。 |
\B |
匹配不得出现在单词边界上。 有关更多信息,请参见非单词边界。 |
字符串或行的开头:^
^ 定位标记指定下面的模式必须出现在字符串的第一个字符位置的开头。 如果将 ^ 与 RegexOptions.Multiline 选项一起使用(请参见正则表达式选项),则匹配必须发生在每行的开头。
下面的示例将 ^ 定位点用于正则表达式中,该正则表达式提取某些专业棒球队存在期间的年的相关信息。 此示例可以调用 Regex.Matches() 方法的两个重载:
Matches(String, String) 重载的调用只发现了与正则表达式模式匹配的输入字符串中的第一个子字符串。
Matches(String, String, RegexOptions) 重载与为 RegexOptions.Multiline 设置的 options 参数的调用发现所有五个子字符串。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim startPos As Integer = 0
Dim endPos As Integer = 70
Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf + _
"Chicago Cubs, National League, 1903-present" + vbCrLf + _
"Detroit Tigers, American League, 1901-present" + vbCrLf + _
"New York Giants, National League, 1885-1957" + vbCrLf + _
"Washington Senators, American League, 1901-1960" + vbCrLf
Dim pattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+"
Dim match As Match
' Provide minimal validation in the event the input is invalid.
If input.Substring(startPos, endPos).Contains(",") Then
match = Regex.Match(input, pattern)
Do While match.Success
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
startPos = match.Index + match.Length
endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
match = match.NextMatch()
Loop
Console.WriteLine()
End If
startPos = 0
endPos = 70
If input.Substring(startPos, endPos).Contains(",") Then
match = Regex.Match(input, pattern, RegexOptions.Multiline)
Do While match.Success
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
startPos = match.Index + match.Length
endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
match = match.NextMatch()
Loop
Console.WriteLine()
End If
' For Each match As Match in Regex.Matches(input, pattern, RegexOptions.Multiline)
' Console.Write("The {0} played in the {1} in", _
' match.Groups(1).Value, match.Groups(4).Value)
' For Each capture As Capture In match.Groups(5).Captures
' Console.Write(capture.Value)
' Next
' Console.WriteLine(".")
' Next
End Sub
End Module
' The example displays the following output:
' The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'
' The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
' The Chicago Cubs played in the National League in 1903-present.
' The Detroit Tigers played in the American League in 1901-present.
' The New York Giants played in the National League in 1885-1957.
' The Washington Senators played in the American League in 1901-1960.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
int startPos = 0, endPos = 70;
string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
"Chicago Cubs, National League, 1903-present\n" +
"Detroit Tigers, American League, 1901-present\n" +
"New York Giants, National League, 1885-1957\n" +
"Washington Senators, American League, 1901-1960\n";
string pattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+";
Match match;
if (input.Substring(startPos, endPos).Contains(",")) {
match = Regex.Match(input, pattern);
while (match.Success) {
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
startPos = match.Index + match.Length;
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
if (! input.Substring(startPos, endPos).Contains(",")) break;
match = match.NextMatch();
}
Console.WriteLine();
}
if (input.Substring(startPos, endPos).Contains(",")) {
match = Regex.Match(input, pattern, RegexOptions.Multiline);
while (match.Success) {
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
startPos = match.Index + match.Length;
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
if (! input.Substring(startPos, endPos).Contains(",")) break;
match = match.NextMatch();
}
Console.WriteLine();
}
}
}
// The example displays the following output:
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
// The Chicago Cubs played in the National League in 1903-present.
// The Detroit Tigers played in the American League in 1901-present.
// The New York Giants played in the National League in 1885-1957.
// The Washington Senators played in the American League in 1901-1960.
正则表达式模式 ^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+ 的定义如下表所示。
模式 |
说明 |
---|---|
^ |
从输入字符串开头开始进行匹配(如果使用 RegexOptions.Multiline 选项调用方法,则为行的开头)。 |
((\w+(\s*)){2,} |
完全匹配两次零个或多个后跟零个或一个空格的单词字符。 这是第一个捕获组。 此表达式还定义第二个和第三个捕获组:第二个包含捕获的单词,第三个包含捕获的空格。 |
,\s |
匹配后跟空格字符的逗号。 |
(\w+\s\w+) |
匹配零个或多个后跟空格、一个或多个单词字符的单词字符。 这是第四个捕获组。 |
, |
匹配逗号。 |
\s\d{4} |
匹配后跟四个十进制数字的空格。 |
(-(\d{4}|present))* |
匹配零个或一个后跟四个十进制数字或字符串“present”的连字符匹配项。 这是第六个捕获组。 它还包括第七个捕获组。 |
,* |
匹配零个或一个逗号匹配项。 |
(\s\d{4}(-(\d{4}|present))*,*)+ |
匹配以下一个或多个匹配项:空格、四个十进制数字、零个或一个后跟四个十进制数字或字符串“present”的连字符匹配项,以及零个或一个逗号。 这是第五个捕获组。 |
返回页首
字符串或行的结尾:$
$ 定位标记指定前面的模式必须出现在输入字符串的结尾或输入字符串的结尾处的 \n 之前。
如果将 $ 与 RegexOptions.Multiline 选项一起使用,则匹配也会出现在一行的末尾。 请注意 $ 匹配 \n,但不匹配 \r\n(回车换行组合,或 CR/LF)。 若要匹配 CR/LF 字符组合,请在正则表达式模式中包含 \r?$。
下面的示例将 $ 定位点添加到字符串或行的开头部分中示例中使用的正则表达式模式。 用于原始输入字符串(其中包括五行文本)时,Regex.Matches(String, String) 方法无法找到匹配,因为第一行的结尾与 $ 模式不匹配。 原始输入字符串被拆分为字符串数组时,Regex.Matches(String, String) 方法将成功匹配五行中的每一行。 通过将 options 参数设置为 RegexOptions.Multiline 调用 Regex.Matches(String, String, RegexOptions) 方法时,没有找到匹配项,因为正则表达式模式不考虑回车元素 (\u+000D)。 但是,通过将 $ 替换为 \r?$ 来修改正则表达式模式时,调用 options 参数再次设置为 RegexOptions.Multiline 的 Regex.Matches(String, String, RegexOptions) 方法时将查找五个匹配项。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim startPos As Integer = 0
Dim endPos As Integer = 70
Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf + _
"Chicago Cubs, National League, 1903-present" + vbCrLf + _
"Detroit Tigers, American League, 1901-present" + vbCrLf + _
"New York Giants, National League, 1885-1957" + vbCrLf + _
"Washington Senators, American League, 1901-1960" + vbCrLf
Dim basePattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+"
Dim match As Match
Dim pattern As String = basePattern + "$"
Console.WriteLine("Attempting to match the entire input string:")
' Provide minimal validation in the event the input is invalid.
If input.Substring(startPos, endPos).Contains(",") Then
match = Regex.Match(input, pattern)
Do While match.Success
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
startPos = match.Index + match.Length
endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
match = match.NextMatch()
Loop
Console.WriteLine()
End If
Dim teams() As String = input.Split(New String() { vbCrLf }, StringSplitOptions.RemoveEmptyEntries)
Console.WriteLine("Attempting to match each element in a string array:")
For Each team As String In teams
If team.Length > 70 Then Continue For
match = Regex.Match(team, pattern)
If match.Success Then
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
End If
Next
Console.WriteLine()
startPos = 0
endPos = 70
Console.WriteLine("Attempting to match each line of an input string with '$':")
' Provide minimal validation in the event the input is invalid.
If input.Substring(startPos, endPos).Contains(",") Then
match = Regex.Match(input, pattern, RegexOptions.Multiline)
Do While match.Success
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
startPos = match.Index + match.Length
endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
match = match.NextMatch()
Loop
Console.WriteLine()
End If
startPos = 0
endPos = 70
pattern = basePattern + "\r?$"
Console.WriteLine("Attempting to match each line of an input string with '\r?$':")
' Provide minimal validation in the event the input is invalid.
If input.Substring(startPos, endPos).Contains(",") Then
match = Regex.Match(input, pattern, RegexOptions.Multiline)
Do While match.Success
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
startPos = match.Index + match.Length
endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
match = match.NextMatch()
Loop
Console.WriteLine()
End If
End Sub
End Module
' The example displays the following output:
' Attempting to match the entire input string:
'
' Attempting to match each element in a string array:
' The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
' The Chicago Cubs played in the National League in 1903-present.
' The Detroit Tigers played in the American League in 1901-present.
' The New York Giants played in the National League in 1885-1957.
' The Washington Senators played in the American League in 1901-1960.
'
' Attempting to match each line of an input string with '$':
'
' Attempting to match each line of an input string with '\r+$':
' The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
' The Chicago Cubs played in the National League in 1903-present.
' The Detroit Tigers played in the American League in 1901-present.
' The New York Giants played in the National League in 1885-1957.
' The Washington Senators played in the American League in 1901-1960.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
int startPos = 0, endPos = 70;
string cr = Environment.NewLine;
string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + cr +
"Chicago Cubs, National League, 1903-present" + cr +
"Detroit Tigers, American League, 1901-present" + cr +
"New York Giants, National League, 1885-1957" + cr +
"Washington Senators, American League, 1901-1960" + cr;
Match match;
string basePattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+";
string pattern = basePattern + "$";
Console.WriteLine("Attempting to match the entire input string:");
if (input.Substring(startPos, endPos).Contains(",")) {
match = Regex.Match(input, pattern);
while (match.Success) {
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
startPos = match.Index + match.Length;
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
if (! input.Substring(startPos, endPos).Contains(",")) break;
match = match.NextMatch();
}
Console.WriteLine();
}
string[] teams = input.Split(new String[] { cr }, StringSplitOptions.RemoveEmptyEntries);
Console.WriteLine("Attempting to match each element in a string array:");
foreach (string team in teams)
{
if (team.Length > 70) continue;
match = Regex.Match(team, pattern);
if (match.Success)
{
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
}
}
Console.WriteLine();
startPos = 0;
endPos = 70;
Console.WriteLine("Attempting to match each line of an input string with '$':");
if (input.Substring(startPos, endPos).Contains(",")) {
match = Regex.Match(input, pattern, RegexOptions.Multiline);
while (match.Success) {
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
startPos = match.Index + match.Length;
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
if (! input.Substring(startPos, endPos).Contains(",")) break;
match = match.NextMatch();
}
Console.WriteLine();
}
startPos = 0;
endPos = 70;
pattern = basePattern + "\r?$";
Console.WriteLine(@"Attempting to match each line of an input string with '\r?$':");
if (input.Substring(startPos, endPos).Contains(",")) {
match = Regex.Match(input, pattern, RegexOptions.Multiline);
while (match.Success) {
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
startPos = match.Index + match.Length;
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
if (! input.Substring(startPos, endPos).Contains(",")) break;
match = match.NextMatch();
}
Console.WriteLine();
}
}
}
// The example displays the following output:
// Attempting to match the entire input string:
//
// Attempting to match each element in a string array:
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
// The Chicago Cubs played in the National League in 1903-present.
// The Detroit Tigers played in the American League in 1901-present.
// The New York Giants played in the National League in 1885-1957.
// The Washington Senators played in the American League in 1901-1960.
//
// Attempting to match each line of an input string with '$':
//
// Attempting to match each line of an input string with '\r+$':
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
// The Chicago Cubs played in the National League in 1903-present.
// The Detroit Tigers played in the American League in 1901-present.
// The New York Giants played in the National League in 1885-1957.
// The Washington Senators played in the American League in 1901-1960.
返回页首
仅字符串的开头:\A
\A 定位标记指定匹配必须出现在输入字符串的开头。 它与 ^ 定位标记相同,除非 \A 忽略 RegexOptions.Multiline 选项。 因此,它只能匹配多行输入字符串中第一行的开头。
下面的示例类似于 ^ 和 $ 定位点的示例。 它将 \A 定位标记用于正则表达式,该正则表达式提取期间存在某些专业棒球队的年份的相关信息。 输入字符串包含五行。 Regex.Matches(String, String, RegexOptions) 方法的调用只发现了与正则表达式模式匹配的输入字符串中的第一个子字符串。 如示例所示,Multiline 选项将无效。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim startPos As Integer = 0
Dim endPos As Integer = 70
Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf + _
"Chicago Cubs, National League, 1903-present" + vbCrLf + _
"Detroit Tigers, American League, 1901-present" + vbCrLf + _
"New York Giants, National League, 1885-1957" + vbCrLf + _
"Washington Senators, American League, 1901-1960" + vbCrLf
Dim pattern As String = "\A((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+"
Dim match As Match
' Provide minimal validation in the event the input is invalid.
If input.Substring(startPos, endPos).Contains(",") Then
match = Regex.Match(input, pattern, RegexOptions.Multiline)
Do While match.Success
Console.Write("The {0} played in the {1} in", _
match.Groups(1).Value, match.Groups(4).Value)
For Each capture As Capture In match.Groups(5).Captures
Console.Write(capture.Value)
Next
Console.WriteLine(".")
startPos = match.Index + match.Length
endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
match = match.NextMatch()
Loop
Console.WriteLine()
End If
End Sub
End Module
' The example displays the following output:
' The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
int startPos = 0, endPos = 70;
string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
"Chicago Cubs, National League, 1903-present\n" +
"Detroit Tigers, American League, 1901-present\n" +
"New York Giants, National League, 1885-1957\n" +
"Washington Senators, American League, 1901-1960\n";
string pattern = @"\A((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+";
Match match;
if (input.Substring(startPos, endPos).Contains(",")) {
match = Regex.Match(input, pattern, RegexOptions.Multiline);
while (match.Success) {
Console.Write("The {0} played in the {1} in",
match.Groups[1].Value, match.Groups[4].Value);
foreach (Capture capture in match.Groups[5].Captures)
Console.Write(capture.Value);
Console.WriteLine(".");
startPos = match.Index + match.Length;
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
if (! input.Substring(startPos, endPos).Contains(",")) break;
match = match.NextMatch();
}
Console.WriteLine();
}
}
}
// The example displays the following output:
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
返回页首
字符串的结尾或结束换行符之前:\Z
\Z 定位标记指定匹配必须出现在输入字符串的结尾或输入字符串的结尾处的 \n 之前。 它与 $ 定位标记相同,除非 \Z 忽略 RegexOptions.Multiline 选项。 因此,在多行字符串中,它只能匹配最后一行或 \n 之前最后一行的结尾。
请注意 \Z 匹配 \n,但不匹配 \r\n(CR/LF 字符组合)。 要匹配 CR/LF,请将 \r?\Z 包含在正则表达式模式中。
下面的示例将 \Z 定位点用于与字符串或行的开头节中的示例类似的正则表达式中,该正则表达式提取有关某些专业棒球队存在期间的年的信息。 正则表达式 ^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\Z 中子表达式 \r?\Z 与字符串末端相匹配,也与以 \n 或 \r\n 结尾的字符串相匹配。 因此,数组中的每个元素匹配正则表达式模式。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim inputs() As String = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957", _
"Chicago Cubs, National League, 1903-present" + vbCrLf, _
"Detroit Tigers, American League, 1901-present" + vbLf, _
"New York Giants, National League, 1885-1957", _
"Washington Senators, American League, 1901-1960" + vbCrLf }
Dim pattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\Z"
For Each input As String In inputs
If input.Length > 70 Or Not input.Contains(",") Then Continue For
Console.WriteLine(Regex.Escape(input))
Dim match As Match = Regex.Match(input, pattern)
If match.Success Then
Console.WriteLine(" Match succeeded.")
Else
Console.WriteLine(" Match failed.")
End If
Next
End Sub
End Module
' The example displays the following output:
' Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
' Match succeeded.
' Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
' Match succeeded.
' Detroit\ Tigers,\ American\ League,\ 1901-present\n
' Match succeeded.
' New\ York\ Giants,\ National\ League,\ 1885-1957
' Match succeeded.
' Washington\ Senators,\ American\ League,\ 1901-1960\r\n
' Match succeeded.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
"Chicago Cubs, National League, 1903-present" + Environment.NewLine,
"Detroit Tigers, American League, 1901-present" + Regex.Unescape(@"\n"),
"New York Giants, National League, 1885-1957",
"Washington Senators, American League, 1901-1960" + Environment.NewLine};
string pattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\Z";
foreach (string input in inputs)
{
if (input.Length > 70 || ! input.Contains(",")) continue;
Console.WriteLine(Regex.Escape(input));
Match match = Regex.Match(input, pattern);
if (match.Success)
Console.WriteLine(" Match succeeded.");
else
Console.WriteLine(" Match failed.");
}
}
}
// The example displays the following output:
// Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
// Match succeeded.
// Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
// Match succeeded.
// Detroit\ Tigers,\ American\ League,\ 1901-present\n
// Match succeeded.
// New\ York\ Giants,\ National\ League,\ 1885-1957
// Match succeeded.
// Washington\ Senators,\ American\ League,\ 1901-1960\r\n
// Match succeeded.
返回页首
仅字符串的结尾:\z
\z 定位标记指定匹配必须出现在输入字符串的结尾。 和 $ 语言元素一样,\z 忽略 RegexOptions.Multiline 选项。 与 \Z 语言元素不同,\z 与字符串结尾的 \n 字符不匹配。 因此,它只能匹配输入字符串的最后一行。
下面的示例将 \z 定位点用于原本与上一节中的示例相同的正则表达式中,该正则表达式提取有关某些专业棒球队存在期间的年的信息。 此示例尝试匹配字符串数组中的每五个元素,方式为采用正则表达式模式 ^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\z。 其中两个字符串以回车符和换行符结尾,一个以换行符结尾,并且两个均不以回车符或换行符结尾。 如输出所示,仅不包含回车符或换行符的字符串匹配该模式。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim inputs() As String = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957", _
"Chicago Cubs, National League, 1903-present" + vbCrLf, _
"Detroit Tigers, American League, 1901-present" + vbLf, _
"New York Giants, National League, 1885-1957", _
"Washington Senators, American League, 1901-1960" + vbCrLf }
Dim pattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\z"
For Each input As String In inputs
If input.Length > 70 Or Not input.Contains(",") Then Continue For
Console.WriteLine(Regex.Escape(input))
Dim match As Match = Regex.Match(input, pattern)
If match.Success Then
Console.WriteLine(" Match succeeded.")
Else
Console.WriteLine(" Match failed.")
End If
Next
End Sub
End Module
' The example displays the following output:
' Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
' Match succeeded.
' Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
' Match failed.
' Detroit\ Tigers,\ American\ League,\ 1901-present\n
' Match failed.
' New\ York\ Giants,\ National\ League,\ 1885-1957
' Match succeeded.
' Washington\ Senators,\ American\ League,\ 1901-1960\r\n
' Match failed.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
"Chicago Cubs, National League, 1903-present" + Environment.NewLine,
"Detroit Tigers, American League, 1901-present\\r",
"New York Giants, National League, 1885-1957",
"Washington Senators, American League, 1901-1960" + Environment.NewLine };
string pattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\z";
foreach (string input in inputs)
{
if (input.Length > 70 || ! input.Contains(",")) continue;
Console.WriteLine(Regex.Escape(input));
Match match = Regex.Match(input, pattern);
if (match.Success)
Console.WriteLine(" Match succeeded.");
else
Console.WriteLine(" Match failed.");
}
}
}
// The example displays the following output:
// Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
// Match succeeded.
// Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
// Match failed.
// Detroit\ Tigers,\ American\ League,\ 1901-present\n
// Match failed.
// New\ York\ Giants,\ National\ League,\ 1885-1957
// Match succeeded.
// Washington\ Senators,\ American\ League,\ 1901-1960\r\n
// Match failed.
返回页首
连续匹配:\G
\G 定位标记指定匹配必须出现在上一个匹配结束的地方。 通过 Regex.Matches 或 Match.NextMatch 方法使用此定位点时,它确保所有匹配项是连续的。
下面的示例使用正则表达式从以逗号分隔的字符串提取啮齿类的名称。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "capybara,squirrel,chipmunk,porcupine,gopher," + _
"beaver,groundhog,hamster,guinea pig,gerbil," + _
"chinchilla,prairie dog,mouse,rat"
Dim pattern As String = "\G(\w+\s?\w*),?"
Dim match As Match = Regex.Match(input, pattern)
Do While match.Success
Console.WriteLine(match.Groups(1).Value)
match = match.NextMatch()
Loop
End Sub
End Module
' The example displays the following output:
' capybara
' squirrel
' chipmunk
' porcupine
' gopher
' beaver
' groundhog
' hamster
' guinea pig
' gerbil
' chinchilla
' prairie dog
' mouse
' rat
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "capybara,squirrel,chipmunk,porcupine,gopher," +
"beaver,groundhog,hamster,guinea pig,gerbil," +
"chinchilla,prairie dog,mouse,rat";
string pattern = @"\G(\w+\s?\w*),?";
Match match = Regex.Match(input, pattern);
while (match.Success)
{
Console.WriteLine(match.Groups[1].Value);
match = match.NextMatch();
}
}
}
// The example displays the following output:
// capybara
// squirrel
// chipmunk
// porcupine
// gopher
// beaver
// groundhog
// hamster
// guinea pig
// gerbil
// chinchilla
// prairie dog
// mouse
// rat
正则表达式模式 \G(\w+\s?\w*),? 的含义如下表所示。
模式 |
说明 |
---|---|
\G |
在最后一个匹配结束的位置开始。 |
\w+ |
匹配一个或多个单词字符。 |
\s? |
匹配零个或一个空格。 |
\w* |
匹配零个或多个单词字符。 |
(\w+\s? \w*) |
匹配零个或多个后跟零个或一个空格、零个或多个单词字符的单词字符。 这是第一个捕获组。 |
,? |
匹配零个或一个文本逗号字符匹配项。 |
返回页首
单词边界:\b
\b 定位标记指定匹配必须出现在单词字符(\w 语言元素)和非单词字符(\W 语言元素)之间的边界上。 单词字符包含字母数字字符和下划线;非单词字符是非字母数字或下划线的任何字符。 (有关更多信息,请参见字符类。)匹配也可能出现在字符串的开头或结尾的单词边界。
\b 定位标记经常用于确保子表达式匹配整个单词而不只是匹配单词的开头或结尾。 下例中的正则表达式 \bare\w*\b 演示此用法。 它匹配以子字符串“are”开头的任意单词。 示例的输出还演示 \b 同时匹配输入字符串的开头和结尾。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "area bare arena mare"
Dim pattern As String = "\bare\w*\b"
Console.WriteLine("Words that begin with 'are':")
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}", _
match.Value, match.Index)
Next
End Sub
End Module
' The example displays the following output:
' Words that begin with 'are':
' 'area' found at position 0
' 'arena' found at position 10
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "area bare arena mare";
string pattern = @"\bare\w*\b";
Console.WriteLine("Words that begin with 'are':");
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}",
match.Value, match.Index);
}
}
// The example displays the following output:
// Words that begin with 'are':
// 'area' found at position 0
// 'arena' found at position 10
正则表达式模式的解释如下表所示。
模式 |
说明 |
---|---|
\b |
在单词边界处开始匹配。 |
are |
匹配子字符串“are”。 |
\w* |
匹配零个或多个单词字符。 |
\b |
在单词边界处结束匹配。 |
返回页首
非字边界:\B
\B 定位标记指定匹配不得出现在单词边界。 它是与 \b 定位点相反的定位点。
下面的示例使用 \B 定位点找到单词中子字符串“qu”的匹配项。 正则表达式模式 \Bqu\w+ 匹配以“qu”开头的子字符串,该字符串不会开始单词并延续到字符串的结尾。
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "equity queen equip acquaint quiet"
Dim pattern As String = "\Bqu\w+"
For Each match As Match In Regex.Matches(input, pattern)
Console.WriteLine("'{0}' found at position {1}", _
match.Value, match.Index)
Next
End Sub
End Module
' The example displays the following output:
' 'quity' found at position 1
' 'quip' found at position 14
' 'quaint' found at position 21
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "equity queen equip acquaint quiet";
string pattern = @"\Bqu\w+";
foreach (Match match in Regex.Matches(input, pattern))
Console.WriteLine("'{0}' found at position {1}",
match.Value, match.Index);
}
}
// The example displays the following output:
// 'quity' found at position 1
// 'quip' found at position 14
// 'quaint' found at position 21
正则表达式模式的解释如下表所示。
模式 |
说明 |
---|---|
\B |
不从单词边界开始进行匹配。 |
qu |
匹配子字符串“qu”。 |
\w+ |
匹配一个或多个单词字符。 |
返回页首