How to: Extract a Protocol and Port Number from a URL
The following example extracts a protocol and port number from a URL.
Warning
When using System.Text.RegularExpressions to process untrusted input, pass a timeout. A malicious user can provide input to RegularExpressions
, causing a Denial-of-Service attack. ASP.NET Core framework APIs that use RegularExpressions
pass a timeout.
Example
The example uses the Match.Result method to return the protocol followed by a colon followed by the port number.
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string url = "http://www.contoso.com:8080/letters/readme.html";
Regex r = new Regex(@"^(?<proto>\w+)://[^/]+?(?<port>:\d+)?/",
RegexOptions.None, TimeSpan.FromMilliseconds(150));
Match m = r.Match(url);
if (m.Success)
Console.WriteLine(m.Result("${proto}${port}"));
}
}
// The example displays the following output:
// http:8080
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim url As String = "http://www.contoso.com:8080/letters/readme.html"
Dim r As New Regex("^(?<proto>\w+)://[^/]+?(?<port>:\d+)?/",
RegexOptions.None, TimeSpan.FromMilliseconds(150))
Dim m As Match = r.Match(url)
If m.Success Then
Console.WriteLine(m.Result("${proto}${port}"))
End If
End Sub
End Module
' The example displays the following output:
' http:8080
The regular expression pattern ^(?<proto>\w+)://[^/]+?(?<port>:\d+)?/
can be interpreted as shown in the following table.
Pattern | Description |
---|---|
^ |
Begin the match at the start of the string. |
(?<proto>\w+) |
Match one or more word characters. Name this group proto . |
:// |
Match a colon followed by two slash marks. |
[^/]+? |
Match one or more occurrences (but as few as possible) of any character other than a slash mark. |
(?<port>:\d+)? |
Match zero or one occurrence of a colon followed by one or more digit characters. Name this group port . |
/ |
Match a slash mark. |
The Match.Result method expands the ${proto}${port}
replacement sequence, which concatenates the value of the two named groups captured in the regular expression pattern. It is a convenient alternative to explicitly concatenating the strings retrieved from the collection object returned by the Match.Groups property.
The example uses the Match.Result method with two substitutions, ${proto}
and ${port}
, to include the captured groups in the output string. You can retrieve the captured groups from the match's GroupCollection object instead, as the following code shows.
Console.WriteLine(m.Groups["proto"].Value + m.Groups["port"].Value);
Console.WriteLine(m.Groups("proto").Value + m.Groups("port").Value)