How to: Strip Invalid Characters from a String
The following example uses the static Regex.Replace method to strip invalid characters from a string.
Warning
When using System.Text.RegularExpressions to process untrusted input, pass a timeout. A malicious user can provide input to RegularExpressions
, causing a Denial-of-Service attack. ASP.NET Core framework APIs that use RegularExpressions
pass a timeout.
Example
You can use the CleanInput
method defined in this example to strip potentially harmful characters that have been entered into a text field that accepts user input. In this case, CleanInput
strips out all nonalphanumeric characters except periods (.), at symbols (@), and hyphens (-), and returns the remaining string. However, you can modify the regular expression pattern so that it strips out any characters that should not be included in an input string.
using System;
using System.Text.RegularExpressions;
public class Example
{
static string CleanInput(string strIn)
{
// Replace invalid characters with empty strings.
try {
return Regex.Replace(strIn, @"[^\w\.@-]", "",
RegexOptions.None, TimeSpan.FromSeconds(1.5));
}
// If we timeout when replacing invalid characters,
// we should return Empty.
catch (RegexMatchTimeoutException) {
return String.Empty;
}
}
}
Imports System.Text.RegularExpressions
Module Example
Function CleanInput(strIn As String) As String
' Replace invalid characters with empty strings.
Try
Return Regex.Replace(strIn, "[^\w\.@-]", "")
' If we timeout when replacing invalid characters,
' we should return String.Empty.
Catch e As RegexMatchTimeoutException
Return String.Empty
End Try
End Function
End Module
The regular expression pattern [^\w\.@-]
matches any character that is not a word character, a period, an @ symbol, or a hyphen. A word character is any letter, decimal digit, or punctuation connector such as an underscore. Any character that matches this pattern is replaced by String.Empty, which is the string defined by the replacement pattern. To allow additional characters in user input, add those characters to the character class in the regular expression pattern. For example, the regular expression pattern [^\w\.@-\\%]
also allows a percentage symbol and a backslash in an input string.