Share via


Character Matching (Scripting) 

The period (.) matches any single printing or non-printing character in a string, except a newline character (\n). The following JScript regular expression matches 'aac', 'abc', 'acc', 'adc', and so on, as well as 'a1c', 'a2c', a-c', and a#c':

Example

/a.c/

The equivalent VBScript regular expression is:

"a.c"

If you are trying to match a string containing a file name where a period (.) is part of the input string, you do so by preceding the period in the regular expression with a backslash (\) character. To illustrate, the following JScript regular expression matches 'filename.ext':

/filename\.ext/

For VBScript, the equivalent expression appears as follows:

"filename\.ext"

These expressions are still pretty limited. They only let you match any single character. Many times, it is useful to match specified characters from a list. For example, if you have an input text that contains chapter headings that are expressed numerically as Chapter 1, Chapter 2, and so on, you might want to find those chapter headings.

Bracket Expressions

You can create a list of matching characters by placing one or more individual characters within square brackets ([ and ]). When characters are enclosed in brackets, the list is called a bracket expression. Within brackets, as anywhere else, ordinary characters represent themselves, that is, they match an occurrence of themselves in the input text. Most special characters lose their meaning when they occur inside a bracket expression. Here are some exceptions:

  • The ']' character ends a list if it is not the first item. To match the ']' character in a list, place it first, immediately following the opening '['.

  • The '\' character continues to be the escape character. To match the '\' character, use '\\'.

Characters enclosed in a bracket expression match only a single character for the position in the regular expression where the bracket expression appears. The following JScript regular expression matches 'Chapter 1', 'Chapter 2', 'Chapter 3', 'Chapter 4', and 'Chapter 5':

/Chapter [12345]/

To match those same chapter heading in VBScript, use the following:

"Chapter [12345]"

Notice that the word 'Chapter' and the space that follows are fixed in position relative to the characters within brackets. The bracket expression then, is used to specify only the set of characters that matches the single character position immediately following the word 'Chapter' and a space. That is the ninth character position.

If you want to express the matching characters using a range instead of the characters themselves, you can separate the beginning and ending characters in the range using the hyphen (-) character. The character value of the individual characters determines their relative order within a range. The following JScript regular expression contains a range expression that is equivalent to the bracketed list shown above.

/Chapter [1-5]/

The same expression for VBScript appears as follows:

"Chapter [1-5]"

When a range is specified in this manner, both the starting and ending values are included in the range. It is important to note that the starting value must precede the ending value in Unicode sort order.

If you want to include the hyphen character in your bracket expression, you must do one of the following:

  • Escape it with a backslash:

    [\-]
    
  • Put the hyphen character at the beginning or the end of the bracketed list. The following expressions matches all lowercase letters and the hyphen:

    [-a-z]
    
    [a-z-]
    
  • Create a range where the beginning character value is lower than the hyphen character and the ending character value is equal to or greater than the hyphen. Both of the following regular expressions satisfy this requirement:

    [!--]
    
    [!-~]
    

You can also find all the characters not in the list or range by placing the caret (^) character at the beginning of the list. If the caret character appears in any other position within the list, it matches itself, that is, it has no special meaning. The following JScript regular expression matches chapter headings with numbers greater than 5':

/Chapter [^12345]/

For VBScript use:

"Chapter [^12345]"

In the examples shown above, the expression matches any digit character in the ninth position except 1, 2, 3, 4, or 5. So, for example, 'Chapter 7' is a match and so is 'Chapter 9'.

The same expressions above can be represented using the hyphen character (-). For JScript:

/Chapter [^1-5]/

or for VBScript:

"Chapter [^1-5]"

A typical use of a bracket expression is to specify matches of any upper- or lowercase alphabetic characters or any digits. The following JScript expression specifies such a match:

/[A-Za-z0-9]/

The equivalent expression for VBScript is:

"[A-Za-z0-9]"