mdq.NGrams (Transact-SQL)
Outputs a stream of tokens based on an input string. This function is in the mdq schema and is available only in the Master Data Services database.
Transact-SQL Syntax Conventions
Syntax
mdq.NGrams(input,n,padSpace)
Arguments
input
Is the input string to create tokens from. input is nvarchar(4000) with no default.n
Specifies the length of each token. n is tinyint with a default value of 3. Valid values are 1 through 255.padSpace
Specifies whether to left-pad and right-pad the input. padSpace is bit with a default value of 0. A value of 0 pads the beginning and end of the input with characters. A value of 1 pads the beginning and end of the input with space characters.
Table Returned
Column name |
Column type |
Description |
---|---|---|
Sequence |
int |
Is the sequence of the tokens in the result stream. |
Token |
Nvarchar(255) |
Is a single token of the specified length. |
Remarks
The result is a stream of tokens, also known as a set of n-grams, in the length specified by n. n-grams can be used to compare strings and determine approximate matches between those strings.
Permissions
This function is available to the public role.
Examples
The following example splits the input string into a stream of trigrams (tokens that are three characters in length).
USE MDM_Sample;
GO
SELECT * FROM mdq.NGrams(N'Northwind', 3, 0);