共用方式為


什麼是模糊字串比對?

適用於: SQL Server 2025(17.x)Azure SQL DatabaseAzure SQL Managed InstanceMicrosoft Fabric 中的 SQL 資料庫

使用模糊或近似字串比對來檢查兩個字元串是否相似,並計算兩個字串之間的差異。 使用這項功能來識別因字元損毀而可能不同的字串。 損壞包括拼字錯誤、字元調換、字元遺失或縮寫。 模糊字串比對會使用演算法來偵測類似的音效字串。

Note

模糊字串匹配目前正處於 SQL Server 2025(17.x)的預覽階段,需要啟用 預覽功能資料庫範圍設定

模糊字串比對可在 Azure SQL 受控執行個體中使用 SQL Server 2025始終保持最新更新原則

模糊函數

Function Description
EDIT_DISTANCE 計算將一個字串轉換成另一個字串所需的插入、刪除、替代和轉換數目。
EDIT_DISTANCE_SIMILARITY 計算相似度值,範圍從0(表示不相符)到100(表示完整相符)。
JARO_WINKLER_DISTANCE 計算兩個字串之間的編輯距離,偏好從開頭匹配的一組前綴長度的字串。
JARO_WINKLER_SIMILARITY 計算相似度值,範圍從0(表示不相符)到100(表示完整相符)。

Note

目前,這些函式並未遵循由排序設定定義的比較語意,例如大小寫不敏感性及其他特定排序規則。 一旦對排序規則的支援實現,函式的輸出將反映這些語意,並可能相應改變。

Examples

下列範例示範模糊字串比對函式。

範例表格

執行範例查詢之前,請先建立並填入範例數據表。

若要建立並填入範例數據表,請連線到非生產使用者資料庫,然後執行下列腳本:

-- Step 1: Create the table
CREATE TABLE WordPairs
(
    WordID INT IDENTITY (1, 1) PRIMARY KEY, -- Auto-incrementing ID
    WordUK NVARCHAR (50), -- UK English word
    WordUS NVARCHAR (50)  -- US English word
);

-- Step 2: Insert the data
INSERT INTO WordPairs (WordUK, WordUS)
VALUES ('Colour', 'Color'),
       ('Flavour', 'Flavor'),
       ('Centre', 'Center'),
       ('Theatre', 'Theater'),
       ('Organise', 'Organize'),
       ('Analyse', 'Analyze'),
       ('Catalogue', 'Catalog'),
       ('Programme', 'Program'),
       ('Metre', 'Meter'),
       ('Honour', 'Honor'),
       ('Neighbour', 'Neighbor'),
       ('Travelling', 'Traveling'),
       ('Grey', 'Gray'),
       ('Defence', 'Defense'),
       ('Practise', 'Practice'), -- Verb form in UK
       ('Practice', 'Practice'), -- Noun form in both
       ('Aluminium', 'Aluminum'),
       ('Cheque', 'Check'); -- Bank cheque vs. check

範例 EDIT_DISTANCE

SELECT WordUK,
       WordUS,
       EDIT_DISTANCE(WordUK, WordUS) AS Distance
FROM WordPairs
WHERE EDIT_DISTANCE(WordUK, WordUS) <= 2
ORDER BY Distance ASC;

Returns:

WordUK                         WordUS                         Distance
------------------------------ ------------------------------ -----------
Practice                       Practice                       0
Aluminium                      Aluminum                       1
Honour                         Honor                          1
Neighbour                      Neighbor                       1
Travelling                     Traveling                      1
Grey                           Gray                           1
Defence                        Defense                        1
Practise                       Practice                       1
Colour                         Color                          1
Flavour                        Flavor                         1
Organise                       Organize                       1
Analyse                        Analyze                        1
Catalogue                      Catalog                        2
Programme                      Program                        2
Metre                          Meter                          2
Centre                         Center                         2
Theatre                        Theater                        2

範例 EDIT_DISTANCE_SIMILARITY

SELECT WordUK,
       WordUS,
       EDIT_DISTANCE_SIMILARITY(WordUK, WordUS) AS Similarity
FROM WordPairs
WHERE EDIT_DISTANCE_SIMILARITY(WordUK, WordUS) >= 75
ORDER BY Similarity DESC;

Returns:

WordUK                         WordUS                         Similarity
------------------------------ ------------------------------ -----------
Practice                       Practice                       100
Travelling                     Traveling                      90
Aluminium                      Aluminum                       89
Neighbour                      Neighbor                       89
Organise                       Organize                       88
Practise                       Practice                       88
Defence                        Defense                        86
Analyse                        Analyze                        86
Flavour                        Flavor                         86
Colour                         Color                          83
Honour                         Honor                          83
Catalogue                      Catalog                        78
Programme                      Program                        78
Grey                           Gray                           75

範例 JARO_WINKLER_DISTANCE

SELECT WordUK,
       WordUS,
       JARO_WINKLER_DISTANCE(WordUK, WordUS) AS Distance
FROM WordPairs
WHERE JARO_WINKLER_DISTANCE(WordUK, WordUS) <= .05
ORDER BY Distance ASC;

Returns:

WordUK                         WordUS                         Distance
------------------------------ ------------------------------ -----------
Practice                       Practice                       0
Travelling                     Traveling                      0.02
Neighbour                      Neighbor                       0.0222222222222223
Aluminium                      Aluminum                       0.0222222222222223
Theatre                        Theater                        0.0285714285714286
Flavour                        Flavor                         0.0285714285714286
Centre                         Center                         0.0333333333333333
Colour                         Color                          0.0333333333333333
Honour                         Honor                          0.0333333333333333
Catalogue                      Catalog                        0.0444444444444444
Programme                      Program                        0.0444444444444444
Metre                          Meter                          0.0466666666666667

範例 JARO_WINKLER_SIMILARITY

SELECT WordUK,
       WordUS,
       JARO_WINKLER_SIMILARITY(WordUK, WordUS) AS Similarity
FROM WordPairs
WHERE JARO_WINKLER_SIMILARITY(WordUK, WordUS) > 90
ORDER BY Similarity DESC;

Returns:

WordUK                         WordUS                         Similarity
------------------------------ ------------------------------ -----------
Practice                       Practice                       100
Aluminium                      Aluminum                       98
Neighbour                      Neighbor                       98
Travelling                     Traveling                      98
Colour                         Color                          97
Flavour                        Flavor                         97
Centre                         Center                         97
Theatre                        Theater                        97
Honour                         Honor                          97
Catalogue                      Catalog                        96
Programme                      Program                        96
Metre                          Meter                          95
Organise                       Organize                       95
Practise                       Practice                       95
Analyse                        Analyze                        94
Defence                        Defense                        94

具有所有函式的範例查詢

下列查詢示範目前可用的所有正則表達式函式。

SELECT T.source_string,
       T.target_string,
       EDIT_DISTANCE(T.source_string, T.target_string) AS ED_Distance,
       JARO_WINKLER_DISTANCE(T.source_string, T.target_string) AS JW_Distance,
       EDIT_DISTANCE_SIMILARITY(T.source_string, T.target_string) AS ED_Similarity,
       JARO_WINKLER_SIMILARITY(T.source_string, T.target_string) AS JW_Similarity
FROM (VALUES ('Black', 'Red'),
             ('Colour', 'Yellow'),
             ('Colour', 'Color'),
             ('Microsoft', 'Msft'),
             ('Regex', 'Regex')
     ) AS T(source_string, target_string);

Returns:

source_string  target_string  ED_Distance    JW_Distance           ED_Similarity  JW_Similarity
-------------- -------------- -------------- --------------------- -------------- --------------
Black          Red            5              1                     0              0
Colour         Yellow         5              0.444444444444445     17             55
Colour         Color          1              0.0333333333333333    83             96
Microsoft      Msft           5              0.491666666666667     44             50
Regex          Regex          0              0                     100            100

收拾整理

使用完範例資料後,刪除範例資料表:

IF OBJECT_ID('dbo.WordPairs', 'U') IS NOT NULL
BEGIN
    DROP TABLE dbo.WordPairs;
END