適用於: SQL Server 2025(17.x)
Azure SQL Database
Azure SQL Managed Instance
Microsoft Fabric 中的 SQL 資料庫
使用模糊或近似字串比對來檢查兩個字元串是否相似,並計算兩個字串之間的差異。 使用這項功能來識別因字元損毀而可能不同的字串。 損壞包括拼字錯誤、字元調換、字元遺失或縮寫。 模糊字串比對會使用演算法來偵測類似的音效字串。
Note
模糊字串匹配目前正處於 SQL Server 2025(17.x)的預覽階段,需要啟用 預覽功能資料庫範圍設定。
模糊字串比對可在 Azure SQL 受控執行個體中使用 SQL Server 2025 或 始終保持最新 的 更新原則。
模糊函數
| Function | Description |
|---|---|
| EDIT_DISTANCE | 計算將一個字串轉換成另一個字串所需的插入、刪除、替代和轉換數目。 |
| EDIT_DISTANCE_SIMILARITY | 計算相似度值,範圍從0(表示不相符)到100(表示完整相符)。 |
| JARO_WINKLER_DISTANCE | 計算兩個字串之間的編輯距離,偏好從開頭匹配的一組前綴長度的字串。 |
| JARO_WINKLER_SIMILARITY | 計算相似度值,範圍從0(表示不相符)到100(表示完整相符)。 |
Note
目前,這些函式並未遵循由排序設定定義的比較語意,例如大小寫不敏感性及其他特定排序規則。 一旦對排序規則的支援實現,函式的輸出將反映這些語意,並可能相應改變。
Examples
下列範例示範模糊字串比對函式。
範例表格
執行範例查詢之前,請先建立並填入範例數據表。
若要建立並填入範例數據表,請連線到非生產使用者資料庫,然後執行下列腳本:
-- Step 1: Create the table
CREATE TABLE WordPairs
(
WordID INT IDENTITY (1, 1) PRIMARY KEY, -- Auto-incrementing ID
WordUK NVARCHAR (50), -- UK English word
WordUS NVARCHAR (50) -- US English word
);
-- Step 2: Insert the data
INSERT INTO WordPairs (WordUK, WordUS)
VALUES ('Colour', 'Color'),
('Flavour', 'Flavor'),
('Centre', 'Center'),
('Theatre', 'Theater'),
('Organise', 'Organize'),
('Analyse', 'Analyze'),
('Catalogue', 'Catalog'),
('Programme', 'Program'),
('Metre', 'Meter'),
('Honour', 'Honor'),
('Neighbour', 'Neighbor'),
('Travelling', 'Traveling'),
('Grey', 'Gray'),
('Defence', 'Defense'),
('Practise', 'Practice'), -- Verb form in UK
('Practice', 'Practice'), -- Noun form in both
('Aluminium', 'Aluminum'),
('Cheque', 'Check'); -- Bank cheque vs. check
範例 EDIT_DISTANCE
SELECT WordUK,
WordUS,
EDIT_DISTANCE(WordUK, WordUS) AS Distance
FROM WordPairs
WHERE EDIT_DISTANCE(WordUK, WordUS) <= 2
ORDER BY Distance ASC;
Returns:
WordUK WordUS Distance
------------------------------ ------------------------------ -----------
Practice Practice 0
Aluminium Aluminum 1
Honour Honor 1
Neighbour Neighbor 1
Travelling Traveling 1
Grey Gray 1
Defence Defense 1
Practise Practice 1
Colour Color 1
Flavour Flavor 1
Organise Organize 1
Analyse Analyze 1
Catalogue Catalog 2
Programme Program 2
Metre Meter 2
Centre Center 2
Theatre Theater 2
範例 EDIT_DISTANCE_SIMILARITY
SELECT WordUK,
WordUS,
EDIT_DISTANCE_SIMILARITY(WordUK, WordUS) AS Similarity
FROM WordPairs
WHERE EDIT_DISTANCE_SIMILARITY(WordUK, WordUS) >= 75
ORDER BY Similarity DESC;
Returns:
WordUK WordUS Similarity
------------------------------ ------------------------------ -----------
Practice Practice 100
Travelling Traveling 90
Aluminium Aluminum 89
Neighbour Neighbor 89
Organise Organize 88
Practise Practice 88
Defence Defense 86
Analyse Analyze 86
Flavour Flavor 86
Colour Color 83
Honour Honor 83
Catalogue Catalog 78
Programme Program 78
Grey Gray 75
範例 JARO_WINKLER_DISTANCE
SELECT WordUK,
WordUS,
JARO_WINKLER_DISTANCE(WordUK, WordUS) AS Distance
FROM WordPairs
WHERE JARO_WINKLER_DISTANCE(WordUK, WordUS) <= .05
ORDER BY Distance ASC;
Returns:
WordUK WordUS Distance
------------------------------ ------------------------------ -----------
Practice Practice 0
Travelling Traveling 0.02
Neighbour Neighbor 0.0222222222222223
Aluminium Aluminum 0.0222222222222223
Theatre Theater 0.0285714285714286
Flavour Flavor 0.0285714285714286
Centre Center 0.0333333333333333
Colour Color 0.0333333333333333
Honour Honor 0.0333333333333333
Catalogue Catalog 0.0444444444444444
Programme Program 0.0444444444444444
Metre Meter 0.0466666666666667
範例 JARO_WINKLER_SIMILARITY
SELECT WordUK,
WordUS,
JARO_WINKLER_SIMILARITY(WordUK, WordUS) AS Similarity
FROM WordPairs
WHERE JARO_WINKLER_SIMILARITY(WordUK, WordUS) > 90
ORDER BY Similarity DESC;
Returns:
WordUK WordUS Similarity
------------------------------ ------------------------------ -----------
Practice Practice 100
Aluminium Aluminum 98
Neighbour Neighbor 98
Travelling Traveling 98
Colour Color 97
Flavour Flavor 97
Centre Center 97
Theatre Theater 97
Honour Honor 97
Catalogue Catalog 96
Programme Program 96
Metre Meter 95
Organise Organize 95
Practise Practice 95
Analyse Analyze 94
Defence Defense 94
具有所有函式的範例查詢
下列查詢示範目前可用的所有正則表達式函式。
SELECT T.source_string,
T.target_string,
EDIT_DISTANCE(T.source_string, T.target_string) AS ED_Distance,
JARO_WINKLER_DISTANCE(T.source_string, T.target_string) AS JW_Distance,
EDIT_DISTANCE_SIMILARITY(T.source_string, T.target_string) AS ED_Similarity,
JARO_WINKLER_SIMILARITY(T.source_string, T.target_string) AS JW_Similarity
FROM (VALUES ('Black', 'Red'),
('Colour', 'Yellow'),
('Colour', 'Color'),
('Microsoft', 'Msft'),
('Regex', 'Regex')
) AS T(source_string, target_string);
Returns:
source_string target_string ED_Distance JW_Distance ED_Similarity JW_Similarity
-------------- -------------- -------------- --------------------- -------------- --------------
Black Red 5 1 0 0
Colour Yellow 5 0.444444444444445 17 55
Colour Color 1 0.0333333333333333 83 96
Microsoft Msft 5 0.491666666666667 44 50
Regex Regex 0 0 100 100
收拾整理
使用完範例資料後,刪除範例資料表:
IF OBJECT_ID('dbo.WordPairs', 'U') IS NOT NULL
BEGIN
DROP TABLE dbo.WordPairs;
END