StringInfo.ParseCombiningCharacters(String) 方法
定義
重要
部分資訊涉及發行前產品,在發行之前可能會有大幅修改。 Microsoft 對此處提供的資訊,不做任何明確或隱含的瑕疵擔保。
傳回所指定字串內各個基底字元、高 Surrogate 或控制字元的索引。
public:
static cli::array <int> ^ ParseCombiningCharacters(System::String ^ str);
public static int[] ParseCombiningCharacters (string str);
static member ParseCombiningCharacters : string -> int[]
Public Shared Function ParseCombiningCharacters (str As String) As Integer()
參數
- str
- String
要搜尋的字串。
傳回
整數陣列,包含所指定字串內各個基底字元、高 Surrogate 或控制字元的以零起始的索引。
例外狀況
str
為 null
。
範例
下列程式碼範例示範如何呼叫 ParseCombiningCharacters 方法。 此程式代碼範例是針對 類別提供的較大範例的 StringInfo 一部分。
using namespace System;
using namespace System::Text;
using namespace System::Globalization;
// Show how to enumerate each real character (honoring surrogates)
// in a string.
void EnumTextElements(String^ combiningChars)
{
// This StringBuilder holds the output results.
StringBuilder^ sb = gcnew StringBuilder();
// Use the enumerator returned from GetTextElementEnumerator
// method to examine each real character.
TextElementEnumerator^ charEnum =
StringInfo::GetTextElementEnumerator(combiningChars);
while (charEnum->MoveNext())
{
sb->AppendFormat("Character at index {0} is '{1}'{2}",
charEnum->ElementIndex, charEnum->GetTextElement(),
Environment::NewLine);
}
// Show the results.
Console::WriteLine("Result of GetTextElementEnumerator:");
Console::WriteLine(sb);
}
// Show how to discover the index of each real character
// (honoring surrogates) in a string.
void EnumTextElementIndexes(String^ combiningChars)
{
// This StringBuilder holds the output results.
StringBuilder^ sb = gcnew StringBuilder();
// Use the ParseCombiningCharacters method to
// get the index of each real character in the string.
array <int>^ textElemIndex =
StringInfo::ParseCombiningCharacters(combiningChars);
// Iterate through each real character showing the character
// and the index where it was found.
for (int i = 0; i < textElemIndex->Length; i++)
{
sb->AppendFormat("Character {0} starts at index {1}{2}",
i, textElemIndex[i], Environment::NewLine);
}
// Show the results.
Console::WriteLine("Result of ParseCombiningCharacters:");
Console::WriteLine(sb);
}
int main()
{
// The string below contains combining characters.
String^ combiningChars = L"a\u0304\u0308bc\u0327";
// Show each 'character' in the string.
EnumTextElements(combiningChars);
// Show the index in the string where each 'character' starts.
EnumTextElementIndexes(combiningChars);
};
// This code produces the following output.
//
// Result of GetTextElementEnumerator:
// Character at index 0 is 'a-"'
// Character at index 3 is 'b'
// Character at index 4 is 'c,'
//
// Result of ParseCombiningCharacters:
// Character 0 starts at index 0
// Character 1 starts at index 3
// Character 2 starts at index 4
using System;
using System.Text;
using System.Globalization;
public sealed class App {
static void Main() {
// The string below contains combining characters.
String s = "a\u0304\u0308bc\u0327";
// Show each 'character' in the string.
EnumTextElements(s);
// Show the index in the string where each 'character' starts.
EnumTextElementIndexes(s);
}
// Show how to enumerate each real character (honoring surrogates) in a string.
static void EnumTextElements(String s) {
// This StringBuilder holds the output results.
StringBuilder sb = new StringBuilder();
// Use the enumerator returned from GetTextElementEnumerator
// method to examine each real character.
TextElementEnumerator charEnum = StringInfo.GetTextElementEnumerator(s);
while (charEnum.MoveNext()) {
sb.AppendFormat(
"Character at index {0} is '{1}'{2}",
charEnum.ElementIndex, charEnum.GetTextElement(),
Environment.NewLine);
}
// Show the results.
Console.WriteLine("Result of GetTextElementEnumerator:");
Console.WriteLine(sb);
}
// Show how to discover the index of each real character (honoring surrogates) in a string.
static void EnumTextElementIndexes(String s) {
// This StringBuilder holds the output results.
StringBuilder sb = new StringBuilder();
// Use the ParseCombiningCharacters method to
// get the index of each real character in the string.
Int32[] textElemIndex = StringInfo.ParseCombiningCharacters(s);
// Iterate through each real character showing the character and the index where it was found.
for (Int32 i = 0; i < textElemIndex.Length; i++) {
sb.AppendFormat(
"Character {0} starts at index {1}{2}",
i, textElemIndex[i], Environment.NewLine);
}
// Show the results.
Console.WriteLine("Result of ParseCombiningCharacters:");
Console.WriteLine(sb);
}
}
// This code produces the following output:
//
// Result of GetTextElementEnumerator:
// Character at index 0 is 'ā̈'
// Character at index 3 is 'b'
// Character at index 4 is 'ç'
//
// Result of ParseCombiningCharacters:
// Character 0 starts at index 0
// Character 1 starts at index 3
// Character 2 starts at index 4
Imports System.Text
Imports System.Globalization
Public Module Example
Public Sub Main()
' The string below contains combining characters.
Dim s As String = "a" + ChrW(&h0304) + ChrW(&h0308) + "bc" + ChrW(&h0327)
' Show each 'character' in the string.
EnumTextElements(s)
' Show the index in the string where each 'character' starts.
EnumTextElementIndexes(s)
End Sub
' Show how to enumerate each real character (honoring surrogates) in a string.
Sub EnumTextElements(s As String)
' This StringBuilder holds the output results.
Dim sb As New StringBuilder()
' Use the enumerator returned from GetTextElementEnumerator
' method to examine each real character.
Dim charEnum As TextElementEnumerator = StringInfo.GetTextElementEnumerator(s)
Do While charEnum.MoveNext()
sb.AppendFormat("Character at index {0} is '{1}'{2}",
charEnum.ElementIndex,
charEnum.GetTextElement(),
Environment.NewLine)
Loop
' Show the results.
Console.WriteLine("Result of GetTextElementEnumerator:")
Console.WriteLine(sb)
End Sub
' Show how to discover the index of each real character (honoring surrogates) in a string.
Sub EnumTextElementIndexes(s As String)
' This StringBuilder holds the output results.
Dim sb As New StringBuilder()
' Use the ParseCombiningCharacters method to
' get the index of each real character in the string.
Dim textElemIndex() As Integer = StringInfo.ParseCombiningCharacters(s)
' Iterate through each real character showing the character and the index where it was found.
For i As Int32 = 0 To textElemIndex.Length - 1
sb.AppendFormat("Character {0} starts at index {1}{2}",
i, textElemIndex(i), Environment.NewLine)
Next
' Show the results.
Console.WriteLine("Result of ParseCombiningCharacters:")
Console.WriteLine(sb)
End Sub
End Module
' The example displays the following output:
'
' Result of GetTextElementEnumerator:
' Character at index 0 is 'ā̈'
' Character at index 3 is 'b'
' Character at index 4 is 'ç'
'
' Result of ParseCombiningCharacters:
' Character 0 starts at index 0
' Character 1 starts at index 3
' Character 2 starts at index 4
備註
Unicode Standard 會將 Surrogate 配對定義為單一抽象字元的編碼字元表示法,其中包含兩個程式碼單位的序列,其中配對的第一個單位是高 Surrogate,而第二個則是低 Surrogate。 高 Surrogate 是 U+D800 到 U+DBFF 範圍中的 Unicode 字碼點,低 Surrogate 是 U+DC00 到 U+DFFF 範圍中的 Unicode 字碼點。
控制字元是 Unicode 值為 U+007F 或 U+0000 到 U+001F 範圍或 U+0080 到 U+009F 的字元。
.NET 會將文字元素定義為顯示為單一字元的文字單位,也就是 grapheme。 文字元素可以是基底字元、Surrogate 配對或合併字元序列。 Unicode 標準會將結合字元序列定義為基底字元和一或多個結合字元的組合。 Surrogate 配對可以代表基底字元或合併字元。
如果合併字元序列無效,也會傳回該序列中的每個合併字元。
產生的陣列中的每個索引都是文字項目的開頭,也就是基底字元或高 Surrogate 的索引。
每個元素的長度很容易計算為後續索引之間的差異。 陣列的長度一律會小於或等於字串的長度。 例如,假設字串串 “\u4f00\u302a\ud800\udc00\u4f01”,這個方法會傳回索引 0、2 和 4。
對等成員
從 2.0 版的 .NET Framework 開始,SubstringByTextElements方法和LengthInTextElements屬性提供方法所提供的ParseCombiningCharacters功能實作容易使用。