UTF8Encoding.GetEncoder Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Obtains an encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes.
public:
override System::Text::Encoder ^ GetEncoder();
public override System.Text.Encoder GetEncoder ();
override this.GetEncoder : unit -> System.Text.Encoder
Public Overrides Function GetEncoder () As Encoder
Returns
A Encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes.
Examples
The following example uses the GetEncoder method to obtain an encoder to convert a sequence of characters into a UTF-8 encoded sequence of bytes.
using namespace System;
using namespace System::Text;
using namespace System::Collections;
int main()
{
array<Char>^chars = {'a','b','c',L'\u0300',L'\ua0a0'};
array<Byte>^bytes;
Encoder^ utf8Encoder = Encoding::UTF8->GetEncoder();
int byteCount = utf8Encoder->GetByteCount( chars, 2, 3, true );
bytes = gcnew array<Byte>(byteCount);
int bytesEncodedCount = utf8Encoder->GetBytes( chars, 2, 3, bytes, 0, true );
Console::WriteLine( "{0} bytes used to encode characters.", bytesEncodedCount );
Console::Write( "Encoded bytes: " );
IEnumerator^ myEnum = bytes->GetEnumerator();
while ( myEnum->MoveNext() )
{
Byte b = safe_cast<Byte>(myEnum->Current);
Console::Write( "[{0}]", b );
}
Console::WriteLine();
}
using System;
using System.Text;
class UTF8EncodingExample {
public static void Main() {
Char[] chars = new Char[] {'a', 'b', 'c', '\u0300', '\ua0a0'};
Byte[] bytes;
Encoder utf8Encoder = Encoding.UTF8.GetEncoder();
int byteCount = utf8Encoder.GetByteCount(chars, 2, 3, true);
bytes = new Byte[byteCount];
int bytesEncodedCount = utf8Encoder.GetBytes(chars, 2, 3, bytes, 0, true);
Console.WriteLine(
"{0} bytes used to encode characters.", bytesEncodedCount
);
Console.Write("Encoded bytes: ");
foreach (Byte b in bytes) {
Console.Write("[{0}]", b);
}
Console.WriteLine();
}
}
Imports System.Text
Imports Microsoft.VisualBasic.Strings
Class UTF8EncodingExample
Public Shared Sub Main()
'Characters:
' ChrW(97) = a
' ChrW(98) = b
' ChrW(99) = c
' ChrW(768) = `
' ChrW(41120) = valid unicode code point, but not a character
Dim chars() As Char = {ChrW(97), ChrW(98), ChrW(99), ChrW(768), ChrW(41120)}
Dim bytes() As Byte
Dim utf8Encoder As Encoder = Encoding.UTF8.GetEncoder()
Dim byteCount As Integer = utf8Encoder.GetByteCount(chars, 2, 3, True)
bytes = New Byte(byteCount - 1) {}
Dim bytesEncodedCount As Integer = utf8Encoder.GetBytes( _
chars, 2, 3, bytes, 0, True _
)
Console.WriteLine("{0} bytes used to encode characters.", bytesEncodedCount)
Console.Write("Encoded bytes: ")
Dim b As Byte
For Each b In bytes
Console.Write("[{0}]", b)
Next b
Console.WriteLine()
End Sub
End Class
Remarks
The Encoder.GetBytes method converts sequential blocks of characters into sequential blocks of bytes, in a manner similar to the GetBytes method. However, a Encoder maintains state information between calls so it can correctly encode character sequences that span blocks. The Encoder also preserves trailing characters at the end of data blocks and uses the trailing characters in the next encoding operation. For example, a data block might end with an unmatched high surrogate, and the matching low surrogate might be in the next data block. Therefore, GetDecoder and GetEncoder are useful for network transmission and file operations, because those operations often deal with blocks of data instead of a complete data stream.
If error detection is enabled, that is, the throwOnInvalidCharacters
parameter of the constructor is set to true
, error detection is also enabled in the Encoder returned by this method. If error detection is enabled and an invalid sequence is encountered, the state of the encoder is undefined and processing must stop.