UnicodeEncoding Class
Microsoft Silverlight will reach end of support after October 2021. Learn more.
Represents a UTF-16 encoding of Unicode characters.
Inheritance Hierarchy
System.Object
System.Text.Encoding
System.Text.UnicodeEncoding
Namespace: System.Text
Assembly: mscorlib (in mscorlib.dll)
Syntax
'Declaration
<ComVisibleAttribute(True)> _
Public Class UnicodeEncoding _
Inherits Encoding
[ComVisibleAttribute(true)]
public class UnicodeEncoding : Encoding
The UnicodeEncoding type exposes the following members.
Constructors
Name | Description | |
---|---|---|
UnicodeEncoding() | Initializes a new instance of the UnicodeEncoding class. | |
UnicodeEncoding(Boolean, Boolean) | Initializes a new instance of the UnicodeEncoding class. Parameters specify whether to use the big-endian byte order and whether to provide a Unicode byte order mark. | |
UnicodeEncoding(Boolean, Boolean, Boolean) | Initializes a new instance of the UnicodeEncoding class. Parameters specify whether to use the big-endian byte order, whether to provide a Unicode byte order mark, and whether to throw an exception when an invalid encoding is detected. |
Top
Properties
Name | Description | |
---|---|---|
WebName | When overridden in a derived class, gets the name registered with the Internet Assigned Numbers Authority (IANA) for the current encoding. (Inherited from Encoding.) |
Top
Methods
Name | Description | |
---|---|---|
Clone | When overridden in a derived class, creates a shallow copy of the current Encoding object. (Inherited from Encoding.) | |
Equals | Determines whether the specified Object is equal to the current UnicodeEncoding object. (Overrides Encoding.Equals(Object).) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before the Object is reclaimed by garbage collection. (Inherited from Object.) | |
GetByteCount(array<Char[]) | When overridden in a derived class, calculates the number of bytes produced by encoding all the characters in the specified character array. (Inherited from Encoding.) | |
GetByteCount(String) | Calculates the number of bytes produced by encoding the characters in the specified string. (Overrides Encoding.GetByteCount(String).) | |
GetByteCount(array<Char[], Int32, Int32) | Calculates the number of bytes produced by encoding a set of characters from the specified character array. (Overrides Encoding.GetByteCount(array<Char[], Int32, Int32).) | |
GetBytes(array<Char[]) | When overridden in a derived class, encodes all the characters in the specified character array into a sequence of bytes. (Inherited from Encoding.) | |
GetBytes(String) | When overridden in a derived class, encodes all the characters in the specified string into a sequence of bytes. (Inherited from Encoding.) | |
GetBytes(array<Char[], Int32, Int32) | When overridden in a derived class, encodes a set of characters from the specified character array into a sequence of bytes. (Inherited from Encoding.) | |
GetBytes(Char*, Int32, Byte*, Int32) | Security Critical. Encodes a set of characters starting at the specified character pointer into a sequence of bytes that are stored starting at the specified byte pointer. (Overrides Encoding.GetBytes(Char*, Int32, Byte*, Int32).) | |
GetBytes(array<Char[], Int32, Int32, array<Byte[], Int32) | Encodes a set of characters from the specified character array into the specified byte array. (Overrides Encoding.GetBytes(array<Char[], Int32, Int32, array<Byte[], Int32).) | |
GetBytes(String, Int32, Int32, array<Byte[], Int32) | Encodes a set of characters from the specified String into the specified byte array. (Overrides Encoding.GetBytes(String, Int32, Int32, array<Byte[], Int32).) | |
GetCharCount(array<Byte[]) | When overridden in a derived class, calculates the number of characters produced by decoding all the bytes in the specified byte array. (Inherited from Encoding.) | |
GetCharCount(array<Byte[], Int32, Int32) | Calculates the number of characters produced by decoding a sequence of bytes from the specified byte array. (Overrides Encoding.GetCharCount(array<Byte[], Int32, Int32).) | |
GetChars(array<Byte[]) | When overridden in a derived class, decodes all the bytes in the specified byte array into a set of characters. (Inherited from Encoding.) | |
GetChars(array<Byte[], Int32, Int32) | When overridden in a derived class, decodes a sequence of bytes from the specified byte array into a set of characters. (Inherited from Encoding.) | |
GetChars(array<Byte[], Int32, Int32, array<Char[], Int32) | Decodes a sequence of bytes from the specified byte array into the specified character array. (Overrides Encoding.GetChars(array<Byte[], Int32, Int32, array<Char[], Int32).) | |
GetDecoder | Obtains a decoder that converts a UTF-16 encoded sequence of bytes into a sequence of Unicode characters. (Overrides Encoding.GetDecoder().) | |
GetEncoder | Obtains an encoder that converts a sequence of Unicode characters into a UTF-16 encoded sequence of bytes. (Overrides Encoding.GetEncoder().) | |
GetHashCode | Returns the hash code for the current instance. (Overrides Encoding.GetHashCode().) | |
GetMaxByteCount | Calculates the maximum number of bytes produced by encoding the specified number of characters. (Overrides Encoding.GetMaxByteCount(Int32).) | |
GetMaxCharCount | Calculates the maximum number of characters produced by decoding the specified number of bytes. (Overrides Encoding.GetMaxCharCount(Int32).) | |
GetPreamble | Returns a Unicode byte order mark encoded in UTF-16 format. (Overrides Encoding.GetPreamble().) | |
GetString | Decodes a range of bytes from a byte array into a string. (Overrides Encoding.GetString(array<Byte[], Int32, Int32).) | |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
ToString | Returns a string that represents the current object. (Inherited from Object.) |
Top
Remarks
Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. Decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters. The Unicode Standard assigns a code point (a number) to each character in every supported script. A Unicode Transformation Format (UTF) is a way to encode that code point. UTF-16 encoding represents each code point as a sequence of one to two 16-bit integers.
The encoder can use the big-endian byte order (the most significant byte first) or the little-endian byte order (the least significant byte first). For example, the Latin Capital Letter A (code point U+0041) is serialized as follows (in hexadecimal):
Big-endian byte order: 00 00 00 41
Little-endian byte order: 41 00 00 00
It is generally more efficient to store Unicode characters using the native byte order. For example, it is better to use the little-endian byte order on little-endian platforms, such as Intel computers, and big-endian byte order on big-endian platforms.
You can instantiate a UnicodeEncoding object in any of the following ways:
By retrieving the UnicodeEncoding object returned by the Unicode or BigEndianUnicode properties. The former property returns an encoding object that uses little-endian byte order, while the latter uses big-endian byte order.
By calling the GetEncoding method with either "utf-16" (for little-endian byte order) or "utf-16BE" (for big-endian byte order) as the value of its name parameter.
By calling one of the overloads of the UnicodeEncoding class constructor. Unlike the other ways to instantiate a UnicodeEncoding object, which return a default UnicodeEncoding object, overloads of the class constructor allow you to define the encoding's byte order, as well as whether encodings include a preamble, and whether an exception is thrown if an invalid encoding is encountered.
The GetByteCount method determines how many bytes result in encoding a set of Unicode characters, and the GetBytes method performs the actual encoding.
Likewise, the GetCharCount method determines how many characters result in decoding a sequence of bytes, and the GetChars and GetString methods perform the actual decoding.
Optionally, the UnicodeEncoding object provides a preamble, which is an array of bytes that can be prefixed to the sequence of bytes resulting from the encoding process. If the preamble contains a byte order mark (BOM), it helps the decoder determine the byte order and the transformation format or UTF. The GetPreamble method retrieves an array of bytes that can include the BOM. For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.
Note: |
---|
To enable error detection and to make the class instance more secure, the application should use the UnicodeEncoding constructor that takes a throwOnInvalidBytes parameter, and set that parameter to true. With error detection, a method that detects an invalid sequence of characters or bytes throws a ArgumentException. Without error detection, no exception is thrown, and the invalid sequence is generally ignored. |
Examples
The following example demonstrates how to encode a string of Unicode characters into a byte array, using UnicodeEncoding. The byte array is decoded into a string to demonstrate that there is no loss of data.
Imports System.Text
Imports Microsoft.VisualBasic.Strings
Class Example
Public Shared Sub Demo(ByVal outputBlock As System.Windows.Controls.TextBlock)
' The encoding.
Dim uni As New UnicodeEncoding()
' Create a string that contains Unicode characters.
Dim unicodeString As String = _
"This Unicode string contains two characters " & _
"with codes outside the traditional ASCII code range, " & _
"Pi (" & ChrW(928) & ") and Sigma (" & ChrW(931) & ")."
outputBlock.Text &= "Original string:" & vbCrLf
outputBlock.Text &= unicodeString & vbCrLf
' Encode the string.
Dim encodedBytes As Byte() = uni.GetBytes(unicodeString)
outputBlock.Text &= vbCrLf
outputBlock.Text &= "Encoded bytes:" & vbCrLf
Dim b As Byte
For Each b In encodedBytes
outputBlock.Text += String.Format("[{0}]", b)
Next b
outputBlock.Text &= vbCrLf
' Decode bytes back to string.
' Notice Pi and Sigma characters are still present.
Dim decodedString As String = uni.GetString(encodedBytes, _
0, encodedBytes.Length)
outputBlock.Text &= vbCrLf
outputBlock.Text &= "Decoded bytes:" & vbCrLf
outputBlock.Text &= decodedString & vbCrLf
End Sub
End Class
using System;
using System.Text;
class Example
{
public static void Demo(System.Windows.Controls.TextBlock outputBlock)
{
// The encoding.
UnicodeEncoding unicode = new UnicodeEncoding();
// Create a string that contains Unicode characters.
String unicodeString =
"This Unicode string contains two characters " +
"with codes outside the traditional ASCII code range, " +
"Pi (\u03a0) and Sigma (\u03a3).";
outputBlock.Text += "Original string:" + "\n";
outputBlock.Text += unicodeString + "\n";
// Encode the string.
Byte[] encodedBytes = unicode.GetBytes(unicodeString);
outputBlock.Text += "\n";
outputBlock.Text += "Encoded bytes:" + "\n";
foreach (Byte b in encodedBytes)
{
outputBlock.Text += String.Format("[{0}]", b);
}
outputBlock.Text += "\n";
// Decode bytes back to string.
// Notice Pi and Sigma characters are still present.
String decodedString = unicode.GetString(encodedBytes,
0, encodedBytes.Length);
outputBlock.Text += "\n";
outputBlock.Text += "Decoded bytes:" + "\n";
outputBlock.Text += decodedString + "\n";
}
}
Version Information
Silverlight
Supported in: 5, 4, 3
Silverlight for Windows Phone
Supported in: Windows Phone OS 7.1, Windows Phone OS 7.0
XNA Framework
Supported in: Xbox 360, Windows Phone OS 7.0
Platforms
For a list of the operating systems and browsers that are supported by Silverlight, see Supported Operating Systems and Browsers.
Thread Safety
Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
See Also