Encoding.GetMaxByteCount(Int32) Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
When overridden in a derived class, calculates the maximum number of bytes produced by encoding the specified number of characters.
public:
abstract int GetMaxByteCount(int charCount);
public abstract int GetMaxByteCount (int charCount);
abstract member GetMaxByteCount : int -> int
Public MustOverride Function GetMaxByteCount (charCount As Integer) As Integer
Parameters
- charCount
- Int32
The number of characters to encode.
Returns
The maximum number of bytes produced by encoding the specified number of characters.
Exceptions
charCount
is less than zero.
A fallback occurred (for more information, see Character Encoding in .NET)
-and-
EncoderFallback is set to EncoderExceptionFallback.
Examples
The following example determines the number of bytes required to encode a character array, encodes the characters, and displays the resulting bytes.
using namespace System;
using namespace System::Text;
void PrintCountsAndBytes( array<Char>^chars, Encoding^ enc );
void PrintHexBytes( array<Byte>^bytes );
int main()
{
// The characters to encode:
// Latin Small Letter Z (U+007A)
// Latin Small Letter A (U+0061)
// Combining Breve (U+0306)
// Latin Small Letter AE With Acute (U+01FD)
// Greek Small Letter Beta (U+03B2)
// a high-surrogate value (U+D8FF)
// a low-surrogate value (U+DCFF)
array<Char>^myChars = gcnew array<Char>{
L'z','a',L'\u0306',L'\u01FD',L'\u03B2',L'\xD8FF',L'\xDCFF'
};
// Get different encodings.
Encoding^ u7 = Encoding::UTF7;
Encoding^ u8 = Encoding::UTF8;
Encoding^ u16LE = Encoding::Unicode;
Encoding^ u16BE = Encoding::BigEndianUnicode;
Encoding^ u32 = Encoding::UTF32;
// Encode the entire array, and print out the counts and the resulting bytes.
PrintCountsAndBytes( myChars, u7 );
PrintCountsAndBytes( myChars, u8 );
PrintCountsAndBytes( myChars, u16LE );
PrintCountsAndBytes( myChars, u16BE );
PrintCountsAndBytes( myChars, u32 );
}
void PrintCountsAndBytes( array<Char>^chars, Encoding^ enc )
{
// Display the name of the encoding used.
Console::Write( "{0,-30} :", enc );
// Display the exact byte count.
int iBC = enc->GetByteCount( chars );
Console::Write( " {0,-3}", iBC );
// Display the maximum byte count.
int iMBC = enc->GetMaxByteCount( chars->Length );
Console::Write( " {0,-3} :", iMBC );
// Encode the array of chars.
array<Byte>^bytes = enc->GetBytes( chars );
// Display all the encoded bytes.
PrintHexBytes( bytes );
}
void PrintHexBytes( array<Byte>^bytes )
{
if ( (bytes == nullptr) || (bytes->Length == 0) )
Console::WriteLine( "<none>" );
else
{
for ( int i = 0; i < bytes->Length; i++ )
Console::Write( "{0:X2} ", bytes[ i ] );
Console::WriteLine();
}
}
/*
This code produces the following output.
System.Text.UTF7Encoding : 18 23 :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D
System.Text.UTF8Encoding : 12 24 :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF
System.Text.UnicodeEncoding : 14 16 :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC
System.Text.UnicodeEncoding : 14 16 :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF
System.Text.UTF32Encoding : 24 32 :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00
*/
using System;
using System.Text;
public class SamplesEncoding {
public static void Main() {
// The characters to encode:
// Latin Small Letter Z (U+007A)
// Latin Small Letter A (U+0061)
// Combining Breve (U+0306)
// Latin Small Letter AE With Acute (U+01FD)
// Greek Small Letter Beta (U+03B2)
// a high-surrogate value (U+D8FF)
// a low-surrogate value (U+DCFF)
char[] myChars = new char[] { 'z', 'a', '\u0306', '\u01FD', '\u03B2', '\uD8FF', '\uDCFF' };
// Get different encodings.
Encoding u7 = Encoding.UTF7;
Encoding u8 = Encoding.UTF8;
Encoding u16LE = Encoding.Unicode;
Encoding u16BE = Encoding.BigEndianUnicode;
Encoding u32 = Encoding.UTF32;
// Encode the entire array, and print out the counts and the resulting bytes.
PrintCountsAndBytes( myChars, u7 );
PrintCountsAndBytes( myChars, u8 );
PrintCountsAndBytes( myChars, u16LE );
PrintCountsAndBytes( myChars, u16BE );
PrintCountsAndBytes( myChars, u32 );
}
public static void PrintCountsAndBytes( char[] chars, Encoding enc ) {
// Display the name of the encoding used.
Console.Write( "{0,-30} :", enc.ToString() );
// Display the exact byte count.
int iBC = enc.GetByteCount( chars );
Console.Write( " {0,-3}", iBC );
// Display the maximum byte count.
int iMBC = enc.GetMaxByteCount( chars.Length );
Console.Write( " {0,-3} :", iMBC );
// Encode the array of chars.
byte[] bytes = enc.GetBytes( chars );
// Display all the encoded bytes.
PrintHexBytes( bytes );
}
public static void PrintHexBytes( byte[] bytes ) {
if (( bytes == null ) || ( bytes.Length == 0 ))
{
Console.WriteLine( "<none>" );
}
else {
for ( int i = 0; i < bytes.Length; i++ )
Console.Write( "{0:X2} ", bytes[i] );
Console.WriteLine();
}
}
}
/*
This code produces the following output.
System.Text.UTF7Encoding : 18 23 :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D
System.Text.UTF8Encoding : 12 24 :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF
System.Text.UnicodeEncoding : 14 16 :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC
System.Text.UnicodeEncoding : 14 16 :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF
System.Text.UTF32Encoding : 24 32 :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00
*/
Imports System.Text
Public Class SamplesEncoding
Public Shared Sub Main()
' The characters to encode:
' Latin Small Letter Z (U+007A)
' Latin Small Letter A (U+0061)
' Combining Breve (U+0306)
' Latin Small Letter AE With Acute (U+01FD)
' Greek Small Letter Beta (U+03B2)
' a high-surrogate value (U+D8FF)
' a low-surrogate value (U+DCFF)
Dim myChars() As Char = {"z"c, "a"c, ChrW(&H0306), ChrW(&H01FD), ChrW(&H03B2), ChrW(&HD8FF), ChrW(&HDCFF)}
' Get different encodings.
Dim u7 As Encoding = Encoding.UTF7
Dim u8 As Encoding = Encoding.UTF8
Dim u16LE As Encoding = Encoding.Unicode
Dim u16BE As Encoding = Encoding.BigEndianUnicode
Dim u32 As Encoding = Encoding.UTF32
' Encode the entire array, and print out the counts and the resulting bytes.
PrintCountsAndBytes(myChars, u7)
PrintCountsAndBytes(myChars, u8)
PrintCountsAndBytes(myChars, u16LE)
PrintCountsAndBytes(myChars, u16BE)
PrintCountsAndBytes(myChars, u32)
End Sub
Public Shared Sub PrintCountsAndBytes(chars() As Char, enc As Encoding)
' Display the name of the encoding used.
Console.Write("{0,-30} :", enc.ToString())
' Display the exact byte count.
Dim iBC As Integer = enc.GetByteCount(chars)
Console.Write(" {0,-3}", iBC)
' Display the maximum byte count.
Dim iMBC As Integer = enc.GetMaxByteCount(chars.Length)
Console.Write(" {0,-3} :", iMBC)
' Encode the array of chars.
Dim bytes As Byte() = enc.GetBytes(chars)
' Display all the encoded bytes.
PrintHexBytes(bytes)
End Sub
Public Shared Sub PrintHexBytes(bytes() As Byte)
If bytes Is Nothing OrElse bytes.Length = 0 Then
Console.WriteLine("<none>")
Else
Dim i As Integer
For i = 0 To bytes.Length - 1
Console.Write("{0:X2} ", bytes(i))
Next i
Console.WriteLine()
End If
End Sub
End Class
'This code produces the following output.
'
'System.Text.UTF7Encoding : 18 23 :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D
'System.Text.UTF8Encoding : 12 24 :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF
'System.Text.UnicodeEncoding : 14 16 :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC
'System.Text.UnicodeEncoding : 14 16 :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF
'System.Text.UTF32Encoding : 24 32 :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00
Remarks
The charCount
parameter actually specifies the number of Char objects that represent the Unicode characters to encode, because .NET internally uses UTF-16 to represent Unicode characters. Consequently, most Unicode characters can be represented by one Char object, but a Unicode character represented by a surrogate pair, for example, requires two Char objects.
To calculate the exact array size required by GetBytes to store the resulting bytes, you should use the GetByteCount method. To calculate the maximum array size, use the GetMaxByteCount method. The GetByteCount method generally allows allocation of less memory, while the GetMaxByteCount method generally executes faster.
GetMaxByteCount retrieves a worst-case number, including the worst case for the currently selected EncoderFallback. If a fallback is chosen with a potentially large string, GetMaxByteCount retrieves large values, particularly in cases where the worst case for the encoding involves switching modes for every character. For example, this can happen for ISO-2022-JP. For more information, see the blog post "What's with Encoding.GetMaxByteCount() and Encoding.GetMaxCharCount()?.
In most cases, this method retrieves reasonable values for small strings. For large strings, you might have to choose between using very large buffers and catching errors in the rare case when a more reasonable buffer is too small. You might also want to consider a different approach using GetByteCount or Encoder.Convert.
When using GetMaxByteCount, you should allocate the output buffer based on the maximum size of the input buffer. If the output buffer is constrained in size, you might use the Convert method.
Note that GetMaxByteCount considers potential leftover surrogates from a previous decoder operation. Because of the decoder, passing a value of 1 to the method retrieves 2 for a single-byte encoding, such as ASCII. You should use the IsSingleByte property if this information is necessary.
Note
GetMaxByteCount(N)
is not necessarily the same value as N* GetMaxByteCount(1)
.
Notes to Implementers
All Encoding implementations must guarantee that no buffer overflow exceptions occur if buffers are sized according to the results of this method's calculations.