UnicodeEncoding.Preamble Property
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Gets a Unicode byte order mark encoded in UTF-16 format, if this object is configured to supply one.
public:
virtual property ReadOnlySpan<System::Byte> Preamble { ReadOnlySpan<System::Byte> get(); };
public override ReadOnlySpan<byte> Preamble { get; }
member this.Preamble : ReadOnlySpan<byte>
Public Overrides ReadOnly Property Preamble As ReadOnlySpan(Of Byte)
Property Value
A byte span containing the Unicode byte order mark, if this object is configured to supply one; otherwise, the default span.
Remarks
The UnicodeEncoding object can provide a preamble, which is a byte span that can be prepended to the sequence of bytes resulting from the encoding process. Prefacing a sequence of encoded bytes with a byte order mark (code point U+FEFF
) helps the decoder determine the byte order and the transformation format or UTF. The Unicode byte order mark (BOM) is serialized as follows (in hexadecimal):
Big endian byte order:
FE FF
Little endian byte order:
FF FE
You can instantiate a UnicodeEncoding object whose Preamble is a valid BOM in the following ways:
By retrieving the UnicodeEncoding object returned by the Encoding.Unicode or Encoding.BigEndianUnicode property.
By calling the parameterless UnicodeEncoding() constructor to instantiate a UnicodeEncoding object.
By supplying
true
as the value of thebyteOrderMark
argument to the UnicodeEncoding(Boolean, Boolean) or UnicodeEncoding(Boolean, Boolean, Boolean) constructors.
We recommended that you use the BOM, since it provides nearly certain identification of an encoding for files that otherwise have lost a reference to their encoding, such as untagged or improperly tagged web data or random text files stored when a business did not have international concerns. Often user problems might be avoided if data is consistently and properly tagged.
For standards that provide an encoding type, a BOM is somewhat redundant. However, it can be used to help a server send the correct encoding header. Alternatively, it can be used as a fallback in case the encoding is otherwise lost.
There are some disadvantages to using a BOM. For example, knowing how to limit the database fields that use a BOM can be difficult. Concatenation of files can be a problem also, for example, when files are merged in such a way that an unnecessary character can end up in the middle of data. In spite of the few disadvantages, however, the use of a BOM is highly recommended.
Important
To ensure that the encoded bytes are decoded properly, you should prefix the beginning of a stream of encoded bytes with a preamble. Note that the GetBytes method does not prepend a BOM to a sequence of encoded bytes; supplying a BOM at the beginning of an appropriate byte stream is the developer's responsibility.