Rediger

Del via


UnicodeEncoding.Preamble Property

Definition

Gets a Unicode byte order mark encoded in UTF-16 format, if this object is configured to supply one.

public:
 virtual property ReadOnlySpan<System::Byte> Preamble { ReadOnlySpan<System::Byte> get(); };
public override ReadOnlySpan<byte> Preamble { get; }
member this.Preamble : ReadOnlySpan<byte>
Public Overrides ReadOnly Property Preamble As ReadOnlySpan(Of Byte)

Property Value

A byte span containing the Unicode byte order mark, if this object is configured to supply one; otherwise, the default span.

Remarks

The UnicodeEncoding object can provide a preamble, which is a byte span that can be prepended to the sequence of bytes resulting from the encoding process. Prefacing a sequence of encoded bytes with a byte order mark (code point U+FEFF) helps the decoder determine the byte order and the transformation format or UTF. The Unicode byte order mark (BOM) is serialized as follows (in hexadecimal):

  • Big endian byte order: FE FF

  • Little endian byte order: FF FE

You can instantiate a UnicodeEncoding object whose Preamble is a valid BOM in the following ways:

We recommended that you use the BOM, since it provides nearly certain identification of an encoding for files that otherwise have lost a reference to their encoding, such as untagged or improperly tagged web data or random text files stored when a business did not have international concerns. Often user problems might be avoided if data is consistently and properly tagged.

For standards that provide an encoding type, a BOM is somewhat redundant. However, it can be used to help a server send the correct encoding header. Alternatively, it can be used as a fallback in case the encoding is otherwise lost.

There are some disadvantages to using a BOM. For example, knowing how to limit the database fields that use a BOM can be difficult. Concatenation of files can be a problem also, for example, when files are merged in such a way that an unnecessary character can end up in the middle of data. In spite of the few disadvantages, however, the use of a BOM is highly recommended.

Important

To ensure that the encoded bytes are decoded properly, you should prefix the beginning of a stream of encoded bytes with a preamble. Note that the GetBytes method does not prepend a BOM to a sequence of encoded bytes; supplying a BOM at the beginning of an appropriate byte stream is the developer's responsibility.

Applies to