2.2.5.3.2 Encoding Strings

Some strings require encoding before they can be used in XML output, to remove invalid surrogate pairs for example. In the sections that follow, the descriptions of strings which require encoding will explicitly cite this section; strings with descriptions that lack such a citation can be serialized without encoding them first.

This method translates some characters into escaped numeric entity encodings.

The escape character is "_". Control characters and surrogate characters are escaped as _xHHHH_, where HHHH string stands for the four-digit hexadecimal UTF-16 code for the character in most significant bit first order.

For example, the "Order\nDetails" is encoded as:

 Order_x000A_Details

The underscore character only requires escaping when it is followed by a character sequence that, together with the underscore, can be misinterpreted as an escape sequence when decoding the name. For example, Order_Details is not encoded, but Order_x0020_ is encoded as Order_x005f_x0020_. No short forms are allowed. For example, the forms _x20_ and __ are not generated.