Whitespace Processing in XAML

Extensible Application Markup Language (XAML) has language rules that state how significant whitespace must be processed by a XAML processor implementation. This topic documents these language rules, as well as additional whitespace handling defined by the Windows Presentation Foundation (WPF) implementation of the XAML processor, and the XAML writer for serialization.

Whitespace Processing

Whitespace Definition

Consistent with XML, whitespace characters in XAML are space, linefeed, and tab. These correspond to the Unicode values 0020, 000A, and 0009 respectively.

Whitespace Normalization

By default the following whitespace normalization occurs when a XAML processor processes a XAML file:

  1. Linefeed characters between East Asian characters are removed. See East Asian Characters section further down in this topic for a definition of "East Asian characters".

  2. All whitespace characters (space, linefeed, tab) are converted into spaces.

  3. All consecutive spaces are deleted and replaced by one space.

  4. A space immediately following the start tag is deleted.

  5. A space immediately before the end tag is deleted.

"Default" corresponds to the state denoted by the default value of the xml:space attribute.

Whitespace in Inner Text, and String Primitives

The above normalization rules apply to inner text found within XAML elements. After normalization, a XAML processor will convert any inner text into an appropriate type as follows:

  • If the type of the property is not a collection, but is not directly an Object type, the XAML processor attempts to convert to that type using its type converter. A failed conversion here will result in a compile time error.

  • If the type of the property is a collection, and the inner text is contiguous (no intervening element tags), the inner text is parsed as a single String. If the collection type cannot accept String, this also results in a compile time error.

  • If the type of the property is Object, then the inner text is parsed as a single String. If there are intervening element tags, this results in a compile time error, because the Object type implies a single object (String or otherwise).

  • If the type of the property is a collection, and the inner text is not contiguous, then the first substring is converted into a String and added as a collection item, the intervening element is added as a collection item, and finally the trailing substring (if any) is added to the collection as a third String item.

Whitespace and Text Content Models

In practice, preserving whitespace is only of concern for a subset of all possible content models. That subset is composed of content models that can take a singleton String type in some form, a dedicated String collection, or a mixture of String and other types in an IList or ICollection<T> collection.

Even for content models that can take strings, the default behavior within these content models is that any whitespace that remains is not treated as significant. For instance, ListBox takes an IList, but the whitespace (such as linefeeds between each ListBoxItem) is not preserved and not rendered, and in fact attempting to use linefeeds as separators between strings for ListBoxItem items does not work at all; the strings separated by the linefeeds are treated as one string and one item.

Those collections that do treat whitespace as significant are typically part of the flow document model. The primary collection that supports whitespace preservation behavior is InlineCollection. This collection class is declared with the WhitespaceSignificantCollectionAttribute; when this attribute is found, the XAML processor will treat whitespace within the collection as significant. The combination of xml:space="preserve" and whitespace within a WhitespaceSignificantCollectionAttribute denoted collection is that ALL whitespace is preserved and rendered. The combination of xml:space="default" and whitespace within a WhitespaceSignificantCollectionAttribute will result in the initial whitespace normalization described earlier, which will leave one whitespace in certain positions, and those whitespaces are preserved and rendered. Which behavior is desirable is up to you, and you should use xml:space selectively to enable the behavior that you want.

Also, certain inline elements that connote a linebreak in a flow document model should deliberately not introduce an extra space even in a whitespace significant collection. For instance, the LineBreak element has the same purpose as the <BR/> tag in HTML, and for readability in markup typically a LineBreak is separated from any subsequent text by an authored linefeed. That linefeed should not be normalized to become a leading space in the subsequent line. To enable that behavior, the class definition for the LineBreak element applies the TrimSurroundingWhitespaceAttribute, which is then interpreted by the XAML processor to mean that whitespace surrounding LineBreak is always trimmed.

Preserving Whitespace

There are several techniques for preserving whitespace in the source XAML for eventual presentation that are not affected by XAML processor whitespace normalization.

xml:space="preserve": Specify this attribute at the level of the element where whitespace preservation is desired. Note that this will preserve all whitespace, including the spaces that might be added by code editing applications to "pretty-print" align elements as a visually intuitive nesting, but whether those spaces render is again a matter of the content model for the containing element. Specifying xml:space="preserve" at the root level is not recommended, because the majority of object models do not consider whitespace as significant one way or another. It is a better practice to only set the attribute specifically at the level of elements that render whitespace within strings, or are whitespace significant collections.

Entities and non breaking spaces: XAML supports placing any Unicode entity within a text object model. You can use dedicated entities such as nonbreaking space (&#160; in UTF-8 encoding). You can also use rich text controls that support nonbreaking space characters. You should be cautious if you are using entities to simulate layout characteristics such as indention, because the run-time output of the entities will vary based on a greater number of factors than would the general layout facilities, such as proper use of panels and margins. For instance, entities are mapped to fonts and can change size in response to user font selection.

East Asian Characters

"East Asian characters" is defined as a set of Unicode character ranges U+20000 to U+2FFFD and U+30000 to U+3FFFD. This subset is also sometimes referred to as "CJK ideographs". For more information, see http://www.unicode.org.

See Also

Concepts

XAML Overview

Reference

XML Character Entities and XAML

xml:space Handling in XAML