2.2.12 BinXml

BinXml is a token representation of text XML 1.0, which is specified in [XML10]. Here, BinXml encodes an XML document so that the original XML text can be correctly reproduced from the encoding. For information about the encoding algorithm, see section 3.1.4.7.

The binary format for all numeric values is always little-endian. No alignment is required for any data. The format is given in the following Augmented Backus-Naur Form (ABNF) example, as specified in [RFC4234]).

In addition to defining the layout of the binary XML binary large objects (BLOBs), the following ABNF example has additional annotations that suggest a way to convert the binary to text. To convert to text, a tool is needed to evaluate the BinXml according to ABNF and to emit text for certain key rules. That text is emitted before evaluating the rule. The actual text to emit is defined in the sections as noted.

When processing the Attribute rule, the text generated is as specified in section 2.2.12.2.

Note When the emit rules specify emitting a literal string, that string is surrounded by quotes. The quotation marks shown are not part of the output. They are included in the text to delineate the characters that are sent on the wire. For example, an instruction might specify that "/>" is output.

            
 ; ==== Top-level Definitions ======================================
 ;
 Document = 0*1Prolog Fragment 0*1Misc EOFToken
 Prolog = PI
 Misc = PI
 Fragment = 0*FragmentHeader ( Element / TemplateInstance )
 FragmentHeader = FragmentHeaderToken MajorVersion MinorVersion Flags
 MajorVersion = OCTET
 MinorVersion = OCTET
 Flags = OCTET
            
 ;
 ; ==== Basic XML Definitions ======================================
 ;
 Element = 
  ( StartElement CloseStartElementToken Content EndElementToken ) / 
  ( StartElement CloseEmptyElementToken ) ; Emit using Element Rule
 Content =  
  0*(Element / CharData / CharRef / EntityRef / CDATASection / PI)
 CharData = ValueText / Substitution
 StartElement = 
  OpenStartElementToken 0*1DependencyId ElementByteLength 
  Name 0*1AttributeList
 DependencyId = WORD
 ElementByteLength = DWORD
 AttributeList = AttributeListByteLength 1*Attribute
 Attribute = 
  AttributeToken Name AttributeCharData ; Emit using Attribute Rule
 AttributeCharData = 
  0*(ValueText / Substitution / CharRef / EntityRef)
 AttributeListByteLength = DWORD
 ValueText = ValueTextToken StringType LengthPrefixedUnicodeString
 Substitution = 
  NormalSubstitution / OptionalSubstitution 
 ; Emit using Substitution Rule
 NormalSubstitution = 
  NormalSubstitutionToken SubstitutionId ValueType
 OptionalSubstitution = 
  OptionalSubstitutionToken SubstitutionId ValueType
 SubstitutionId = WORD
 CharRef = CharRefToken WORD ; Emit using CharRef Rule
 EntityRef = EntityRefToken Name ; Emit using EntityRef Rule
 CDATASection = CDATASectionToken LengthPrefixedUnicodeString 
 ; Emit using CDATA Section Rule
 PI = PITarget PIData
 PITarget = PITargetToken Name ; Emit using PITarget Rule
 PIData = PIDataToken LengthPrefixedUnicodeString 
 ; Emit using PIData Rule
 Name = NameHash NameNumChars NullTerminatedUnicodeString
 NameHash = WORD
 NameNumChars = WORD
            
 ; 
 ; ==== Token Types ================================================
 ;
 EOFToken = %x00
 OpenStartElementToken = %x01 / %x41
 CloseStartElementToken = %x02 ;Emit using CloseStartElementToken Rule
 CloseEmptyElementToken = %x03 ;Emit using CloseEmptyElementToken Rule
 EndElementToken = %x04 ; Emit using EndElementToken Rule
            
 ValueTextToken = %x05 / %x45
 AttributeToken = %x06 / %x46
 CDATASectionToken = %x07 / %x47
 CharRefToken = %x08 / %x48
 EntityRefToken = %x09 / %x49
            
 PITargetToken = %x0A
 PIDataToken = %x0B
 TemplateInstanceToken = %x0C
 NormalSubstitutionToken = %x0D
 OptionalSubstitutionToken = %x0E
 FragmentHeaderToken = %x0F
            
 ;
 ; ==== Template-related definitions ===============================
 ;
 TemplateInstance = 
  TemplateInstanceToken TemplateDef TemplateInstanceData
 TemplateDef = 
  %b0 TemplateId TemplateDefByteLength 
  0*FragmentHeader Element EOFToken
 TemplateId = GUID
            
 ;
 ; The full length of the value section of the TemplateInstanceData 
 ; can be obtained by adding up all the lengths described in the 
 ; value spec.
 ;
 TemplateInstanceData = 
  ValueSpec *Value; Emit using TemplateInstanceDataRule
 ValueSpec = NumValues *ValueSpecEntry
 NumValues = DWORD
 ValueSpecEntry = ValueByteLength ValueType %x00
 ValueByteLength = WORD
            
 TemplateDefByteLength = DWORD
            
 ;
 ; ==== Value Types =================================================
 ;
 ValueType = 
  NullType / StringType / AnsiStringType / Int8Type / UInt8Type / 
  Int16Type / UInt16Type / Int32Type / UInt32Type / Int64Type / 
  Int64Type / Real32Type / Real64Type / BoolType / BinaryType / 
  GuidType / SizeTType / FileTimeType / SysTimeType / SidType / 
  HexInt32Type / HexInt64Type / BinXmlType / StringArrayType / 
  AnsiStringArrayType / Int8ArrayType / UInt8ArrayType / 
  Int16ArrayType / UInt16ArrayType / Int32ArrayType / UInt32ArrayType/
  Int64ArrayType / UInt64ArrayType / Real32ArrayType / 
  Real64ArrayType / BoolArrayType / GuidArrayType / SizeTArrayType / 
  FileTimeArrayType / SysTimeArrayType / SidArrayType / 
  HexInt32ArrayType / HexInt64ArrayType
 NullType = %x00
 StringType = %x01
 AnsiStringType = %x02
 Int8Type = %x03
 UInt8Type = %x04
 Int16Type = %x05
 UInt16Type = %x06
 Int32Type = %x07
 UInt32Type = %x08
 Int64Type = %x09
 UInt64Type = %x0A
 Real32Type = %x0B
 Real64Type = %x0C
 BoolType = %x0D
 BinaryType = %x0E
 GuidType = %x0F
 SizeTType = %x10
 FileTimeType = %x11
 SysTimeType = %x12
 SidType = %x13
 HexInt32Type = %x14
 HexInt64Type = %x15
 BinXmlType = %x21
 StringArrayType = %x81
 AnsiStringArrayType = %x82
 Int8ArrayType = %x83
 UInt8ArrayType = %x84
 Int16ArrayType = %x85
 UInt16ArrayType = %x86
 Int32ArrayType = %x87
 UInt32ArrayType = %x88
 Int64ArrayType = %x89
 UInt64ArrayType = %x8A
 Real32ArrayType = %x8B
 Real64ArrayType = %x8C
 BoolArrayType = %x8D
 GuidArrayType = %x8F
 SizeTArrayType = %x90
 FileTimeArrayType = %x91
 SysTimeArrayType = %x92
 SidArrayType = %x93
 HexInt32ArrayType = %x00 %x94
 HexInt64ArrayType = %x00 %x95
            
 ;
 ; === Value Formats ================================================
 ;
 Value = 
  StringValue / AnsiStringValue / Int8Value / UInt8Value / 
  Int16Value / UInt16Value / Int32Value / UInt32Value / Int64Value /
  UInt64Value / Real32Value / Real64Value / BoolValue / BinaryValue / 
  GuidValue / SizeTValue / FileTimeValue / SysTimeValue / SidValue /
  HexInt32Value / HexInt64Value / BinXmlValue / StringArrayValue / 
  AnsiStringArrayValue / Int8ArrayValue / UInt8ArrayValue / 
  Int16ArrayValue / UInt16ArrayValue / Int32ArrayValue / 
  UInt32ArrayValue / Int64ArrayValue / UInt64ArrayValue / 
  Real32ArrayValue / Real64ArrayValue / BoolArrayValue / 
  GuidArrayValue / SizeTArrayValue / FileTimeArrayValue / 
  SysTimeArrayValue / SidArrayValue / HexInt32ArrayValue / 
  HexInt64ArrayValue
 StringValue = 0*WORD
 AnsiStringValue = 0*OCTET
 Int8Value = OCTET
 UInt8Value = OCTET
 Int16Value = 2*2OCTET
 UInt16Value = 2*2OCTET
 Int32Value = 4*4OCTET
 UInt32Value = 4*4OCTET
 Int64Value = 8*8OCTET
 UInt64Value = 8*8OCTET
 Real32Value = 4*4OCTET
 Real64Value = 8*8OCTET
 BoolValue = OCTET
 BinaryValue = *OCTET
 GuidValue = GUID
 SizeTValue = UInt32Value / UInt64Value
 FileTimeValue = 8*8OCTET
 SysTimeValue = 16*16OCTET
 SidValue = *OCTET
 HexInt32Value = UInt32Value
 HexInt64Value = UInt64Value
 BinXmlValue = Fragment EOFToken
            
 StringArrayValue = *NullTerminatedUnicodeString
 AnsiStringArrayValue = *NullTerminatedAnsiString
 Int8ArrayValue = *Int8Value
 UInt8ArrayValue = *UInt8Value
 Int16ArrayValue = *Int16Value
 UInt16ArrayValue = *UInt16Value
 Int32ArrayValue = *Int32Value
 UInt32ArrayValue = *UInt32Value
 Int64ArrayValue = *Int64Value
 UInt64ArrayValue = *UInt64Value
 Real32ArrayValue = *Real32Value
 Real64ArrayValue = *Real64Value
 BoolArrayValue = *BoolValue
 GuidArrayValue = *GuidValue
 SizeTArrayValue = *SizeTValue
 FileTimeArrayValue = *FileTimeValue
 SysTimeArrayValue = *SysTimeValue
 SidArrayValue = *SidValue
 HexInt32ArrayValue = *HexInt32Value
 HexInt64ArrayValue = *HexInt64Value
            
 ;
 ; ==== Base Types =================================================
 ; 
 NullTerminatedUnicodeString = StringValue %x00 %x00
 NullTerminatedAnsiString = AnsiStringValue %x00
 LengthPrefixedUnicodeString = NumUnicodeChars StringValue
 NumUnicodeChars = WORD
 OCTET = %x0
 WORD = 2*2OCTET
 DWORD = 4*4OCTET
 GUID = 16*16OCTET

Entity

Description

MajorVersion

The major version of BinXml. MUST be set to 1.

MinorVersion

The minor version of BinXml. MUST be set to 1.

Flags

The reserved value in the BinXml header. Not used currently and MUST be 0.

DependencyID

Specifies the index into the ValueSpec list of an instance of the TemplateDefinition (TemplateInstance). If the ValueType at that index is NullType, the element MUST NOT be included for rendering purposes. If the index is 0xFFFF, there is no dependency for the element.

ElementByteLength

The number of bytes that is after ElementByteLength and that makes up the entire element definition, including the EndElementToken or CloseEmptyElementToken for the element.

AttributeListByteLength

The number of bytes in the attribute list that is after AttributeListByteLength and is up to, but not including, the CloseStartElementToken or CloseEmptyElementToken; typically used for jumping to the end of the enclosing start element tag.

AttributeCharData

The character data that appears in an attribute value.

SubstitutionId

A 0-based positional identifier into the set of substitution values. Zero indicates the first substitution value; 1 indicates the second substitution value, and so on.

CharRef

An XML 1.0 character reference value.

NameHash

The low order 16 bits of the value that is generated by performing a hash of the binary representation of Name (in which NameNumChars * 2 is the hash input length).

The hash function is implemented by initially setting the value of the hash to zero. For each character in Name, multiply the previous value of the hash by 65599 and add the binary representation of the character to the hash value.

The following pseudocode shows how to implement this hash function.

 hash(str)
 {
       hashVal = 0;
       for(i=0; i < strLen; i++ )
           hashVal = hashVal*65599 + str[i];
       return hashVal;
 }

NameNumChars

The number of Unicode characters for the NameData, not including the null terminator.

OpenStartElementToken

A value of 0x01 indicates that the element start tag contains no elements; a value of 0x41 indicates that an attribute list can be expected in the element start tag.

ValueTextToken

A value of 0x45 indicates that more data can be expected to follow in the current content of the element or attribute; a value of 0x05 indicates that no more such data follows.

AttributeToken

A value of 0x46 indicates that there is another attribute in the attribute list; a value of 0x06 indicates that no more attributes exist.

CDATASectionToken

A value of 0x47 indicates that more data can be expected to follow in the current content of the element or attribute; a value of 0x07 indicates that no more such data follows.

CharRefToken

A value of 0x48 indicates that more data can be expected to follow in the current content of the element or attribute; a value of 0x08 indicates that no more such data follows.

EntityRefToken

A value of 0x49 indicates that more data can be expected to follow in the current content of the element or attribute; a value of 0x09 indicates that no more such data follows.

TemplateId

The raw data of the GUID that identifies a template definition.

NumValues

The number of substitution values that make up the Template Instance Data.

ValueByteLength

The length, in bytes, of a substitution value as it appears in the Template Instance Data.

TemplateDefByteLength

The number of bytes after the TemplateDefByteLength up to and including the EOFToken (end of fragment or document) element for the template definition.

ValueType

The type of a substitution value, as it appears in the Template Instance Data.

Value

The raw data of the substitution value.

NumUnicodeChars

The number of wide characters in LengthPrefixedUnicodeString. The Length MUST include the null terminator if one is present in the string; however, length-prefixed strings are not required to have a null terminator.