Working with XML Schemas in InfoPath
In this article
Introduction
Using externally-authored XML Schemas
Unsupported XSD Constructs
XSD Constructs with Reduced Functionality
XSD Constructs with Special Meaning in InfoPath
Debugging Common XSD Errors
How to Edit or Author an XSD for InfoPath
Conclusion
Introduction
A form template that you create with Microsoft Office InfoPath 2007 uses an XML Schema (XSD) to perform structural and data validation on the XML that is input, edited, and output from an InfoPath form. Every form template created in InfoPath design mode contains at least one XSD schema file (.xsd) that is used for validation at run time.
Note |
---|
The information contained in this article applies to form templates designed for use in the Office InfoPath 2007 client. Browser-compatible form templates have stricter XSD Schema requirements. For more information, see the documentation about XML Schemas in browser-compatible form templates available on MSDN. |
Using externally-authored XML Schemas
To load an XSD schema file that was authored outside of InfoPath, follow these steps:
- On the File menu, click Design a Form Template.
- In the Design a Form Template dialog box, click XML or Schema, and click OK.
Note |
---|
You can specify any XSD file you want to use by manually editing your template files outside of design mode. The XSD file you specify, however, must be one that Microsoft XML Core Services (MSXML) 5.0 considers valid. |
For additional information about XML Schema standards and implementation, see the following resources:
- W3C XML Schema
- W3C Schema Primer
- W3C Schema Structure Reference
- W3C Schema Datatypes Reference
- XML Schema Tutorial
- XML Developer Center: XML Schemas
Unsupported XSD Constructs
The following section describes XSD constructs that InfoPath cannot handle at run time. Avoid these constructs when creating a form template in InfoPath design mode.
ENTITY and ENTITIES Types
The ENTITY and ENTITIES types require a document type definition (DTD) for validation, which InfoPath does not support. InfoPath does not allow you to design a form template against such a schema and displays an error message that recommends changing the ENTITY type to the NCName type from which ENTITY derives.
Note |
---|
If you manually author an InfoPath form template outside of design mode, and it uses an XSD that includes ENTITY and ENTITIES types, the form template may work at run time if the Template.xml file contains the required DTD for these types. |
Required xsd:any Element
An occurrence of an xsd:any
wildcard element, that is, an occurrence of an xsd:any
element with a minOccurs
attribute value greater than zero ("required any"), prevents InfoPath from deterministically creating a valid instance for this schema fragment. InfoPath must be able to create a valid instance when generating a form that uses this schema fragment. As part of running the Data Source Wizard, schemas with required xsd:any
elements require you to choose which schema element you want to use in place of the required xsd:any
element.
Elements with an Abstract Complex Type
InfoPath design mode supports designing a form template against schemas that use abstract complex types. For example, if an element named shippingAddress has an abstract complex type named address that has two derivations, USAddress and CanadianAddress, InfoPath treats any instance of shippingAddress as a choice between shippingAddress with type USAddress and shippingAddress with type CanadianAddress. In this example, if the provided schemas contain no types that derive from address, then InfoPath requests an additional schema that fulfills this requirement.
XSD Constructs with Reduced Functionality
The following section describes XSD constructs that have reduced functionality when used to create a form template in InfoPath design mode.
Substitution Groups
All members of the substitution group appear in the Data Source task pane. InfoPath represents the substitution possibilities as a choice of all the substitution groups (including the defining element, if it is not abstract). If there are no substitution groups for an abstract element, InfoPath prompts you to provide a schema that contains at least one element that is a substitution group.
Unbounded Choice Elements
The following schema fragment shows an unbounded choice element:
|
InfoPath displays repeating choice elements as repeating choices in the Data Source task pane. There is a Repeating Choice Group control that you can use to represent the heterogeneous list defined by the repeating choice element in the XSD.
Repeating Sequence
The following schema fragment shows a repeating sequence:
|
As long as the repeating sequence contains a required element, InfoPath loads the XSD without modifying it and allows you to bind repeating section controls to the repeating sequence.
Choice of Model Groups
The following schema fragment shows the choice element containing several model groups:
|
InfoPath design mode supports such XSD constructs without requiring any modification by the form designer. While InfoPath does not modify the meaning of the schema, it simplifies the choice construct above into an equivalent collapsed single choice in the Data Source task pane.
Optional Sibling with Same Qualified Name
The following schema fragment shows an optional sibling with same qualified name (QName
):
|
XPath expressions for these nodes can be complex because every potential XML instance must be accounted for in the InfoPath design mode. InfoPath does not expose parts of the schema for which it may have difficulty creating correct XPath bindings. Warnings appear regarding the portions of the schema that are ignored.
XSD Constructs with Special Meaning in InfoPath
The following sections describe XSD constructs that have special meaning when used in creating a form template in design mode. These sections describe how you can use the constructs in your schema to enable certain behaviors.
Adding New Element Fields and Groups with the Data Source Task Pane
You can construct your schema so that you can use the Data Source task pane to add new element fields and groups to an element at design time. To do so, you declare an element in your schema with an optional, unbounded xsd:any
element that specifies the namespace attribute with the ##any
wildcard. Then, in design mode, you can use the Data Source task pane to add new element fields and groups to that element. For example, you could add new content to the following element:
|
Adding New Attribute Fields with the Data Source Task Pane
Similarly to the element case, you can declare an attribute with an anyAttribute
element that has the namespace attribute specified as the ##any
wildcard. At design time, you can use the Data Source task pane to add new content to that schema attribute.
|
Storing XML Signatures in the Data Source
To enable users to digitally sign a form at run time, the schema of the data source must declare an element named signature for storing the XML Signatures (digital signature) information that is created when a user signs the form. You make this declaration by using the xsd:any
element with the namespace attribute specified as the XML Signatures namespace with a wildcard character, as follows: "http://www.w3c.org/2000/09/xmldsig#"
|
Binding a Field to a Rich Text Box Control
Rich Text Box controls in InfoPath generate generic XHTML; consequently, your schema must specify that any number of text and XHTML nodes is valid in the XML of the form instance. You can achieve this specification with the following XSD construct:
|
Note |
---|
InfoPath never modifies the content of the schema file (.xsd), but it may logically infer a subset of it for design purposes. The schema file is always untouched within the form template at both design time and run time. |
Debugging Common XSD Errors
If you load externally authored XSD files to create form templates in InfoPath design mode, you may receive either of two types of error messages: MSXML error messages or InfoPath error messages. MSXML error messages appear in the Details section of an InfoPath error message dialog box, and they always begin with a reference to the name or path of the schema file that is raising the error. Some valid XSD schema constructs are not supported by InfoPath; these are discussed in the Unsupported XSD Constructs section. This section describes some common errors that can cause schemas to fail to load successfully in InfoPath.
The XSD Namespace Declaration
Similar to all W3C standards, XML Schemas (XSD) went through a lengthy review process on its way to becoming a recommendation. There were many working drafts, and consequently, many XSD files were written based on these evolving standards. During this process, Microsoft created a proprietary schema language called XML-Data Reduced (XDR) that was included with MSXML 3.0. With the release of MSXML 4.0 and later, Microsoft XML Core Services supports the full recommendation of XSD. Many programs for creating schemas did not wait for XSD to become a full recommendation. Older versions of these programs may produce outdated XSD files that the MSXML 5.0 infrastructure, on which InfoPath depends, does not support.
To ensure that an XSD file supports the full XSD recommendation, it should contain the following XML namespace declaration in the <schema> tag:
|
Similar to all XML namespace declarations, the XML prefix (in this case 'xsd') can be any valid prefix string. Some common prefixes you may see in practice are 'xsd', 'xs', and '' (no prefix). MSXML 5.0 usually reports an error about the root not being properly defined if this namespace declaration is missing.
Importing and Including Schemas
XSD schemas are extensible and can import and include other schemas. Generally, you should import a schema if the schema specified in the targetNamespace
attribute differs from the current schema. You should include it if the schema specified in the targetNamespace
attribute is the same as the current schema.
The semantics for importing and including schemas are as follows:
|
If the schemaLocation
attribute is missing (as happens with some converters), then MSXML 5.0 raises an error because it cannot find the file. If you get this error, also check to make sure that the resource or location specified in the schemaLocation
attribute is accessible by users of the form template. Obviously, errors occur if the schemaLocation
attribute references a server or directory that is down or nonexistent or if users do not have access permissions. Also, be sure to examine all imported and included schemas to make sure they are valid.
Note |
---|
Errors caused by problems with the schemaLocation attribute are an issue only when InfoPath first imports the schemas; that is, when you first start designing a form based on an existing schema. After that, InfoPath works with cached versions of the schema files that are stored in the form template. |
An empty namespace attribute is allowed when importing a schema, if that schema does not specify a targetNamespace
attribute. In general, the namespace on the import must match the targetNamespace
specified in the schema that you import.
Nondeterministic Schemas
The MSXML 5.0 infrastructure that InfoPath depends upon can reliably detect and raise errors to alert you to nondeterministic schemas, but the resultant error message does not provide a line number to tell you which part of the schema is raising the error. This section discusses why it is important for XSD schema files to be deterministic and what it means to be nondeterministic. It also shows some common errors to avoid.
XSD schemas exist for the purpose of validating XML data structure and type semantics. To accomplish this task, the validating system (in this case, MSXML 5.0) must map XML nodes to XSD declarations. Without this mapping, the validating system cannot accomplish its task. If a mapping can be guaranteed, then the schema is deterministic. If there is a single XML instance that makes this mapping impossible, then the schema is nondeterministic.
The following example schema is nondeterministic:
|
To illustrate why this XSD segment is nondeterministic, assume you have the following XML fragment that you want to validate with this schema:
|
In this XML fragment, it is not clear whether the <file_path> element is the required node from the first part of the choice
declaration or the optional one from the second part of the choice
declaration. This distinction is important for the following reasons:
- If the XML fragment is validated against the first part of the
choice
declaration, then the XML is valid against the schema. - If the XML fragment is validated against the second part of the
choice
declaration, then the schema is not valid, because the required <URI> node is missing.
Some XSD validation systems err toward validating against this schema because there is a valid path. MSXML 5.0 is stricter and raises an error stating that the schema is nondeterministic.
What follows are a few more examples of nondeterministic schemas. The first deals with optional elements. These cases often arise from XDR to XSD converters because of differences in the default cardinalities in the two schema languages. The first case to consider is optional elements declared with xsd:choice
and xsd:sequence
elements. Optional elements declared in an xsd:sequence
element usually validate properly, as long as you do not have elements with the same name more than once, with only optional elements in between. For example:
|
To understand why this schema segment is nondeterministic, assume you have the following invalid XML fragment:
|
Looking at this fragment, you can see why it is invalid: there are two <aNode>
elements before the <anotherNode>
element, when only one is allowed.
Now assume that you have the following XML instance to validate:
|
The challenge is to determine whether this instance is valid. Do you have two <aNode>
elements where only one is allowed, or do you have an <aNode> element where it is allowed and another where it is allowed? The schema is nondeterministic because there is no way to know.
Similarly, optional elements declared in an xsd:choice
element are usually problematic. In the following simplified example, there is no way to determine whether the choice occurred once with the optional element not being there or whether it never occurred at all.
|
The final questionable practice is using an xsd:any
element without a namespace definition, as in <xsd:any namespace="##other"/>
, after an xsd:sequence
element. This construct is especially troublesome when it follows an optional element. If you revisit the previous example and change just the last node to an xsd:any
element, you can see that all the previous arguments about nondeterminism still apply, as follows:
|
Illegal Enumeration Values
XSD schemas typically do not perform any type validation until you validate an actual instance document. An exception to this is when you have an enumeration in your schema. In this case, the schema validates the enumeration values against the enumeration types to ensure they are proper node values. Here are two examples:
|
This schema is invalid because "eleven o'clock" is not a valid value for an element of type xsd:time
.
The following is a more complex example:
|
To understand why this example is invalid, you must understand how the type xsd:NMTOKEN
is defined. The W3C datatypes specification defines the NMTOKEN
type as follows: "An NMTOKEN (name token) is any mixture of name characters."
If you investigate further, you find that '&' is not a valid name character, and therefore "M&Ms" does not validate as an NMTOKEN
type.
Empty Sequence or Choice Elements
MSXML 5.0 sometimes raises errors about schema declarations that contain empty xsd:choice
or xsd:sequence
elements, as shown in the following example:
|
Removing the empty <xsd:choice /> tag should fix this problem.
Regular Expressions
MSXML 5.0 can have problems validating regular expression patterns on load. Regular expressions can be complicated, and you should be careful when using them. Every XSD parser seems to have flexible regular expression languages; that is, they implement the official XSD regular expression language plus elements from other regular expression languages. If InfoPath design mode has a problem parsing a regular expression, then the sample data InfoPath generates might be invalid or might not be generated at all. This is acceptable at design time, because InfoPath uses only sample data for formatting. However, if you use a regular expression that MSXML 5.0 does not support, then InfoPath cannot validate a value against it when a user is filling out a form in edit mode. XML Schema Part 0: Primer Second Edition describes what is supported in XSD regular expressions. For more information about XSD regular expressions and Unicode level 1 regular expressions, see the Unicode Home Page.
targetNamespace Attribute Issues
XSD is interesting in that, by default, the targetNamespace
attribute refers to only the top-level declarations, although you can set attributeFormDefault=qualified
and elementFormDefault=qualified
to override this default behavior. As an example, assume you have the following XSD:
|
Your XML instance document looks like the following:
|
Local definitions do not require the target namespace because qualification is turned off by default. However, if you change your local definition to be global, then your reference must be qualified with the namespace prefix. For example, the following schema is invalid:
|
This schema is invalid because "global" is in the namespace "http://ns". The simple ref="global"
is not recognized because the default namespace is not "http://ns". To fix this, you must add a prefix for the target namespace and use that for all global references and type uses. The corrected schema looks like the following:
|
If your schema has the targetNamespace
attribute specified, ensure that all global references are qualified with the correct namespace prefix.
XML Processing Instruction Encoding (Unicode vs. ANSII)
XML supports only Unicode character sets, so you may lose information if you save files that use ANSII characters. However, saving files as UTF-16 may be excessive for your particular use. To reduce the implementation cost of an XML reader, the XML author must state which encoding they are using in the top-level XML processing instruction. You may recognize something that looks like this:
|
This processing instruction tag specifies that the encoding of the file is UTF-8. You must ensure that the file encoding is the same as the encoding stated in the processing instruction tag. You can determine the encoding by looking at the bytes of the file and looking for the Unicode byte order marks, but there is an easier way. If you are having problems opening an XSD schema, specify the encoding as "UTF-8", open it in a text editor such as Notepad, and then save the file using UTF-8 encoding (Notepad provides the Encoding drop-down list in the Save As dialog box). If you still have problems opening the file, it is not an encoding issue.
maxOccurs Attribute Inside the xsd:all Element
Due to the way nondeterminism is defined in the XML Schema recommendation, the only legal value for the maxOccurs
attribute of an xsd:element
element inside of an xsd:all
element is 1. For example, the following is valid:
|
However, this example is not valid:
|
This example is invalid because the validation system cannot determine whether two occurrences of <x/>
map to the single declaration or to the declaration and another illegal definition. Along the same lines, you cannot have two elements of the same name in an <xsd:all>
tag.
This example is also interesting because it allows you to have any number of <x/>
and <docs/>
nodes inside a containing element in any order. Although this construct is illegal, there is a workaround. By using the xsd:choice
element, you can achieve the same result, as demonstrated in the following example:
|
How to Edit or Author an XSD for InfoPath
The two examples in this section demonstrate how to edit or author a schema to produce specific results in InfoPath.
Allowing User-defined Elements to Be Inserted in the Data Source Task Pane
To allow user-defined elements to appear under a parent element in the Data Source task pane, you must insert an xsd:any element under the parent element. To allow user-defined elements to be inserted inside <your_node_name>, the XSD declaration should look something like the following:
|
If you want to allow user-defined attributes as well, then you need to add <xsd:anyAttribute namespace="##any | ##other"/>
to the element declaration.
Allowing Rich Text Elements to be Bound in InfoPath Design and Edit Modes
If you want to declare an element that can be bound to a Rich Text Box control, then it should have the following form, which includes the xsd:any
element with a namespace
attribute set to "http://www.w3.org/1999/xhtml"
.
|
Conclusion
By taking advantage of InfoPath support for designing XML form solutions based on externally authored XML Schema (.xsd) files, you can create a form template that works with an industry-standard schema or custom schema created by your company or organization. With the information in this article, you can create custom XSD schema files that are compatible with InfoPath, and you can troubleshoot common issues that you may encounter when loading externally authored XSD files into the InfoPath design environment.