SQL Server 2000 XML Overview
Updated : August 23, 2001
Abstract: In a world where mission-critical data is stored in disparate sources, extensible markup language (XML) provides a powerful mechanism to exchange data among different applications and integrate different systems using a standardized format.
XML has been tightly integrated into the Microsoft® SQL Server™ relational database management system (RDBMS) to help developers build the next generation of Web and enterprise applications. This white paper describes XML features in SQL Server 2000 and how these features compare with those in Oracle.
On This Page
Why Integrate XML with RDBMS?
XML Features in SQL Server 2000
Oracle Competitive Analysis
Why Integrate XML with RDBMS?
XML enables information exchange in the following business scenarios:
Business to Business (B2B)
Integrating supply chains of different vendors.
Business to Consumer (B2C)
Browser-based applications that require data from a database.
Integrating, for example, enterprise resource planning (ERP) and customer relationship management (CRM) software from different vendors.
These business scenarios require the following XML features for information exchange:
Mapping loosely coupled business systems.
SQL Server 2000 provides XML view of the relational data through XML views, XPath queries, and SELECT statements. The mapping from XML data to relational tables is provided using OPENXML and annotated schemas.
Secure HTTP connectivity to the database.
The ability to query, update, and load XML data regardless of its origin.
SQL Server 2000 provides URL, Microsoft OLE DB Provider for SQL Server (SQLOLEDB), and Microsoft ActiveX® Data Objects (ADO) access to the data stored in the SQL Server database.
XML Features in SQL Server 2000
Microsoft® SQL Server™ 2000 is a highly scalable and reliable platform for building XML-based applications. SQL Server 2000 provides these XML features:
New SELECT statement options to retrieve results in XML (FOR XML)
OLE DB and ADO access
Microsoft® SQL Server™ 2000 provides three ways to access the data through HTTP:
SQL statements at the URL
SQL statements and stored procedures can be directly executed from the URL. This allows quick ad hoc access to data in the database.
A template is a valid XML document that contains one or more embedded SQL statements. Like SQL statements, templates can be specified at the URL or, alternatively, in a file. Templates enable SQL data to be retrieved using a URL without exposing the structure of the database to the requester; only the template name is known. This eliminates the need for typing long SQL statements on the URL. SQL Server 2000 supports the following within a template:
SELECT statements and entity references.
Execution of a stored procedure.
Parameter passing and specification of default parameters.
HTML post event integration
HTML form input variables can be programmatically passed as parameters in the template specified in the URL. An Extensible Stylesheet Language (XSL) style sheet can be used to process the output from either access mechanism on the client or on the Web server.
Options Available Using HTTP Access
The HTTP access method supports the following options:
This keyword specifies the content-type of the document returned. text/XML is the default content-type of the document.
This option allows you to process the results of the query using an XSL file.
Security is set per virtual root and permissions are also set on SQL Server. The ISAPI dynamic-link library (DLL) has three authentication options:
Microsoft Windows® or SQL Server login/password set directly on virtual root and used for all users.
Clear text SQL Server login/password is sent over the network when using this option and, therefore, should be used in conjunction with Secure Sockets Layer (SSL).
Uses Windows access control lists (ACLs).
New SELECT Statement Options to Retrieve Results in XML (FOR XML)
You can execute SQL queries against existing relational databases to return results as XML rather than standard rowsets. These queries can be executed directly or from within stored procedures. To retrieve XML results, use the FOR XML clause of the SELECT statement and specify an XML mode of RAW, AUTO, or EXPLICIT.
The shape of the returned XML document is determined by the XML mode specified. The following XML modes can be specified:
The RAW mode takes the query result and transforms each row in the result set into an XML element with a generic identifier row as the element tag and the columns in the SELECT as attributes.
The AUTO mode returns query results in a simple, nested XML tree. Each table in the FROM clause for which at least one column is listed in the SELECT clause is represented as an XML element. The columns listed in the SELECT clause are mapped to the appropriate element's attributes. The columns can be optionally mapped to subelements. The nesting of the elements or hierarchy in the result set is based on the order of tables identified by the columns specified in the SELECT clause. The leftmost table will be the top element. The second leftmost table (identified by columns in the SELECT statement) will be nested within the top element, and so on.
In EXPLICIT mode, you can explicitly define the shape of the resulting XML tree. Using this mode requires that the queries be written in a specific way, so that additional information about the desired nesting is specified explicitly as part of the query.
The following illustration (Figure1) shows the output using a virtual path and an SQL string in the URL in AUTO mode.
Figure 1: Example output using a virtual path and an SQL string in the URL in AUTO mode
Options Available with the XML Modes
XML modes support the following options:
This option returns the schema of the document along with the XML document.
This option is specified in the query to return the binary data in base64-encoded format. In RAW and EXPLICIT mode, retrieving binary data without specifying this option results in an error.
If this option is specified, the columns are returned as subelements, as shown in the following illustration (Figure 2). Otherwise, they are mapped to XML attributes. This option is supported only in AUTO Mode.
Figure 2: Example output using elements option
Microsoft® SQL Server™ 2000 introduces a number of annotations to the XML-Data schema language to provide an XML View on the database. These annotations can be used within the XML-Data schema to specify a two-way XML-to-relational mapping. This includes mapping between elements and attributes in the XML-Data schema to tables and columns in the databases, and vice-versa. By default, an element name in an annotated schema maps to a table (view) with the same name in the specified database and the attribute maps to a column with the same name. These annotations can also be used to specify the hierarchical relationships in XML.
SQL Server 2000 supports a subset of the XPath language in which can be employed in conjunction with the XML View technology described above to obtain XML results from the database.
OPENXML is a Transact-SQL keyword that provides a relational/rowset view over an in-memory XML document. OPENXML is a rowset provider similar to a table or a view. OPENXML provides a way to access XML data within the Transact-SQL context by transferring data from an XML document into the relational tables. Thus, OPENXML allows you to manage an XML document and its interaction with the relational environment.
OPENXML can be used in statements such as SELECT, INSERT, UPDATE, and DELETE. Instead of specifying a source table or a view, OPENXML can be specified.
An XML document can be passed to a stored procedure as text parameter (char, nchar, varchar, nvarchar, text, ntext). The OPENXML statement can be used along with other Transact-SQL statements to provide a rowset view of the XML data so that relational operations such as insert, update, delete, and joins can be performed with an XML document as the source. This can be used to perform multirow, multitable updates to the database using a single stored procedure invocation.
OLE DB and ADO Access
SQLOLEDB has been extended for XML and XPath. A new ICommandStream interface has been added to pass templates to OLE DB for processing. Additionally, the property set for the OLE-DB provider has been expanded to allow the specification of an object supporting IStream for returning the XML results and support XPath queries using XML Views defined with mapping schemas. The stream object was introduced in ADO 2.5 and can now be used together with the newly introduced dialects to send templates and XPath queries to, and receive XML results from, the database.
Oracle Competitive Analysis
XML functionality in SQL Server 2000 and Oracle is similar. The primary difference is that Oracle has built a generic XML framework, and Microsoft has built specific XML support into SQL Server 2000. SQL Server 2000 supports a simple "plug-and-play" declarative mechanism (XML Views) for providing mappings between loosely coupled business systems. For example, support is directly included for retrieving (FOR XML) and storing (OPENXML) XML data. Oracle's XML technology requires complex programming to perform even simple XML tasks. Unlike Oracle's solution, SQL Server 2000 provides native constructs to generate XML documents in different formats without having to do an XSL transformation
SQL Server 2000
Declarative mapping of loosely coupled business systems
No specific features. Must write custom programs.
Native XML extensions to SQL
Transact-SQL FOR XML extension
No specific features. Must write custom programs using the XSQL Utility.
Diverse XML access methods
Secure Web access
Database user-level security
Database user-level security
Options for advanced searching of complex XML documents
Intermedia text search with sectioning features
MSXML COM-enabled parser supporting multiple languages
Oracle parsers supporting multiple languages
Comparison of SQL Server 2000 and Oracle XML Features
Although Oracle provides a generic XML framework, you must write custom programs to use most of its XML functionality. Oracle provides XML text searching capabilities that can be extended to recognize XML tags.
The following SQL Server 2000 and Oracle XML features are compared.
SQL Server 2000 includes technology that provides XML Views of relational data. Currently, custom programs must be written in Oracle to provide mapping.
Native Extensions to the Database
Integrating features into the database can provide higher performance when compared to features implemented externally. The decision to build native SQL Server 2000 XML features was made with these performance considerations in mind. Oracle has no native XML support.
Retrieving Dynamic XML Formats
Oracle queries produce a fixed XML format that is similar to an element-centric RAW format. To get more useful formats, programmers must write XSL transformations or Java code. SQL Server 2000 not only provides a similar model with RAW, but also allows generation of more "valuable" XML directly (using XML Views and FOR XML AUTO and EXPLICIT modes). In addition, SQL Server 2000 provides a stream interface for returning a stream of XML as the result of a query.
Both Oracle and SQL Server 2000 (with Web release) support bulk load. SQL Server has OPENXML for providing rowsets over XML on the server and updategrams (with Web release) for providing updates using XML syntax. Oracle has a Java package to perform similar functionality.
Oracle XSQL Templates vs SQL Server 2000 XML Templates
Oracle uses XML SQL Utility for Java (for formatting data into XML) and XSQL Servlets (for URL access) to execute template files. SQL Server 2000 provides a more integrated solution for the execution of template files by providing native support (FOR XML) for retrieving XML data. Oracle's XSQL template files can be executed only through XSQL Servlets. To execute Oracle templates in any other way requires custom Java programming. SQL Server 2000 templates can be executed either from a URL or directly from an ADO program.
Oracle has an alpha-level demonstration of how to map a subset of XPath into SQL. SQL Server 2000 has a shipping version of an XPath subset.
Full-Text Retrieval and Indexing
SQL Server 2000 provides overflow column semantics for OPENXML that can be used to extract index information from the documents. Oracle offers a similar way to extract data from documents to build indexes. Oracle has a text retrieval component that works with hierarchies (the WITHIN clause). Although Oracle cannot perform correlation, it supports queries over multilevel nestings. The full-text retrieval system in SQL Server 2000 does not currently provide hierarchical indexes and searches the XML in a pure text form. No specific configuration is included for identifying tags and attribute names.
SQL Server 2000 provides rowset semantics as well as XML stream interface from ADO. Oracle supports only rowset semantics.
Scalability and Performance
SQL Server 2000 can push some workload from the middle tier to the server by using OPENXML and FOR XML queries.
Although it is difficult to compare the performance and scalability of these products, SQL Server has an advantage in that it can generate XML directly with the query without having to transform it with XSLT or a Java program.
Oracle has object relational support and uses this feature to provide XML object view of the data stored in the database. SQL Server 2000 has a different model with object support with ADO.
XML Parser, Schema Processor, and XML Development Components
Oracle has Transviewer beans, Schema Processor, and XML parsers for different languages. These features are not related to the database. MSXML and third-party utilities provide the same functionality.
XML Class Generator
Like Oracle Transviewer beans and XML parsers, this feature is not related to database technology.
Microsoft SQL Server 2000 offers XML features specifically designed for Web and database developers. Web developers do not have to learn database programming to make use of SQL Server 2000 XML features because SQL Server 2000 supports standard XML constructs, such as XPath, to allow them to interact with the database. Similarly, database developers do not have to learn XML programming and can use the FOR XML clause to obtain XML query results and OPENXML to manipulate the XML data. Lastly, SQL Server 2000 provides a simple declarative mechanism, XML Views, to define schema mapping for business-to-business transactions.
The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, AS TO THE INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, copyrights, or other intellectual property.
© 2000 Microsoft Corporation. All rights reserved.