Validating the XML data against the mapping schema in SQLXML Bulkload

 

Summary:

Learn how to validate the XML data against the annotated XSD schema when using SQLXML Bulkload, with the help of System.Xml classes.

Introduction :

SQLXML Bulkload functionality available in Microsoft SQL Server 2005 allows the user to shred large amounts of xml data into the database using an annotated xsd schema. There are some known limitations regarding the XML data consumptions and this article will try to address one of them and will offer a solution to overcome it.

This article covers the following:

· Validating XML data with an XSD schema

· Using Bulkload in a .NET application

The problem:

In general, all element and attribute data which is contained in the body of an XML document is consumed and mapped to corresponding tables and columns and stored in the database.

However, one of the limitations of Bulkload is that the data validation is not performed during the bulk loading process. Here are some facts about Bulkload that need to be understood:

1. Bulkload doesn’t ensure that the XML data is valid with respect to any DTD that is defined or referenced within the xml data. It will only check if it is well-formed. Any error of this kind will be reported and no data will be inserted.

2. Bulkload does not make any attempt to validate the data with respect to the supplied annotated xsd schema.

Considering the above statements, if there is any mismatch between the data file and the schema, the bulkload operation will result in no data being inserted in the database and no error will be reported. Thus, users found it very hard to identify what the problem is.

Usually this problem happens due to one of the following:

1. The schema file contains the correct mappings but the data file has a small difference (like different casing for element/attribute name).

Example:

Schema has the following:

<xsd:attribute name=”CustomerID” type=”xsd:string”/>

Data has the follwing:

<Customers Customerid=”ALFKI”/>

Notice the difference marked in red.

2. There are elements/attributes in the data that are not defined in the schema.

To overcome this problem, the user can validate the xml data against the annotated xsd schema before doing the bulk load operation. This can be done using the XmlReader and XmlReaderSettings classes from System.Xml namespace.

Here is an example on how to do this, using bulkload functionality in a .NET application:

using System;

using System.Collections.Generic;

using System.Text;

using System.Runtime.InteropServices;

using System.Xml;

using System.Xml.Schema;

using System.IO;

using SQLXMLBULKLOADLib;

namespace Test

{

    class Program

    {

       public static bool bValid = true;

        [STAThread]

        static void Main(string[] args)

        {

            SQLXMLBulkLoad4Class objBL = new SQLXMLBULKLOADLib.SQLXMLBulkLoad4Class();

            objBL.ConnectionString = "provider=sqloledb;server=server;database=database;integrated security=SSPI";

            objBL.ErrorLogFile = "error.xml";

            objBL.SchemaGen = true;

           if (IsValid("data.xml", "schema.xml"))

            {

                try

                {

                  objBL.Execute("schema.xml", "data.xml");

                }

                catch (Exception e)

                {

                    Console.WriteLine(e.ToString());

                }

            }

            else

            {

                Console.WriteLine("XML data is not valid according to the schema definition!");

            }

        }

        public static void MyValidationCallBack(object sender, ValidationEventArgs args)

        {

            bValid = false;

            Console.WriteLine("The following error occured : " + args.Message);

        }

        public static bool IsValid(string strXML, string strXSD)

        {

            //Validate the xml data against the xsd schema

            XmlReader reader = null;

            XmlReaderSettings settings = null;

            try

            {

                settings = new XmlReaderSettings();

                settings.Schemas.Add("", "schema.xml");

                settings.ValidationEventHandler += new ValidationEventHandler(MyValidationCallBack);

                settings.ValidationType = ValidationType.Schema;

                settings.ValidationFlags = XmlSchemaValidationFlags.None;

                settings.ConformanceLevel = ConformanceLevel.Document;

                reader = XmlReader.Create("data.xml", settings);

                while (reader.Read()) { }

            }

            finally

            {

                reader.Close();

            }

            return bValid;

        }

    }

}

Conclusion:

This article and the included code sample present a solution to a very common problem customers encounter when using SQLXML Bulkload feature. For more details on Bulkload functionality, please visit the following links:

 

SQL Server Bulk Load Object Model:

https://msdn.microsoft.com/library/en-us/sqlxml3/htm/bulkload_3g30.asp

 

Guidelines and Limitations of XML Bulk Load:

https://msdn.microsoft.com/library/default.asp?url=/library/en-us/sqlxml3/htm/bulkload_5l44.asp

 

Examples of Bulk Loading XML Documents:

https://msdn.microsoft.com/library/en-us/sqlxml3/htm/bulkload_6bos.asp

 

Applies to:
Microsoft® SQL Server™ 2005
Microsoft Visual Studio® .NET