Data Factory XML validation(Copy Activity) throwing User Configuration error

Partha Das 276 Reputation points
2023-02-28T13:48:21.4066667+00:00

Hi,

I'm using Azure Data Factory Copy activity to validate one xml file against an XSD file.

I was mainly referring to the link "https://techcommunity.microsoft.com/t5/azure-data-factory-blog/azure-data-factory-adds-support-for-xml-format/bc-p/1533021"

I created one sample XML file and generated xsd using Visual Studio. I attached both files(changed the extension of xsd to txt for the shake of uploading)

books.xml books.txt

Please refer below my source configuration

SourceConfig

XML and XSD are stored in same blob storage

BlobContainer

And I'm getting below error while running the pipeline

Error

I'm unable to figure out, where I'm making the mistake.

Please help.

Also please let me know, what will be the configuration for sink? My requirement is to validate the xml and write the errors in any csv file(if any)

Regards,

Partha

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,443 questions
0 comments No comments
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2023-03-03T01:22:25.17+00:00

    Hi @Partha Das ,

    Welcome to Microsoft Q&A forum and thanks for reaching out here.

    As per my understanding you are trying to validate your XML file against XSD and then copy data to your desired sink location. Please correct me if I'm wrong anywhere.

    When I looked at the error message noticed that the schema validation is failing. And then I opened your XML file attached above in the question and noticed that XML file doesn't have XSD reference and hence the validation is failing as there no reference to your XSD in your XML file. Please see below (left side is your file data and right side what I updated referencing the XSD)

    User's image

    Resolution: Please add XSD reference in your XML file to validate schema. I have placed xsd in same location as xml file for testing. After adding XSD reference, your XML file will look like below:

    <?xml version="1.0"?>
    <books xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
           xsi:noNamespaceSchemaLocation="books.xsd">
       <book id="bk001">
          <author>Writer</author>
          <title>The First Book</title>
          <genre>Fiction</genre>
          <price>44.95</price>
          <pub_date>2000-10-01</pub_date>
          <review>An amazing story of nothing.</review>
       </book>
       <book id="bk002">
          <author>Poet</author>
          <title>The Poet's First Poem</title>
          <genre>Poem</genre>
          <price>24.95</price>
          <review>Least poetic poems.</review>
       </book>
    </books>
    

    The above will overcome your error message.

    To your second question, in Copy activity mapping section, please configure as below so that data will be copied accordingly. First select the collection reference to $['books']['book'] and then click on Import Schemas which will import the columns as shown below and map it to your desired destination columns and run the pipeline.

    User's image

    For testing I copied the data to a blob as txt file and sample looks like below:

    User's image

    Hope this helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.


0 additional answers

Sort by: Most helpful