How to exclude xml file from processing in dataflow based on one of the attribute values of xml data

Madugundu Somashekara, Roopa 46 Reputation points
2023-09-20T09:13:37.4433333+00:00

Hi Team,

I have a requirement where we receive multiple xml files in a day. We would like to process those xml files that has the data. There could be xml files with no data in it.
We are checking a field named 'TotalNbrOfEntries'. If this value is '0', then the mapping dataflow should not process this file. If the value is other than '0', it should process it and continue with the next process which we defined in our flows.
Could you help on how this could be achieved in Azure DataFactory pipelines or Azure DataFactory-Mapping Dataflows.

Thanks in advance!

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Accepted answer
  1. Subashri Vasudevan 11,226 Reputation points
    2023-09-20T12:13:22.78+00:00

    Madugundu Somashekara, Roopa

    Hi Roopa,

    1. You can use get meta data activity to get list of files.
    2. use foreach loop to go over the files
      1. use a lookup activity on the file to get file content
        1. use an if condition and decide whether the value of TotalNbrOfEntries is greater than 0 or not
          1. if TotalNbrOfEntries>0, in true condition, have your data flow. leave the false part empty

    I Tried with similar setup.

    My file:

    <?xml version="1.0"?>
    <books TotalNumberofEntries="2">
    	<book id="bk101">
    		<author>Gambardella, Matthew</author>
    		<title>XML Developer's Guide</title>
    		<genre>Computer</genre>
    	</book>
    	<book id="bk102">
    		<author>Ralls, Kim</author>
    		<title>Midnight Rain</title>
    		<genre>Fantasy</genre>
    	</book>
    </books>
    

    So, under books, i had TotalNumberofEntries attribute. Based on this, i will decide if to process the file or not.

    My pipeline design (u can replace set variable with if activity)

    Screenshot 2023-09-20 at 5.39.03 PM

    Get meta data: gets all file \folders

    Filter : filters just xml files

    foreach: loops over files to process

    lookup: looks up file content

    Set variable: captures the value of the attribute as follows

    @activity('Lookup1').output.value[0]['books']['@TotalNumberofEntries']
    

    If you are using if condition, this expression would be like below

    @greater(activity('Lookup1').output.value[0]['books']['@TotalNumberofEntries'],0)
    
    
    

    Hope it helps. Please let us know if you have questions.

    Thanks


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.