Hello @Ridham and welcome to Microsoft Q&A.
As I understand there are two asks here:
- How to flatten and get the optional fields which might not always appear in the result of "import schema"
- Use the XSD to inform the schema mapping
I haven't worked with XSD before. From what I am learning, it basically IS the schema definition. In both Copy activity and Data flow, there are options to validate the XML data against your choice of XSD or DTD. Given this validation feature, it is made clear we do somehow read the XSD. It would make sense to have a feature where we import schema from the validation ... but since you are asking this question, I'm guessing it is not being used. Unless "Allow Schema Drift" causes "import schema" to ignore the XSD in favor of looking at the data instead.
Anyway, to the useful information. Use of rule-based mapping in flatten transformation allows to catch optional fields and fields not found in the schema.
Here is my working example:
Input Data:
<?xml version='1.0'?>
<root>
<items>
<id>1</id>
<name>first</name>
</items>
<items>
<id>2</id>
<name>second</name>
</items>
<items>
<id>3</id>
<name>optional</name>
<additional>true</additional>
</items>
<items>
<id>4</id>
<name>fourth</name>
</items>
</root>
Data Flow script. Note that the second line where the source schema is defined, field "additional" is missing.
source(output(
root as (items as (id as short, name as string)[])
),
allowSchemaDrift: true,
validateSchema: false,
ignoreNoFilesFound: false,
format: 'xml',
fileSystem: 'data',
folderPath: 'input',
fileName: 'additional.xml',
validationMode: 'none',
namespaces: false) ~> source1
source1 foldDown(unroll(root.items),
mapColumn(
every(match(true()))
),
skipDuplicateMapInputs: false,
skipDuplicateMapOutputs: false) ~> flatten1
flatten1 sink(allowSchemaDrift: true,
validateSchema: false,
format: 'delimited',
fileSystem: 'data',
folderPath: 'out',
columnDelimiter: ',',
escapeChar: '\\',
quoteChar: '\"',
columnNamesAsHeader: true,
umask: 0022,
preCommands: [],
postCommands: [],
skipDuplicateMapInputs: true,
skipDuplicateMapOutputs: true) ~> sink1