Hi @CzarR ,
*.csv/txt flat files always have host of problems where column delimiters, column separators, invisible characters like null terminators \0, special characters based on encoding, and line breaks are in the middle of the actual data.
The most reliable format for data feeds is XML enforced by XSD. An XSD plays a role of a data contract between sender and receiver. It will guarantee proper data format/shape, and enforce data quality.
XML supports very well optional vs. mandatory data elements via cardinality in XSDs.
- SSIS has XML Source adapter to ingest XML files. XML Source is using XSD to validate XML files.
- SSIS has XML Task, operation Validation.
You can use XML Task, operation Validation to validate XML before loading.
It will produce a detailed report in the following format:
<root xmlns:ns="https://schemas.microsoft.com/xmltools/2002/xmlvalidation">
<metadata>
<result>false</result>
<errors>2</errors>
<warnings>0</warnings>
<startTime>2015-05-28T10:45:09.538</startTime>
<endTime>2015-05-28T10:45:09.558</endTime>
<xmlFile>C:\Temp\TestData.xml</xmlFile>
<xsdFile>C:\Temp\TestSchema.xsd</xsdFile>
</metadata>
<messages>
<error line="5" position="26">The 'ApplicantRole' element is invalid - The value 'wer3' is invalid
according to its datatype 'ApplicantRoleType' - The Enumeration constraint failed.</error>
<error line="16" position="28">The 'Phone' element is invalid - The value 'we3056666666' is invalid
according to its datatype 'phone' - The Pattern constraint failed.</error>
</messages>
</root>
Useful links:
validate-xml-with-the-xml-task
importing-xml-documents-using-sql-server-integration-services