SSIS CSV Loading issue

Question

I am looking for a C# solution to use a script task in SSIS so that I can load multiple csv files to SQL Tables
Now If I use a data flow task, then I have to create 1 DFT per file as each file has a different metadata ( unless there is some other way with dynamic DFT?)

So I checked the following solutions
http://www.techbrothersit.com/2016/03/how-to-create-tables-dynamically-from.html
http://www.techbrothersit.com/2016/04/how-to-load-flat-files-to-sql-server.html
One of this solution expects target table to be already present while other creates the table on the fly ( columns from the first line of the CSV) which is fine as well

Issue :-
Now my CSV files have comma as a delimiter and some of the columns have comma in the column value

eg some columns have numbers but with comma separated value eg 100,000
some columns have info about name of person which can be D'Souza
`
so the above scripts error out as they consider any comma as a field separator so error is thrown like count of values doesn't match the number or columns
OR the apostrophe character throws the error as invalid data

eg here's the structure of one csv file ( check lines 2,3)

Thanks in advance.

Accepted Answer

The borrowed solutions use simple text reading operations that are not suitable for CSV files, which contain commas or quotation marks within values, or which contain multiline values.

If you want to process CSV files yourself, then consider a designed class: TextFieldParser [https://learn.microsoft.com/en-us/dotnet/api/microsoft.visualbasic.fileio.textfieldparser?view=netframework-4.8]. It can be used in C# but requires a reference to Microsoft.VisualBasic assembly.

You can find many samples, for example:

using Microsoft.VisualBasic.FileIO;  
. . .  
using (var p = new TextFieldParser(@"C:\myfile.csv")  
                                  {  
                                         TextFieldType = FieldType.Delimited,  
                                         Delimiters = new[] { "," },  
                                         HasFieldsEnclosedInQuotes = true  
                                  })  
{  
       while (!p.EndOfData)  
       {  
              // read and split one line of fields  
              string[] fields = p.ReadFields();  
  
              // each fields[i] is a value of a column  
              // . . .  
       }  
}

It is not difficult to identify the first header row.

If you decide to build dynamic SQL, which is not always recommended, then replace each apostrophe (like “Brian D'Souza”) with two apostrophes using string.Replace function.

Try and debug it in a separate console application, in Visual Studio, before moving to SSIS.

But also check if SSIS contains other appropriate tasks.

Share via

SSIS CSV Loading issue

1 additional answer

Your answer