How to convert an excel file into CSV which has commas(,) in some columns

sujith kumar matharasi 351 Reputation points
2021-04-20T21:11:27.717+00:00

Hi All,

I have an SSIS Script task which basically converts all xlsx files into csv however when a column has a comma value in it such as "bars,cafes" it takes cafes into the next column.

I am using a Comma(,) delimiter in my code which i guess is causing the issue, what would be the correct delimiter to get the entire value in same column?

Below is my script task code :

public void Main()
{
string SourceFolderPath = Dts.Variables["User::SourceFolderPath"].Value.ToString();
string DestinationFolderPath = Dts.Variables["User::DestinationFolderPath"].Value.ToString();
string FileDelimited = Dts.Variables["User::FileDelimited"].Value.ToString();
var directory = new DirectoryInfo(SourceFolderPath);
FileInfo[] files = directory.GetFiles("LC_-_State");
DateTime lastModified = DateTime.MinValue;

        //Declare and initilize variables
        string fileFullPath = "";

        //Get one Book(Excel file at a time)
        foreach (FileInfo file in files)
        {
            if (file.LastWriteTime > lastModified)
            {
                lastModified = file.LastWriteTime;
                string filename = "";
                fileFullPath = SourceFolderPath + "\\" + file.Name;
                filename = file.Name.Replace(".xlsx", "");
                //MessageBox.Show(fileFullPath);

                //Create Excel Connection
                string ConStr;
                string HDR;
                HDR = "YES";
                ConStr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileFullPath + ";Extended Properties=\"Excel 12.0;HDR=" + HDR + ";IMEX=0\"";
                OleDbConnection cnn = new OleDbConnection(ConStr);


                //Get Sheet Name
                cnn.Open();
                DataTable dtSheet = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
                string sheetname;
                sheetname = "";
                foreach (DataRow drSheet in dtSheet.Rows)
                {
                    if (drSheet["TABLE_NAME"].ToString().Contains("$"))
                    {
                        sheetname = drSheet["TABLE_NAME"].ToString();
                        //Display Sheet Name , you can comment it out
                        // MessageBox.Show(sheetname);

                        //Load the DataTable with Sheet Data
                        OleDbCommand oconn = new OleDbCommand("select * from [" + sheetname + "]", cnn);
                        //cnn.Open();
                        OleDbDataAdapter adp = new OleDbDataAdapter(oconn);
                        DataTable dt = new DataTable();
                        adp.Fill(dt);

                        //drop $from sheet name
                        sheetname = sheetname.Replace("$", "");

                        //Create CSV File and load data to it from Sheet
                        StreamWriter sw = new StreamWriter(DestinationFolderPath + "\\" + filename + ".csv", false);
                        int ColumnCount = dt.Columns.Count;

                        // Write the Header Row to File
                        for (int i = 0; i < ColumnCount; i++)
                        {
                            sw.Write(dt.Columns[i]);
                            if (i < ColumnCount - 1)
                            {
                                sw.Write(FileDelimited);
                            }
                        }
                        sw.Write(sw.NewLine);

                        // Write All Rows to the File
                        foreach (DataRow dr in dt.Rows)
                        {
                            for (int i = 0; i < ColumnCount; i++)
                            {
                                if (!Convert.IsDBNull(dr[i]))
                                {
                                    sw.Write(dr[i].ToString());
                                }
                                if (i < ColumnCount - 1)
                                {
                                    sw.Write(FileDelimited);
                                }
                            }
                            sw.Write(sw.NewLine);
                        }
                        sw.Close();
                    }
                }
            }
        }
        Dts.TaskResult = (int)ScriptResults.Success;

}

Thanks,
Sujith

SQL Server Integration Services
SQL Server Integration Services
A Microsoft platform for building enterprise-level data integration and data transformations solutions.
1,850 questions
No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Olaf Helper 25,726 Reputation points
    2021-04-21T06:33:11.947+00:00

    "bars,cafes" it takes cafes into the next column.

    Enclose the text in quotation marks, that's the common way how to handle comma in text in CSV files. Every ETL tool is aware of this.

    what would be the correct delimiter to get the entire value in same column

    Use a character which never appears in a common text field, for example the tabulator character = TSV file.

    No comments

  2. Monalv-MSFT 5,686 Reputation points
    2021-04-21T07:24:28.59+00:00

    Hi @sujith kumar matharasi ,

    Please choose the delimiter as which in the following picture besides Comma(,) .

    89773-delimiter.png

    We can refer to Flat File Connection Manager.

    Best regards,
    Mona


    If the answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.