CSV how can you remove all data that does not contain artist info or song name

asked 2021-04-15T22:58:22.21+00:00
mion shion 221 Reputation points

Hello all i got a list of 172,500 lines of code from a csv that contains the artist name and song name,

but among all this, there is a lot of data i don't want i just want artist name and song name,

all the rest of the stuff i want gone but have no idea how to do this trying to use notepadd++ and used regex to try and do this but to no avail,

the file is here,
5Ub1mTxg

the issue i am having is i have no idea how to get just the name and artist there that is only a snippet as file is got a lot of data in any help would be much appriated
elfenliedtopfan5

C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
6,927 questions
No comments
{count} votes

3 answers

Sort by: Most helpful
  1. answered 2021-04-16T00:13:24.427+00:00
    Ken Tucker 5,796 Reputation points

    You could use the open source Nuget package CSVHelper to get the data from the file.

      class Program
       {
           static void Main(string[] args)
           {
                var config = new CsvConfiguration(CultureInfo.InvariantCulture)
                {
                      HasHeaderRecord = false,
                };
                List<SongData> records = null;
                using (var reader = new StreamReader("Csv File.csv"))
                {
                       using (var csv = new CsvReader(reader, config))
                       {
                              records = csv.GetRecords<SongData>().ToList();
                        }
                 }
                 foreach(var song in records)
                 {
                        Console.WriteLine($"{song.Artist} - {song.SongName}");
                 }
             }
       }
    
    
    public class SongData
    {
    
            [Index(1)]
            public string Artist { get; set; }
    
            [Index(12)]
            public string SongName { get; set; }
    
    }
    

    Then you could write the data to a new file. There are examples on writing a new file in the CSVHelper link above.

    No comments

  2. answered 2021-04-16T00:34:38.883+00:00
    Duane Arnold 3,211 Reputation points

    @mion shion

    You could use Linq to read the CVS file.

    http://shortfastcode.blogspot.com/2011/12/using-linq-to-read-csv-file.html

    You can change the code to project an anonymous type list using Linq.

    https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/anonymous-types

    Projecting Anonymous Types is section in below article

    http://csharp-station.com/Tutorial/Linq/Lesson02

    Then you can go into a foreach loop on the anonymous type list of objects and pick out what elements from each item/object's properties in the list you want to write to a new CVS file or whatever else you need to do.

    No comments

  3. answered 2021-04-16T00:36:59.69+00:00
    Karen Payne MVP 29,226 Reputation points Microsoft MVP

    If you have SQL-Server Express edition you can perform a bulk insert, use a SQL SELECT WHERE to get at the data you want them empty the table.

    Using a connection and command object example for the SQL

    BEGIN TRANSACTION
    BEGIN TRY
        BULK INSERT dbo.BulkDataTable
        FROM 'C:\SomePath\SomeFile.csv'
        WITH
        (
            FIELDTERMINATOR = ',',
            ROWTERMINATOR = '\n',
            ROWS_PER_BATCH = 10000, 
            TABLOCK
        )
    
        COMMIT TRANSACTION
    
    END TRY
    BEGIN CATCH
        ROLLBACK TRANSACTION
    END CATCH
    

    Sample export code

    public static void ToCommaDelimited()
    {
        var selectStatement = "TODO";
    
        using var cn = new SqlConnection("TODO");
        using var cmd = new SqlCommand { Connection = cn, CommandText = selectStatement };
    
        cn.Open();
    
        var reader = cmd.ExecuteReader();
    
        var fileName = "C:\\SomePath\\test.csv";
        var streamWriter = new StreamWriter(fileName);
    
        var output = new object[reader.FieldCount];
    
        for (int index = 0; index < reader.FieldCount; index++)
        {
            output[index] = reader.GetName(index);
        }
    
        streamWriter.WriteLine(string.Join(",", output));
    
        while (reader.Read())
        {
            reader.GetValues(output);
            streamWriter.WriteLine(string.Join(",", output));
        }
    }
    
    No comments