The data is a little inconsistent. For example, look at records 10 to 13. If we want to sum the values of column 12, what should happen for the values of columns 8, 9 which are different?
Usually the values of column 8,9 in records 11,12 should be empty (like column 7) or should be exactly the same as values of the record 10.
Anyways, in this answer, to make is simple to understand the rules, I use a smaller table of data:
- We have 5 columns, and we only want 1, 4, and 5
- Column 1 is the key of grouping. So if value of column 1 is empty, in a row, it will use the value of first row in group.
- Column 2, we ignore it's data, and will not be included in result.
- Column 3 is our filter column, with GOOD and BAD as value, and we only are interested in GOOD values. We also will not include it in result.
- Column 4, is wanted for result, and if it belongs to a row in a group, it will use the value of the column from first row in group(because column 1 of this row is of course empty)
- Column 5, is our summary column, so after reducing the results, we calculate sum of the values of this column, in each group
To keep the answer focused on the main issue, like the previous answer, I only focus on processing, filtering, and reducing data, so I start with a DataTable, assuming you have read it from your file, and I'll produce a DataTable assuming you will later write it to a file.
That said, here is the code that does the trick for us:
using System;
using System.Data;
using System.Linq;
namespace SampleConsoleApp
{
internal class Program
{
static void Main(string[] args)
{
var input = new DataTable();
input.Columns.Add("C1", typeof(string)); //Our key column, in output
input.Columns.Add("C2", typeof(string)); //not in output
input.Columns.Add("C3", typeof(string)); //Our Filter column, not in output
input.Columns.Add("C4", typeof(string)); //in output, so follows value of first row in group
input.Columns.Add("C5", typeof(double)); //in output, The summary column
input.Rows.Add("A", "AA", "GOOD", "AAAA", 100);
input.Rows.Add("", "xyz", "GOOD", "xyz", 100);
input.Rows.Add("", "xyz", "BAD", "xyz", 100);
input.Rows.Add("", "xyz", "GOOD", "xyz", 100);
input.Rows.Add("B", "BB", "BAD", "BBBB", 100);
input.Rows.Add("", "xyz", "GOOD", "xyz", 100);
input.Rows.Add("", "xyz", "GOOD", "xyz", 100);
input.Rows.Add("C", "xyz", "GOOD", "CCCC", 100);
//Step 1 - Fix the data,
//For all the columns which are not part of filter
//Or are not in sum, copy value of the key row to next rows of group
//Here, column 1, 2, 4
for (int i = 0; i < input.Rows.Count; i++)
{
//1st column: Copy value of prevuous row, if the value is empty
if (string.IsNullOrEmpty(input.Rows[i].Field<string>("C1")))
{
input.Rows[i]["C1"] = input.Rows[i - 1]["C1"];
//2nd column, don't touch we do not need
//3rd column, don't touch, we will use it in filter later
//4th column, treat it like 1st column
input.Rows[i]["C4"] = input.Rows[i - 1]["C4"];
//5th column, don't touch, we will sum it later
}
}
//Step 2 - Create data structure of output
var output = new DataTable();
output.Columns.Add("C1", typeof(string));
output.Columns.Add("C4", typeof(string));
output.Columns.Add("C5", typeof(double));
//Step 2 - Filter, Group by and Sum
input.Rows.Cast<DataRow>()
.Where(row => row.Field<string>("C3") == "GOOD") /
.GroupBy(
x => new { C1 = x["C1"], C4 = x["C4"] },
x => new { C5 = x.Field<double>("C5") })
.Select(g => new
{
C1 = g.Key.C1,
C4 = g.Key.C4,
C5 = g.Sum(x => x.C5)
})
.ToList()
.ForEach(x =>
{
output.Rows.Add(x.C1, x.C4, x.C5);
});
foreach (DataRow item in output.Rows)
{
Console.WriteLine($"C1:{item["C1"]}, C4:{item["C4"]}, C5:{item["C5"]}");
}
Console.ReadLine();
}
}
}
And here is the result:
C1:A, C4:AAAA, C5:300
C1:B, C4:BBBB, C5:200
C1:C, C4:CCCC, C5:100