Grouping Data (C#)
Grouping refers to the operation of putting data into groups so that the elements in each group share a common attribute. The following illustration shows the results of grouping a sequence of characters. The key for each group is the character.
Important
These samples use an System.Collections.Generic.IEnumerable<T> data source. Data sources based on System.Linq.IQueryProvider use System.Linq.IQueryable<T> data sources and expression trees. Expression trees have limitations on the allowed C# syntax. Furthermore, each IQueryProvider
data source, such as EF Core may impose more restrictions. Check the documentation for your data source.
The standard query operator methods that group data elements are listed in the following table.
Method Name | Description | C# Query Expression Syntax | More Information |
---|---|---|---|
GroupBy | Groups elements that share a common attribute. An IGrouping<TKey,TElement> object represents each group. | group … by -or- group … by … into … |
Enumerable.GroupBy Queryable.GroupBy |
ToLookup | Inserts elements into a Lookup<TKey,TElement> (a one-to-many dictionary) based on a key selector function. | Not applicable. | Enumerable.ToLookup |
The following code example uses the group by
clause to group integers in a list according to whether they're even or odd.
List<int> numbers = [35, 44, 200, 84, 3987, 4, 199, 329, 446, 208];
IEnumerable<IGrouping<int, int>> query = from number in numbers
group number by number % 2;
foreach (var group in query)
{
Console.WriteLine(group.Key == 0 ? "\nEven numbers:" : "\nOdd numbers:");
foreach (int i in group)
{
Console.WriteLine(i);
}
}
The equivalent query using method syntax is shown in the following code:
List<int> numbers = [35, 44, 200, 84, 3987, 4, 199, 329, 446, 208];
IEnumerable<IGrouping<int, int>> query = numbers
.GroupBy(number => number % 2);
foreach (var group in query)
{
Console.WriteLine(group.Key == 0 ? "\nEven numbers:" : "\nOdd numbers:");
foreach (int i in group)
{
Console.WriteLine(i);
}
}
Note
The following examples in this article use the common data sources for this area.
Each Student
has a grade level, a primary department, and a series of scores. A Teacher
also has a City
property that identifies the campus where the teacher holds classes. A Department
has a name, and a reference to a Teacher
who serves as the department head.
You can find the example data set in the source repo.
public enum GradeLevel
{
FirstYear = 1,
SecondYear,
ThirdYear,
FourthYear
};
public class Student
{
public required string FirstName { get; init; }
public required string LastName { get; init; }
public required int ID { get; init; }
public required GradeLevel Year { get; init; }
public required List<int> Scores { get; init; }
public required int DepartmentID { get; init; }
}
public class Teacher
{
public required string First { get; init; }
public required string Last { get; init; }
public required int ID { get; init; }
public required string City { get; init; }
}
public class Department
{
public required string Name { get; init; }
public int ID { get; init; }
public required int TeacherID { get; init; }
}
Group query results
Grouping is one of the most powerful capabilities of LINQ. The following examples show how to group data in various ways:
- By a single property.
- By the first letter of a string property.
- By a computed numeric range.
- By Boolean predicate or other expression.
- By a compound key.
In addition, the last two queries project their results into a new anonymous type that contains only the student's first and family name. For more information, see the group clause.
Group by single property example
The following example shows how to group source elements by using a single property of the element as the group key. The key is an enum
, the student's year in school. The grouping operation uses the default equality comparer for the type.
var groupByYearQuery =
from student in students
group student by student.Year into newGroup
orderby newGroup.Key
select newGroup;
foreach (var yearGroup in groupByYearQuery)
{
Console.WriteLine($"Key: {yearGroup.Key}");
foreach (var student in yearGroup)
{
Console.WriteLine($"\t{student.LastName}, {student.FirstName}");
}
}
The equivalent code using method syntax is shown in the following example:
// Variable groupByLastNamesQuery is an IEnumerable<IGrouping<string,
// DataClass.Student>>.
var groupByYearQuery = students
.GroupBy(student => student.Year)
.OrderBy(newGroup => newGroup.Key);
foreach (var yearGroup in groupByYearQuery)
{
Console.WriteLine($"Key: {yearGroup.Key}");
foreach (var student in yearGroup)
{
Console.WriteLine($"\t{student.LastName}, {student.FirstName}");
}
}
Group by value example
The following example shows how to group source elements by using something other than a property of the object for the group key. In this example, the key is the first letter of the student's family name.
var groupByFirstLetterQuery =
from student in students
let firstLetter = student.LastName[0]
group student by firstLetter;
foreach (var studentGroup in groupByFirstLetterQuery)
{
Console.WriteLine($"Key: {studentGroup.Key}");
foreach (var student in studentGroup)
{
Console.WriteLine($"\t{student.LastName}, {student.FirstName}");
}
}
Nested foreach is required to access group items.
The equivalent code using method syntax is shown in the following example:
var groupByFirstLetterQuery = students
.GroupBy(student => student.LastName[0]);
foreach (var studentGroup in groupByFirstLetterQuery)
{
Console.WriteLine($"Key: {studentGroup.Key}");
foreach (var student in studentGroup)
{
Console.WriteLine($"\t{student.LastName}, {student.FirstName}");
}
}
Group by a range example
The following example shows how to group source elements by using a numeric range as a group key. The query then projects the results into an anonymous type that contains only the first and family name and the percentile range to which the student belongs. An anonymous type is used because it isn't necessary to use the complete Student
object to display the results. GetPercentile
is a helper function that calculates a percentile based on the student's average score. The method returns an integer between 0 and 10.
static int GetPercentile(Student s)
{
double avg = s.Scores.Average();
return avg > 0 ? (int)avg / 10 : 0;
}
var groupByPercentileQuery =
from student in students
let percentile = GetPercentile(student)
group new
{
student.FirstName,
student.LastName
} by percentile into percentGroup
orderby percentGroup.Key
select percentGroup;
foreach (var studentGroup in groupByPercentileQuery)
{
Console.WriteLine($"Key: {studentGroup.Key * 10}");
foreach (var item in studentGroup)
{
Console.WriteLine($"\t{item.LastName}, {item.FirstName}");
}
}
Nested foreach required to iterate over groups and group items. The equivalent code using method syntax is shown in the following example:
static int GetPercentile(Student s)
{
double avg = s.Scores.Average();
return avg > 0 ? (int)avg / 10 : 0;
}
var groupByPercentileQuery = students
.Select(student => new { student, percentile = GetPercentile(student) })
.GroupBy(student => student.percentile)
.Select(percentGroup => new
{
percentGroup.Key,
Students = percentGroup.Select(s => new { s.student.FirstName, s.student.LastName })
})
.OrderBy(percentGroup => percentGroup.Key);
foreach (var studentGroup in groupByPercentileQuery)
{
Console.WriteLine($"Key: {studentGroup.Key * 10}");
foreach (var item in studentGroup.Students)
{
Console.WriteLine($"\t{item.LastName}, {item.FirstName}");
}
}
Group by comparison example
The following example shows how to group source elements by using a Boolean comparison expression. In this example, the Boolean expression tests whether a student's average exam score is greater than 75. As in previous examples, the results are projected into an anonymous type because the complete source element isn't needed. The properties in the anonymous type become properties on the Key
member.
var groupByHighAverageQuery =
from student in students
group new
{
student.FirstName,
student.LastName
} by student.Scores.Average() > 75 into studentGroup
select studentGroup;
foreach (var studentGroup in groupByHighAverageQuery)
{
Console.WriteLine($"Key: {studentGroup.Key}");
foreach (var student in studentGroup)
{
Console.WriteLine($"\t{student.FirstName} {student.LastName}");
}
}
The equivalent query using method syntax is shown in the following code:
var groupByHighAverageQuery = students
.GroupBy(student => student.Scores.Average() > 75)
.Select(group => new
{
group.Key,
Students = group.AsEnumerable().Select(s => new { s.FirstName, s.LastName })
});
foreach (var studentGroup in groupByHighAverageQuery)
{
Console.WriteLine($"Key: {studentGroup.Key}");
foreach (var student in studentGroup.Students)
{
Console.WriteLine($"\t{student.FirstName} {student.LastName}");
}
}
Group by anonymous type
The following example shows how to use an anonymous type to encapsulate a key that contains multiple values. In this example, the first key value is the first letter of the student's family name. The second key value is a Boolean that specifies whether the student scored over 85 on the first exam. You can order the groups by any property in the key.
var groupByCompoundKey =
from student in students
group student by new
{
FirstLetterOfLastName = student.LastName[0],
IsScoreOver85 = student.Scores[0] > 85
} into studentGroup
orderby studentGroup.Key.FirstLetterOfLastName
select studentGroup;
foreach (var scoreGroup in groupByCompoundKey)
{
var s = scoreGroup.Key.IsScoreOver85 ? "more than 85" : "less than 85";
Console.WriteLine($"Name starts with {scoreGroup.Key.FirstLetterOfLastName} who scored {s}");
foreach (var item in scoreGroup)
{
Console.WriteLine($"\t{item.FirstName} {item.LastName}");
}
}
The equivalent query using method syntax is shown in the following code:
var groupByCompoundKey = students
.GroupBy(student => new
{
FirstLetterOfLastName = student.LastName[0],
IsScoreOver85 = student.Scores[0] > 85
})
.OrderBy(studentGroup => studentGroup.Key.FirstLetterOfLastName);
foreach (var scoreGroup in groupByCompoundKey)
{
var s = scoreGroup.Key.IsScoreOver85 ? "more than 85" : "less than 85";
Console.WriteLine($"Name starts with {scoreGroup.Key.FirstLetterOfLastName} who scored {s}");
foreach (var item in scoreGroup)
{
Console.WriteLine($"\t{item.FirstName} {item.LastName}");
}
}
Create a nested group
The following example shows how to create nested groups in a LINQ query expression. Each group that is created according to student year or grade level is then further subdivided into groups based on the individuals' names.
var nestedGroupsQuery =
from student in students
group student by student.Year into newGroup1
from newGroup2 in
from student in newGroup1
group student by student.LastName
group newGroup2 by newGroup1.Key;
foreach (var outerGroup in nestedGroupsQuery)
{
Console.WriteLine($"DataClass.Student Level = {outerGroup.Key}");
foreach (var innerGroup in outerGroup)
{
Console.WriteLine($"\tNames that begin with: {innerGroup.Key}");
foreach (var innerGroupElement in innerGroup)
{
Console.WriteLine($"\t\t{innerGroupElement.LastName} {innerGroupElement.FirstName}");
}
}
}
Three nested foreach
loops are required to iterate over the inner elements of a nested group.
(Hover the mouse cursor over the iteration variables, outerGroup
, innerGroup
, and innerGroupElement
to see their actual type.)
The equivalent query using method syntax is shown in the following code:
var nestedGroupsQuery =
students
.GroupBy(student => student.Year)
.Select(newGroup1 => new
{
newGroup1.Key,
NestedGroup = newGroup1
.GroupBy(student => student.LastName)
});
foreach (var outerGroup in nestedGroupsQuery)
{
Console.WriteLine($"DataClass.Student Level = {outerGroup.Key}");
foreach (var innerGroup in outerGroup.NestedGroup)
{
Console.WriteLine($"\tNames that begin with: {innerGroup.Key}");
foreach (var innerGroupElement in innerGroup)
{
Console.WriteLine($"\t\t{innerGroupElement.LastName} {innerGroupElement.FirstName}");
}
}
}
Perform a subquery on a grouping operation
This article shows two different ways to create a query that orders the source data into groups, and then performs a subquery over each group individually. The basic technique in each example is to group the source elements by using a continuation named newGroup
, and then generating a new subquery against newGroup
. This subquery is run against each new group created by the outer query. In this particular example the final output isn't a group, but a flat sequence of anonymous types.
For more information about how to group, see group clause. For more information about continuations, see into. The following example uses an in-memory data structure as the data source, but the same principles apply for any kind of LINQ data source.
var queryGroupMax =
from student in students
group student by student.Year into studentGroup
select new
{
Level = studentGroup.Key,
HighestScore = (
from student2 in studentGroup
select student2.Scores.Average()
).Max()
};
var count = queryGroupMax.Count();
Console.WriteLine($"Number of groups = {count}");
foreach (var item in queryGroupMax)
{
Console.WriteLine($" {item.Level} Highest Score={item.HighestScore}");
}
The query in the preceding snippet can also be written using method syntax. The following code snippet has a semantically equivalent query written using method syntax.
var queryGroupMax =
students
.GroupBy(student => student.Year)
.Select(studentGroup => new
{
Level = studentGroup.Key,
HighestScore = studentGroup.Max(student2 => student2.Scores.Average())
});
var count = queryGroupMax.Count();
Console.WriteLine($"Number of groups = {count}");
foreach (var item in queryGroupMax)
{
Console.WriteLine($" {item.Level} Highest Score={item.HighestScore}");
}