How to: Query for the Total Number of Bytes in a Set of Folders (LINQ)
This example shows how to retrieve the total number of bytes used by all the files in a specified folder and all its subfolders.
Example
The Sum method adds the values of all the items selected in the select clause. You can easily modify this query to retrieve the biggest or smallest file in the specified directory tree by calling the Min or Max method instead of Sum.
Module QueryTotalBytes
Sub Main()
' Change the drive\path if necessary.
Dim root As String = "C:\Program Files\Microsoft Visual Studio 9.0\VB"
'Take a snapshot of the folder contents.
' This method assumes that the application has discovery permissions
' for all folders under the specified path.
Dim fileList = My.Computer.FileSystem.GetFiles _
(root, FileIO.SearchOption.SearchAllSubDirectories, "*.*")
Dim fileQuery = From file In fileList _
Select GetFileLength(file)
' Force execution and cache the results to avoid multiple trips to the file system.
Dim fileLengths = fileQuery.ToArray()
' Find the largest file
Dim maxSize = Aggregate aFile In fileLengths Into Max()
' Find the total number of bytes
Dim totalBytes = Aggregate aFile In fileLengths Into Sum()
Console.WriteLine("The largest file is " & maxSize & " bytes")
Console.WriteLine("There are " & totalBytes & " total bytes in " & _
fileList.Count & " files under " & root)
' Keep the console window open in debug mode
Console.WriteLine("Press any key to exit.")
Console.ReadKey()
End Sub
' This method is used to catch the possible exception
' that can be raised when accessing the FileInfo.Length property.
Function GetFileLength(ByVal filename As String) As Long
Dim retval As Long
Try
Dim fi As New System.IO.FileInfo(filename)
retval = fi.Length
Catch ex As System.IO.FileNotFoundException
' If a file is no longer present,
' just return zero bytes.
retval = 0
End Try
Return retval
End Function
End Module
class QuerySize
{
public static void Main()
{
string startFolder = @"c:\program files\Microsoft Visual Studio 9.0\VC#";
// Take a snapshot of the file system.
// This method assumes that the application has discovery permissions
// for all folders under the specified path.
IEnumerable<string> fileList = System.IO.Directory.GetFiles(startFolder, "*.*", System.IO.SearchOption.AllDirectories);
var fileQuery = from file in fileList
select GetFileLength(file);
// Cache the results to avoid multiple trips to the file system.
long[] fileLengths = fileQuery.ToArray();
// Return the size of the largest file
long largestFile = fileLengths.Max();
// Return the total number of bytes in all the files under the specified folder.
long totalBytes = fileLengths.Sum();
Console.WriteLine("There are {0} bytes in {1} files under {2}",
totalBytes, fileList.Count(), startFolder);
Console.WriteLine("The largest files is {0} bytes.", largestFile);
// Keep the console window open in debug mode.
Console.WriteLine("Press any key to exit.");
Console.ReadKey();
}
// This method is used to swallow the possible exception
// that can be raised when accessing the System.IO.FileInfo.Length property.
static long GetFileLength(string filename)
{
long retval;
try
{
System.IO.FileInfo fi = new System.IO.FileInfo(filename);
retval = fi.Length;
}
catch (System.IO.FileNotFoundException)
{
// If a file is no longer present,
// just add zero bytes to the total.
retval = 0;
}
return retval;
}
}
If you only have to count the number of bytes in a specified directory tree, you can do this more efficiently without creating a LINQ query, which incurs the overhead of creating the list collection as a data source. The usefulness of the LINQ approach increases as the query becomes more complex, or when you have to run multiple queries against the same data source.
The query calls out to a separate method to obtain the file length. It does this in order to consume the possible exception that will be raised if the file was deleted on another thread after the FileInfo object was created in the call to GetFiles. Even though the FileInfo object has already been created, the exception can occur because a FileInfo object will try to refresh its Length property with the most current length the first time the property is accessed. By putting this operation in a try-catch block outside the query, the code follows the rule of avoiding operations in queries that can cause side-effects. In general, great care must be taken when you consume exceptions to make sure that an application is not left in an unknown state.
Compiling the Code
Create a Visual Studio project that targets the .NET Framework version 3.5. The project has a reference to System.Core.dll and a using directive (C#) or Imports statement (Visual Basic) for the System.Linq namespace by default. In C# projects, add a using directive for the System.IO namespace.
Copy this code into your project.
Press F5 to compile and run the program.
Press any key to exit the console window.
Robust Programming
For intensive query operations over the contents of multiple types of documents and files, consider using the Windows Desktop Search engine.