.NET Matters: File Copy Progress, Custom Thread Pools

Article
10/18/2019

.NET Matters

File Copy Progress, Custom Thread Pools

Stephen Toub

Code download available at:NETMatters0502.exe(214 KB)

Q I'm copying a lot of large files in my application, and I'd like to give the user the option to cancel these actions. Can I use the same dialog that the shell provides when dragging and dropping files from one place to another? Alternatively, is there any way my application can receive status notifications during a file copy so that I can create my own progress dialog with an option to cancel?

A If you're using Visual Studio® 2005 Beta 1, this is a piece of cake. While not exposed as part of the base class libraries, the Visual Basic team has wrapped the shell's file copy functionality (the SHFileOperation function exposed from shell32.dll) in the My namespace (for more information on My, see Duncan Mackenzie's article in the May 2004 issue of MSDN®Magazine). When using Visual Basic®, the CopyFile and CopyDirectory methods on My.Computer.FileSystem provide a showUI Boolean parameter. When set to true, the copy file dialog will be displayed to show the file transfer's progress. This dialog also allows the user to cancel the operation. If you're using Visual C#® 2005, you can reference the Microsoft.VisualBasic.dll library from your app. Instead of this code in Visual Basic

My.Computer.FileSystem.CopyFile(...)

you can use the functionally equivalent C#:

new Microsoft.VisualBasic.MyServices.MyServerComputer() .FileSystem.CopyFile(...);

Of course, this is only valid with the .NET Framework 2.0. If you're using the .NET Framework 1.x, you'll have to use P/Invoke to access the SHFileOperation manually, a sample for which is available at C# does Shell, Part 2.

My.Computer.FileSystem.CopyFile(...)

you can use the functionally equivalent C#:

new Microsoft.VisualBasic.MyServices.MyServerComputer() .FileSystem.CopyFile(...);

That addresses your first question about whether it's possible to use the shell's copy operation. Your second question about receiving copy status notifications is answered with one function, CopyFileEx, exposed from Kernel32.dll. Under the covers, the System.IO.File class's copy operations use the related Win32® CopyFile function, also exposed from Kernel32.dll, but unfortunately there is no wrapper provided for CopyFileEx.

CopyFileEx can call a specified callback function each time a portion of the copy operation is completed, exactly the functionality you need. So, I've created the wrapper class shown in Figure 1 to expose this functionality to your application through the FileRoutines.CopyFile method. The primary overload of this method accepts five parameters: a FileInfo for the source file to be copied, a FileInfo for the destination location, a CopyFileOptions enumeration that further configures the copy operation, a CopyFileCallback delegate that is to be called to inform your application of the copy's progress, and an object state parameter that will be passed to this callback delegate.

Figure 1 FileRoutines.CopyFile Using the Win32 CopyFileEx

public sealed class FileRoutines { public static void CopyFile(FileInfo source, FileInfo destination) { CopyFile(source, destination, CopyFileOptions.None); } public static void CopyFile(FileInfo source, FileInfo destination, CopyFileOptions options) { CopyFile(source, destination, options, null); } public static void CopyFile(FileInfo source, FileInfo destination, CopyFileOptions options, CopyFileCallback callback) { CopyFile(source, destination, options, callback, null); } public static void CopyFile(FileInfo source, FileInfo destination, CopyFileOptions options, CopyFileCallback callback, object state) { if (source == null) throw new ArgumentNullException("source"); if (destination == null) throw new ArgumentNullException("destination"); if ((options & ~CopyFileOptions.All) != 0) throw new ArgumentOutOfRangeException("options"); new FileIOPermission( FileIOPermissionAccess.Read, source.FullName).Demand(); new FileIOPermission( FileIOPermissionAccess.Write, destination.FullName).Demand(); CopyProgressRoutine cpr = callback == null ? null : new CopyProgressRoutine(new CopyProgressData( source, destination, callback, state).CallbackHandler); bool cancel = false; if (!CopyFileEx(source.FullName, destination.FullName, cpr, IntPtr.Zero, ref cancel, (int)options)) { throw new IOException(new Win32Exception().Message); } } private class CopyProgressData { private FileInfo _source = null; private FileInfo _destination = null; private CopyFileCallback _callback = null; private object _state = null; public CopyProgressData(FileInfo source, FileInfo destination, CopyFileCallback callback, object state) { _source = source; _destination = destination; _callback = callback; _state = state; } public int CallbackHandler( long totalFileSize, long totalBytesTransferred, long streamSize, long streamBytesTransferred, int streamNumber, int callbackReason, IntPtr sourceFile, IntPtr destinationFile, IntPtr data) { return (int)_callback(_source, _destination, _state, totalFileSize, totalBytesTransferred); } } private delegate int CopyProgressRoutine( long totalFileSize, long TotalBytesTransferred, long streamSize, long streamBytesTransferred, int streamNumber, int callbackReason, IntPtr sourceFile, IntPtr destinationFile, IntPtr data); [SuppressUnmanagedCodeSecurity] [DllImport("Kernel32.dll", CharSet=CharSet.Auto, SetLastError=true)] private static extern bool CopyFileEx( string lpExistingFileName, string lpNewFileName, CopyProgressRoutine lpProgressRoutine, IntPtr lpData, ref bool pbCancel, int dwCopyFlags); } public delegate CopyFileCallbackAction CopyFileCallback( FileInfo source, FileInfo destination, object state, long totalFileSize, long totalBytesTransferred); public enum CopyFileCallbackAction { Continue = 0, Cancel = 1, Stop = 2, Quiet = 3 } [Flags] public enum CopyFileOptions { None = 0x0, FailIfDestinationExists = 0x1, Restartable = 0x2, AllowDecryptedDestination = 0x8, All = FailIfDestinationExists | Restartable | AllowDecryptedDestination }

To start, the parameters are validated, ensuring that the source and destination FileInfo parameters are not null and that the specified CopyFileOptions enumeration value is legitimate. There are a few ways to check whether an enumeration value is valid based on that enumeration's definition. The simplest way involves the Enum.IsDefined method; it is robust and easy to use, but also slow when compared to other mechanisms, which is important to take into consideration if the API using this method will be used frequently. When I have a small Flags-attributed enumeration, especially one that has an all-value (a value in the enumeration that is the Boolean Or of all of the other values), the simplest way to check whether a value falls within that enumeration is to negate the all-value and take the Boolean And of it with the value being checked for inclusion. By negating the all-value, I'm creating a number with one-bits in places where one-bits are invalid. Thus, if taking the Boolean And of that with the value in question yields a non-zero value, the number being checked has bits in invalid positions, and is thus an invalid value (you should note, however, that this solution might not work as intended if the various values in the enumeration have more than one-bit set).

You'll note that I don't call File.Exists to ensure that the specified source file does in fact exist; this would be an exercise in futility as there's nothing to prevent the file from being deleted between the time I call File.Exists and the time the actual copy operation commences. Thus, I rely on Win32 to provide the appropriate error response if the specified source file doesn't exist.

After validating the parameters, I ensure that the caller has the necessary permissions to read the source file and to write the destination file by demanding the relevant FileIOPermission. I've also marked the declaration of CopyFileEx with the SuppressUnmanagedCodeSecurityAttribute so that calls to it do not cause a stack walk searching for the UnmanagedCode security permission.

With validation and security checks out of the way, I can get to the meat of the operation which is to invoke the CopyFileEx function. Besides the source and destination path parameters, the most important parameter for your purposes is lpProgressRoutine. In the Win32 API, this parameter is a function pointer to a function that will be called when the CopyFileEx routine has a progress update to report. In the managed world, I can pass a delegate with the corresponding signature such that a managed method will be called as the callback from CopyFileEx. In order to get additional state passed to the callback (such as the original FileInfo objects), the method I wrap with the CopyProgressRoutine delegate is an instance member of a state class which I populate with all of the relevant information about the copy operation. Now, when CopyFileEx calls back to the lpProgressRoutine, I have all of the user-supplied state in addition to that provided to me by CopyFileEx, such as the total number of bytes being copied and the number of bytes copied thus far.

The wrapper could end here, but I don't want to provide too much unnecessary information to the user of the wrapper. Thus, instead of exposing to the user the CopyProgressRoutine callback delegate, I've exposed the simpler CopyFileCallback. The user-supplied CopyFileCallback passed to the FileRoutines.CopyFile method is invoked when the CopyProgressData.CallbackHandler method (the actual callback for CopyFileEx) is invoked, receiving information I feel is most important to the user (the source and destination FileInfo, the original user-supplied state, the total number of bytes in the file to copy, and the total number of bytes successfully copied thus far). In addition, the CopyFileCallback delegate returns a CopyFileCallbackAction value which is, in turn, cast to an integer and returned directly to CopyFileEx. This value controls how CopyFileEx proceeds, whether it should continue the copy, cancel the copy, stop the copy, or continue the copy but stop notification callbacks for updates.

The difference between cancel and stop is very important. CopyFileCallbackAction.Cancel and CopyFileCallbackAction.Stop both prevent the copy operation from continuing; however, whereas Cancel deletes the portion of the target file already copied, Stop leaves it. This is helpful when using the CopyFileOptions.Restartable option. Restartable causes the progress of the copy operation to be tracked in the target file in case the copy is stopped or fails. The operation can then be restarted at a later time by specifying source and destination FileInfo objects that represent the same paths. More information on the CopyFileEx Win32 function is available in the MSDN library.

Q I have a situation where I'd like to have multiple thread pools, each of which has a small number of threads dedicated to processing a certain task. Unfortunately, it looks like the ThreadPool in the System.Threading namespace is only usable in a static context, meaning that there's only one of them and I can't create more. Any suggestions? How can I create multiple thread pools?

A Before embarking down this path, make sure you really need this for your solution. The .NET Framework team spent a lot of time optimizing the ThreadPool for many different scenarios, and it would be a shame for you to reinvent the wheel. And make sure you have the performance scenario you think you have. Assuming that a bottleneck is being caused by a particular piece of code can be a frustrating situation when you spend hours "fixing" it, only to realize that the area in question wasn't the problem.

If, in fact, you determine by profiling that you really will benefit from multiple, small thread pools, you have a couple of options. The first would be to use the .NET ThreadPool, but throttle requests to it in order to simulate multiple smaller pools (I implemented a simple ThreadPool throttling mechanism in my December 2004 column). This would allow you to limit the number of threads from the pool that could be allotted to each task. A second option is, of course, to create your own custom pool.

Thread pool implementations run the gamut from very simple to very complex and feature rich. The common language runtime (CLR) thread pool falls into the latter category, providing functionality such as delayed thread creation, thread teardown after a period of nonuse, completion callbacks, and special support for I/O completion ports. There's little reason to attempt to duplicate its complexity in your own custom thread pool, especially given that the CLR team has spent a lot of time and resources getting its sophisticated implementation right—not an easy task.

So, here I'll show you how to create a very simple thread pool that creates the requested pool size on startup and keeps them around until you're done with the pool. Each thread waits for work items to be supplied, and when work is available, it executes the work. When a thread is finished executing a work item, it waits for more work. A side benefit of this approach is that it allows you to create a "pool" of only one thread, which means you can easily implement serialized processing: queue up a bunch of delegates, and only one will be executed at a time.

My implementation is shown in Figure 2. An instance of the CustomThreadPool class maintains three private member variables: a semaphore used to block threads in the pool until work is available, a queue of all work waiting to be processed, and a list of the threads in the pool (the list is only necessary for pool management operations, such as shutting it down). For a CustomThreadPool to be instantiated, the user must specify the number of threads she wants to exist in the pool. The constructor then spins up that number of new threads, each of which starts running the private Run method and waits for work to be queued.

Figure 2 Custom Thread Pool

public sealed class CustomThreadPool : IDisposable { private Semaphore _workWaiting; private Queue<WaitQueueItem> _queue; private List<Thread> _threads; public CustomThreadPool(int numThreads) { if (numThreads <= 0) throw new ArgumentOutOfRangeException("numThreads"); _threads = new List<Thread>(numThreads); _queue = new Queue<WaitQueueItem>(); _workWaiting = new Semaphore(0, int.MaxValue); for (int i = 0; i < numThreads; i++) { Thread t = new Thread(Run); t.IsBackground = true; _threads.Add(t); t.Start; } } public void Dispose() { if (_threads != null) { _threads.ForEach(delegate(Thread t) { t.Interrupt(); }); _threads = null; } } public void QueueUserWorkItem(WaitCallback callback, object state) { if (_threads == null) throw new ObjectDisposedException(GetType().Name); if (callback == null) throw new ArgumentNullException("callback"); WaitQueueItem item = new WaitQueueItem(); item.Callback = callback; item.State = state; item.Context = ExecutionContext.Capture(); lock(_queue) _queue.Enqueue(item); _workWaiting.Release(); } private void Run() { try { while (true) { _workWaiting.WaitOne(); WaitQueueItem item; lock(_queue) item = _queue.Dequeue(); ExecutionContext.Run(item.Context, new ContextCallback(item.Callback), item.State); } } catch(ThreadInterruptedException){} } private class WaitQueueItem { public WaitCallback Callback; public object State; public ExecutionContext Context; } }

The Run method is very straightforward. The worker thread blocks waiting for work items to be added to the instance using QueueUserWorkItem. When a work item shows up, one of the waiting threads wakes up, retrieves the work item from the work queue (under a lock on the queue so that access to the queue is synchronized), and executes it. QueueUserWorkItem is just as straightforward. The user-supplied WaitCallback and object state, along with the current ExecutionContext, are stored in a data structure that is appended to the work queue (again, synchronized around the queue). The semaphore is then signaled to indicate that another work item is available.

I discussed ExecutionContext in my November 2004 column, but suffice it to say that ExecutionContext provides a single container for all information relevant to a logical thread of execution. This includes security context, call context, synchronization context, localization context, and transaction context. The class provides the ability to transfer this context from one thread to another, and as such I use it to transfer the execution context from the thread that queues a work item to the worker thread that actually invokes the work item. If the thread that queues the work item is the same thread that created the pool, this adds little functionality, as the threads created in the pool's constructor will have already had the original execution context flowed to them. If, however, the thread queuing the work item has a different context than the thread that created the pool at the time it created it, this simple addition can greatly increase the security of the pool. Given it only took three additional lines of code to support the transfer of the ExecutionContext from QueueUserWorkItem to run, it's well worth the effort!

As I'm a big fan of C# and the improvements that have been made to it for version 2.0, I'd like to highlight some here (I'm using Beta 1). First and foremost, the work queue I've implemented uses a generic Queue of WaitQueueItem objects, and the list of Thread instances is a generic List of Thread objects. This improves both the efficiency and readability of the code (in addition to improving IntelliSense® in Visual Studio 2005 while I wrote the code). Of course, generics are applicable to more than just C#, as languages like MSIL, Visual Basic, and C++ also support them (in fact, C++ supports both generics and templates, and it supports them in combination). I've also taken advantage of delegate inference, as you can see in the CustomThreadPool's constructor where I construct the Thread instances, and of anonymous delegates, which you can see in the call to _threads.ForEach (which executes the specified delegate for each item in the List).

With this custom thread pool in place, there are lots of enhancements you can add to it to better suit your particular requirements. For example, currently all threads are created during the construction of the pool, but the pool could be modified to create them on demand up to the maximum number allowed, and to tear down extra threads after some period of inactivity. Work item priorities and work item wait functionality could be added (similar to how I demonstrated in my October 2004 and November 2004 columns, but more easily given that you don't have to write it as an extension to the pool). You could add the ability to cancel a queued work item such that it wouldn't be executed if it hadn't already, you could add properties to expose how many threads are currently in use and how many are available for work, and you could add additional events to signal when work has been started and when work has been completed. These are just a few ways a pool can be extended, and many possibilities exist for you to tailor the class for your exact needs.

Send your questions and comments to netqa@microsoft.com.

Stephen Toub is the Technical Editor for MSDN Magazine.

Additional resources