The Dangers of Concurrency
When is Thread Safety Important?
Objects and Monitors
Writing a Thread-Safe Class
Using the SyncLock Construct
Wrapping It Up
My last three Basic Instincts columns have examined techniques for using asynchronous delegates and creating secondary threads. Those columns demonstrated how to introduce multithreaded behavior into your applications. In this month's column, I am going to discuss the need for thread synchronization and introduce the fundamentals of writing thread-safe code. After all, I've already shown how you can get into trouble by getting your code to run on multiple threads at once. Now, I feel obligated to discuss how to avoid the dangers and pitfalls of writing code for a multithreaded environment.
As a caveat, I must point out that there's no way I could possibly cover all the important topics related to thread synchronization in just a few columns. After all, there are entire university courses dedicated to the subject. Instead, my goal is to frame the problems created by multithreading and concurrency and to discuss how these problems are solved through thread synchronization. Here I'll introduce monitors as the most basic synchronization feature provided by the Microsoft® .NET Framework. In a follow-up column, I will examine more complex issues and techniques related to thread synchronization.
The Dangers of Concurrency
Some programming problems do not surface until you begin to execute the code associated with a class or object on multiple threads at once. For example, when multiple threads are executing methods concurrently on a single object, it opens up the possibility of one thread seeing data that's been left in an inconsistent or corrupt state by another thread. This is possible because of the way in which threads are scheduled by the underlying operating system.
The way the operating system's thread scheduler preempts one thread and begins to execute another thread is nondeterministic, meaning that a thread may be preempted while in the middle of a set of related update operations. This makes it possible for a second thread to gain control of the processor and to see the first thread's partially completed work.
Let's start by looking at an example of a very simple class definition. The Point class shown in Figure 1 can be used to create Point objects. Each Point object has a field for the X coordinate and the Y coordinate, which are used together to track a point's position. The Point class also provides a public constructor and two public methods: SetPointPosition and GetPointPosition. These methods can be used to read the state of a Point object and update it accordingly.
Figure 1 Not Thread Safe
'*** this class is not thread safe Class Point Private x As Integer Private y As Integer Sub New(ByVal x As Integer, ByVal y As Integer) Me.x = x Me.y = y End Sub '*** method to update point position Sub SetPointPosition(ByVal x As Integer, ByVal y As Integer) Me.x = x Me.y = y End Sub '*** method to read point position Sub GetPointPosition(ByRef x As Integer, ByRef y As Integer) x = Me.x y = Me.y End Sub End Class
By default, classes written in Visual Basic® .NET are not thread safe. This is true of the Point class shown in Figure 1. The lack of thread safety isn't a problem in many design scenarios if you can assume that a Point object will never be accessed concurrently by two or more threads.
However, in some scenarios that involve multithreaded programming techniques it would be possible for code running on two or more threads to access a Point object and attempt to update and read its data at the same time. If you have a design in which Point objects are going to be accessed concurrently, you must then write the Point class to be thread safe. This means you have to understand how to properly use thread synchronization techniques.
Let's investigate this example a little further to shed some light on the nature of the problem. Examine the diagram in Figure 2. This figure shows the timing of a thread's update to the state of a Point object when executing the SetPointPosition method. As you can see, there is a moment in time when the Point object's data exists in an inconsistent state. What would happen if a thread calling the SetPointPosition method were to be preempted by the thread scheduler at this particular moment in time? Another thread could take control of the processor and begin to execute the GetPointPosition method. This thread would see the Point object at the position (20, 10). This is an example of an inconsistent read and represents a bug in your application code.
Figure 2** Data in Inconsistent State **
Once again, it is important to make and document decisions about whether Point objects will ever be put in a situation where they will be accessed by multiple threads concurrently. If you don't plan to use Point objects in multithreaded scenarios, then you don't have to worry about inconsistent reads and writes. However, if you do plan to use Point objects in a multithreaded environment in which they will be accessed concurrently, then you must make sure that your code is thread safe.
Testing and debugging the thread safety of your code can be a difficult and painful undertaking because the way that two threads advance with respect to one another is nondeterministic with respect to your application. There is no guarantee that things will happen the same way twice.
For example, you might run a multithreaded program 10,000 times and observe that it executes without any problems. Run the multithreaded program one more time and it might crash because one of its threads is preempted at a point in time that is different from any of your previous test runs. This is a symptom of what is known as a race condition.
The important observation is that one or more successful test runs doesn't guarantee the safety of your thread-safe code. However, you can increase your chances of finding bugs by testing your code on machines with the fastest possible processor. Better yet, test your code on a machine with several really fast processors. This will give you the best chance of finding the race conditions and flushing out the bugs.
When is Thread Safety Important?
Writing thread-safe code requires additional analysis and more coding. It also requires very thorough testing. Fortunately, a relatively small percentage of the code you write for a typical application needs to be thread safe. In fact, most of the code that makes up the .NET Framework Class Library is not thread safe. That's because most objects don't run in scenarios that involve the potential for concurrent access.
As an application designer, it is your responsibility to determine which classes and objects are vulnerable. If you have written a program that is using asynchronous delegates or creating secondary threads, you must think about whether multiple threads might access the same class or object concurrently.
Remember that each thread has its own call stack with its own local variables. This means that you don't have to worry about concurrent access to data on the call stack. However, objects are always created on the heap and there are numerous ways to share object references across threads. If this is the case, it is possible for two or more threads to access an object concurrently. Also keep in mind that classes that expose shared members are particularly problematic because they make it easy for multiple threads to access the same set of shared fields. Whenever a class or object is going to be accessed concurrently, you must determine whether or not you need to write thread-safe code.
Several aspects of the .NET Framework were designed explicitly to reduce your need to write thread-safe code. The ASP.NET architecture provides an excellent example. It was designed to be a multithreaded environment to provide the highest levels of application throughput and scalability. The processing engine of ASP.NET provides a built-in thread pool to service incoming requests. When there's lots of traffic, the ASP.NET worker process might be executing 25 different requests concurrently. Does this mean that all the code you provide for an ASP.NET application must be written to be thread safe? Fortunately, no.
ASP.NET runs each incoming request on its own separate worker thread. The ASP.NET programming model is based on a scheme in which each worker thread is routed through a private set of ASP.NET objects including applications, modules, pages, and custom handlers. The ASP.NET programming model was designed to ensure that only one worker thread will ever touch any one of these objects at any one time. This enhances productivity because you can write most of your ASP.NET code as single-threaded. There's no need to worry about concurrency or thread safety.
In ASP.NET programming, there are some less common scenarios in which you do have to worry about thread safety. For example, if you create a utility class with public shared methods that update and read shared fields, you might be required to make the class thread safe by using synchronization techniques since the methods could be used simultaneously by multiple requests. You also need to be particularly wary of the objects you store in either the ASP.NET Cache object or Application dictionary. These types of objects are accessible to every incoming request and, therefore, have a high likelihood of being accessed concurrently by multiple worker threads. These are the types of objects that must be designed and written with thread safety in mind.
Objects and Monitors
Monitors provide the most fundamental synchronization technique in the .NET Framework. A monitor is an exclusive locking mechanism available to every managed object. When you are writing instance methods within a class to update and read instance fields, you can utilize the monitor of the current object to synchronize access. A thread can use a monitor to block other threads for the period of time during which it is making a series of updates to an object. That means a monitor can be used in order to block other threads from seeing any single update until all of the updates have been completed.
Let's begin our discussion of programming with monitors by examining the shared methods named Enter and Exit supplied by the Monitor class. Note that the Monitor class is defined within the System.Threading namespace. You call Monitor.Enter to acquire an exclusive lock on an object. Each call to Monitor.Enter should be complemented with a call to Monitor.Exit to release the lock. The idea is that only one thread can be "inside" the monitor at any one time. Examine the rewritten implementation of the SetPointPosition method shown here:
Sub SetPointPosition(ByVal x As Integer, ByVal y As Integer) '*** acquire exclusive lock using Monitor Monitor.Enter(Me) '*** perform thread-safe operations Me.x = x Me.y = y '*** release exclusive lock using Monitor Monitor.Exit(Me) End Sub
As you can see, the SetPointPosition method allows the calling thread to perform updates to both the X and Y coordinates while executing inside the scope of a monitor. The key point is that no other thread will be able to enter the monitor until the thread executing inside the scope of the monitor calls Monitor.Exit. The monitor allows a thread to perform a series of updates as an atomic unit of work. It follows the exact same principles as running a transaction against a DBMS. Figure 3 shows a graphic representation of the timing involved when the calling thread acquires and releases the monitor.
Figure 3** Monitor Timing **
When you call the Monitor.Enter method, the calling thread will wait for as long as it takes to acquire the target monitor. If another thread has acquired the target monitor and never releases it, the call to Monitor.Enter will block indefinitely. In some designs, this can lead to a deadlock situation.
In addition to the Enter method, the Monitor class also provides the TryEnter method that accepts a timeout parameter and has a Boolean return value. TryEnter makes it possible to prevent deadlocks because it doesn't block indefinitely when the target monitor is held by another thread. Instead, TryEnter accepts a timeout parameter and returns a value of true if it can acquire the target monitor before the timeout expires. However, a call to TryEnter returns with a value of false as soon as the timeout expires if it cannot acquire the monitor in the allotted amount of time.
Writing a Thread-Safe Class
Now that you have seen how calls to Monitor.Enter and Monitor.Exit make it possible to synchronize access, it's time to rewrite the definition of the Point class that was shown earlier. Examine the new class definition in Figure 4. As you can see, the execution paths through the methods SetPointPosition and GetPointPosition are synchronized using the current Point object's monitor. These modifications (shown in red) have made the Point class thread safe. No thread will ever be able to read a Point object's position while it is in an inconsistent state.
Figure 4 Making a Class Thread Safe
'*** class rewritten to be thread safe Class Point '*** private fields Private x As Integer Private y As Integer '*** constructor does not require synchronization Sub New(ByVal x As Integer, ByVal y As Integer) Me.x = x Me.y = y End Sub '*** update point position with synchronization Sub SetPointPosition(ByVal x As Integer, ByVal y As Integer) <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">Monitor.Enter(Me)</span> Me.x = x Me.y = y <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">Monitor.Exit(Me)</span> End Sub '*** read point position with synchronization Sub GetPointPosition(ByRef x As Integer, ByRef y As Integer) <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">Monitor.Enter(Me)</span> x = Me.x y = Me.y <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">Monitor.Exit(Me)</span> End Sub End Class
When you are designing and writing thread-safe code, it's critical to understand what monitors do and what they do not do. A monitor provides you with a synchronization mechanism so you can ensure that a thread can perform a series of updates as an atomic unit of work. However, a monitor does not place a magic force field around an object and make it automatically thread safe. For example, how would it affect the Point class shown in Figure 4 if you added the following method?
Sub TranslatePointPosition(x As Integer, y As Integer) Me.x += x Me.y += y End Sub
The answer is that the Point class would no longer be thread safe. The TranslatePointPosition method contains no synchronization code, meaning that it could be preempted right after changing the X coordinate. This would allow a call to GetPointPosition to see a Point object in an inconsistent state. It doesn't really matter whether the GetPointPosition method synchronizes its access with the object's monitor if the TranslatePointPosition method doesn't do so as well. All the entry points into an object must be synchronized in order to ensure thread safety.
Let's look at another potential problem. What would happen if you modified the X and Y fields of the Point class to be public members instead of private members? Once again, this would defeat thread safety because any thread holding a reference to a Point object could update or read either field without any form of synchronization. It's up to you to ensure that every entry point into your object contains the proper synchronization code.
As a final point in this section, I'd like you to think about whether the implementation of a constructor needs to be synchronized. In general, the answer to this question is no. A constructor executes on a single thread that has called the New operator. However, a call to the New operator doesn't return a reference to the object being created until the constructor has completed its execution. This means a reference to an object cannot be shared across threads until after the constructor has completed (unless the constructor does something unorthodox such as storing its Me reference to a shared field). Because of this timing sequence, constructors do not require synchronized access.
Using the SyncLock Construct
Visual Basic .NET provides a language-specific feature for using a monitor to synchronize access to an object. For example, the SetPointPosition method from Figure 4 can be written like this:
Sub SetPointPosition(x As Integer, y As Integer) SyncLock Me Me.x = x Me.y = y End SyncLock End Sub
As you can see, the SyncLock construct accepts an object reference. In this example, the SetPointPosition passes the Me reference to the current Point object. (I've used Me to make the samples simpler; however, it is considered a better practice to lock on an object internal to your class rather than on the class itself, which is externally exposed.) When you use the SyncLock construct in this fashion, the Visual Basic .NET compiler expands your code to use a monitor to synchronize access to the instruction in the body of the SyncLock construct. The Visual Basic .NET compiler also adds extra code as a precaution to release the monitor in a Finally block. The previous implementation of SetPointPosition using a SyncLock construct is roughly the equivalent to following code:
Sub SetPointPosition(x As Integer, y As Integer) Monitor.Enter(Me) Try Me.x = x Me.y = y Finally Monitor.Exit(Me) End Try End Sub
Whether you like to use the SyncLock construct or make explicit calls to Monitor.Enter and Monitor.Exit is really a matter of style and preference. The SyncLock construct is convenient because it automatically adds the code to ensure that the monitor is released. The final definition of the Point class is shown in Figure 5 (with modifications shown in red).
Figure 5 The SyncLock Construct
'*** class rewritten to be thread safe Class Point '*** private fields Private x As Integer Private y As Integer '*** constructor does not require synchronization Sub New(ByVal x As Integer, ByVal y As Integer) Me.x = x Me.y = y End Sub '*** update point position with synchronization Sub SetPointPosition(ByVal x As Integer, ByVal y As Integer) <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">SyncLock Me</span> Me.x = x Me.y = y <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">End SyncLock</span> End Sub '*** read point position with synchronization Sub GetPointPosition(ByRef x As Integer, ByRef y As Integer) <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">SyncLock Me</span> x = Me.x y = Me.y <span class="clsRed" xmlns="https://www.w3.org/1999/xhtml">End SyncLock</span> End Sub End Class
Wrapping It Up
This month I focused on the fundamental principles of thread synchronization. I began by describing the dangers associated with running multiple threads concurrently within a single process. Objects and classes that are not thread safe are vulnerable to problems when they are accessed by two or more threads concurrently. This can lead to data being read in an inconsistent state which, in turn, can lead to applications crashing or producing incorrect results. To create robust apps that take advantage of multithreading, it's up to you to know when and how to write thread-safe code.
This month's column also introduced you to synchronization techniques using monitors. The .NET Framework makes a monitor available for each object to provide a simple and easy-to-use locking mechanism. You have also seen that the Visual Basic .NET SyncLock construct is a language-specific feature for synchronizing access using a monitor.
I must admit that there's still more you need to know to take advantage of the synchronization support that's built into the .NET Framework. In the next installment of Basic Instincts, I will examine more advanced synchronization techniques involving fine-grained locking, using the ReaderWriterLock class, and performing thread-safe operations on integer values. See you then.
Send your questions and comments for Ted to email@example.com.
Ted Pattison is a cofounder of Barracuda .NET, an education company that assists companies building collaborative applications using Microsoft technologies. Ted is the author of several books including Building Applications and Components with Visual Basic .NET (Addison-Wesley, 2003).