Advanced Basics
Being Generic Ain't So Bad
Ken Getz
Contents
A Sample Demo
Internal ArrayList
Generics in Action
Ispeak at a lot of user groups and conferences where I field tech support questions. Recently, a conference attendee (I'll call him Adam) came up to me with a sheaf of printouts, along with the following question. He had created a middle-tier component that exposes information about customers. His Customer class maps to data in his database and to a class that exposes an ArrayList instance that contains Customer objects. It works fine, but he can't keep inexperienced programmers who are working in the presentation tier from adding invalid items to the Customers collection. In addition, even when they do get it right, their code has to perform all sorts of type casts because the ArrayList can only hand them back Object instances (instead of strongly typed Customer objects). Isn't there a better way to handle all of this?
I'm guessing that many of you experienced developers can propose at least a few solutions to this problem, but I'll pass along what I suggested to Adam. I started with the technique I would use in Visual Studio® .NET 2003, and then progressed to the technique we'll all likely be using once Visual Studio 2005 (and Visual Basic® 2005) become available. If you haven't done so already, you should download Visual Basic 2005 Express Beta 1, available for free at https://msdn.microsoft.com/vbasic. For more information on Visual Basic 2005 Express, see Brian Randell's article in this issue of MSDN®Magazine. Note that some of the information in this column is based on a prerelease version of Visual Studio 2005. Some features may change in the final product.
In order to explain Adam's situation, I'll craft up a little sample demo that you can try yourself, and then we'll walk through two different solutions.
A Sample Demo
To test out the current situation, create a new Class Library project, name the project MiddleTier, and add two class files: Customer.vb and DataLayer.vb. Select Project | MiddleTier Properties to display the MiddleTier Property Pages dialog box, and on the Build tag, set the Option Strict option to On.
Add the code shown in Figure 1 to the Customer class. Place the following code in the DataLayer class, emulating Adam's public ArrayList instance containing Customer objects:
Public Class DataLayer Public Customers As New ArrayList End Class
That's all it takes to show off the bad behavior. To prove how easy it is to cause trouble in this scenario, add a second project (a Windows®-based application) named CustomerDemo to your solution, and set this new project as the startup project. Again, select Project | CustomerDemo Properties to display the CustomerDemo Property Pages dialog box and on the Compile tab, set the Option Strict option to On. In the new project, add a reference to the MiddleTier project. Open Form1 in the form designer and add a button to the form. Double-click the button to open the code designer and add the following class-level definitions:
Private data As New MiddleTier.DataLayer Private cust As MiddleTier.Customer
In the button's Click event handler, add the following code:
' You can create a Customer object and keep it around... cust = New MiddleTier.Customer("Nancy", "Davolio") data.Customers.Add(cust) ' ...or you can create Customer objects on the fly. data.Customers.Add(New MiddleTier.Customer("Andrew", "Fuller")) data.Customers.Add(New MiddleTier.Customer("Janet", "Leverling"))
As you're typing, notice that the Add method of the Customers object accepts an Object as its parameter, as shown in Figure 2.
Figure 1 Creating a Simple Customer Class
Public Class Customer Public FirstName As String Public LastName As String Public Sub New( _ ByVal FirstName As String, ByVal LastName As String) Me.FirstName = FirstName Me.LastName = LastName End Sub Public Overrides Function ToString() As String Return Me.FirstName & " " & Me.LastName End Function End Class
Figure 2** Adding an Object **
Add a second button to the form and add the following code to its Click event handler:
For Each cust As MiddleTier.Customer In data.Customers Debug.WriteLine(cust.ToString) Next
Run the project, click each of the buttons, and verify that you can indeed add new Customer objects to the Customers ArrayList and that you can iterate through the items and call the ToString method of each. Back in design mode, try modifying the second button's Click event handler to interact with a single Customer object, as shown in the following line:
cust = data.Customers(0)
Unfortunately, this code won't work—because you've set the Option Strict option to On (as you should), the code can't compile. You're attempting to assign an Object into a MiddleTier.Customer reference, and that isn't legal. To make the code work, you'll need to cast the reference as the correct type:
cust = CType(data.Customers(0), MiddleTier.Customer)
Not only is this code slightly slower than it would be if the object in data.Customers(0) was already a MiddleTier.Customer object, it requires every developer who consumes the MiddleTier.Customers ArrayList to remember to make the conversion every time. There has to be a better solution!
Finally, try one more devious trick that an inexperienced developer might exploit. Add the following code to the existing code in the first button's Click event handler:
data.Customers.Add("John Smith")
No one would argue that adding a String to the Customers ArrayList is different from adding a Customer object, yet the code compiles just fine. Indeed, when you run the project and click the first button, nothing seems to be wrong. When you click the second button, however, you fail with an InvalidCastException—you can't cast a String to a MiddleTier.Customer, but that's exactly what the For Each loop is attempting to do here.
Internal ArrayList
In Visual Studio 2003, the solution to all of these problems is to create a wrapper class—that is, a class that encapsulates the ArrayList, rather than exposing it directly. In Visual Basic 6.0, many developers did just this, hiding a Collection instance inside a container class and handling all the delegation for the encapsulated class. In Visual Basic .NET, things are simpler. You can create an instance of the System.Collections.CollectionBase class instead. This class maintains its own internal ArrayList instance and allows you to provide your type-safe Add, Item, and other methods that allow developers to work with members of the ArrayList without needing to worry about unsafe additions and type conversions. The CollectionBase class provides two properties that allow you to interact with its internal ArrayList: the InnerList property returns an ArrayList reference and the List property returns an IList.
The difference between the two is that the List property acts as a wrapper around the InnerList property. When you call the List.Add method, for example, the CollectionBase class first calls its OnValidate and OnInsert overrides, and then calls the InnerList.Add method. After the call to InnerList.Add, the class calls the OnInsertComplete method override. These methods allow you to provide code that runs both before and after the data is inserted into the data structure, and let you validate the data as well. For Adam's example, working with the InnerList property directly is simplest because there's no need to override the other procedures.
To create a type-safe encapsulated ArrayList, add a new class to your MiddleTier project, named Customers.vb. Add the code in Figure 3 to your class. The Customers class inherits from the CollectionBase class and encapsulates all the methods of the ArrayList class that normally accept Object types as parameters or that return Object types. Because the inner ArrayList is hidden from the outside world, it's not possible to add anything to this data structure that isn't the correct type.
Figure 3 Inheriting from the CollectionBase Class
Public Class Customers Inherits CollectionBase Default Public Property Item(ByVal index As Integer) As Customer Get Return CType(InnerList(index), Customer) End Get Set(ByVal Value As Customer) InnerList(index) = Value End Set End Property Public Function Add(ByVal value As Customer) As Integer Return InnerList.Add(value) End Function Public Function IndexOf(ByVal value As Customer) As Integer Return InnerList.IndexOf(value) End Function Public Sub Insert( _ ByVal index As Integer, ByVal value As Customer) InnerList.Insert(index, value) End Sub Public Sub Remove(ByVal value As Customer) InnerList.Remove(value) End Sub Public Function Contains(ByVal value As Customer) As Boolean Return InnerList.Contains(value) End Function End Class
To test out the new data structure, add a new button to the sample form, double-click the button to switch to the code editor, and add code like the following to the button's Click event handler:
Dim custs As New MiddleTier.Customers cust = New MiddleTier.Customer("Robert", "King")) custs.Add(cust) If custs.Contains(cust) Then Debug.WriteLine("Robert is in there") End If
As you type, note that you can't add anything to the Customers class other than a Customer object, as shown in the code in Figure 4.
Figure 4** Type-Safe Access **
If you run the code, you'll verify that the Customers class contains a simple ArrayList containing Customer objects and that you cannot add anything to this class that isn't a Customer object. (It's important to note that if someone inherits from your Customers class, they'll have direct access to the InnerList property as well and could "corrupt" the ArrayList contents by modifying it directly. If this is an issue for you, consider making the class noninheritable using the NotInheritable keyword in the class definition.)
It may seem like I had, at this point, completely solved Adam's problem. But not quite. As soon as I got this far, Adam responded with another set of questions. Specifically, he wanted to know what happens when he needs to provide an Employees collection? Or an Invoices collection? Would that require him to completely rewrite the code in the Customers class each time, except for a different type of contained class? Isn't it possible to just write the collection class once, but have Visual Basic .NET keep track of the type of object you want to allow in the collection?
Yep, that's all true. It may seem an insurmountable task to write and maintain all of this code in Visual Studio .NET 2003, but if you want to take advantage of this technique now, consider investing in tools that provide code generation. For example, CodeSmith reduces the task to a 30-second process. Use a template and out pops the collection class you need. Worth every penny, I'd say (and it's free)!
As the Visual Basic language progresses, wouldn't it be great if you could create a collection class once and then, at design time, indicate the type of object your collection could contain? And, of course, you'd expect that attempting to add anything besides the correct type would trigger a runtime error, right? Obviously, I wouldn't be pushing so hard if this wasn't exactly what you'll find in Visual Basic 2005.
By this point in Adam's questioning, I was getting awfully hungry and tired, but I couldn't give up without digging into one of the most important new features to be included in Visual Basic 2005—the ability to create and consume type-safe classes that can determine the type of objects they work with at compile time, rather than at run time. These classes are called generics, and you've probably heard that generics are an important addition to the language. You're likely wondering what they are and how they can help you.
Generics in Action
In a nutshell, generics allow you to let the compiler control the type of object passed to a method or contained within a collection at the time the class is consumed, as opposed to checking the type at run time or creating a separate class for each different type. Controlling types at the time you create the collection, as in the previous example, requires you to create a separate chunk of code for each type you'd like to support. Controlling types at run time (as in Adam's original situation) requires you to pay a price both in terms of performance and convenience. Generics allow you to have the best of both worlds—that is, you can write code that is type agnostic, but at compile time the compiler determines what types to use and won't allow any other types. Generics make it possible for developers to create type-agnostic classes and methods that can determine the types they'll work with when they're consumed, not when they're created in the first place. (For more background on generics, see the .NET columns in the September and October 2003 issues of MSDN Magazine.)
In order to make it easy for developers to take advantage of generics, the Microsoft® .NET Framework 2.0 contains a new namespace, System.Collections.Generic, which provides a number of generic collection classes. For example, the System.Collections.Generic.Collection class allows you to specify the type of object an instance of the class can accept, and it handles all the encapsulation details for you. Rather than requiring you to create a class that inherits from CollectionBase and writing all the type-specific code, you simply create an instance of the class, and the .NET Framework does the rest.
To see generics in action, fire up Visual Basic 2005, create a new Windows-based application, and name it GenericsDemo. Add a Class Library project to the solution, and name it MiddleTier. In the MiddleTier project, add a Customer class using the code that was shown in Figure 1. Add a second class to the project named DataLayer, and insert the following code in the new class:
Imports System.Collections.Generic Public Class DataLayer Public Customers As New Collection(Of Customer) End Class
The new syntax ("Of Customer") is the clear giveaway that something serious has changed here. This new syntax indicates to the compiler that it should create an instance of the System.Collections.Generic.Collection class and set up the collection so that it can only contain Customer objects. To create a second collection that can contain only String values, for example, you could add another declaration like this:
Public StringValues As New Collection(Of String)
In the GenericsDemo project, add a reference to the MiddleTier project. Add a button to Form1 and double-click it to load the Click event handler. Then add the following code:
Dim data As New MiddleTier.DataLayer Dim cust As MiddleTier.Customer cust = New MiddleTier.Customer("Nancy", "Davolio") data.Customers.Add(cust) data.Customers.Add(New MiddleTier.Customer("Andrew", "Fuller")) data.Customers.Add(New MiddleTier.Customer("Janet", "Leverling"))
Scroll to the top of the class and add the following Imports statement (you'll use this in later code snippets):
Imports System.Collections.Generic
Run the project, and although it doesn't appear to do much, it does prove that you can easily create a generic collection class that contains a particular type. Back in the code view, try adding anything besides a MiddleTier.Customer instance to the Customers collection—the code will simply not compile if you do.
The System.Collections.Generic namespace includes a number of different collection types, each having a specific purpose. The Collection class you've seen allows you to create a strongly typed structure similar to an ArrayList. The Dictionary class allows you to create a strongly typed hashtable, and this generic declaration allows you to specify the type of both the key and the value as you write your code. You can also retrieve items given their key, as in the following code fragment (you can add this code to the Click event handler in your sample project, if you like):
Dim list2 As New Dictionary(Of String, MiddleTier.Customer) list2.Add("DAVOLION", New MiddleTier.Customer("Nancy", "Davolio")) list2.Add("FULLERA", New MiddleTier.Customer("Andrew", "Fuller")) list2.Add("LEVERLINGJ", New MiddleTier.Customer("Janet", "Leverling")) Debug.WriteLine(list2.Item("FULLERA").FirstName)
Note the syntax that's required when a class definition requires two generic types:
Dim list2 As New Dictionary(Of String, MiddleTier.Customer)
The generic LinkedList class allows you to create a classic linked list, strongly typed so that it can contain only values of a particular type. (It's odd that the previous versions of the .NET Framework didn't include support for linked lists; if you wanted one, you had to create it yourself.) Linked lists allow you to create a linear series of nodes, inserting and removing nodes at any point in the list. You can traverse the list by calling the Next method of a node, and you can work with the value of each node using the Value property. The code in Figure 5, which you can add to the button's Click event handler, demonstrates how you might use this class. Although not every application requires the use of a linked list, when you need one, it's nice to have a generic class built into the Framework.
Figure 5 Using LinkedList
Dim list3 As New LinkedList(Of String) Dim node As LinkedListNode(Of String) ' Add some nodes at various places in the linked list. node = list3.AddHead("Item 1") list3.AddAfter(node, "Item 2") list3.AddBefore(node, "Item 3") list3.AddTail("Item 4") list3.AddHead("Item 5") list3.AddAfter(list3.Head, "Item 6") ' Remove a few nodes. list3.Remove("Item 3") list3.Remove(node) ' Iterate through all the nodes in the list. node = list3.Head While node IsNot Nothing Debug.WriteLine(node.Value) node = node.Next End While
As you can see, using generic collections solves both of Adam's problems: he can expose a simple public variable that represents his collection of Customer objects, and he needn't worry about other developers purposefully or mistakenly adding objects to his collection that aren't of the correct type. In addition, if he wants to create other collections, he doesn't need to write any extra code. He can simply create a new public instance of a generic collection and add it to his middle tier.
You can also create your own generic collection classes using Visual Basic 2005, although it's unlikely that you will ever need to. The generic collections provided by the .NET Framework take care of most situations. If you determine that you need your own generic collection, it's not difficult to do: simply modify the code in Figure 3 to meet your own needs. A simple generic collection class might look like Figure 6, a generic collection class wrapping up the CollectionBase class. Of course, there's no reason to write code as shown in Figure 6 because of the rich set of collection classes in the System.Collections.Generic namespace.
Figure 6 Creating a Generic Collection Class
Public Class MyCollection(Of MyType) Inherits CollectionBase Default Public Property Item(ByVal index As Integer) As MyType Get Return CType(InnerList(index), MyType) End Get Set(ByVal Value As MyType) InnerList(index) = Value End Set End Property Public Function Add(ByVal value As MyType) As Integer Return InnerList.Add(value) End Function Public Function IndexOf(ByVal value As MyType) As Integer Return InnerList.IndexOf(value) End Function Public Sub Insert(ByVal index As Integer, ByVal value As MyType) InnerList.Insert(index, value) End Sub Public Sub Remove(ByVal value As MyType) InnerList.Remove(value) End Sub Public Function Contains(ByVal value As MyType) As Boolean Return InnerList.Contains(value) End Function End Class
To use the MyCollection class, you could add a second public reference in the DataLayer class, like this:
Public MyCustomers As New MyCollection(Of Customer)
The MyCollection class doesn't add any useful functionality, but does show how you can create your own generic collections.
There's a lot more to generics, of course. You can use generics when creating procedures, allowing a procedure to accept more than one type of parameter. The canonical example is the Swap procedure, which is useful when writing sorting routines:
Public Sub Swap(Of MyType) (ByRef item1 As MyType, ByRef item2 As MyType) Dim temp As MyType temp = item1 item1 = item2 item2 = temp End Sub
Once you've defined a swapping procedure that accepts two parameters of the same generic type, you can write code like the following which uses the Swap procedure:
Dim i As Integer = 0 Dim j As Integer = 1 Swap(i, j) Dim x As String = "Hello" Dim y As String = "World" Swap(x, y)
Try passing in two different types to the Swap procedure, however, and your code won't compile. This technique adds a huge amount of flexibility and type safety to your applications (not to mention performance gains because you're no longer passing parameters as Object types and boxing value types).
At this point, Adam had had enough and both of us were starving. I left the scene content that one more developer was convinced that generics are the coolest thing since, well, 2003. I'm looking forward to finding new ways to use generics and to digging deeper into all their subtleties that I didn't have room to cover here. Generics are just one of the many exciting new features in Visual Basic 2005, and I'm hoping to delve into them all before long.
Send your questions and comments to basics@microsoft.com.
Ken Getz is a senior consultant with MCW Technologies. He is coauthor of ASP .NET Developers Jumpstart (Addison-Wesley, 2002), Access Developer's Handbook (Sybex, 2001), and VBA Developer's Handbook, 2nd Edition (Sybex, 2001). Reach him at keng@mcwtech.com.