August 2009

Volume 24 Number 08

Entity Framework - N-Tier Application Patterns

By Daniel Simmons | August 2009

This article discusses:

  • N-tier design patterns
  • Entity Framework
  • Microsoft .NET Framework 4
This article uses the following technologies:
Entity Framework, Windows Communication Foundation

Contents

Change Set
DTOs
Simple Entities
Self-Tracking Entities
Implementing N-Tier with the Entity Framework
Concurrency Tokens
Serialization
Working with the ObjectStateManager
Patterns Other Than Simple Entities in .NET 3.5 SP1
API Improvements in .NET 4

In my previous article, I described a foundation on which you can build successful n-tier applications, focusing mainly on anti-patterns to avoid. There are many issues to consider before making decisions about the design of an n-tier application. In this article, I examine n-tier patterns for success and some of the key APIs and issues specific to the Entity Framework. I also provide a sneak peak at features coming in the Microsoft .NET Framework 4 that should make n-tier development significantly easier.

Change Set

The idea behind the change set pattern is to create a serializable container that can keep the data needed for a unit of work together and, ideally, perform change tracking automatically on the client. The container glues together the parts of the unit of work in a custom format, so this approach also tends to be quite full-featured and is easy to use on the mid-tier and on the client.

DataSet is the most common example of this pattern, but other examples exist, such as the EntityBag sample I wrote some time ago as an exploration of this technique with the Entity Framework. Both examples exhibit some of the downsides of this pattern. First, the change set pattern places significant constraints on the client because the wire format tends to be very specific to the change set and hard to make interoperable. In practice, the client must use .NET with the same change set implementation used on the mid-tier. Second, the wire format is usually quite inefficient. Among other things, change sets are designed to handle arbitrary schemas, so overhead is required to track the instance schema. Another issue with change set implementations such as DataSet, but not necessarily endemic to the pattern, is the ease with which you can end up tightly coupling two or more of the tiers, which causes problems if you have different rates of change. Finally, and probably of most concern, is how easy it is to abuse the change set.

In some ways, this pattern automates and submerges critical concerns that should be at the forefront of your mind when designing your solution. Precisely because it is so easy to put data into the change set, send it to the mid-tier, and then persist, you can do so without verifying on the mid-tier that the changes you are persisting are only of the type that you expect. Imagine that you have a service intended to add an expense report to your accounting system that ends up also modifying someone's salary.

The change set pattern is best used in cases where you have full control over client deployment so that you can address the coupling and technology requirement issues. It is also the right choice if you want to optimize for developer efficiency rather than runtime efficiency. If you do adopt this pattern, be sure to exercise the discipline to validate any changes on the mid-tier rather than blindly persisting whatever changes arrive.

DTOs

At the opposite end of the spectrum from change sets are Data Transfer Objects, or DTOs. The intent of this pattern is to separate the client and the mid-tier by using different types to hold the data on the mid-tier and the data on the client and in the messages sent between them.

The DTO approach requires the most effort to implement, but when implemented correctly, it can achieve the most architectural benefits. You can develop and evolve your mid-tier and your client on completely separate schedules because you can keep the data that travels between the two tiers in a stable format regardless of changes made on either end. Naturally, at times you'll need to add some functionality to both ends, but you can manage the rollout of that functionality by building versioning plus backward and forward compatibility into the code that maps the data to and from the transfer objects. Because you explicitly design the format of the data for when it transfers between the tiers, you can use an approach that interoperates nicely with clients that use technologies other than .NET. If necessary, you can use a format that is very efficient to send across the wire, or you can choose, for instance, to exchange only a subset of an entity's data for security reasons.

The downside to implementing DTOs is the extra effort required to design three different sets of types for essentially the same data and to map the information between the types. You can consider a variety of shortcuts, however, like using DTOs as the types on the client so that you have to design only two types instead of three; using LINQ to Objects to reduce the code that must be written to move data between the types; or using an automatic mapping library, which can further reduce the code for copying data by detecting patterns such as properties with the same name on more than one type. But there is no way around the fact that this pattern involves more effort than any of the other options—at least for initial implementation.

This is the pattern to consider when your solution becomes very large with very sophisticated requirements for interoperability, long-term maintenance, and the like. The longer the life of a project, the more likely that DTOs will pay off. For many projects, however, you might be able to achieve your goals with a pattern that requires less effort.

Simple Entities

Like the change set pattern, the simple entities pattern reuses the mid-tier entity types on the client, but unlike change sets, which wrap those entities in a complex data structure for communication between tiers, simple entities strives to keep the complexity of the data structure to a minimum and passes entity instances directly to service methods. The simple entities pattern allows only simple property modification to entity instances on the client. If more complex operations are required, such as changing the relationships between entities or accomplishing a combination of inserts, updates, and deletes, those operations should be represented in the structure of the service methods.

The beauty of the simple entities approach is that no extra types are required and no effort has to be put into mapping data from one type to another. If you can control deployment of the client, you can reuse the same entity structures (either the same assemblies or proxies), and even if you have to work with a client technology other than .NET, the data structures are simple and therefore easy to make interoperable. The client implementation is typically straightforward because minimal tracking is required. When properties must be modified, the client can change them directly on an entity instance. When operations involving multiple entities or relationships are required, special service methods do the work.

The primary disadvantage of this pattern is that more methods are usually required on the service if you need to accomplish complex scenarios that touch multiple entities. This leads to either chatty network traffic, where the client has to make many service calls to accomplish a scenario or special-purpose service methods with many arguments.

The simple entities approach is especially effective when you have relatively simple clients or when the scenarios are such that operations are homogenous. Consider, for example, the implementation of an e-commerce system in which the vast majority of operations involve creating new orders. You can design your application-interaction patterns so that modifications to information like customer data are performed in separate operations from creating new orders. Then the service methods you need are generally either queries for read-only data, modifications to one entity at a time without changing much in the way of relationships, or inserting a set of related entities all at once for a new order. The simple entities pattern works fairly well with this kind of scenario. When the overall complexity of a solution goes up, when your client becomes more sophisticated, or when network performance is so critical that you need to carefully tune your wire format, other patterns are more appropriate.

Self-Tracking Entities

The self-tracking entities pattern is designed to build on the simple entities pattern and achieve a good balance between the various concerns to create a single pattern that works in many scenarios. The idea is to create smart entity objects that keep track of their own changes and changes to related entities. To reduce constraints on the client, these entities are plain-old CLR objects (POCO) that are not tied to any particular persistence technology—they just represent the entities and some information about whether they are unchanged, modified, new, or marked for deletion.

Because the entities are self-tracking, they have many of the ease-of-use characteristics of a change set, but because the tracking information is built into the entities themselves and is specific to their schema, the wire format can be more efficient than with a change set. In addition, because they are POCO, they make few demands on the client and interoperate well. Finally, because validation logic can be built into the entities themselves, you can more easily remain disciplined about enforcing the intended operations for a particular service method.

There are two primary disadvantages for self-tracking entities compared to change sets. First, a change set can be implemented in a way that allows multiple change sets to be merged if the client needs to call more than one service method to retrieve the data it needs. While such an implementation can be accomplished with self-tracking entities, it is harder than with a change set. Second, the entity definitions themselves are complicated somewhat because they include the tracking information directly instead of keeping that information in a separate structure outside the entities. Often this information can be kept to a minimum, however, so it usually does not have much effect on the usability or maintainability of the entities.

Naturally, self-tracking entities are not as thoroughly decoupled as DTOs, and there are times when more efficient wire formats can be created with DTOs than with self-tracking entities. Nothing prevents you from using a mix of DTOs and self-tracking entities, and, in fact, as long as the structure of the tracking information is kept as simple as possible, it is not difficult to evolve self-tracking entities into DTOs at some later date if that becomes necessary.

Implementing N-Tier with the Entity Framework

Having reviewed your options and decided that you need an n-tier application, you can select a pattern and a client technology knowing what pitfalls to avoid. Now you're ready to get rolling. But where does the Entity Framework (EF) fit into all this?

The EF provides a foundation for addressing persistence concerns. This foundation includes a declarative mapping between the database and your conceptual entities, which decouples your mid-tier from the database structure; automatic concurrency checks on updates as long as appropriate change-tracking information is supplied; and transparent change tracking on the mid-tier. In addition, the EF is a LINQ provider, which means that it is relatively easy to create sophisticated queries that can help with mapping entities to DTOs.

The EF can be used to implement any of the four patterns described earlier, but various limitations in the first release of the framework (shipped as part of Visual Studio 2008 SP1/.NET 3.5 SP1) make patterns other than the simple entities pattern very difficult to implement. In the upcoming release of the EF in Visual Studio 2010/.NET 4, a number of features have been added to make implementing the other patterns easier. Before we look at the future release, though, let's look at what you can do with the EF now by using the simple entities pattern.

Concurrency Tokens

The first step you need to take before looking at any aspects of n-tier development is to create your model and make sure that you have concurrency tokens. You can read about the basics of building a model elsewhere. There are some great tutorials, for instance, available in the Entity Framework section of the MSDN Data Platform Developer Center.

The most important point for this discussion, however, is to make sure that you have specified concurrency tokens for each entity. The best option is to use a row version number or an equivalent concept. A row's version automatically changes whenever any part of the row changes in the database. If you cannot use a row version, the next best option is to use something like a time stamp and add a trigger to the database so that the time stamp is updated whenever a row is modified. You can also perform this sort of operation on the client, but that is prone to causing subtle data corruption problems because multiple clients could inadvertently come up with the same new value for the concurrency token. Once you have an appropriate property configured in the database, open the Entity Designer with your model, select the property, and set its Concurrency Mode in the Properties pane to Fixed instead of the default value None. This setting tells the EF to perform concurrency checks using this property. Remember that you can have more than one property in the same entity with Concurrency Mode set to Fixed, but this is usually not necessary.

Serialization

After you have the prerequisites out of the way, the next topic is serialization. You need a way to move your entities between tiers. If you are using the default entity code generated by the EF and you are building a Windows Communication Foundation (WCF) service, your work is done because the EF automatically generates DataContract attributes on the types and DataMember attributes on the persistable properties of your entities. This includes navigation properties, which means that if you retrieve a graph of related entities into memory, the whole graph is serialized automatically. The generated code also supports binary serialization and XML serialization out of the box, but XML serialization applies only to single entities, not to graphs.

Another important concept to understand is that while the default-generated entities support serialization, their change-tracking information is stored in the ObjectStateManager (a part of the ObjectContext), which does not support serialization. In the simple entities pattern, you typically retrieve unmodified entities from the database on the mid-tier and serialize them to the client, which does not need the change-tracking information. That code might look something like this:

public Customer GetCustomerByID(string id) { using(var ctx = new NorthwindEntities()) { return ctx.Customers.Where(c = > c.CustomerID == id).First(); } }

When it comes time to perform an update, however, the change-tracking information must be managed somehow, and that leads to the next important part of the EF you need to understand.

Working with the ObjectStateManager

For two-tier persistence operations, the ObjectStateManager does its job automatically for the most part. You don't have to think about it at all. The state manager keeps track of the existence of each entity under its control; its key value; an EntityState value, which can be unchanged, modified, added, or deleted; a list of modified properties; and the original value of each modified property. When you retrieve an entity from the database, it is added to the list of entities tracked by the state manager, and the entity and the state manager work together to maintain the tracking information. If you set a property on the entity, the state of the entity automatically changes to Modified, the property is added to the list of modified properties, and the original value is saved. Similar information is tracked if you add or delete an entity. When you call SaveChanges on the ObjectContext, this tracking information is used to compute the update statements for the database. If the update completes successfully, deleted entities are removed from the context, and all other entities transition to the unchanged state so that the process can start over again.

When you send entities to another tier, however, this automatic tracking process is interrupted. To implement a service method on the mid-tier that performs an update by using information from the client, you need two special methods that exist on the ObjectContext for just this purpose: Attach and ApplyPropertyChanges.

The Attach method tells the state manager to start tracking an entity. Normally, queries automatically attach entities, but if you have an entity that you retrieved some other way (serialized from the client, for example), then you call Attach to start the tracking process. There are two critical things about Attach to keep in mind.

First, at the end of a successful call to Attach, the entity will always be in the unchanged state. If you want to eventually get the entity into some other state, such as modified or deleted, you need to take additional steps to transition the entity to that state. In effect, Attach tells the EF, "Trust me. At least at some point in the past, this is how this entity looked in the database." The value an entity's property has when you attach it will be considered the original value for that property. So, if you retrieve an entity with a query, serialize it to the client, and then serialize it back to the mid-tier, you can use Attach on it rather than querying again. The value of the concurrency token when you attach the entity will be used for concurrency checks. (For more information about the danger of querying again, see my description of the anti-pattern Mishandled Concurrency in the June issue of MSDN Magazineat Anti-Patterns To Avoid In N-Tier Applications.)

The second thing to know about Attach is that if you attach an entity that is part of a graph of related entities, the Attach method will walk the graph and attach each of the entities it finds. This occurs because the EF never allows a graph to be in a mixed state, where it is partially attached and partially not attached. So if the EF attaches one entity in a graph, it needs to make sure that the rest of the graph becomes attached as well.

The ApplyPropertyChanges method implements the other half of a disconnected entity modification scenario. It looks in the ObjectStateManager for another entity with the same key as its argument and compares each regular property of the two entities. When it finds a property that is different, it sets the property value on the entity in the state manager to match the value from the entity passed as an argument to the method. The effect is the same as if you had performed changes directly on the entity in the state manager when it was being tracked. It is important to note that this method operates only on "regular" properties and not on navigation properties, so it affects only a single entity, not an entire graph. It was designed especially for the simple entities pattern, where a new copy of the entity contains all the information you need in its property values—no extra tracking information is required for it to function.

If you put the Attach and ApplyPropertyChanges methods together to create a simple service method for updating an entity, the method might look something like this:

public void UpdateCustomer(Customer original, Customer modified) { using(var ctx = new NorthwindEntities()) { ctx.Attach(original); ctx.ApplyPropertyChanges(modified.EntityKey.EntitySetName, modified); ctx.SaveChanges(); } }

While these methods make implementation of the service easy, this kind of service contract adds some complication to the client which now needs to copy the entity before modifying it. Many times, this level of complexity is more than you want or need on the client. So, instead of using ApplyPropertyChanges, you can attach the modified entity and use some lower-level APIs on the ObjectStateManager to tell it that the entity should be in the modified state and that every property is modified. This approach has the advantage of reducing the data that must travel from the client to the mid-tier (only one copy of the entity) at the expense of increasing the data that is updated in the database in some scenarios (every property will be updated even if the client modified only some because there is no way to tell which properties were modified and which were not). Figure 1shows what the code for this approach would look like.

Figure 1 Update Service Method

public void UpdateCustomer(Customer modified) { using(var ctx = new NorthwindEntities()) { ctx.Attach(modified); var stateEntry = ctx.ObjectStateManager.GetObjectStateEntry(modified); foreach(var propertyName in stateEntry.CurrentValues.DataRecordInfo.FieldMetadata.Select(fm = > fm.FieldType.Name)) { stateEntry.SetModifiedProperty(propertyName); } } ctx.SaveChanges(); }

Expanding the service to include methods for adding new customers and deleting customers is also straightforward. Figure 2shows an example of this code.

Figure 2 Add and Delete Service Methods

public void AddCustomer(Customer customer) { using(var ctx = new NorthwindEntities()) { ctx.AddObject("Customers", customer); ctx.SaveChanges(); } } public void DeleteCustomer(Customer customer) { using(var ctx = new NorthwindEntities()) { ctx.Attach(customer); ctx.DeleteObject(customer); ctx.SaveChanges(); } }

This approach can be extended to methods that change relationships between entities or perform other operations. The key concept to remember is that you need to first get the state manager into something like the state it would have been in originally if you had queried the database, then make changes to the entities for the effect you want, and then call SaveChanges.

Patterns Other Than Simple Entities in .NET 3.5 SP1

If you decide to use the first release of the EF to implement one of the other patterns, my first suggestion is to read the next section, which explains how .NET 4 will make things much easier. If your project needs one of the other patterns before .NET 4 is released, however, here are a few things to think about.

The change set pattern can certainly be implemented. You can see a sample of this pattern that was written to work with one of the prerelease betas of the EF at code.msdn.com/entitybag/. This sample has not been updated to work with the 3.5 SP1 version of the EF, but the work required to do that is fairly easily. One key step you might want to adopt even if you choose to build a change set implementation from scratch is to create an ObjectContext on the client with only the conceptual model metadata (no mapping, storage model, or real connection to the database is needed) and use that as a client-side change tracker.

DTOs are also possible. In fact, implementing DTOs is not that much more difficult with the first release of the EF than it will be in later releases. In either case, you have to write your own code or use an automatic mapper to move data between your entities and the DTOs. One idea to consider is to use LINQ projections to copy data from queries directly into your DTOs. For example, if I created a CustomerDTO class that has just name and phone properties, I could then create a service method that returns a set of CustomerDTOs like this:

public List < CustomerDTO > GetCustomerDTOs() { using(var ctx = new NorthwindEntities()) { var query = from c in ctx.Customers select new CustomerDTO() { Name = c.ContactName, Phone = c.Phone }; return query.ToList(); } }

Unfortunately, self-tracking entities is the hardest pattern to implement in the SP1 release for two reasons. First, the EF in .NET 3.5 SP1 does not support POCO, so any self-tracking entities that you implement have a dependency on the 3.5 SP1 version of .NET, and the serialization format will not be as suitable for interoperability. You can address this by hand writing proxies for the client, but they will be tricky to implement correctly. Second, one of the nice features of self-tracking entities is that you can create a single graph of related entities with a mix of operations—some entities can be modified, others new, and still others marked for deletion—but implementing a method on the mid-tier to handle such a mixed graph is quite difficult. If you call the Attach method, it will walk the whole graph, attaching everything it can reach. Similarly, if you call the AddObject method, it will walk the whole graph and add everything it can reach. After either of those operations occurs, you will encounter cases in which you cannot easily transition some entities to their intended final state because the state manager allows only certain state transitions. You can move an entity from unchanged to modified, for instance, but you cannot move it from unchanged to added. To attach a mixed graph to the context, you need to shred your graph into individual entities, add or attach each one separately, and then reconnect the relationships. This code is very difficult.

API Improvements in .NET 4

In the upcoming release of the EF, which will ship with Visual Studio 2010 and .NET 4, we have made a number of improvements to ease the pain of implementing n-tier patterns—especially self-tracking entities. I'll touch on some of the most important features in the following paragraphs.

POCOThe EF will support complete persistence ignorance for entity classes. This means that you can create entities that have no dependencies on the EF or other persistence-related DLLs. A single entity class used for persisting data with the EF will also work on Silverlight or earlier versions of .NET. Also, POCO helps isolate the business logic in your entities from persistence concerns and makes it possible to create classes with a very clean, interoperable serialization format.

Improved N-Tier Support APIsWorking with the ObjectStateManager will be easier because we have relaxed the state transition constraints. It will be possible to first add or attach an entire graph and then walk over that graph changing entities to the right state. You will be able to set the original values of entities, change the state of an entity to any value, and change the state of a relationship.

Foreign Key Property SupportThe first release of the EF supports modeling relationships only as completely separate from entities, which means that the only way to change relationships is through the navigation properties or the RelationshipManager. In the upcoming release, you'll be able to build a model in which an entity exposes a foreign key property that can be manipulated directly.

T4-Based Code GenerationThe final important change to the EF will be the use of the T4 template engine to allow easy, complete control over the code that is generated for entities. This is important because it means Microsoft can create and release templates that generate code for a variety of scenarios and usage patterns, and you can customize those templates or even write your own. One of the templates we will release will produce classes that implement the self-tracking entities pattern with no custom coding required on your part. The resulting classes allow the creation of very simple clients and services.

More to LearnI hope this article has given you a good survey of the design issues involved in creating n-tier applications and some specific hints for implementing those designs with the Entity Framework. There is certainly a lot more to learn, so I encourage you to take a look at the Application Architecture Guide from the patterns & practices groupand the Entity Framework FAQ.

Danny Simmons is dev manager for the Entity Framework team at Microsoft. You can read more of his thoughts on the Entity Framework and other subjects at blogs.msdn.com/dsimmons.