August 2009

Volume 24 Number 08

EF Data Access - EF v2 and Data Access Architecture Best Practices

By Tim Mallalieu | August 2009

This article discusses:

  • Application development styles
  • Design patterns
This article uses the following technologies:
Microsoft Visual Studio 2010, ADO.NET Entity Framework Feature CT1

Contents

Application Development Styles
Building a Forms-Centric Application
Thoughts on Forms-Centric Applications
Building a Model-Centric Application
Thoughts on Model-Centric Application Development
Code-Centric Application Development
Thoughts on Code-Centric Development
Final Thoughts on the Application Development Styles

David Hill, in his preface to the latest patterns & practices Architecture Guidance, jokes that the key to being a good architect is learning to answer "It depends" to most questions. In this article, I'll take that joke to heart. How can you use the Entity Framework with your application architecture? Well, it depends.

Developers deploy a wide variety of development philosophies and architecture styles. This article explores three common perspectives on application development and describes how the Entity Framework can be employed in each. Specifically, I'll look at the forms-centric, model-centric, and code-centric development styles and their relationship to the Entity Framework.

Application Development Styles

I'll start with a discussion of the various development styles. This discussion does not make strong assumptions about particular methodologies that can be applied within these development styles, and I should note that I've used stereotypes for the purpose of this article. Most development styles blend elements of the models I describe. Figure 1 shows the relative characteristics of the models I'll discuss.

fig01.gif

Figure 1 Development Styles and Their Associated Tradeoffs

Forms-Centric. In the forms-centric (or "forms-over-data") style of development, the focus is largely on the construction of the top-level user interface (UI) elements that bind to data. The Microsoft development experience for this style is often a drag-and-drop experience in which you define a data source and then systematically construct a series of forms that can perform create, read, update, and delete (CRUD) operations on the underlying data source. This experience tends to be highly productive and intuitive for a developer. The cost is often that the developer accepts a fairly high degree of prescription from the tools and frameworks being used.

Model-Centric. Model-centric development is a step beyond the forms-centric approach. In model-centric development, a developer defines a model in a visual tool or some textual domain-specific language (DSL) and uses this model as the source for generating classes to program against and a database for persistence. This experience is often handy for tools developers who want to build on existing infrastructure to deliver added value. It is also often useful for organizations that want to prescribe their own standards for their application architectures and databases. The cost of this path has historically been in the investment required to enable a complete experience. As with the forms-centric experience, a developer leveraging a model-centric experience tends to give up some flexibility as a consequence of operating in a more prescribed world.

Code-Centric. In the code-centric application development style, the truth is the code. Developers define persistent classes on their own. They elect to write their own data access layer to support these classes, or they use some available persistence offering to do this for them. The main benefit of the code-centric option is that developers get the utmost flexibility. The cost consideration tends to fall on the approach chosen for persistence. If a developer selects a solution that allows her to focus on the business domain instead of the persistence infrastructure, the overall benefit of this approach can be very high.

Building a Forms-Centric Application

In this section, I'll walk through how to build a very simple application using the forms-centric approach with the Entity Framework. The first step is to create a Visual Studio project. For this example, I created a Dynamic Data application. In Visual Studio 2010, you select the Dynamic Data Entities Web Application, as shown in Figure 2.

fig02.gif

Figure 2 Visual Studio 2010 New Project dialog box with Dynamic Data Entities Web Application project template selected.

The next step is to specify the Entity Framework as a data source for the application. You do this by adding a new ADO.NET Entity Data Model project item to the project, as you can see in Figure 3.

fig03.gif

Figure 3 Add New Item Dialog Box with ADO.NET Entity Data Model Project Item Selected

After selecting this project item, you perform the following three steps:

  1. Choose to start from a database.
  2. Choose the database to target.
  3. Select the tables to import.

At this point, you click Finish and see the model that is generated from the database, as shown in Figure 4.

fig04.gif

Figure 4 Default Entity Data Model Created from the Northwind Database

Now that the model has been generated, using it in the Dynamic Data application is as simple as configuring the form to register the object context that was created in the steps performed earlier. In Global.asax.cs, you can modify the following code to point to the context:

DefaultModel.RegisterContext(typeof(NorthwindEntities), new ContextConfiguration() { ScaffoldAllTables = true});

You should now be able to run your application and have a functional set of forms over your persistent data, as shown in Figure 5.

fig05.gif

Figure 5 Default Dynamic Data Site Using the Entity Framework

This exercise illustrates the most straightforward forms-driven experience. You can now start working on how the presentation should look and what behaviors you need. The ASP.NET Dynamic Data framework uses CLR attributes in the System.ComponentModel.DataAnnotations namespace to provide guidance on how data can be rendered. For example, you can change how the form is rendered by adding an annotation that hides a particluar column. The attribute is as follows:

[ScaffoldColumn(false)]

The ScaffoldColumn attribute indicates whether the Dynamic Data framework renders the column. In a case where a table is to be rendered, you can use the ScaffoldColumn attribute to opt out of rendering a specific column. The interesting challenge in the current scenario is where and when do you attribute a column? In this example, the CLR classes, which are used by Dynamic Data, were generated from the Entity Data Model. You can attribute the generated classes, but then any changes to the model will cause the loss of the attributes. Dynamic Data also allows you to apply attributes by using partial classes associated with your entities class, but then you lose some readability and discoverability because of the loss of encapsulation.

Entity Framework 4.0 will provide an extensibility model that allows developers to extend the Entity Framework Designer's tooling surface and add additional metadata that can then be used in code or database generation; however, this functionality is not available in Visual Studio 2010 beta 1.

The Entity Framework developer who wants to work with Dynamic Data can have a very productive experience. He can start with a database and annotate a model with the appropriate metadata to drive much of that experience. After the model is in good shape, the developer can focus on the UI. For more information on using Dynamic Data with the Entity Framework, please take a look at the official Dynamic Data Web site.

Thoughts on Forms-Centric Applications

Using ASP.NET Dynamic Data and the Entity Framework provides a highly productive experience for developing data-centric applications. However, forms-centric applications are not local to Dynamic Data. Many UI-first development experiences that allow developers to build an application by creating a set of screens over a data source tend to share the same characteristics. The developer experience generally relies on some combination of design-time and run-time experiences that prescribe a given architectural style. The data model often reflects the shape of the persistent store (the underlying tables), and there is often a fair bit of UI metadata (such as DataAnnotations in the case of Dynamic Data) that help to define the UI.

The role of the Entity Framework within a forms-centric experience is primarily as the abstraction over the underlying data source. The extensibility capabilities give a developer one true place to define all the model metadata that they need to express. The mapping capabilities allow a developer to reshape the mid-tier domain classes declaratively without having to dive down into infrastructure code.

Building a Model-Centric Application

The promise of model-driven development is that developers can declaratively express a model that is closer to the conceptual business domain than the run-time concerns of a given application architecture. For the purpose of this article, I've focused on the experience of having a single design surface on which you define the domain and related metadata and from which you provision classes and storage.

In the Microsoft .NET Framework 4, there are a number of innovations in Entity Framework tooling that enable a model-centric experience. Entity Framework tooling provides a basic experience plus the capabilities for framework developers, ISVs, and IT organizations to extend those capabilities. To illustrate the experience, I'll walk through a simple application.

I'll start with a new Dynamic Data project again and add an ADO.NET Entity Data Model project item. This time, however, I'll start with a blank model rather than create the model from a database. By starting with a blank surface, you can build out the model you want. I'll build a very simple Fitness application with just two entity types, Workout and WorkoutType. The data models for the types are shown in Figure 6.

fig06.gif

Figure 6 Simple Entity Data Model

When you define a model like this in the Entity Framework Designer, there is no mapping or store definition created. However, the Entity Framework Designer now allows developers to create a database script from this model. By right-clicking the designer surface, you can choose Generate Database Script From Model, as shown in Figure 7, and the Entity Framework Designer generates a default database from the entity model. For this simple model, two tables are defined. The names of the tables match the EntitySets that are defined in the designer. In the default generation, the database created will build join tables for many-to-many relationships and employ a Table Per Type (TPT) scheme for building tables that must support an inheritance hierarchy.

fig07.gif

Figure 7 Generating a Database Script from the Model

When you invoke Generate Database Script from Model, a new T-SQL file is added to the project and the Entity Data Model you've created provides the Entity Framework metadata with valid mapping and store descriptions. You can see these in Figure 8.

fig08.gif

Figure 8 The T-SQL File Generated from the Model

If a developer is using Visual Studio Team Architect or Team Suite, she can deploy and execute the T-SQL script within Visual Studio merely by clicking in the T-SQL file to give it focus and then pressing F5. You are prompted to select the target database, and then the script executes.

At the same time, the Entity Framework Designer runs the default code generation to create classes based on the model, the Entity Framework artifacts required to describe the mapping between the model and the database, and a description of the data store that was created. As a result, you now have a strongly typed data access layer that can be used in the context of your application.

At this point, you've seen only the default experience. The Entity Framework Designer's extensibility allows you to customize many aspects of the model-driven experience. The database-generation and code-generation steps use T4 templates that can be customized to tailor the database schema and the code that is produced. The overall generation process is a Windows Workflow Foundation (WF) workflow that can also be customized, and you have already seen how you can add extensions to the tools surface by using Managed Extensibility Framework–based Visual Studio extensibility. As an example of this extensibility, let's look at how you can change the code-generation step in the project.

By right-clicking the design surface, you can choose Add New Artifact Generation Item. Choosing this command opens a dialog box in which you can select any of the installed templates to add to the project. In the example shown in Figure 9, I selected the Entity Framework POCO Code Generator template (Note: The POCO template does not work with Dynamic Data in Visual Studio 2010 beta 1, but it will work in upcoming releases.) POCO (Plain Old CLR Objects) classes allow developers to define only the items they care about in their classes and avoid polluting them with implementation details from the persistence framework. With .NET 4.0, we have introduced POCO support within the Entity Framework, and one way of creating POCO classes when you are using a model-centric or data-centric development style is with the use of the POCO template. The POCO template is currently available in the ADO.NET Entity Framework Feature CTP 1, which can be downloaded from Data Platform Development and used with Visual Studio 2010 beta 1.

fig09.gif

Figure 9 Add New Item Dialog Box

By selecting the ADO.NET EF POCO Code Generator template, you get a different set of generated classes. Specifically, you get a set of POCO classes generated as a single file per class, a helper class to use for changes to related items, and a separate context class. Note that you did not do anything to the model. You merely changed the code-generation template.

One interesting capability added in .NET 4.0 is the capability to define functions in terms of the Entity Data Model. These functions are expressed in the model and can be referenced in queries. Think about trying to provide a method to determine how many calories are burned in a given workout. There is no property defined on the type that captures the calories burned. You could query the existing types and then enumerate the results, calculating the calories burned in memory; however, by using model-defined functions, you can fold this query into the database query that is sent to the store, thus yielding a more efficient operation. You can define the function in the EDMX (XML) as follows:

<Function Name="CaloriesBurned" ReturnType="Edm.Int32"> <Parameter Name="workout" Type="Fitness.Workout" /> <DefiningExpression> workout.Duration * workout.WorkoutType.CaloriesPerHour / 60 </DefiningExpression> </Function>

To allow this function to be used in a LINQ query, you need to provide a function in code that can be leveraged. You annotate this method to indicate which model function you intend to use. If you want the function to work when directly invoked, you should implement the body. For the purpose of this exercise, we will throw an unsupported exception because we expect to use this function in the form of LINQ queries that will be pushed to the store:

[EdmFunction("Fitness", "CaloriesBurned")] public int CaloriesBurned(Workout workout) { throw new NotSupportedException(); }

If you want to then build a query to retrieve all high-calorie workouts, where a high-calorie workout is greater than 1,000 calories, you can write the following query:

var highCalWorkouts = from w in context.MyWorkouts where context.CaloriesBurned(w) > 1000 select w;

This LINQ query is a valid query that can now leverage the CaloriesBurned function and be translated to native T-SQL that will be executed in the database.

Thoughts on Model-Centric Application Development

In the degenerate case, where a developer uses the model-first experience and does not customize any of the steps, the model-centric experience is very much like the forms-centric experience. The model the developer is working with is a higher-level model than the logical data model, but it is still a fairly data-centric view of the application.

Developers who extend their Entity Data Model to express more metadata about their domain and who customize the code and/or database generation can come to a place where the experience approaches one in which you define all the metadata for your runtime. This is great for IT organizations that want to prescribe a strict architecture and set of coding standards. It is also very useful for ISVs or framework developers who want to use the Entity Framework Designer as a starting point for describing the model and then generate a broader end-to-end experience from it.

Code-Centric Application Development

The best way to describe code-centric application development is to cite the cliché "the code is the truth." In the forms-centric approach, the focus is on building a data source and UI model for the application. In the model-centric approach, the model is the truth: you define a model, and then generation takes place on both sides (storage and the application). In the code-centric approach, all your intent is captured in code.

One of the challenges of code-centric approaches is the tradeoff between domain logic and infrastructure logic. Object Relational Mapping (ORM) solutions tend to help with code-centric approaches because developers can focus on expressing their domain model in classes and let the ORM take care of the persistence.

As we saw in the model-centric approach, POCO classes can be used with an existing EDM model (in either the model-first or database-first approaches). In the code-centric approach, we use something called Code Only, where we start with just POCO classes and no other artifacts. Code Only is currently available in the ADO.NET Entity Framework Feature CTP 1, which can be downloaded from Data Platform Development and used with Visual Studio 2010 Beta 1.

Consider replicating the Fitness application using only code. Ideally, you would define the domain classes in code such as shown in Figure 10.

Figure 10 Workout and WorkoutType Domain Classes

public class Workout { public int Id { get; set; } public DateTime DateTime { get; set; } public string Notes { get; set; } public int Duration { get; set; } public virtual WorkoutType WorkoutType { get; set; } } public class WorkoutType { public int Id { get; set; } public string Name { get; set; } public int CaloriesPerHour { get; set; } }

To make the domain classes work with the Entity Framework, you need to define a specialized ObjectContext that represents the entry point into the Entity Framework (much like a session or connection abstraction for your interaction with the underlying database). The ObjectContext class must define the EntitySets that you can create LINQ queries on top of. Here's an example of the code:

public class FitnessContext : ObjectContext { public FitnessContext(EntityConnection connection) : base(connection, "FitnessContext") { } public IObjectSet<Workout> Workouts { get { return this.CreateObjectSet<Workout>(); } } public IObjectSet<WorkoutType> WorkoutTypes { get { return this.CreateObjectSet<WorkoutType>(); } } }

In the code-only experience, a factory class is used to retrieve an instance of the context. This context class reflects over the context and builds up the requisite metadata for the run-time execution. The factory signature is as follows:

ContextBuilder.Create<T>(SqlConnection conn)

For convenience, you can add a factory method to the generated context. You provide a static field for the connection string and a static factory method to return instances of a FitnessContext. First the connection string:

static readonly string connString = new SqlConnectionStringBuilder { IntegratedSecurity = true, DataSource = ".\\sqlexpress", InitialCatalog = "FitnessExpress", }.ConnectionString;

And here is the factory method:

public static FitnessContext CreateContext() { return ContextBuilder.Create<FitnessContext>( new SqlConnection(connString)); }

With this, you have enough to be able to use the context. For example, you could write a method such as the following to query all workout types:

public List<WorkoutType> AllWorkoutTypes() { FitnessContext context = FitnessContext.CreateContext(); return (from w in context.WorkoutTypes select w).ToList(); }

As with the model-first experience, it is handy to be able to deploy a database from the code-only experience. The ContextBuilder provides some helper methods that can check whether a database exists, drop it if you want to, and create it.

You can write code like the following to bootstrap a simple set of demo functionality using the code-only approach:

public void CreateDatabase() { using (FitnessContext context = FitnessContext.CreateContext()) { if (context.DatabaseExists()) { context.DropDatabase(); } context.CreateDatabase(); } }

At this point, you can use the Repository pattern from domain-driven design (DDD) to elaborate a bit in what we have seen so far. The use of DDD principles is a common trend in application development today, but I won't attempt to define or evangelize domain driven design here. (For more information, read content from experts such as Eric Evans (Domain-Driven Design: Tackling Complexity in the Heart of Software, Addison-Wesley, 2003) and Jimmy Nilsson (Applying Domain-Driven Design and Patterns: With Examples in C# and .NET, Addison-Wesley, 2006 ).

Currently, we have a handwritten set of domain classes and a specialized ObjectContext. When we used Dynamic Data, we just pointed the framework at the ObjectContext. But if we want to consider a stronger abstraction of our underlying persistence layer, and if we want to truly constrain the contract of operations to just the meaningful domain operations that one should do, we can leverage the Repository pattern.

For this example, I'll define two repositories: one for WorkoutTypes and one for Workouts. When you follow DDD principles, you should think hard about the aggregate root(s) and then think about modeling the repositories appropriately. In this very simple example, I've used two repositories to illustrate high-level concepts. Figure 11 shows the WorkoutType repository, and Figure 12 shows the Workout repository.

Figure 11 The WorkoutType Repository

public class WorkoutTypeRepository { public WorkoutTypeRepository() { _context = FitnessContext.CreateContext(); } public List<WorkoutType> AllWorkoutTypes() { return _context.WorkoutTypes.ToList(); } public WorkoutType WorkoutTypeForName(string name) { return (from w in _context.WorkoutTypes where w.Name == name select w).FirstOrDefault(); } public void AddWorkoutType(WorkoutType workoutType) { _context.WorkoutTypes.AddObject(workoutType); } public void Save() { this._context.SaveChanges(); } private FitnessContext _context; }

Figure 12 The Workout Repository

public class WorkoutRepository { public WorkoutRepository() { _context = FitnessContext.CreateContext(); } public Workout WorkoutForId(int Id) { return (from w in _context.Workouts where w.Id == Id select w).FirstOrDefault(); } public List<Workout> WorkoutsForDate(DateTime date) { return (from w in _context.Workouts where w.DateTime == date select w).ToList(); } public Workout CreateWorkout(int id, DateTime dateTime, int duration, string notes, WorkoutType workoutType) { _context.WorkoutTypes.Attach(workoutType); Workout workout = new Workout() { Id = id, DateTime = dateTime, Duration = duration, Notes = notes, WorkoutType = workoutType }; _context.Workouts.AddObject(workout); return workout; } public void Save() { _context.SaveChanges(); } private FitnessContext _context; }

One interesting thing to note is that the return types are not IQueryable<T>; they are List<T>. There are debates about whether you should expose IQueryable past the boundaries of the persistence layer. My opinion is that exposing IQueryable breaks the encapsulation of the persistence layer and compromises the boundary between explicit operations that happen in memory and operations that happen in the database. If you expose an IQueryable<T> from the repository, you have no idea who will end up composing a database query in LINQ higher up the stack.

You can now use these repositories to add some data in the store. Figure 13 shows two methods that could be used to create some sample data.

Figure 13 Methods for Building Sample Data

public void AddWorkouts() { Console.WriteLine("--- adding workouts ---"); WorkoutRepository repository = new WorkoutRepository(); WorkoutTypeRepository typeRepository = new WorkoutTypeRepository(); WorkoutType squash = typeRepository.WorkoutTypeForName("Squash"); WorkoutType running = typeRepository.WorkoutTypeForName("Running"); repository.CreateWorkout(0,new DateTime(2009, 4, 20, 7, 0, 0), 60, "nice squash workout", squash); repository.CreateWorkout(1, new DateTime(2009, 4, 21, 7, 0, 0), 180, "long run", running); repository.CreateWorkout(2, new DateTime(2009, 4, 22, 7, 0, 0), 45, "short squash match", squash); repository.CreateWorkout(3, new DateTime(2009, 4, 23, 7, 0, 0), 120, "really long squash", squash); repository.Save(); }

In the model-first scenario, we used model-defined functions to provide a method to determine how many calories are burned in a given workout, even though there is no property defined on the type that captures the calories burned. With the code-only approach, you do not have the option to define model-defined functions here. You can, however, compose on top of the existing Workout EntitySet to define a query that already encapsulates the high-calorie filter, as shown here:

public IQueryable<Workout> HighCalorieWorkouts() { return ( from w in Workouts where (w.Duration * w.WorkoutType.CaloriesPerHour / 60) > 1000 select w); }

If we define this method on the FitnessContext, we can then leverage it in the Workout Repository as follows:

public List<Workout> HighCalorieWorkouts() { return _context.HighCalorieWorkouts().ToList(); }

Because the method on the context returned an IQueryable, you could have further composed on top of it, but I chose, for symmetry, to just return the results as a List.

Thoughts on Code-Centric Development

The code-centric experience is highly compelling for developers who want to express their domain logic in code. The code-centric experience lends itself well to providing a level of flexibility and clarity needed to work with other frameworks. Using abstractions like the Repository pattern, this approach lets developers provide a high degree of isolation for the persistence layer, which allows the application to remain ignorant of the persistence layer.

Final Thoughts on the Application Development Styles

These are the three application development styles that we often see. As mentioned earlier, there is no single, true classification of these development styles. They lie more on a continuum from highly prescriptive, very data-centric and CRUD-centric experiences that focus on productivity, to highly expressive code-centric experiences.

For all of these, the Entity Framework can be leveraged to provide the persistence layer. As you move toward the form-centric and model-centric side of the spectrum, the explicit model and the ability to extend the model and tool chain can help the Entity Framework improve overall developer productivity. On the code-centric side, the improvements in the Entity Framework allow the runtime to get out of the way and be merely an implementation detail for persistence services.

Tim Mallalieu is the product unit manager for the Entity Framework and LINQ to SQL. He can be reached at blogs.msdn.com/adonet.