span.sup { vertical-align:text-top; }

ADO.NET

Achieve Flexible Data Modeling With The Entity Framework

Elisa Flasko

This article discusses:

  • The philosophy behind the Entity Framework
  • The Entity Data Model
  • Querying, mapping, and n-tier development
This article uses the following technologies:
ADO.NET, LINQ, Entity Framework

Code download available at:EntityFramework2008_07.exe(6,602 KB)

Contents

Why Another Data Model?
Why Describe the EDM with XML?
Who Needs Another New Query Language?
Implementing the EDM
EntityClient
Mapping
Object Services
Just Another ORM?

The ADO.NET Entity Framework is almost here! First introduced as ADO.NET vNext in 2006, the framework is now ready for prime time with the upcoming release of Visual Studio® 2008 SP1. After a couple of unsuccessful attempts at similar products over the years, Microsoft released two technologies with Visual Studio 2008 that fit, in part, into the Object Relational Mapping (ORM) space: LINQ to SQL and the ADO.NET Entity Framework. With the adoption of these technologies in the marketplace, developers now want to know how we got here and where Microsoft is going next. They also want to know what was behind the development of these technologies, what makes the Entity Framework different from other ORM technologies in the market, and how is Microsoft investing in these technologies going forward?

There were numerous articles following the initial release of Visual Studio 2008 focusing on LINQ to SQL, as well as articles addressing which technology to use (see msdn.microsoft.com/data). Here I'll focus on the Entity Framework and provide a deeper understanding of how and why choices were made during development.

The Microsoft® Entity Data Model (EDM), based on Dr. Peter Chen's Entity Relationship (ER) model, is really the driving force behind the ADO.NET Entity Framework. The EDM is also the feature that most significantly differentiates the Entity Framework from other ORM-style technologies in the marketplace. The EDM builds on the ER model to raise the abstraction level for models an order higher than logical models while still preserving the concepts of entities and relationships.

Why Another Data Model?

So why was another model needed? Well, as the amount of data companies were working with increased, it became very difficult to understand and develop applications on top of that data. Database schemas are designed with storage concerns, such as data integrity, performance, and management, in mind and are not always easy to understand. Often these schemas also do not align with the structure of your application, making development and maintenance even more complex.

Custom solutions that separated the structure of data from the application being built were common. Unfortunately, with the number of custom solutions, various approaches, and steps required to model data all being different for each application, the problem continued to grow. There was a consistent need across the industry for a way to define and develop against an application-level domain model that can be clearly separated from the logical model of the store. Enter the Entity Framework.

The EDM (see the sample illustrated in Figure 1) allows the definition of a domain model that is consistent with the way an organization thinks about and uses its data, rather than the way that data is stored. The EDM was also developed with the primary goal of becoming the core data model across a suite of developer and server technologies within Microsoft.

fig01.gif

Figure 1 Sample Entity Data Model for a Blogging Database

With one core data model, application maintenance is simplified. When this goal is realized, the EDM can be used to define a model not only for custom applications built on the ADO.NET Entity Framework but also as the input to reporting and visualization applications, intranet portal applications, or workflow applications.

Similar to the ER model, the EDM uses two primary concepts: entities (things) and relationships (or associations) among those entities. When reasoning about the storage of entity and association instances, or the closure of set operations over those instances, the concept of sets is also required. Therefore, to complete the picture, entities live in EntitySets and associations live in Association­Sets.

The final structural concept defined in the EDM is that of an Entity­Container, which defines a closure around the sets of instances and relationships previously described. These simple concepts, used together, allow developers to define a domain model that can be mapped back to the persistence layer and to classes used in the application itself. (Just note that the persistence layer of the EDM doesn't need to be relational even though today it is.)

Each Entity Type defined in an EDM can contain two different types of members—properties defining the entity (analogous to columns in the database) and navigation properties, which enable navigation of relationships in which the Entity Type participates (often represented as foreign keys in the database). In addition to these, each Entity Type must have a distinct identity or key. The XML fragment in Figure 2 describes a "Blog Post" Entity Type.

Figure 2 BlogPost Entity Type Definition

<EntityType Name="BlogPost">
   <Key>
      <PropertyRef Name="BlogPostID" />
   </Key>
   <Property Name="BlogPostID" Type="Int32" Nullable="false" />
   <Property Name="BlogEntry" Type="String" Nullable="false" MaxLength="Max" 
             Unicode="true" FixedLength="false" />
   <Property Name="BlogDate" Type="DateTime" Nullable="false" />
   <Property Name="BlogTitle" Type="String" Nullable="false" MaxLength="500" 
             Unicode="true" FixedLength="false" />
   <Property Name="BlogType" Type="Int32" Nullable="false" />
   <Property Name="CityVisited" Type="String" MaxLength="200"
             Unicode="true" FixedLength="false" />
   <Property Name="CountryVisited" Type="String" MaxLength="200" Unicode="true" 
             FixedLength="false" />
   <Property Name="DateVisited" Type="DateTime" />

   <NavigationProperty Name="Blog" Relationship=
                       "MyTravelPostModel.FK_BlogPosts_Blogs"
                       FromRole="BlogPosts" ToRole="Blogs" />
   <NavigationProperty Name="Pictures" Relationship=
                       "MyTravelPostModel.FK_Pictures_BlogPosts"
                       FromRole="BlogPosts" ToRole="Pictures" />
   <NavigationProperty Name="Comments" Relationship=
                       "MyTravelPostModel.BlogComments"
                       FromRole="BlogPosts" ToRole="Comments" />
</EntityType>

Properties on an Entity Type can be primitive types or complex types (see Figure 3), but in the Entity Framework 1.0 they cannot be other Entity Types or Collections of primitive or complex types. The Entity Type's key is then composed of a subset of these properties. Navigation properties enable navigating a relationship from one entity to another.

Figure 3 Address Complex Type Definition

<ComplexType Name="Address">
    <Property Name="StreetAddress"
              Type="String" MaxLength="50" />
    <Property Name="Address2"
              Type="String" MaxLength="50" />
    <Property Name="City"
              Type="String" MaxLength="50" />
    <Property Name="Region"
              Type="String" MaxLength="50" />
    <Property Name="PostalCode"
              Type="String" MaxLength="50" />
    <Property Name="Country"
              Type="String" MaxLength="50" />
</ComplexType>

As discussed previously, relationships may be surfaced as navigation properties on each Entity Type that is a party to the relationship. Relationships themselves are first-class citizens within the EDM and are explicitly defined as Associations within the EDM:

<Association Name="FK_Friends_People">
   <End Role="People" Type="MyTravelPostModel.Person" Multiplicity="1" />
   <End Role="Friends" Type="MyTravelPostModel.Friend" Multiplicity="*" />
   <ReferentialConstraint>
      <Principal Role="People">
         <PropertyRef Name="PersonID" />
      </Principal>
      <Dependent Role="Friends">
         <PropertyRef Name="PrimaryPersonID" />
      </Dependent>
   </ReferentialConstraint>
</Association>

So in short, why did we create a new data modeling technology in the first place? Although a number of data modeling technologies or languages existed prior to the EDM, none were able to satisfy the primary goals Microsoft was trying to accomplish, and none served to make the much-used Entity Relationship model executable. The team investigated numerous existing data modeling technologies but, realizing that all were fairly specific or focused to certain problem areas, began developing the EDM to create a model that met these goals and could truly be used as a core data model across a suite of developer and server technologies.

Why Describe the EDM with XML?

After much consideration, XML was chosen as the first serial representation for the EDM. Having a well-defined XML format enables developers and third parties to do transformations into this format and to load into the Entity Framework's metadata runtime, either through generating XML files (or resources) or by loading from dynamically generated XML representations. It is conceivable, however, that other representations of the EDM could be created, and it is likely that alternate representations will be seen as the product moves forward with future releases.

The current EDM grammar is defined in an XML Schema Definition Language (XSD) that ships with the product. It is not expected, however, that most people will develop the XML by hand, but rather will use the tools provided in Visual Studio. That said, the team has heard of interest in Domain-Specific Languages (DSLs) and in alternative persistence mechanisms (databases being the common one) for EDM models, and it is evaluating the options for expansion in upcoming releases.

Who Needs Another New Query Language?

The last question concerning the development of the EDM is why create a new query language? Why not use an existing one? The answer will become clearer as I dig slightly deeper into the EDM.

So far I've talked about why the EDM was created and the various constructs used in the EDM, as well as the fact that the model is descendant from the Entity Relationship model. In creating a model that was not only able to map cleanly to the underlying data store but also represent the application-level domain model developers would like to program against, the EDM needed to be capable of modeling concepts such as inheritance and polymorphism. Since current relational query languages do not support querying based on inheritance, relationship navigation, or the return of polymorphic results, a new query language was needed to satisfy this requirement.

Thus was born Entity SQL (ESQL), a new SQL dialect that adds the ability to query based on the concepts that are not supported in previous SQL dialects. ESQL extends the existing SQL language in much the same way as the EDM extends the Relational model used in databases. Additionally, ESQL is also not tied to the syntax of any specific back-end database, allowing the queries (and/or the application) to be written once, regardless of the back-end database being targeted. In the following example, I look at a simple ESQL query that will retrieve all blogs with at least one blog post and the associated Person (or in the case of my model, the blog owner):

select c, c.Person 
  from travelEntitiesGeneral.Blogs as c 
  where c.BlogPosts.Count > 0

Implementing the EDM

The ADO.NET Entity Framework is an evolution of ADO.NET and the first concrete implementation of the EDM, providing a higher level of abstraction when developing against a relational database. In version 1.0, the team has been focused on building up the foundation of a platform, more than just a simple ORM, which will allow developers to work against a conceptual or object model with a very flexible mapping and the ability to accommodate a high degree of divergence from the underlying store.

This high degree of flexibility and divergence from the underlying store is the key to allowing the database and applications to evolve separately. When a change is made in the database schema, the application is insulated from the change by the Entity Framework, and you are often not required to rewrite portions of the application, but rather to simply update the mapping files if necessary to accommodate the change.

To begin evolving the ADO.NET platform, the Entity Framework is built on top of the existing ADO.NET 2.0 provider model, with existing providers being updated slightly to support the new Entity Framework and ADO.NET 3.5 functionality. We chose to implement on top of the existing ADO.NET provider model to ensure a provider model that is familiar to the development community.

The architecture is illustrated in Figure 4. You'll note that acceptable schemas include Conceptual Schema Definition Language (CSDL), Mapping Schema Language, and Storage Schema Definition Language (SSDL). Also note that the Entity Framework includes an updated SqlClient Data Provider that supports canonical command trees (CCT).

fig04.gif

Figure 4 ADO.NET Entity Framework Architecture

EntityClient

The Entity Framework then introduces a new ADO.NET provider, EntityClient, on top of these ADO.NET 3.5 providers. Entity­Client looks very much like the ADO.NET providers you are used to, and it provides the first abstraction allowing developers to execute queries in terms of the EDM using the standard Connection, Command, and DataReader objects. It also adds the additional Client View Engine required to map the domain model, defined in terms of the EDM, to the underlying relational database schema. EntityClient provides the ability for developers, when necessary, to work against entities in the form of rows and columns using ESQL query strings without the need to generate classes to represent the conceptual schema.

If you look at the use of EntityClient in Figure 5, you can see that I have created an EntityCommand taking in an ESQL query string, and that command is then executed against my EDM. The query text supplied as part of the EntityCommand is parsed, and a CCT is created.

Figure 5 Use of ESQL to Query against EntityClient

using (EntityConnection conn = new 
         EntityConnection("name=travelEntitiesGeneral"))
{
      conn.Open();
      EntityCommand cmd = conn.CreateCommand();
      cmd.CommandText = @"select c.BlogID 
         from travelEntitiesGeneral.Blogs as c 
         where c.BlogPosts.Count > 0";
      EntityDataReader reader = 
         cmd.ExecuteReader(CommandBehavior.SequentialAccess);
      while (reader.Read())
      {
         Console.WriteLine("BlogID = {0}", reader["BlogID"]);
      }
     conn.Close();
}

At this first stage, the command tree is still represented in terms of the EDM. The Client View Engine, borrowing from the theories for materialized views in database systems but applying these theories to the data access layer, applies a mapping transformation to the tree, producing a tree that represents the same operation in terms of the underlying logical storage model and removing any non-relational concepts such as relationships, inheritance, and polymorphism. This newly transformed tree is handed to the ADO.NET 3.5 provider services, which return a DbCommand that encapsulates native SQL for the underlying store, which is then executed and the results propagated back up through the stack.

When defining the mapping that is used in the Client View Engine to transform between the EDM and the logical database schema, there are a couple of different options. The mapping can be specified using a declarative XML grammar, the Mapping Specification Language (MSL), which can be created and edited either by hand coding the XML or using the Entity Mapping Tools included in Visual Studio (see Figure 6).

Figure 6 MSL—EntitySetMapping Example

<EntitySetMapping Name="BlogPosts">
   <EntityTypeMapping TypeName="IsTypeOf(MyTravelPostModel.BlogPost)">
      <MappingFragment StoreEntitySet="BlogPosts">
         <ScalarProperty Name="BlogPostID" ColumnName="BlogPostID" />
         <ScalarProperty Name="BlogEntry" ColumnName="BlogEntry" />
         <ScalarProperty Name="BlogDate" ColumnName="BlogDate" />
         <ScalarProperty Name="BlogTitle" ColumnName="BlogTitle" />
         <ScalarProperty Name="BlogType" ColumnName="BlogType" />
         <ScalarProperty Name="CityVisited" ColumnName="CityVisited" />
         <ScalarProperty Name="CountryVisited" 
                         ColumnName="CountryVisited" />
         <ScalarProperty Name="DateVisited" ColumnName="DateVisited" />
      </MappingFragment>
   </EntityTypeMapping>
</EntitySetMapping>

When compiled, the MSL allows the Entity Framework to generate the necessary query and update views that are then used within the Client View Engine to accomplish the transformation of the query defined in terms of the EDM to the logical storage schema.

An alternative option for expressing the mapping or a portion of the mapping is through the use of an ESQL query. In this case, when the developer expresses the Query View using ESQL, the infrastructure requires that they also define the accompanying Create, Update, and Delete mappings in the mapping specification. This is required, as the mapping infrastructure is not able to generate a corresponding update view for the query view given that one can leverage the power of ESQL in the Query View, making it possible to have views defined for which there is not a single valid update view.

Object Services

On top of the EntityClient provider, the Entity Framework adds another set of abstractions in order to allow development against objects, rather than untyped Data Records returned by EntityClient. This is the layer that is most often thought of as an ORM, producing CLR instances of the types defined in one's data model and allowing developers to query against those objects using either LINQ or ESQL. This also happens to be the layer of the Entity Framework that is responsible for initially attracting many developers to the technology when looking at the available ORM technologies in the marketplace.

As you saw in Figure 1, the high-level function of the Object Services layer is to take in either ESQL or LINQ queries from the application, pass a query expression to the EntityClient below, and return an IEnumerable<T>. To look a bit deeper, however, you see that at the heart of the Object Services layer is the ObjectContext, representing the session of interaction between the application and the underlying data store.

The ObjectContext is the primary construct that the developer will work with to query, add, and delete instances of his entities and to save new state back to the database. Creation and use of ObjectContext to query, manipulate, and SaveChanges to an Entity is shown in Figure 7. This example uses ESQL as the query language.

Figure 7 Using ObjectContext

using (ObjectContext context = new ObjectContext("name=travelEntities"))
    {
        //--- create a query for customers
        ObjectQuery<Person> personQuery = context.CreateQuery<Person>(
                     @"select value c from travelEntitiesGeneral.People 
                     as c where c.PersonID == 1");
        //--- by enumerating the query will be implicitly executed
        //--- against the store and you can now work with an
        //--- IEnumerable<Customer>
        foreach (Person c in personQuery)
        {
            //--- dereference anything you like from Customer
            Console.WriteLine(c.PersonID + ": " + c.Name);
            c.Name = "New Name";
        }
        try
        {
            context.SaveChanges();
        }
        catch (OptimisticConcurrencyException opt)
        {
            // catching this exception allows you to 
            // refresh travelEntities with either store/client wins
            // project the travelEntities into this failed travelEntities.
            var failedEntities = from e3 in opt.StateEntries
                                 select new { e3.Entity };

            // Note: in future you should be able to just pass 
            // the opt.StateEntities  
            // in to refresh.
            context.Refresh(RefreshMode.ClientWins, failedEntities.ToList());
            context.SaveChanges();
        }
    } 

The process of tracking changes as they are made to the objects in memory and the process of saving those changes back to the database is simplified for the developer through the use of Object Services. Object Services makes use of the ObjectStateManager to track not only the current state of the instances in memory but also the initial state of each instance as it was retrieved from the store, allowing the Entity Framework to apply optimistic concurrency as data is pushed back to the database. Tracked changes are easily saved and pushed back to the data store with invocation of the SaveChanges method on the ObjectContext.

Up to this point, I have been speaking about the ObjectContext in general, and my examples have shown the use of a base Object­Context, which is often used in the scenario where you have a dynamic tool or application that consumes EDM models. When using Visual Studio as the development environment, however, developers see the added benefit of a strongly typed ObjectContext, adding properties and methods to surface capabilities that may be specific to the EDM being targeted.

Figure 8 shows a query built using a strongly typed Object­Context. This example demonstrates the use of LINQ as the query language. Using a strongly typed ObjectContext exposes properties for each EntitySet, making them more discoverable; for example, travel­Entities.BlogPosts instead of travelEntities.CreateQuery<Blog­Post>("travelEntitiesGeneral.BlogPosts").

Figure 8 Query Built Using a Strongly Typed ObjectContext

using (MyTravelPostEntities travelEntities = new MyTravelPostEntities())
{
    // get the latest blog post, with the comments and the people
    // I'm querying for all the blog posts that are related to this blog.
    // I want to include the comments and the people who wrote the
    // comments.
    // I also want only the most recent posting.
    // Note: Since we use the EntityKey that is put on the EntityReference
    // we can either do a tracking query or use span.
    BlogPost post = (from bp in 
        travelEntities.BlogPosts
                              .Include("Comments.Person")
                              .Include("Blog")
                     where bp.Blog.BlogID == requestedBlog.BlogID
                     orderby bp.BlogDate descending
                     select bp).First();
    return post;
} 

LINQ to Entities can be seen as a fairly thin layer over Object Services, providing query facilities directly within the programming language (see Figure 8) as opposed to a string-based query. In this case, the ObjectQuery class implements IQueryable, allowing it to take a LINQ expression tree, pushing the query through the Entity Framework as a CCT query expression in the same manner as Object Services would pass an ESQL query to the Entity­Client provider.

N-Tier Development with the Entity Framework

While it is not my primary goal to fully address n-tier development here, as one of the more interesting scenarios for development with the Entity Framework it should be touched on. In version 1.0, the Entity Framework supports n-tier development in a couple of primary scenarios. These include the use of ADO.NET Data Services or the use of Windows® Communication Foundation (WCF) with the ability to serialize entities and attach and detach entities from an ObjectContext. These are obviously not the only approaches to n-tier development; however, these are the solutions the team chose to concentrate on in V1, with additional scenarios added in V2 and beyond, such as a more dataset-like experience. Figure 9 illustrates what I'm talking about.

Figure 9 ADO.NET Data Services in N-Tier Apps

static Uri baseService = new 
   Uri("https://localhost:17338/MyTravelPostService.svc");
MyPeople2Entities context = new MyPeople2Entities(baseService); 
    // get the comment that is being marked for deletion
    // and get the view state blog post.
BlogPost post = (BlogPost)ViewState["BlogPost"];

    // move the comment to the deleted comment selection. 
Comment deletedComment = post.Comments[e.RowIndex];

    // call the DeleteComment service
context.AttachTo("Comments", deletedComment);
context.DeleteObject(deletedComment);
DataServiceResponse r = context.SaveChanges();

    // reload page so that F5, refresh doesn't update all this data.
ReloadPage();

ADO.NET Data Services is a concrete realization of a Representational State Transfer (REST) architecture style (each resource represents a "noun" in the system—a thing that can be uniquely addressed via a Uniform Resource Identifier, or URI), which enables n-tier application development over any arbitrary IQueryable implementation. With ADO.NET Data Services, you can do more than just query for instances across the wire. ADO.NET Data Services support the various HTTP verbs for Create, Read, Update, and Delete and provide the client-side abstractions to help developers implement their solutions.

The second option for n-tier scenarios is the use of WCF with the Entity Framework, taking advantage of the ability to serialize entities and attach and detach entities from an ObjectContext. Figure 10 shows how to attach to an ObjectContext in this scenario.

Figure 10 Attaching to ObjectContext

// the creation of the travel MyTravelPostEntities opens the connection 
// and sets up all the metadata information automatically for you.
using (MyTravelPostEntities travelEntities = new MyTravelPostEntities())
{
    // attach the comment and delete.
    travelEntities.Attach(deleteComment);

    // call delete on the object
    travelEntities.DeleteObject(deleteComment);

    try
    {
       travelEntities.SaveChanges();
    }
    catch (OptimisticConcurrencyException opt)
    {
      // catching this exception allows you to 
      // refresh travelEntities with either store/client wins
      // project the travelEntities into this failed travelEntities.
      var failedEntities = from e3 in opt.StateEntries
                           select new { e3.Entity };

     travelEntities.Refresh(RefreshMode.ClientWins, failedEntities.ToList());
     travelEntities.SaveChanges();
    }
}

By default, any CLR classes that are generated from an EDM in Visual Studio or using edmgen.exe (the command-line tool that ships with the Entity Framework) are XML serializable, binary serializable, and are Data Contracts with the Navigation Properties attributed as DataMembers by default, making it possible to create ASMX Web services and use Entity instances in view state or WCF services.

Like most ORMs, the Entity Framework currently does not support Data Manipulation Language (DML) operations for create, update, or delete. Changes must be applied to objects in memory, and building the entire graph to be persisted could require a number of round-trips to the database.

One way that this can be avoided is through the use of the attach functionality offered by the ObjectContext. Using Attach allows you to instruct the infrastructure that the Entity already exists and that a set of operations should be performed in memory and then the changes pushed down. For additional information on n-tier development with the Entity Framework, search the MSDN® Library, as more content will be added soon.

Just Another ORM?

So far, the Entity Framework has been considered by many to be just another ORM in the marketplace, which is understandable when looking simply at the first version of the product. In that direction, much of what has been included in the product to this point enables a core set of scenarios that ORMs begin to tackle. Much of the analysis to this point, however, has pointed out that the Entity Framework does not in all cases cover what you might expect from some of the other ORMs in the marketplace, and this is a valid observation.

The investment that Microsoft is making in this space is meant to extend well beyond that of a traditional ORM product, and the Entity Framework, as I will discuss shortly is the first step in a much broader strategy around the EDM. The EDM, as I discussed at the beginning of this article, creates a higher level domain model that will be applicable beyond just Entity Framework and the world of traditional ORMs. The expectation is that over the next few releases of the Microsoft .NET Framework, Visual Studio, SQL Server®, and other Microsoft technologies, you will begin to see increased adoption of the EDM.

This expectation and the overall vision of where the EDM is headed has been the primary influence for this, as seen in many of the product decisions discussed throughout this article. Many decisions have been made with the explicit intent of enabling adoption by technologies such as Reporting Services and Analysis Services. This will bring a strong benefit to customers as services are able to be offered across a common and consistent domain model.

The first realization of this vision will ship alongside the Entity Framework in Visual Studio 2008 SP1 as ADO.NET Data Services. ADO.NET Data Services, which delivers a compelling developer experience for REST-based applications, will be the first released product (outside of the Entity Framework) to be built with the EDM as its metadata exchange format.

In coordination with this release, Microsoft demonstrated a number of different Windows Live™ properties at MIX 2008, which expose their data using the ADO.NET Data Services protocol and the EDM. Similarly, as we begin now to plan for the next release of SQL Server and Visual Studio, the team is working hard on better end-to-end development experiences with EDM and Entity Framework at the core.

Elisa Flasko is a Program Manager in the Data Programmability team at Microsoft, focusing on ADO.NET, XML, and SQL Server Connectivity technologies. She can be reached via her blog at blogs.msdn.com/elisaj.