October 2013

Volume 28 Number 10

Data Points - Coding for Domain-Driven Design: Tips for Data-Focused Devs, Part 3

By Julie Lerman

Read the entire Domain-Driven Design series:
Coding for Domain-Driven Design: Tips for Data-Focused Devs, Part 1
Coding for Domain-Driven Design: Tips for Data-Focused Devs, Part 2
Coding for Domain-Driven Design: Tips for Data-Focused Devs, Part 3

Julie LermanThis is the final installment of my series on helping data-focused developers wrap their heads around some of the more challenging coding concepts used with Domain-Driven Design (DDD). As a Microsoft .NET Framework developer using Entity Framework (EF), and with a long history of data-first (and even database-first) development, I’ve struggled and argued and whined my way to understanding how to merge my skills with some of the implementation techniques of DDD. Even if I’m not using a full DDD implementation (from client interaction all the way to the code) in a project, I still benefit greatly from many of the tools of DDD.

In this last installment, I’ll discuss two important technical patterns of DDD coding and how they apply to the object-relational mapping (ORM) tool I use, EF. In an earlier installment I talked about one-to-one relationships. Here, I’ll explore unidirectional relationships—preferred with DDD—and how they affect your application. This choice leads to a difficult decision: recognizing when you might be better off without some of the nice relationship “magic” that EF performs. I’ll also talk a bit about the importance of balancing tasks between an aggregate root and a repository.

Build Unidirectional Relationships from the Root

From the time I started building models with EF, two-way relationships have been the norm, and I did this without thinking too hard about it. It makes sense to be able to navigate in two directions. If you have orders and customers, it’s nice to be able to see the orders for a customer and, given an order, it’s convenient to access the customer data. Without thinking, I also built a two-way relationship between orders and their line items. A relationship from order to line items makes sense. But if you stop to consider this for a moment, the scenarios in which you have a line item and need to get back to its order are few and far between. One that I can think of is that you’re reporting on products and want to do some analysis on what products are commonly ordered together, or an analysis that involves customer or shipping data. In such cases, you might need to navigate from a product to the line items in which it’s contained and then back to the order. However, I can see this coming up only in a reporting scenario where I’m not likely needing to work with the DDD-focused objects.

If I only need to navigate from order to line items, what is the most efficient way to describe such a relationship in my model?

As I noted, DDD prefers unidirectional relationships. Eric Evans advises that “it’s important to constrain relationships as much as possible,” and that “understanding the domain may reveal natural directional bias.” Managing the complexities of relationships—especially when you’re depending on the Entity Framework to maintain associations—is definitely an area that can cause a lot of confusion. I’ve already penned a number of Data Points columns devoted to associations in Entity Framework. Any level of complexity that can be removed is probably beneficial.

Contemplating the simple sales model I’ve used for this series on DDD, it does present a bias in the direction of an order to its line items. I can’t imagine creating, deleting or editing a line item without starting from the order.

If you look back at the Order aggregate I built earlier in the series, the order does control the line items. For example, you need to use the CreateLineItem method of the Order class to add a new line item:

public void CreateLineItem(Product product, int quantity)
{
  var item = new LineItem
  {
    OrderQty = quantity,
    ProductId = product.ProductId,
    UnitPrice = product.ListPrice,
    UnitPriceDiscount = CustomerDiscount + PromoDiscount
  };
  LineItems.Add(item);
}

The LineItem type has an OrderId property, but no Order property. That means it’s possible to set the value of OrderId, but you can’t navigate from a LineItem to an actual Order instance.

In this case, I have, in Evans’ words, “imposed a traversal direction.” I have, in effect, ensured I can traverse from Order to LineItem but not in the other direction.

There are implications to this approach not only in the model but also in the data layer. I use Entity Framework as my ORM tool and it comprehends this relationship well enough simply from the LineItems property of the Order class. And because I happen to follow the conventions of EF, it understands that LineItem.OrderId is my foreign key property back to the Order class. If I used a different name for OrderId, things would be more complicated for Entity Framework.

But in this scenario, I can add a new LineItem to an existing order like this:

order.CreateLineItem(aProductInstance, 2);
var repo = new SimpleOrderRepository();
repo.AddAndUpdateLineItemsForExistingOrder(order);
repo.Save();

The order variable now represents a graph with a preexisting order and a single new LineItem. That preexisting order has come from the database and already has a value in OrderId, but the new LineItem has only the default value for its OrderId property, and that’s 0.

My repository method takes that order graph, adds it to my EF context and then applies the proper state, as shown in Figure 1.

Figure 1 Applying State to an Order Graph

public void AddAndUpdateLineItemsForExistingOrder(Order order)
{
_context.Orders.Add(order);
_context.Entry(order).State = EntityState.Unchanged;
foreach (var item in order.LineItems)
{
  // Existing items from database have an Id & are being modified, not added
  if (item.LineItemId > 0)
  {
    _context.Entry(item).State = EntityState.Modified;
  }
}
}

In case you aren’t familiar with EF behavior, the Add method causes the context to begin tracking everything in the graph (the order and the single line item). At the same time, each object in the graph is flagged with the Added state. But because this method is focused on using a preexisting order, I know that Order is not new and, therefore, the method fixes the state of the Order instance by setting it to Unchanged. It also checks for any preexisting LineItems and sets their state to Modified so they’ll be updated in the database rather than inserted as new. In a more fleshed-out application, I’d use a pattern for more definitively knowing the state of each object, but I don’t want this sample to get bogged down with additional details. (You can see an early version of this pattern on Rowan Miller’s blog at bit.ly/1cLoo14, and an updated example in our coauthored book “Programming Entity Framework: DbContext” [O’Reilly Media, 2012].)

Because all of these actions are being done while the context is tracking the objects, Entity Framework also “magically” fixes the value of the OrderId in my new LineItem instance. Therefore, by the time I call Save, the LineItem knows that the OrderId value is 1.

Letting Go of the EF Relationship-Management Magic—for Updates

This good fortune occurs because my LineItem type happens to follow EF convention with the foreign key name. If you named it something other than OrderId, such as OrderFK, you’d have to make some changes to your type (for example, introducing the unwanted Order navigation property) and then specify EF mappings. This isn’t desirable, as you’d be adding complexity simply to satisfy the ORM. Sometimes that may be necessary, but when it’s not I prefer to avoid it.

It would be simpler to just let go of any dependency on the EF relationship magic and control the setting of the foreign key in your code.

The first step is to tell EF to ignore this relationship; otherwise, it will continue to look for a foreign key.

Here’s code I’ll use in the DbContext.OnModelBuilder method override so that EF won’t pay attention to that relationship:

modelBuilder.Entity<Order>().Ignore(o => o.LineItems);

Now, I’ll take control of the relationship myself. This means refactoring so I add a constructor to LineItem that requires OrderId and other values, and it makes LineItem much more like a DDD entity so I’m happy. I also have to modify the CreateLineItem method in Order to use that constructor rather than an object initializer.

Figure 2 shows an updated version of the repository method.

Figure 2 The Repository Method

public void UpdateLineItemsForExistingOrder(Order order)
{
  foreach (var item in order.LineItems)
  {
    if (item.LineItemId > 0)
    {
      _context.Entry(item).State = EntityState.Modified;
    }
    else
    {
      _context.Entry(item).State = EntityState.Added;
      item.SetOrderIdentity(order.OrderId);
    }
  }
}

Notice I’m no longer adding the order graph and then fixing the order’s state to Unchanged. In fact, because EF is unaware of the relationship, if I called context.Orders.Add(order), it would add the order instance but wouldn’t add the related line items as it did before.

Instead, I’m iterating through the graph’s line items and not only setting the state of existing line items to Modified but setting the state of new ones to Added. The DbContext.Entry syntax I’m using does two things. Before it sets the state, it checks to see if the context is already aware of (or “tracking”) that particular entity. If it’s not, then internally it attaches the entity. Now it’s able to respond to the fact that the code is setting the state property. So in that single line of code, I’m attaching and setting the state of the LineItem.

My code is now in accord with another healthy prescription for using EF with DDD, which is: don’t rely on EF to manage relationships. EF performs a lot of magic, a huge bonus in many scenarios. I’ve happily benefited from this for years. But for DDD aggregates, you really want to manage those relationships in your model and not rely on the data layer to perform necessary actions for you.

Because I’m stuck for the time being using integers for my keys (Order.OrderId, for example) and depending on my database to provide the values of those keys, I need to do some extra work in the repository for new aggregates such as a new order with line items. I’ll need tight control of the persistence so I can use the old-fashioned pattern of inserting graphs: insert order, get new database-generated OrderId value, apply that to the new line items, and save them to the database. This is necessary because I’ve broken the relationship that EF would normally use to perform this magic. You can see in the sample download how I’ve implemented this in the repository.

I’m ready, after many years, to stop depending on the database to create my identifier and begin to use GUIDs for my key values, which I can generate and assign in my app. This allows me to further separate my domain from the database.

Keeping the EF Relationship-Management Magic—for Queries

Divesting my model of EF relationships really helped in the previous scenario for performing updates. But I don’t want to lose all of the relationship features of EF. Loading related data when querying from the database is one feature I don’t want to give up. Whether I’m eager loading, lazy loading or explicitly loading, I love benefiting from the ability of EF to bring related data along without having to express and execute additional queries.

This is where an extended view of the separation of concerns concept comes into play. When following DDD precepts for design, it’s not unusual to have different representations of similar classes. For example, you might do this with a Customer class designed to be used in the context of customer management, as opposed to a Customer class for simply populating a pick list that needs only the customer’s name and identifier.

It also makes sense to have different DbContext definitions. In scenarios where you’re retrieving data, you might want a context that’s aware of the relationship between Order and LineItems so you can eagerly load an order along with its line items from the database. But then, when you’re performing updates as I did earlier, you may want a context that explicitly ignores that relationship so you can have more granular control of your domain.

An extreme view of this for a certain subset of complex problems you may be solving with software is a pattern called Command Query Responsibility Segregation (CQRS). CQRS guides you to think of data retrieval (reads) and data storage (writes) as separate systems that may require distinct models and architectures. My small example, which highlights the benefit of having the data-retrieval operations embrace a different understanding of relationships than data-storage operations, gives you an idea of what CQRS can help you achieve. You can learn more about CQRS from the excellent resource, CQRS Journey, available at msdn.microsoft.com/library/jj554200.

Data Access Happens in the Repository, Not the Aggregate Root

I want to back up a bit now and tackle one last question that gnawed at me when I started focusing on unidirectional relationships. (This is not to say that I have no more questions about DDD, but this is the final topic I’ll address in this series.) This question about unidirectional relationships is a common one for us “database-first” thinkers: Where, exactly (with DDD), does data access take place?

When EF was first released, the only way it could work with a database was to reverse-engineer an existing database. So, as I noted earlier, I got used to every relationship being two-way. If the Customers and Orders tables in the database had a primary key/foreign key constraint describing a one-to-many relationship, I saw that one-to-many relationship in the model. Customer had a navigation property to a collection of orders. Order had a navigation property to an instance of Customer.

As things evolved to Model- and Code-First, where you can describe the model and generate a database, I continued to follow that pattern, defining navigation properties on both ends of a relationship. EF was happy, mappings were simpler and coding was more natural.

So, with DDD, when I found myself with an Order aggregate root that was aware of CustomerId or maybe even a full Customer type, but I couldn’t navigate from Order back to Customer, I got upset. The first question I asked was, “what if I want to find all of the orders for a customer?” I always assumed I’d need to be able to do that, and I was used to relying on having access to navigation in both directions.

If logic begins with my order aggregate root, how would I ever answer that question? I also initially had the misconception that you do everything through the aggregate root, which didn’t help.

The solution made me hit my head and feel a bit foolish. I share my foolishness here in case someone else gets stuck in the same way. It’s not the job of the aggregate root, nor the job of the Order, to help me answer that question. However, in an Order-focused repository, which is what I’d use to perform my queries and persistence, there’s no reason I can’t have a method to answer my question:

public List<Order>GetOrdersForCustomer(Customer customer)
  {
    return _context.Orders.
      Where(o => o.CustomerId == customer.Id)
      .ToList();
  }

The method returns a list of Order aggregate roots. Of course, if I’m creating this in the scope of doing DDD, I’d only bother putting that method in my repository if I know it’s going to be needed in the particular context, not “just in case.” Chances are, I’d need it in a reporting app or something similar, but not necessarily in a context designed for building sales orders.

Only the Beginning of My Quest

As I’ve learned about DDD over the past few years, the topics I covered in this series are the ones that I had the most difficulty either comprehending or figuring out how to implement when Entity Framework would be part of my data layer. Some of the frustration I encountered was due to years of thinking about my software from the perspective of how things would work in my database. Letting go of this perspective has been freeing because it lets me focus on the problem at hand—the domain problem for which I’m designing software. At the same time, I do need to find a healthy balance because there may be data-layer issues I encounter when it’s time to add that into my solution.

While I’ve focused on how things might work when I’m mapping my classes directly back to the database with Entity Framework, it’s important to consider that there could be another layer (or more) between the domain logic and the database. For example, you might have a service with which your domain logic interacts. At that point, the data layer is of little (or no) consequence to mapping from your domain logic; that problem now belongs to the service.

There are many ways to approach your software solutions. Even when I’m not implementing a full end-to-end DDD approach (something that takes quite a bit of mastery), my entire process continues to benefit from the lessons and techniques I’m learning from DDD.


Julie Lerman is a Microsoft MVP, .NET mentor and consultant who lives in the hills of Vermont. You can find her presenting on data access and other Microsoft .NET Framework topics at user groups and conferences around the world. She blogs at thedatafarm.com/blog and is the author of “Programming Entity Framework” (2010) as well as a Code First edition (2011) and a DbContext edition (2012), all from O’Reilly Media. Follow her on Twitter at twitter.com/julielerman and see her Pluralsight courses at juliel.me/PS-Videos.

Thanks to the following technical expert for reviewing this article: Stephen Bohlen (Microsoft)