March 2016

Volume 31 Number 3

[Cutting Edge]

The Query Stack of a CQRS Architecture

By Dino Esposito | March 2016

Dino EspositoToday, Command and Query Responsibility Segregation (CQRS) is one of the architectural topics most talked about. At its core, CQRS is nothing more than common sense and all it recommends is that you code the queries and commands that your system requires through distinct and ad hoc stacks. You might want to have a domain layer in the command stack and organize business tasks in domain services and pack business rules in domain objects. The persistence layer can also be coded in total freedom just picking the technology and the approach that best fits your needs, whether plain relational tables, NoSQL or event stores.

What about the read stack, instead? The good news is that once you start looking at a CQRS architecture, often you don’t feel the need to have a distinct section of the domain layer for just reading data. Plain queries out of ready-made tables are all you need. The query stack performs read-only operations against the back end that don’t alter the state of the system. Because of this, you might not need any form of business rules and logical intermediation between presentation and storage—or, at least nothing beyond the basic capabilities of SQL advanced query operators such as GROUP BY, JOIN and ORDER. In the implementation of a modern query stack, the LINQ language backed in the Microsoft .NET Framework is immensely helpful. In the rest of this column, I’ll go through a possible implementation of a read stack where the storage is designed to be close to the organization of data required by the presentation.

Having Ready-Made Data

In a CQRS project, you typically plan separate projects for the query and command stacks and you can even have different teams taking care of each stack. A read stack essentially consists of two components: a collection of Data Transfer Objects (DTOs) to deliver data up to the presentation layer and a database context to perform physical reads and populate DTOs.

To keep the query stack as lean and mean as possible, you should aim at having a close match between the data stored and data to present. Especially these days, this might not be the case: It’s more common the scenario in which data is ideally stored in a given format, but must be arranged in a significantly different format to be effectively consumed. As an example, think of a software system to book meeting rooms in a business environment. Figure 1 gives a graphical perspective of this scenario.

Booking Schema in a CQRS Architecture
Figure 1 Booking Schema in a CQRS Architecture

The user is offered a few forms to create a booking request or update an existing booking. Each action is logged in the command data store as is. Scrolling through the list of logged booking events, one can easily track when each booking was entered into the system, when it was modified, how many times it was modified and if and when it was deleted. This is definitely the most effective way to store information about a dynamic system where the state of stored items might change over time.

Saving the state of the system in the form of events has many benefits as summarized in my August 2015 Cutting Edge column (msdn.com/magazine/mt185569), but it can’t offer an immediate view of the current state of the system. In other words, you might be able to dig out the full history of a single booking, but you’re not immediately able to return the list of pending bookings. Users of the system, instead, are also typically interested in getting the list of bookings. This creates the need for two distinct stores kept in sync. Every time a new event is logged, the state of the involved entities should be updated (synchronously or asynchronously) for easy queries.

During the synchronization step, you make sure that useful data is extracted from the log of events and massaged into a format that’s easy to consume from the UI through the query stack. At this point, all the query stack needs to do is shape up data for the particular data model of the current view.

More on the Role of Events

A common objection is, “Why should I save data in the form of events? Why not simply save data in the form in which it will be used?”

The answer is the classic, “It depends,” and the interesting thing is that it doesn’t typically depend on you, the architect. It depends on the business needs. If it’s key for the customer to track the history of a business item (that is, the booking) or to see what the list of bookings was on a given day and if the availability of rooms may change over time, the simplest way to solve the problem is by logging events and building any required data projection on top of that.

By the way, this is also the internal mechanics used by business intelligence (BI) services and applications. By using events, you can even lay the ground for some in-house BI.

Read-Only Entity Framework Context

Entity Framework is a common way to access stored data, at least from within .NET and ASP.NET applications. Entity Framework funnels database calls through an instance of the DbContext class. The DbContext class provides read/write access to the underlying database. If you make the DbContext instance visible from the uppermost layers, you expose your architecture to the risk of also receiving state updates from within the query stack. It’s not a bug, per se; but it’s a serious violation of architectural rules that you want to avoid. To avoid write access to the database, you might want to wrap the DbContext instance in a container and disposable class, as shown in Figure 2.

Figure 2 Wrapping the DbContext Instance in a Container and Disposable Class

public class Database : IDisposable
{
  private readonly SomeDbContext _context = new SomeDbContext();
  public IQueryable<YourEntity> YourEntities
  {
    get
    {
      return _context.YourEntities;
    }
  }
  public void Dispose()
  {
    _context.Dispose();
  }
}

You can use Database wherever you’d like to use a DbContext only to query data. Two aspects make the Database class different from a plain DbContext. First and foremost, the Database class encapsulates a DbContext instance as a private member. Second, the Database class exposes all or some of the DbContext collections as IQueryable<T> collections rather than as DbSet<T> collections. This is the trick to enjoy the query power of LINQ while being unable to add, delete or just save changes back.

Shaping Up Data for the View

In a CQRS architecture, there’s no canonical standing as far as the organization of the application layer is concerned. It can be distinct for the command and query stacks or it can be a unique layer. Most of the time, the decision is left to the vision of the architect. If you managed to keep dual storage and therefore have data made to measure for presentation, then you nearly have no need of any complex data retrieval logic. Subsequently, the implementation of the query stack can be minimal. In the query portion of the application layer code, you can have direct data access code and call directly the Entity Framework DbContext entry point. However, to really keep things limited to query operations, you just use the Database class instead of the native DbContext. Anything you get back from the Database class is then only queryable via LINQ, though. Figure 3 provides an example.

Figure 3 A Query from the Database Class Via LINQ

using (var db = new Database())
{
  var queryable = from i in db.Invoices
                              .Include("Customers")
    where i.InvoiceId == invoiceId
    select new InvoiceFoundViewModel
    {
      Id = i.InvoiceId,
      State = i.State.ToString(),
      Total = i.Total,
      Date = i.IssueDate,
      ExpiryDate = i.GetExpiryDate()
    };
    // More code here
}

As you can see, the actual projection of data you would get from the query is specified at the last minute, right in the application layer. This means you can create the IQueryable object somewhere in the infrastructure layer and move it around across the layers. Each layer has the chance to further modify the object refining the query without actually running it. By doing so, you don’t need to create tons of transfer objects to move data across layers.

Let’s consider a fairly complex query. Say, for example, that for one of the use cases in the application you need to retrieve all invoices of a business unit that haven’t been paid 30 days after due payment terms. If the query expressed a stable and consolidated business need, not subject to further adaptation along the way, it wouldn’t be such a tough query to write in plain T-SQL. The query is made of three main parts: get all invoices, select those specific of a given business unit and select those that haven’t been paid yet after 30 days. From an implementation perspective, it makes no sense at all to split the original query in three sub queries and do most of the work in memory. Yet, from a conceptual perspective, splitting the query in three parts makes it far easier to understand, especially for domain newbies.

LINQ is the magic tool that lets you express the query conceptually and have it executed in a single step, with the underlying LINQ provider taking care of translating it into the proper query language. Here’s a possible way to express the LINQ query:

var queryable = from i in db.Invoices
                            .Include("Customers")
  where i.BusinessUnitId == buId &&
     DateTime.Now – i.PaymentDueBy > 30
                select i;

The expression doesn’t return data, however. The expression just returns a query that hasn’t been executed yet. To execute the query and get related data, you must invoke a LINQ executor such as ToList or FirstOrDefault. Only at that point is the query built, putting together all the pieces and returning data.

LINQ and Ubiquitous Language

An IQueryable object can be modified along the way. You can do that by calling Where and Select methods programmatically. In this way, as the IQueryable object traverses the layers of your system, the final query emerges through the composition of filters. And, more important, each filter is applied only where the logic on which it’s based is available.

Another relevant point to make about LINQ and IQueryable objects in the query stack of a CQRS architecture is readability and ubiquitous language. In domain-driven design (DDD), the ubiquitous language is the pattern that suggests you develop and maintain a vocabulary of unequivocal business terms (nouns and verbs) and, more important, reflect those terms in the actual code. A software system that implements the ubiquitous language, for example, will not delete the order but “cancel” the order and won’t submit an order but “check out.”

The biggest challenge of modern software isn’t in technical solutions, but in deep understanding of business needs and in finding a perfect match between business needs and code. To improve the readability of queries, you can mix together IQueryable objects and C# extension methods. Here’s how to rewrite the previous query to keep it a lot more readable:

var queryable = from i in db.Invoices
                            .Include("Customers")
                            .ForBusinessUnit(buId)
                            .Unpaid(30)
                select i;

ForBusinessUnit and Unpaid are two user-defined extension methods that extend the IQueryable<Invoice> type. All they do is add a WHERE clause to the definition of the ongoing query:

public static IQueryable<Invoice> ForBusinessUnit(
  this IQueryable<Invoice> query, int businessUnitId)
{
  var invoices =
    from i in query
    where i.BusinessUnit.OrganizationID == buId
    select i;
  return invoices;}

Analogously, the Unpaid method will consist of nothing more than another WHERE clause to further restrict the data being returned. The final query is the same, but the expression of it is a lot clearer and hard to misunderstand. In other words, through the use of extension methods you nearly get a domain-specific language, which, by the way, is one of the goals of the DDD ubiquitous language.

Wrapping Up

If you compare CQRS to plain DDD, you see that in most cases you can reduce complexity of domain analysis and design by simply focusing on use cases and related models that back up actions that alter the state of the system. Everything else, namely actions that just read the current state, can be expressed via a far simpler code infrastructure and be no more complex than plain database queries.

It should also be noted that in the query stack of a CQRS architecture, sometimes even a full-fledged O/RM might be overkill. At the end of the day, all you need to do is query data, mostly from ready-made relational tables. There’s no need for sophisticated things such as lazy loading, grouping, joins—all you need is already there, as you need it. A nontrivial O/RM like Entity Framework 6 might even be too much for your needs, and micro O/RMs like PetaPoco or NPoco might even be able to do the job. Interestingly, this trend is also partly reflected in the design of the new Entity Framework 7 and the new ASP.NET 5 stack.


Dino Esposito is the author of “Microsoft .NET: Architecting Applications for the Enterprise” (Microsoft Press, 2014) and “Modern Web Applications with ASP.NET” (Microsoft Press, 2016). A technical evangelist for the .NET and Android platforms at JetBrains, and frequent speaker at industry events worldwide, Esposito shares his vision of software at software2cents@wordpress.com and on Twitter: @despos.