Using LINQ Expressions to Generate Dynamic Methods II

A beta of Visual Studio 2008 SP1 was released on Monday and the ADO.NET Entity Framework (EF) is now in the box! You can download and install the Beta here. The EF Extensions library has been updated to work with the beta and includes several public and internal changes. Source code is available at https://code.msdn.com/EFExtensions. The latest release introduces some performance improvements in the materializer (you can read about the library here). These improvements illustrate another powerful expression pattern.

To improve code clarity, the EF Extensions API encourages you to write:

var products = command.Materialize<Product>(r => new Product {

    ProductID = r.Field<int>("ProductID"),

    Name = r.Field<string>("Name"),

    …

}).ToList();

instead of:

List<Product> products = new List<Product>();

using (SqlDataReader reader = command.ExecuteReader()) {

    int idOrdinal = reader.GetOrdinal("ProductID");

    int nameOrdinal = reader.GetOrdinal("Name");

    …

    while (reader.Read()) {

        Product product = new Product {

            ProductID = (int)reader.GetValue(idOrdinal),

            Name = reader.IsDBNull(nameOrdinal) ? (string)null : (string)reader.GetValue(nameOrdinal),

            …

        };

        products.Add(product);

    }

}

There’s usually a tradeoff. This is no exception... While the code in the first example is easier to write, read and maintain, looking up column ordinals on each call to Field<T> is expensive: for every row in every column, I’m incurring the cost of the lookup. Field<T> also verifies arguments on every call and checks for DBNull whether or not the requested type accepts nulls. Most of this work is redundant or unnecessary for materialization.

Fortunately, there’s a simple solution to these problems. If we represent the “shaper” delegate as a LINQ expression, we can rewrite it (using the technique described here) for efficiency. Basically, we can rewrite calls to Field<T> to calls to the underlying reader, caching column ordinals for efficiency. In the above example, the expression:

r => new Product() {ProductID = r.Field("ProductID"), Name = r.Field("Name")}

now becomes

r => new Product() {ProductID = Convert(r.GetValue(0)), Name = Convert(IIF(r.IsDBNull(1), null, r.GetValue(1)))}

The rewritten version is identical to the more performant version we wrote by hand.

The EFExtensions library uses an extensible pattern to perform these optimizations. Methods that can be optimized or rewritten are flagged with an attribute indicating a handler, in this example FieldMethodOptimizer:

[MaterializerOptimizedMethod(typeof(FieldMethodOptimizer))]

public static T Field<T>(this IDataRecord record, string name);

When materialization begins, field names from the reader are immediately retrieved. Whenever a method with this attribute is encountered in the shaper expression, the corresponding optimizer is called to rewrite the expression:

protected override Expression VisitMethodCall(MethodCallExpression m) {

    Expression result = base.VisitMethodCall(m);

    if (result.NodeType == ExpressionType.Call) {

        m = (MethodCallExpression)result;

        MaterializerOptimizedMethodAttribute attribute = m.Method.GetCustomAttributes(typeof(MaterializerOptimizedMethodAttribute), false)

            .Cast<MaterializerOptimizedMethodAttribute>()

            .SingleOrDefault(); // multiple attributes not permitted; not inherited

        if (null != attribute) {

            return attribute.Optimizer.OptimizeMethodCall(this.fieldNames, this.recordParameter, m);

        }

    }

    return result;

}

As in my previous post, I’m leveraging the ExpressionVisitor to do the rewrite. In this case, I’m intercepting and replacing MethodCallExpressions only.

End result: we can now use a more concise coding pattern without sacrificing performance. Unfortunately, we still need to pay the cost of compiling the materializer delegate, but this can be offset by reusing the delegate. To facilitate reuse, the Materializer class in EFExtensions is thread-safe and stores the optimized delegate on first use.