Category Archives: LINQ

Getting started with Linq Expressions

The Expression class is used to represent expression trees and is seen in use within LINQ. If you’ve been creating your own LINQ provider you’ll also have come across Expressions. For example see my post Creating a custom Linq Provider on this subject.

Getting started with the Expression class

Expression objects can be used in various situations…

Let’s start by looking at using Expressions to represent lambda expressions.

Expression<Func<bool>> e = () => a < b;

In the above we declare an Expression which takes a Func which takes no arguments and returns a Boolean. On the right hand side of the assignment operator we can see an equivalent lambda expression, i.e. one which takes no arguments and returns a Boolean.

From this Expression we can then get at the function within the Expression by calling the Compile method thus

Func<bool> f = e.Compile();

We could also create the same lambda expression using the Expression’s methods. For example

ConstantExpression lParam = Expression.Constant(a, typeof(int));
ConstantExpression rParam = Expression.Constant(b, typeof(int));
BinaryExpression lessThan = Expression.LessThan(lParam, rParam);
Expression<Func<bool>> e = Expression.Lambda<Func<bool>>(lessThan);

This probably doesn’t seem very exciting in itself, but if we can create an Expression from a lambda then we can also deconstruct an lambda into an Expression tree. So in the previous lambda example we could look at the left and right side of the a < b expression and find the types and other such things, we could evaluate the parts or simply traverse the expression and create database for it, but that’s a subject beyond this post.

An alternate use

An interesting use of Expressions can be found in many MVVM base classes (or the likes). I therefore take absolutely no credit for the idea.

The scenario is this. We want to create a base class for handling the INotifyPropertyChanged interface, it will look like this

public class PropertyChangedObject : INotifyPropertyChanged
{
   public event PropertyChangedEventHandler PropertyChanged;

   public void OnPropertyChanged(string propertyName)
   {
      if (PropertyChanged != null)
      {
         PropertyChanged(this, new PropertyChangedEventArgs(propertyName));
      }
   }
}

Next let’s write a simple class to use this, such as

public class MyObject : PropertyChangedObject
{
   private string name;

   public string Name
   {
      get { return name; }
      set
      {
         if (name != value)
         {
            name = value;
            OnPropertyChanged("Name");
         }
      }
   }
}

As you can see, within the setter, we need to check whether the value stored in the Name property is different to the new value passed to it and if so, update the backing field and then raise a property changed event passing a string to represent the property name.

An obvious problem with this approach is that “magic strings” can sometimes be incorrect (i.e. spelling mistakes or the likes). So it would be nicer if we could somehow pass the property name in a more typesafe and compile time checking way. It would also be nice to wrap the whole if block in an extension method which we can reuse in all the setters on our object.

Note: before we go much further with this, in .NET 4.5 there’s a better way to implement this code. See my post on the CallerMemberNameAttribute attribute.

So one way we could pass the property name, which at least ensure the property exists at compile time, is to use an Expression object which will then include all the information we need (and more).

Here’s what we want the setter code to look like

public string Name
{
   get { return name; }
   set { this.RaiseIfPropertyChanged(p => p.Name, ref name, value); }
}

The second and third arguments are self-explanatory, but for the sake of completeness let’s review them – the second argument takes a reference to the backing field. this will be set to the value contained within the third field only if the two differ. At which point we expect an OnPropertyChanged call to be made and the PropertyChanged event to occur.

The first argument is the bit relevant to the topic of this post, i.e. the Expression class.

Let’s look at the extension method that implements this and then walk through it

public static void RaiseIfPropertyChanged<TModel, TValue>(this TModel po, 
         Expression<Func<TModel, TValue>> e,  
         ref TValue backingField, 
         TValue value) where 
            TModel : PropertyChangedObject
{
   if (!EqualityComparer<TValue>.Default.Equals(backingField, value))
   {
      var m = e.Body as MemberExpression;
      if(m != null)
      {
         backingField = value;
         po.OnPropertyChanged(m.Member.Name);
      }
   }
}

The method can be used on any type which inherits from PropertyChangedObject, this is obviously so we get the method call OnPropertyChanged.

We check the equality of the backingField and value and obviously, only if they’re different do we bother doing anything. Assuming the values are different we then get the Body of the expression as a MemberExpression, on this the Member.Name property will be a string representing the name of the property supplied in the calling property, i.e. in this example the property name “Name”.

So now when we use the RaiseIfPropertyChanged extension method we have a little more type safety, i.e. the property passed to the expression must be the same type as the backing field and value and ofcourse a mis-spelled/none existent property will fail to compile as well, so lessens the chances of “magic string” typos. Obviously if we passed another property of the same type into the Expression then this will compile and seemingly work but the OnPropertyChanged event would be passed an incorrect property string and this is where the CallerMemberNameAttribute would help us further.

Some Linq Patterns

What follows is a list of a few patterns for specific tasks using LINQ

Combining two lists

We can combine two lists, where duplicates are removed using

var a = new List {0, 2, 3, 4, 5, 6};
var b = new List { 0, 1, 3, 4, 6, 6 };

var union = a.Union(b);
[/code]

Find the common items in both lists

var a = new List<int> {0, 2, 3, 4, 5, 6};
var b = new List<int> { 0, 1, 3, 4, 6, 6 };

var intersection = a.Intersect(b);

we can also use the lambda expression join

var l1 = new List<int> {1, 2, 4, 5};
var l2 = new List<int> { 0, 2, 4, 6 };

var r = from a in l1 join b in l2 on a equals b select a;

Find the differences between two lists

This will find the items in the first list which are not in the second list

var listA = new List<int> {0, 2, 3, 4, 5, 6};
var listB = new List<int> { 0, 1, 3, 4, 6, 6 };

var differences = from a in listA where listB.All(i => i != a) select a;

Combining every item in one list with every item in another list

Here we’re going to do a cross join, where the output enumerable is a made up of each element from the first list, combined with each element of the second list

var l1 = new List<int> {1, 2};
var l2 = new List<int> { 0, 2 };

var crossjoin = from a in l1
       from b in l2
       select new {a, b};

The output would be

1 0
1 2
2 0
2 2

I’ll add more as and when I remember to add them.

Creating a custom Linq Provider

To create a custom LINQ provider we need to implement the interfaces, IQueryable or IOrderedQueryable and IQueryProvider.

This is a summary of several posts/blogs I’ve read on the subject and source code available on github etc. See References below.

IQueryable<T>

To implement a minimal LINQ provider we need to implement IQueryable<>. This interface inherits from IEnumerable. The actual IQueryable methods are minimal

  1. ElementType
  2. Expression
  3. Provider

IOrderedQueryable<T>

If we want to support the sorting query operators, then we need to implement IOrderedQueryable<T>.

IOrderedQueryable inherits from IEnumerable, IEnumerable<T>, IOrderedQueryable, IQueryable, and IQueryable<T>.

We can implement an IOrderedQueryable that’s reusable for most situations, as follows

public class Queryable<T> : IOrderedQueryable<T>
{
   public Queryable(IQueryContext queryContext)
   {
      Initialize(new QueryProvider(queryContext), null);
   }

   public Queryable(IQueryProvider provider)
   {
     Initialize(provider, null);
   }

   internal Queryable(IQueryProvider provider, Expression expression)
   {
      Initialize(provider, expression);
   }

   private void Initialize(IQueryProvider provider, Expression expression)
   {
      if (provider == null)
         throw new ArgumentNullException("provider");
      if (expression != null && !typeof(IQueryable<T>).
             IsAssignableFrom(expression.Type))
         throw new ArgumentException(
              String.Format("Not assignable from {0}", expression.Type), "expression");

      Provider = provider;
      Expression = expression ?? Expression.Constant(this);
   }

   public IEnumerator<T> GetEnumerator()
   {
      return (Provider.Execute<IEnumerable<T>>(Expression)).GetEnumerator();
   }

   System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
   {
      return (Provider.Execute<System.Collections.IEnumerable>(Expression)).GetEnumerator();
   }

   public Type ElementType
   {
      get { return typeof(T); }
   }

   public Expression Expression { get; private set; }
   public IQueryProvider Provider { get; private set; }
}

Note: The IQueryContext is not part of the LINQ interfaces but one I’ve created to allow me to abstract the actual “custom” part of the code and allow maximum reuse of default functionality.

IQueryProvider

The IQueryProvider basically requires two methods (a generic and non-generic version of both) to be implemented. Obviously the generic versions are for strongly typed and the non-generic are loosely typed.

CreateQuery

The CreateQuery method is used to create a new instance of an IQueryable based upon the supplied expression tree.

Execute

The Execute method is used to actually execute a query expression, i.e. in a custom provider we will get the data from the datasource, for example a webservice or database etc. and it’s in the Execute method that we both get the data and do any conversion to, say SQL or the likes (if the LINQ query ultimately queries some other system.

An example of an implementation for an IQueryProvider might look like this…

public class QueryProvider : IQueryProvider
{
   private readonly IQueryContext queryContext;

   public QueryProvider(IQueryContext queryContext)
   {
      this.queryContext = queryContext;
   }

   public virtual IQueryable CreateQuery(Expression expression)
   {
      Type elementType = TypeSystem.GetElementType(expression.Type);
      try
      {
         return                
            (IQueryable)Activator.CreateInstance(typeof(Queryable<>).
                   MakeGenericType(elementType), new object[] { this, expression });
      }
      catch (TargetInvocationException e)
      {
         throw e.InnerException;
      }
   }

   public virtual IQueryable<T> CreateQuery<T>(Expression expression)
   {
      return new Queryable<T>(this, expression);
   }

   object IQueryProvider.Execute(Expression expression)
   {
      return queryContext.Execute(expression, false);
   }

   T IQueryProvider.Execute<T>(Expression expression)
   {
      return (T)queryContext.Execute(expression, 
                 (typeof(T).Name == "IEnumerable`1"));
   }
}

Note: Again, the IQueryContext is not part of the LINQ interfaces but one I’ve created to allow me to abstract the actual “custom” part of the code and allow maximum reuse of default functionality.

The TypeSystem class is taken from the post Walkthrough: Creating an IQueryable LINQ Provider

IQueryContext

The above shows some LINQ interfaces and some sample implementations. I’ve pointed out that IQueryContext is not part of LINQ but is instead something I’ve created (based upon reading various other implementations) to allow me to abstract the actual LINQ provider code specific to my provider’s implementation. Ofcourse we could have derived from QueryProvider, but for now we “plug-in” the data context instead. To change the implementation to derive from QueryProvider simple remove the IQueryContext (also from the Queryable implementation) and override the Execute methods.

For now I’ll continue this post using the IQueryContext, so here’s the interface

public interface IQueryContext
{
   object Execute(Expression expression, bool isEnumerable);
}

Implementing the IQueryContext

Whether the implementation of the actual Execute code (on the IQueryProvider) is abstracted into the IQueryContext or implemented within an alternate implementation of IQueryProvider. This is where the fun of actually “running” the LINQ query against your custom provider takes place.

When writing something like this…

var ctx = new Queryable<string>(new CustomContext())

var query = from s in ctx where s.StartsWith("T") select s;

what we are really doing is creating the query. Hence the CreateQuery methods on the IQueryProvider are called, but the actual data source or whatever supplies the data for your custom LINQ provider should not be called until we reach the execution phase, this is known as deferred execution. The execution phase takes place when we enumerate over the query or call methods, such as Count() etc. against a query.

foreach (var q in query)
{
   Console.WriteLine(q);
}

So at the point that we GetEnumerator implicitly via the foreach loop, that’s when the Execute method is called.

Executing the query

So, the Execute method is called and we will have an expression tree defined by LINQ supplied as an argument. We now need to translate that query into something useful, i.e. turn it into an SQL query for a custom database LINQ provider and then get the data for this query, or get data from a webservice and allow the query to be executed against the returned data etc.

As the actual decoding of the Expression is a fairly large subject in itself, I’ll leave that for another post, but suffice to say, there’s a lot we need to implement to duplicate some if not all of the “standard” LINQ query operators etc.

References

Walkthrough: Creating an IQueryable LINQ Provider
LINQ: Building an IQueryable provider series
LINQ and Deferred Execution