Category Archives: Programming

Debugging a release build of a .NET application

What’s a Release Build compared to a Debug build

Release builds of a .NET application (by default) add optimizations, remove any debug code from the build, i.e. anything inside #if DEBUG is remove as well as Trace. and Debug. calls being removed. You also have reduced debug information. However you will still have .PDB files…

PDB files are generated by the compiler if a project’s properties allow for .PDB file to be generated. Simply check the project properties, select the Build tab and the Advanced… button. You’ll see Debug Info, which can be set to full, pdb-only or none. Obviously none will not produce any .PDB files.

At this point, I do not know the differences between pdb-only and full, if I find out I’ll amend this post, but out of the box, Release builds used pdb-only whilst Debug use full.

So what are .PDB files ?

Simply put – PDB files contain symbol information which allows us to map debug information to source files when we attach a debugger and step through the code.

Debugging a Release Build

It’s often the case that we’ll create a deployment of a Release build without the PDB files, this may be due to a desire to reduce the deployment foot print or some other reason. If you cannot or do not wish to deploy the PDB’s with an application then we should store them for a specific version of a release.

Before attaching our debugger (Visual Studio) we ned to add the PDB file locations to Visual Studio. So select the Debug menu, then Options and Settings. From here select Debugging | Symbols from the tree view on the left of the Options dialog. Click on the add folder button and type in (or paste) the folder name for the symbols for the specific Release build.

Now attach Visual Studio using Debug | Attach Process and the symbols will get loaded for the build and you can now step through the source code.

Let’s look at a real example

An application I work on deploys over the network and we do not include PDB files with it so we can reduce the size of the deployment. If we find a bug only repeatable “production” we cannot step through the source code related to the build without both a version of the code related to the release and without the PDB files for that release.

What we do is, when our continuous integration server runs, it builds a specific version of the application as a release build. We embed the source repository revision into the EXE version number. This allows us to easily check out the source related to that build if need be.

During the build process, we the copy the release build to a deployment folder, again using the source code revision in the folder name. We (as already mentioned) remove the PDB files (and Tests and other such files are also obviously removed). However we don’t just throw away the PDB’s, we instead copy them to a folder similarly named to the release build but with the name Symbols within the folder name (and ofcourse with the same version number). The PDB’s are all copied to this folder and now accessible if we need to debug a release build.

Now if the Release (or a production) build is executed and an error occurs or we just need to step through code for some other reason, we can get the specific source for the deployed version, direct Visual Studio to the PDB files for that build and now step through our code.

So don’t just delete your PDB’s store them in case you need to use them in the future.

Okay, how do we use the symbol/PDB files

So, in Visual Studio (if you’re using that to debug/step through your code). Obviously open your project with the correct source for your release build.

In the Tools | Options dialog, select the Debugging parent node in the dialog and then select Symbols or ofcourse just type Symbols into the search text box in Visual Studio 2013.

Now press the folder button and type in the location of your PDB files folder. Note that this option doesn’t have a folder browse option so you’ll need to type (or copy and paste) the folder name yourself.

Ensure the folder is checked so that Visual Studio will load the symbols.

Now attach the debugger to your release build and Visual Studio will (as mentioned) locate the correct symbols and attach them and then allow you to step through your source.

See Specify Symbol (.pdb) and Source Files in the Visual Studio Debugger for more information and some screen shots of the process just described.

Downloading a file from URL using basic authentication

I had some code in an application which I work on which uses Excel to open a .csv file from a URL. The problem is that user’s have moved to Excel 2010 (yes we’re a little behind the latest versions) and basic authentication is no longer supported without registry changes (see Office file types fail to open from server).

So, to re-implement this I needed to write some code to handle the file download myself (as we’re no able to change user’s registry settings).

The code is simple enough , but I thought it’d be useful to document it here anyway

WebClient client = new WebClient();
client.Proxy = WebRequest.DefaultWebProxy;
client.Credentials = new NetworkCredential(userName, password);
client.DownloadFile(url, filename);

This code assumes that the url is supplied to this code along with a filename for where to save the downloaded file.

We use a proxy, hence the proxy is supplied, and then we supply the NetworkCredential which will handle basic authentication. Here we need to supply the userName and password, ofcourse with basic authentication these will be passed as plain text over the wire.

SQL select case

As mentioned in other posts on SQL. I’m currently working on a simple database to maintain information on the plants I’m planting (or looking to plant) this year. I’m going to use this application to write some “reminders” on using SQL.

Note: Currently I’m using SQL Server for this database, so I’ve only tried this on SQL Server.

Select

A simple example which classifies plants by their maximum height.

select Name, MaxHeight,
   case when MaxHeight > 100 then 'Tall'
        when MaxHeight > 50 and MaxHeight <= 100 then 'Medium'
        when MaxHeight > 0 and MaxHeight <= 50 then 'Small'
        else 'No Information'
        end as Height
from Plant



Results

[table “” not found /]

SQL Basics (Part 4)

In the previous SQL Basics posts, we’ve looked at querying our data, CRUD operations and transactions.

Let’s now look at creating databases, tables etc. These statements come under the acronym DDL (Database Definition Language).

Creating a database

We can create a database easily in SQL Server using

CREATE DATABASE Employee;

Now this is not accepted by every SQL type database, for example file based databases would probably not support this as the existence of the file is the database.

Creating Tables

To create a table on a database, we might use the following command in SQL Server

USE Employee;

This will ensure that we do not need to prefix our table definitions etc. with the database name. So we can create a table as

CREATE TABLE Person(
   Id int,
   FirstName nchar (20),
   LastName nchar (20),
   Age int
)

Note: the types supported by different databases may differ, but generally you’ll have numeric, string and date/time types (at least).

In the example Person table, we created a table with an Id, FirstName, LastName and Age. We’ve not created any primary key for this, so let’s now delete (or DROP) the table and try again with the following (see below for how to DROP the table)

CREATE TABLE Person(
   Id int IDENTITY(1, 1) not null,
   FirstName nchar (20) not null,
   LastName nchar (20) not null,
   Age int null,
   CONSTRAINT PK_Person PRIMARY KEY CLUSTERED
   (
      Id ASC
   )
)

Now we’ve also been more explicit in stating whether columns may be NULL (Note: NULL is the default for a column and some databases may not support explicit setting to NULL as it’s assumed NULL already). We’re using IDENTITY to create an auto incrementing ID. This may not be available in all SQL databases. Then finally we create a primary key constraint on the table named PK_Person.

DROP TABLE

To delete a table we call

DROP TABLE Person;

obviously all the data will be lost when we drop a table. We also may need to remove relationships to the Person table first, if any exist.

ALTER TABLE

As it’s name suggests, ALTER TABLE allows us to amend a TABLE. It may also be used as part of the TABLE creation process when foreign keys exist with tables which have not yet been created. i.e. we might create all the tables then ALTER TABLE to create the foreign keys etc.

Let’s just make a simple change to add a Nationality field to the table

ALTER TABLE Person 
ADD Nationality nchar(2) null

Whoops Nationality is a little on small size, so we need to alter this column, let’s use

ALTER TABLE Person 
ALTER COLUMN Nationality nchar(20)

Or we could remove the column thus

ALTER TABLE Person 
DROP COLUMN Nationality

SQL Basics (Part 3)

CRUD

In previous SQL Basics posts we’ve looked at querying the database, but we actually need to get some data into the database. So let’s go over the Create, Retrieve, Update, Delete commands.

Create

To create data in SQL we use the INSERT keyword, for example

INSERT INTO Plants (CommonName, Type_Id) VALUES ('Strawberry', 2)

This creates a new plant with the CommonName Strawberry and the Type_Id 2.

Retreieve

We’re not going to go through this again as we’ve dealt with the SELECT query already (this is the way we retrieve data using SQL).

Update

The UPDATE keyword is used to alter data with SQL.

Beware, if no WHERE clause is used then ALL rows will be updated

UPDATE Plants p SET p.CommonName = 'Strawberry' where p.Id = 123

Delete

DELETE is used to remove items from the database using SQL. We can delete one or more rows at a time but only from one table at a time, for example

DELETE FROM Plants

Beware you can easily delete ALL rows from a table if you do not specify a WHERE clause

Transactions

We’ve finished looking at the CRUD operations but you’ve probably noted some of the pitfalls of not correctly forming your DELETE or UPDATE queries. So let’s look a transactions which allow us to make changes which are not permanent until the transaction is completed.

First off, let’s look at the concept of ACID…

ACID stands for Atomic, Consistent, Isolated and finally Durable.

A transaction is said to be atomic in that it either happens or doesn’t happen, i.e. we cannot have a partially altered set of data. It’s consistent if the transaction leaves the database in a consistent state. It’s isolated in that it occurs in a serial way and finally it’s durable if it’s “permanently” stored, i.e. it’s not kept in memory but stored on disc so it will still be available after a reboot (for example).

Transaction syntax requires we tell the database we’re beginning a transaction, then we run our SQL command before either committing the transaction or rolling it back. For example

BEGIN TRANSACTION
DELETE FROM Plants
--ROLLBACK TRANSACTION
--COMMIT TRANSACTION

Now in the sample above I have started a transaction and deleted all Plants. If I now try to get all rows from the Plants table I’ll find they’re all gone. But uncommenting the ROLLBACK will allow us to cancel the transaction and return all the rows we seemed to have deleted.

Obviously had this been our intention then we could alternatively just uncomment the COMMIT transaction and commit out changes.

SQL Basics (Part 2)

JOINS

Joins allows us to aggregate multiple tables into a single result set. So for example we might want names of Plants from our plant database and information from other tables that relates to the plant.

As a more concrete example, we might have a foreign key from our Plant table which relates to the Plant Type (i.e. whether a plant is a Tree, Vegetable, Flower etc.)

CROSS JOIN

Before we look at an example of a JOIN as outlined above, let’s look at a CROSS JOIN. A CROSS JOIN is basically a JOIN of two tables without any where clause, so for example

select p.CommonName, pt.Name 
from Plants p, PlantTypes pt

This type of join is expensive in that it’s simply taking two tables and merging data, and in many ways it’s probably of little in terms of the result set produced. For example in my plant data at the moment I have 29 plants listed (not many I admit, but it’s early days). I also have 4 plant types. The result set of the CROSS join above is 4 * 29 = 116 rows. Basically the result set lists each plant CommonName against each plan type Name.

INNER JOIN

An inner join is generally the most used JOIN whereby we’re going to be looking for all items from one table matching a column to data from another table. So again using our Plants table and Plant Types we might want to see the plant type associated with each plan in our database.

Note: We can create such a join without the INNER JOIN keywords, by default in SQL Server such joins are inner joins anyway

select p.CommonName, pt.Name 
from Plants p inner join PlantTypes pt
on p.Type_Id = pt.Id

So this assumes that the Type_Id is a foreign key into the PlantTypes table and relates to it’s primary key.

FULL OUTER JOIN

A FULL OUTER JOIN will return NULL data. So for example we allowed NULL’s for our Type_Id or there are no matches for a PlantTypes Id then we’ll see NULL values in the output columns, for example the query would look like

select p.CommonName, pt.Name 
from Plants p inner join PlantTypes pt
on p.Type_Id = pt.Id

Our result set may now display CommonName NULL (for example) if a PlantType Id does not
have a matching Plant, i.e. we’ve not added any Trees yet to out Plants table.

Note: The FULL OUTER JOIN syntax is not supported by MySQL

LEFT OUTER JOIN

As we’ve seen with a FULL OUTER JOIN we might have a PlantType which is not yet used in the Plants table and therefore we’ll see a NULL for the column CommonName. However we might also have a plant which doesn’t yet have a Type_Id and hence would have a NULL in this column.

If we don’t want to view plant types which have NULL CommonNames (in other words we only really care about the plants not the types) we may want to see all plants and their plant types regardless of whether they’re NULL, we can use a LEFT OUTER JOIN

select p.CommonName, pt.Name 
from Plants p left outer join PlantTypes pt
on p.Type_Id = pt.Id

In this case we get data from the left table whether or not they have a matching value in the right hand side table.

RIGHT OUTER JOIN

As you’ve guessed a RIGHT OUTER JOIN will return all values from the plants data with a match on the plant type plus those not matching the plan type with no plants associated with it.

select p.CommonName, pt.Name 
from Plants p right outer join PlantTypes pt
on p.Type_Id = pt.Id

SELF JOIN

We’ve seen how to create joins with other tables but actually we can join with the same table. There’s no SELF JOIN keywords, it’s more the concept of joining one table with itself. The most obvious use of such a join is within a hierarchal type of data. Possibly a Plant Type might have a parent Plant Type then one could join against the parent plant type on the same table as the child plant type.

SQL basics (Part 1)

Let’s take a look at some SQL basics.

Note: Unless otherwise stated, the queries are tested in SQL Server only

Select

So to retrieve data from a database using SQL we write

select <columns> from <table>

and optionally we can add a where clause to reduce the result set based upon Boolean logic.

select <columns> from <table> where <boolean logic>

We can view all columns within a table using the wildcard * however in a production environment (at least where an application is retrieving data from the database) we would be better off specifying the columns we want to improve performance and ensure that if new columns are added or the likes, our query still produce the same “expected” result set, column-wise anyway.

So using the wild card we would have something like

select * from Plants

or specifying the required columns we use something like

select CommonName from Plants

or retrieving multiple columns

select CommonName from Plants
select CommonName, Genus from Plants

When specifying columns we might be querying from multiple tables so it’s best to alias the table name in case of column name duplication across different tables, hence we get something like

select p.CommonName from Plants p

Aliasing columns

Using the alias table query (from above) we will get a result set (in SQL Server Management Studio) with the column name CommonName. We can assign an alias to the column during a query and therefore change the name output for the column, for example

select p.CommonName as 'Common Name' from Plants p

this will output a column named Common Name now.

Aliasing can also be used on the output from functions etc. So, for example

select count(*) as Count from Plants p

will output a column named Count.

Count

Count returns a scalar value. We can get the number of rows in a table using the wildcard

select count(*) from Plants

This will return the total number of rows in the Plants table, however using a column name within count returns the number of non-NULL rows, hence

select count(p.CommonName) from Plants p

Min/Max

Both Min and Max returns a single value indicating the minimum (obviously using MIN) or maximum (obviously using MAX) non-NULL value from a column, both can be used on numeric or non-numeric columns.

Here’s an example of the usage

select max(p.Height) from Plants p

The above returns the maximum height found in the Plants table and

select min(p.Height) from Plants p

returns the minimum height.

AVG

AVG returns a single value indicating the average value within a selected column. AVG only works on numeric columns.

select avg(p.Height) from Plants p

SUM

SUM can be used on a numeric column and return the SUM of all values

select sum(p.Height) from Plants p

DISTINCT

The distinct keyword allows us to get only distinct (non-duplicating) values, i.e.

select distinct p.Genus from Plants p

The above will basically remove duplicates.

GROUP BY

We can create sub groups from our data using the GROUP BY clause. For example say we want to duplicate the DISTINCT functionality by only returning all plants with no duplicate common names we can do this with GROUP BY as

select p.CommonName from Plants p group by p.CommonName

Basically we create groups based upon the CommonName and then output each group’s CommonName. But we can think of the result set as groups (or arrays) of data so we can also do things like list a count for the number of duplicated names against each group

select p.CommonName, count(p.CommonName) from Plants p group by p.CommonName

This will now list each distinct group name and list a count alongside it to show how many items have that CommonName.

HAVING

The HAVING clause is used in a similar way to the WHERE clause but for GROUP BY result sets. An example might be where we’ve grouped by the CommonName of a plant in our database but are only interested in those names with more than a certain number of occurrences, thus

select p.CommonName, count(p.CommonName) 
from Plants p 
group by p.CommonName having count(p.CommonName) > 3

Now we’re basically get a result set with the CommonName and the Count for those CommonNames duplicated more than 3 times.

Note: In SQL Server we cannot alias the count(p.CommonName) as c, for example and then use c in the having clause, whereas MySQL does allow this syntax

Type conversions in C#

Converting one type to another

All of the primitive types, such as Int32, Boolean, String etc. implement the IConvertible interface. This means we can easily change one type to another by using

float f = (float)Convert.ChangeType("100", typeof(float));

The thing to note regarding the IConvertible type is that it’s one way, i.e. from your type which implements the IConvertible to another type, but not back (this is where the class TypeConverter, which we’ll discuss next, comes into play).

So let’s look at a simple example which converts a Point to a string, and yes before I show the code for implementing IConvertible, we could have simply overridden the ToString method (which I shall also show in the sample code).

First off let’s create a couple of tests to prove our code works. The first takes a Point and using IConvertible, will generate a string representation of the type. As it uses ToString there’s no surprise that the second test which uses the ToString method will produce the same output.

[Fact]
public void ChangeTypePointToString()
{
   Point p = new Point { X = 100, Y = 200 };
   string s = (string)Convert.ChangeType(p, typeof(string));

   Assert.Equal("(100,200)", s);
}

[Fact]
public void PointToString()
{
   Point p = new Point { X = 100, Y = 200 };

   Assert.Equal("(100,200)", p.ToString());
}

Now let’s look at our Point type, with an overridden ToString method

public struct Point : IConvertible
{
   public int X { get; set; }
   public int Y { get; set; }

   public override string ToString()
   {
      return String.Format("({0},{1})", X, Y);
   }

   // ... IConvertible methods
}

and now let’s look at a possible implementation of the IConvertible

TypeCode IConvertible.GetTypeCode()
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

bool IConvertible.ToBoolean(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

byte IConvertible.ToByte(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

char IConvertible.ToChar(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

DateTime IConvertible.ToDateTime(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

decimal IConvertible.ToDecimal(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

double IConvertible.ToDouble(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

short IConvertible.ToInt16(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

int IConvertible.ToInt32(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

long IConvertible.ToInt64(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

sbyte IConvertible.ToSByte(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

float IConvertible.ToSingle(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

string IConvertible.ToString(IFormatProvider provider)
{
   return ToString();
}

object IConvertible.ToType(Type conversionType, IFormatProvider provider)
{
   if(conversionType == typeof(string))
      return ToString();

   throw new InvalidCastException("The method or operation is not implemented.");
}

ushort IConvertible.ToUInt16(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

uint IConvertible.ToUInt32(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

ulong IConvertible.ToUInt64(IFormatProvider provider)
{
   throw new InvalidCastException("The method or operation is not implemented.");
}

TypeConverters

As mentioned previously, the IConvertible allows us to convert a type to one of the primitive types, but what if we want more complex capabilities, converting to and from various types. This is where the TypeConverter class comes in.

Here we develop our type as normal and then we adorn it with the TypeConverterAttribute at the struct/class level. The attribute takes a type derived from the TypeConverter class. This TypeConverter derived class does the actual type conversion to and from our adorned type.

Let’s again create a Point struct to demonstrate this on

[TypeConverter(typeof(PointTypeConverter))]
public struct Point
{
   public int X { get; set; }
   public int Y { get; set; }
}

Note: We can also declare the TypeConverter type using a string in the standard Type, Assembly format, i.e. [TypeConverter(“MyTypeConverters.PointTypeConverter, MyTypeConverters)]
if we wanted to reference the type in an external assembly.

Before we create the TypeConverter code, let’s take a look at some tests which hopefully demonstrate how we use the TypeConverter and what we expect from our conversion code.

[Fact]
public void CanConvertPointToString()
{
   TypeConverter tc = TypeDescriptor.GetConverter(typeof(Point));

   Assert.True(tc.CanConvertTo(typeof(string)));
}

[Fact]
public void ConvertPointToString()
{
   Point p = new Point { X = 100, Y = 200 };

   TypeConverter tc = TypeDescriptor.GetConverter(typeof(Point));

   Assert.Equal("(100,200)", tc.ConvertTo(p, typeof(string)));
}

[Fact]
public void CanConvertStringToPoint()
{
   TypeConverter tc = TypeDescriptor.GetConverter(typeof(Point));

   Assert.True(tc.CanConvertFrom(typeof(string)));
}

[Fact]
public void ConvertStringToPoint()
{
   TypeConverter tc = TypeDescriptor.GetConverter(typeof(Point));

   Point p = (Point)tc.ConvertFrom("(100,200)");
   Assert.Equal(100, p.X);
   Assert.Equal(200, p.Y);
}

So as you can see, to get the TypeConverter for our class we call the static method GetConverter on the TypeDescriptor class. This returns an instance of our TypeConverter (in this case our PointTypeConverter). From this we can check whether the type converter can convert to on from a type and then using the ConvertTo or ConvertFrom methods on the TypeConverter we can convert the type.

The tests above show that we expect to be able to convert a Point to a string where the string takes the format “(X,Y)”. So let’s look at an implementation for this

Note: note, this is an example of how we might implement this code and does not have full error handling, but hopefully gives a basic idea of what you might implement.

public class PointTypeConverter : TypeConverter
{
   public override bool CanConvertTo(ITypeDescriptorContext context, 
            Type destinationType)
   {
      return (destinationType == typeof(string)) || 
         base.CanConvertTo(context, destinationType);
   }

   public override object ConvertTo(ITypeDescriptorContext context, 
            CultureInfo culture, 
            object value, 
            Type destinationType)
   {
      if (destinationType == typeof(string))
      {
         Point pt = (Point)value;
         return String.Format("({0},{1})", pt.X, pt.Y);
       }
       return base.ConvertTo(context, culture, value, destinationType);
   }

   public override bool CanConvertFrom(ITypeDescriptorContext context, 
            Type sourceType)
   {
      return (sourceType == typeof(string)) ||
         base.CanConvertFrom(context, sourceType);
   }

   public override object ConvertFrom(ITypeDescriptorContext context, 
            CultureInfo culture, 
            object value)
   {
      string s = value as string;
      if (s != null)
      {
         s = s.Trim();

         if(s.StartsWith("(") && s.EndsWith(")"))
         {
            s = s.Substring(1, s.Length - 2);

            string[] parts = s.Split(',');
            if (parts != null && parts.Length == 2)
            {
               Point pt = new Point();
               pt.X = Convert.ToInt32(parts[0]);
               pt.Y = Convert.ToInt32(parts[1]);
               return pt;
            }
         }
      }
      return base.ConvertFrom(context, culture, value);
   }
}

How to, conditionally, stop XML serializing properties

Let’s assume we have this simple C# class which represents some XML data (i.e. it’s serialized to XML eventually)

[XmlType(AnonymousType = true)]
public partial class Employee
{
   [XmlAttribute(AttributeName = "id")]
   public string Id { get; set; }

   [XmlAttribute(AttributeName = "name")]
   public string Name { get; set; }

   [XmlAttribute(AttributeName = "age")]
   public int Age { get; set; }
}

Under certain circumstances we may prefer to not include elements in the XML if the values are not suitable.

We could handle this is a simplistic manner by setting a DefaultValueAttribute on a property and obviously the data will not be serialized unless the value differs from the default, but this is not so useful if you wanted more complex functionality to decide whether a value should be serialized or not, for example what if we don’t want to serialize Age if it’s less than 1 or greater than 100. Or not serialize Name if it’s empty, null or the string length is less than 3 characters and so on.

ShouldSerializeXXX

Note: You should not use ShouldSerializeXXX method and the DefaultValueAttribute on the same property

So, we can now achieve this more complex logic using the ShouldSerializeXXX method. If we create a partial class (shown below) and add the ShouldSerializName method we can tell the serializer to not bother serializing the Name property under these more complex circumstances

public partial class Employee
{
   public bool ShouldSerializeName()
   {
      return !String.IsNullOrEmpty(Name) || Name.Length < 3;
   }
}

When serializing this data the methods are called by the serializer to determine whether a property should be serialized and obviously if it should not be, then the element/attribute will not get added to the XML.

Entity Framework – lazy & eager loading

By default Entity Framework will lazy load any related entities. If you’ve not come across Lazy Loading before it’s basically coding something in such a way that either the item is not retrieved and/or not created until you actually want to use it. For example, the code below shows the AlternateNames list is not instantiated until you call the property.

public class Plant
{
   private IList<AlternateName> alternateNames;

   public virtual IList<AlternateName> AlternateNames
   {
      get
      {
         return alternateNames ?? new List<AlternateName>();
      }
   }
}

So as you can see from the example above we only create an instance of IList when the AlternateNames property is called.

As stated at the start of this post, by default Entity Framework defaults to lazy loading which is perfect in most scenarios, but let’s take one where it’s not…

If you are returning an instance of an object (like Plant above), AlternateNames is not loaded until it’s referenced, however if you were to pass the Plant object over the wire using something like WCF, AlternateNames will not get instantiated. The caller/client will try to access the AlternateNames property and of course it cannot now be loaded. What we need to do is ensure the object is fully loaded before passing it over the wire. To do this we need to Eager Load the data.

Eager Loading is the process of ensuring a lazy loaded object is fully loaded. In Entity Framework we achieve this using the Include method, thus

return context.Plants.Include("AlternateNames");