Author Archives: purpleblob

Python classes

In my previous post I looked at some of the basics of Python. In this post I want to dig further into the syntax etc. for classes within Python.

Class naming conventions

We prefix a class name with the class keyword. Python naming convention suggests the class name should be Pascal case, i.e. ClassName.

The method naming convention, by default, should be all lowercase with underscores (snake case) to separate words, i.e. method_name.

Private/protected methods and instance variables should have an underscore prefix, i.e. _private_name.

Defining a class

The special method name __init__ can be thought of a little like a constructor, however we cannot overload it, but we can pass arguments into it

class Animal:
    def __init__(self, name):
        self.name = name

Note: Whilst it might appear possible to have multiple __init__ methods, it’s the last one declared which seems to be the one available to calling code.

We terminate the line that the class keyword is on with the colon and then methods or variables start on a new line and indented with a tab.

The self parameter passed to __init__ or any other class method is passed implicitly, i.e. we do not supply the self, Python does that for us.

Inheritance

Python classes support inheritance (even multiple inheritance) by declaring the base classes in a comma separated list within parenthesis, i.e.

class Animal:
    def __init__(self, name):
        self.name = name


class Mammal:
    pass


class Dog(Animal, Mammal):
    def __init__(self):
        super().__init__("Dog")

In this example you can see how we can call the base class’ __init__ method. The Dog class derives from both Animal and Mammal classes (in this instance the Mammal class has no methods etc. hence uses pass statement to create the empty class).

Calling base methods in Multiple Inheritance

Multiple inheritance always has the issue around what base method to call in situations where there’s multiple methods of the same name, for example if we have the following

class A:
    def name(self):
        print("A")

class B:
    def name(self):
        print("B")


class C(A, B):
    pass

c = C()
c.name()

then what base class method is actually called via c.name().

A quick note: In the above as we’re missing the method name() on the class C, this can be seen as the following

class C(A, B):
    def name(self):
        super(C, self).name()

The order of resolving the methods to be called is the Method Resolution Order (MRO).

For our simple example, the first name() method located depends upon the order of inheritance, i.e. in the example above A’s name() is called, switching the code for C to

class C(B, A):
    pass

results in B’s name method being called.

To ensure we are explicit about which code is called, it’s best to override the name method in the subclass like this

class C(A, B):
    def name(self):
        B.name(self)

and obviously this ensures B’s name is called.

Methods

We declare methods within a class using the keyword def and all methods are virtual, i.e. can be overridden in a subclass. Here’s a basic example over method overriding

class Animal:
    def name(self):
        pass


class Dog(Animal):
    def name(self):
        return "Dog"

In this example we’re using the Animal a little like an abstract class and then implementing the name method in derived classes.

Member variables/fields

Let’s rewrite the previous code to now use a member variable in the base class to store the name of the Animal.

class Animal:
    _name = ""

    def name(self):
        return self._name


class Dog(Animal):
    def __init__(self):
        self._name = "Dog"

In this case the prefixed _ denotes a protected/private variable. In most OO languages this encapsulates the member variable in such a way as to ensure it’s no accessible outside of the base and derived classes, however in Python it’s still available, and thus this will return the value stored within the _name variable. In PyCharm we get a hint that we’re accessing a protected member, but ultimately it’s still accessible, as per the example below

a = Dog()
print(a._name)

Empty classes

In some cases we might want to create an empty class type, such as

class Duck:
    pass

If we need to then create the equivalent of a struct at runtime, then we can simply dynamically “add” variables like this

d = Duck()
d.name = "Duck"

this leads us onto duck typing…

Duck typing

Duck typing allows us to declare Python type for use in other types that simple expect certain methods/variables to exist, for example

class Car:
    def __init__(self, engine):
        self.engine = engine

    def engine_size(self):
        return self.engine.size


class EconomicalEngine:
    size = 1.0


class SportEngine:
    size = 3.5


car = Car(SportEngine())
print(car.engine_size())

In this example our Car has an engine but we do not define an Engine type, instead we simply state that the Car expects a type with a size variable. This does ofcourse means we could pass in anything with a size variable (which ofcourse might not always make much sense).

Static/class methods

Up until now we’ve seen instance methods on classes. Python also includes two decorators that can be used in conjunction with our methods to make them acts in a similar way to static methods.

The first of these is the decorator @staticmethod, for example

class Dog:
    @staticmethod
    def animal_type():
        return "Dog";

# and we call the method like this
print(Dog.animal_type())
# or
d = Dog()
print(d.animal_type())

Python also includes another static-like method type which is decorated with @classmethod. A class method differs from a static method in that it includes an implicit cls variable (by convention cls is used for the first argument). For example

class Dog:
    @classmethod
    def name(cls):
        return "English Springer Spaniel"

# and we call the method like this
print(Dog.name())
# or
d = Dog()
print(d.name())

So what’s the difference? Both appear to allow us to work in a “static” method way but the @classmethod allows us to still get an instance of the class, but it appears that this is almost like passing an instance of a new class into the method, i.e. these appear to be equivalent. Let’s assume we have this

class Dog:
    _name = "Dog"

    @classmethod
    def name(cls):
        print(cls)
        return cls._name

d = Dog()
d._name = "English Springer Spaniel"

print(d.name)
# same as
print(Dog().name)

In the above we change the instance variable _name but both print statements will output “Dog” so appear functionally equivalent.

Python basics

I’ve used Python (or more specifically IronPython) in the past to use Python as a scripting engine for my app. but never really bothered with Python beyond that. However, I decided I need to spend a little more time with Python, so am going to post a few “getting started” posts, starting with this one…

Note: This post any others on Python are aimed at developers coming from another language, such as C#, Java or C++. So I will not be going over every capability of the language but will aim to cover those main areas of syntax etc. to simply get things working in Python.

Python basics

Python files are usually saved with the .py extension and is interpreted, although there are apps for compiling the scripts to executables.

Python is a dynamic language, hence we do not need to declare the type of a variable.

Python uses whitespace and indentation to denote blocks of code, i.e. in C#, Java etc. we’d use curly braces { } to denote a block of code whereas Python expects the code blocks to be tabbed, for example in C# we might have

public int Pi()
{
   return 3.14;
}

the same function in Python would be

def pi():
    return 3.14

This also demonstrates creating a function in Python (see Functions, below for more information on function definitions).

Also notice that Python does not use line termination like the semi-colon in C#. Java etc. This means if we need to extend our code onto another line we can use the backslash, i.e.

def pi():
    return \
    3.14

Types

Python is a dynamically typed language which has good and bad points. This basically means when we declare a variable name it’s automatically created and it’s type is automatically declared based upon it’s usage. The type can change as we assign different types to the variable, for example

value = "Hello World"
# value is of type str
value = 123
# value is now of type int

Note: # is used for a single line comment, we can use “”” to start and end a block of code to use as multi-line comments, although technically speaking these denote multi-line strings.

Python types include the usual types, such as a string (str), int, long, float, boolean along with complex types, tuples and more.

If we need to find the type at runtime then we can use the type method, for example

value = (1, "Hello")
print(type(value))

In this code, the first line creates a tuple and the next line outputs (to the console) the type of the value variable, in this case the output will be

<class 'tuple'>

Enumerations

Enumerations are declared via the Enum type, we need to import the enum module and then we can declare our Enum subclassed type as follows

import enum

class Gender(enum.Enum):
    Male = 1,
    Female = 2

# in use
g = Gender.Female

Functions

Python allows us to define function using the def keyword, here the example we introduced earlier

def pi():
    return 3.14

def is used to declare or define our function, arguments may be passed within the parenthesis, in this instance no arguments exist. Finally we end the function declaration using the colon. The actual statements that are executed within the function are then below the declaration and the code is indented.

Executing functions uses pretty standard syntax, i.e.

value = pi()

Although Python does a good job of type inference when we have code like the following

def add(a, b):
    return a + b

we have a situation where both of the following bits of code work

strValue = add("2", "3")
intValue = add(2, 3)

but in the first instance we get the string “23” and in the second we get the int 5.

Obviously if our intent of that the add function it for numerical types then we will need to give type hints, for example

def add(a: float, b: float) -> float:
    return a + b

The strange thing here (if you’re used to similar techniques in F#, for example) is that running the previous example of the add function with strings and ints will still work, i.e. still returns “23” and 5. So it’s debatable how useful hints are. Certainly PyCharm will display a “hint” over the string arguments and tell us the expected type, but ultimately we can still pass in a different types for the add function arguments.

If you want to include type hints but not show them in your source or if you want to hint on existing code that you do not (or cannot) edit. Then you can include a .pyi file along with the .py file including the type/function etc. and separate the hints from the actual implementation. So for example our .py file might have

def add(a, b):
    return a + b

and the file with the same name but .pyi extension would have

def add(a: float, b: float) -> float: ...

Object Orientated

Python allows us to define classes uses the class keyword, for example

import http.client

class WebClient:
        def __init__(self):
            self.url = "www.google.co.uk"
            self.method = "GET"

        def submit(self):
            connection = http.client.HTTPConnection(self.url)
            connection.request(self.method, "/")
            response = connection.getresponse()
            data = response.read()
            print(response.status, response.reason)
            print(data)

client = WebClient()

The __init__ function is the equivalent of a constructor and self is analogous to this in C#. Within the __init__ we have declared new variables along with some default values, i.e. the url and method variables.

In the case where we do not have anything to setup/initialize in the __init__ function we can use the pass keyword. Here’s an example of this along with how we can derived/subclass a type

class Animal:
    def __init__(self):
        pass

class Dog(Animal):
    def __init__(self):
        self.legs = 4

A good old, OO example whereby Animal is the base class and Dog subclasses this and adds functionality/fields etc.

As you’ve probably already noticed, Python doesn’t include a new keyword to create an instance of a class we simply use

animal = Dog()

Importing modules & packages

Like most languages, we can create reusable source code modules (and packages) that can then be imported into our source code, for example let’s import a Python standard library for connecting to an HTTP endpoint

import http.client

connection = http.client.HTTPConnection("www.google.co.uk")
connection.request("GET", "/")
response = connection.getresponse()
data = response.read()

In the code above, we import the http.client library and then create an HTTPConnection instance which we then use to connect to google.co.uk.

We can import multiple modules using comma separated values, i.e.

import http.client, sys

Creating modules & packages

Modules are simply ways to define reusable code within separate files. Packages are is essence namespaces which can contain multiple packages and modules.

To create a module, we simply create a file, for example web.py and include our functions and types etc. within that file, so web.py looks like this

import http.client

class WebClient:
        def __init__(self):
            self.url = "www.google.co.uk"
            self.method = "GET"

        def submit(self):
            connection = http.client.HTTPConnection(self.url)
            connection.request(self.method, "/")
            response = connection.getresponse()
            data = response.read()
            print(response.status, response.reason)
            print(data)

and it our file that uses this module, we have the following

import web

client = web.WebClient()
client.submit()

Packages now take this a little further. Still using the file based approach. Packages are simply directories with one or more module file, but every package must include a file named __init__.py which can be empty. This __init__.py file may contain an __all__ variable which defines what modules are exported, i.e. non-exported modules can be internal or private to the package or exported and visible to those using the package. For example our __init__.py file might look like this

__all__ = ['WebClient']

# example within multiple exports
__all__ = ['WebClient', 'SomethingElse']

However the above only seems to be used when using the alternate import syntax, for example

from web import *

See the answers to Can someone explain __all__ in Python? for much more information on the use of __all__.

Documenting your API’s using XML documentation

I’ve been updating one of my GitHub repositories (my Presentation.Core repo.) with some more documentation, using XML documentation comments, and thought I’d write a refresher post for taking your code from no documentation all the way through to being fully documented.

<summary>

So the /// is used for XML documenting comments in C#, for example

/// <summary>
/// A Task/async aware command object. 
/// </summary>
public class AsyncCommand : CommandCommon
{
}

In the above, the <summary> element (and it’s corresponding end element) within the /// acts as a summary of our class. This class level summary is displayed in Visual Studio Intellisense when declaring a type of AsyncCommand when we create code such as new AsyncCommand() Intellisense will display the summary for the default constructor (if such documentation exists).

The summary tag is the most important tag for documenting our code using the XML documentation as it’ll acts as the primary source of information for Intellisense and within generated help files.

<remarks>

This is an optional documentation <remarks> element which can be used to expand upon the summary or add supplementary documentation to the summary element. For example

/// <summary>
/// A Task/async aware command object. 
/// </summary>
/// <remarks>
/// Automatically handles changes to the built-in, IsBusy flag, so when
/// the command is executing the IsBusy is true and when completed or not 
/// executing IsBusy is false.
/// </remarks>
public class AsyncCommand : CommandCommon
{
}

Summary text is displayed by IntelliSense whereas Remarks are shown in the help file output via tools such as Sandcastle.

<returns>

As you might guess, the <returns> element is used on method return values, for example

/// <summary>
/// Implementation of CanExecute which returns a boolean
/// to allow the calling method to determine whether the
/// Execute method can be called
/// </summary>
/// <param name="parameter">An object is ignored in this implementation</param>
/// <returns>True if the command can be executed, otherwise False</returns>
public override bool CanExecute(object parameter)
{
}

<param>

Continuing with the elements available for a method, the param element is used for method arguments, to allow the documentation to list what each parameter is used for.

/// <summary>
/// Implementation of CanExecute which returns a boolean
/// to allow the calling method to determine whether the
/// Execute method can be called
/// </summary>
/// <param name="parameter">An object is ignored in this implementation</param>
/// <returns>True if the command can be executed, otherwise False</returns>
public override bool CanExecute(object parameter)
{
}

<value>

Like a return but used on a property, the <value> element is used along with a summary to describe what the property represents.

<para>

The <para> element is used within summary, remarks or return elements to add paragraph formatting/structure to your documentation.

<exception>

The <exception> element is used (as the name suggests) to list exceptions a method may throw during it’s execution. For example

/// <summary>
/// Constructor take a comparison function which expects two types of T and
/// returns an integer where a value less than 0 indicates the first item is 
/// before the second, a value of 0 indicates they're equivalent 
/// and a value greater than 0 indicates the first item is after
/// the second item. This constructor also takes a function for the object 
/// hash code method.
/// </summary>
/// <param name="objectComparer">
/// A function to compare two items of type T
/// </param>
/// <exception cref="System.NullReferenceException">
/// Thrown when the objectComparer is null
/// </exception>
public ComparerImpl(
   Func<T, T, int> objectComparer)
{
   _objectComparer = 
      objectComparer ?? 
      throw new NullReferenceException("Comparer cannot be null");
}

IntelliSense will display the list of exceptions

<see>

The <see> element allows us to reference documentation elsewhere within your code, it accepts a cref argument and within this we need to reference our classname.methodname etc. For example

/// <summary>
/// Sets the property value against the property and raises
/// OnPropertyChanging, OnPropertyChanged etc. as required.
/// <see cref="GetProperty{T}"/>
/// </summary>
protected bool SetProperty<T>(
   Func<T> getter, 
   Func<T, T> setter, 
   T value, 
   [CallerMemberName] string propertyName = null)
{
}

Note: In the case of reference another method within the same class we need not declare the class name within the cref, but if referencing another class we should use classname.methodname syntax.

Within IntelliSense and documentation the full method name will be displayed for the see element’s cref.

<seealso>

Usage for <seealso> is as per the see element but generated code will create a see also section and place such references within.

<typeparam>

The <typeparam> element is used to document the generic parameters passed to a class and/or methods, for example

/// <summary>
/// Implements IComparer&lt;T&gt;, IEqualityComparer&lt;T&gt; 
/// and IComparer
/// </summary>
/// <typeparam name="T">The type of being compared</typeparam>
public class ComparerImpl<T> : 
   IComparer<T>, IEqualityComparer<T>, IComparer
{
}

<paramref>

Used within a summary or remarks, the <paramref> is used to refer to a parameter on the method being documented. For example

/// <summary>
/// Gets the current value via a Func via the <paramref name="generateFunc"/>
/// </summary>
/// <typeparam name="T">The property type</typeparam>
/// <param name="generateFunc">The function to create the return value</param>
/// <param name="propertyName">The name of the property</param>
/// <returns>The value of type T</returns>
protected T ReadOnlyProperty<T>(
   Func<T> generateFunc, 
   [CallerMemberName] string propertyName = null)
{
}

In the above, generateFunc is displayed within the IntelliSense summary and highlighted (via an italic font) in generated help files.

<typeparamref>

Like paramref, the <typeparamref> element can be used within summary, remarks etc. to link to a specific generic type, for example

/// <summary>
/// Gets the current property value as type <typeparamref name="T"/>
/// </summary>
protected T GetProperty<T>(
   Func<T> getter, 
   Func<T, T> setter, 
   [CallerMemberName] string propertyName = null)
{
}

As per paramref generated documentation may highlight the name within a help file and it’s also displayed via IntelliSense within the summary.

<list>

The <list> element allows us to define formatted text in a bullet, name or table format within our summary block, for example

/// <summary>
/// Acts as a base class for view models, can be used
/// with backing fields or delegating getter/setter 
/// functionality to another class - useful for situations
/// where the underlying model is used directly
/// <list type="bullet">
/// <item><description>GetProperty</description></item>
/// <item><description>SetProperty</description></item>
/// <item><description>ReadOnlyProperty</description></item>
/// </list>
/// </summary>
public class ViewModelWithModel : ViewModelCommon
{
}

This results in generated documentation showing a bullet list of items, within IntelliSense, this isn’t quite so pretty and results in just the descriptions listed with space delimitation.

<example>

The <example> element allows us to add example code to our documentation, for example

/// <summary>
/// Sets the property value against the property and raises
/// OnPropertyChanging, OnPropertyChanged etc. as required
/// </summary>
/// <example>
/// <code>
/// public string Name
/// {
///    set => SetProperty(value);
///    get => GetProperty();
/// }
/// </code>
/// </example>
protected bool SetProperty<T>(
   T value, 
   [CallerMemberName] string propertyName = null)
{
}

Note: Without the <code> element the example is not formatted as we’d expected. This is results in a generated help file section named examples.

<code>

The <code> element is used within the example element to format code samples. See above.

<c>

The <c> element may be used within a summary or other element and formats the text within it as code.

<permission>

As you probably guessed, we can highlight the permissions expected or required for a method etc. using the <permission> element. For example

/// <summary>
/// Gets the current property value
/// </summary>
/// <permission cref="System.Security.PermissionSet">
/// Unlimited access to this method.
/// </permission>
protected T GetProperty<T>([CallerMemberName] string propertyName = null)
{
}

The above will result in generated documentation with a security section which includes a table of the cref value and the description for the permissions required for the method.

<include>

The <include> element allows us to use documentation from an external XML file.

How do we generate the XML documents?

Once we’ve documented our code using the XML documentation we will need to get Visual Studio to generate the XML files for the documents, basically extracting the comment blocks into these external files.

For each project you wish to generate documentation for (within Visual Studio) select the properties for the project and then the Build tab. In the Output sections check the XML documentation file check box and either allow the default file name or create your own.

References

Documenting your code with XML comments

Extending Cake – creating a Cake extension method

There may come a time when cake’s existing commands do not do everything you want, so being that the cake DSL is basically C# we can easily create our own code.

In this post I’m going to create a DLL which is pretty standard but we’ll make it so we can fit into the cake ecosystem by making it an alias, this will make it simple to use and also give us access to the ICakeContext for logging etc..

First let’s see how it’ll be used inside cake

#r "./tools/SvnVersioning.dll" 

Task("Version")
    .Description("Generates a version based upon current SVN checkin")
    .Does(() =>
{
    // svnPath can be null if svn.exe is accessible via path
    var svnPath = @"C:\svn\svn-win32-1.6.0\bin\";
    var current = MakeAbsolute(Directory("."));
    var version = GetCurrentVersion(svnPath, current.FullPath);
    Information("Version: {0}", version);
});

Now if we create a class library (named SvnVersioning as uses in the above script).

We need to add a reference to Cake.Core to allows us access to the ICakeContext and cake attributes. Here’s the code for the GetCurrentVersion method (again, as can be seen in use in the script above)

public static class SvnVersionAliases
{
   [CakeMethodAlias]
   public static string GetCurrentVersion(this ICakeContext context, string svnPath, string path)
   {
      var REVISION_HEADING = "Last Changed Rev:";
      var revision = "10101";
      try
      {
         context.Log.Information("Solution path {0}", path);
         context.Log.Information("Starting svn {0}", svnPath);

         var svn = "svn.exe";
         if (!String.IsNullOrEmpty(svnPath))
         {
            svn = svnPath + @"\" + svn;
         }

         var proc = new Process();
         var procInfo = new ProcessStartInfo(svn)
         {
            CreateNoWindow = true,
            UseShellExecute = false,
            RedirectStandardOutput = true,
            Arguments = "info " + path
         };

         proc.StartInfo = procInfo;
         proc.Start();
         var output = proc.StandardOutput.ReadLine();

         while (!String.IsNullOrEmpty(output))
         {
            if (output.StartsWith(REVISION_HEADING))
            {
               revision = output.Replace(REVISION_HEADING, String.Empty).Trim();
            }
            output = proc.StandardOutput.ReadLine();
         }

         //revision is restricted to 65535 - and svn is already past this!
         //so mod by 10000!
         revision = (Convert.ToInt32(revision) % 10000).ToString();
         return revision;
      }
      catch (Exception ex)
      {
         context.Log.Information(ex.Message);
         return revision;
      }
   }
}

The CakeMethodAlias simply marks the code as a method alias, we can also add CakeAliasCategoryAttribute on method of class which is used for documentations of methods/properties. The CakeNamespaceImportAttribute is used to “hint” about additional namespaces that are required by the alias.

Getting started with Cake

Before I start getting into cake usage, I want to just list some problems I had during the Getting Started process. I’m not blaming cake for these, but just wanting to put these issues out there in case others hit the same.

  • It does state in the Getting Started page, “NOTE: If downloading the latest zip file, ensure that it is unblocked otherwise you will run into issues when attempting to execute Cake.” and I forgot this step initially. If you do not unblock before unzipping, you’ll end up with all sorts of Powershell security problems.
  • NuGet.exe needed to be located on your machine (not a big deal) and copied to the Tools folder – I couldn’t find this mentioned in the Getting Started, so not quite sure wther this was supposed to be in the example zip or downloaded manually or whatever
  • When I eventually got cake to run I kept hitting this error “Error: Failed to install tool ‘NUnit.ConsoleRunner’.”. The .cake script specifies that NUnit.ConsoleRunner version 3.4.0 is required, but this continually failed to work. I ran NuGet list NUnit.ConsoleRunner and found that the package didn’t exist. When I ran NuGet sources I noticed nuget api v2 was disabled, enabling this using NuGet.exe source Enable -Name “https://www.nuget.org/api/v2/” solved this. I’m not sure why cake was unable to use nuget.org (v3) which was enabled, but this solved the problem for me.

Make, Rake, Fake, Cake…

So first there was make, the good old (and complicated to understand) C/C++ build tool, then there were others ant/nant etc. Then along comes fake, the F# build tool and now cake the C# build tool. When I say F# or C#, I simply mean they use the syntax of F# or C# as opposed to nant’s (for example) XML, for creating the build etc. process.

The Basics

I’m going to solely be discussing running the cake executable against a .cake script here, not a PowerShell ps1 file hence you needn’t have to use PowerShell to try these commands out.

By default, if you run cake without supplying a specific .cake file (the extension can be anything you prefer if specifying the cake file name), cake will try to run build.cake.

A .cake file will have one or more Task’s (these are analogous to Target’s in nant and strangely cake has the command RunTarget to run the tasks). The task ofcourse can have some meaningful name, such as Build, Clean, Deploy (or whatever your preference). Let’s look at a simple task from Cake’s example

Task("Clean")
    .Does(() =>
{
    CleanDirectory(buildDir);
});

In the above we have a task named Clean and this runs the code CleanDirectory, which is a DSL method to, as I’m sure you guessed, clean the supplied folder. The buildDir is declared in the example cack file as

var buildDir = Directory("./src/Example/bin") + Directory(configuration);

We can also compose multiple tasks using the DSL’s IsDependentOn method, such as

Task("Restore-NuGet-Packages")
    .IsDependentOn("Clean")
    .Does(() =>
{
    NuGetRestore("./src/Example.sln");
});

So now, when the Restore-NuGet-Packages task is run cake will first run the Clean task.

You’ll need to specify some form of command line argument to allow you to run one of these tasks, i.e. to pass a string from the command line into your cake script. This is done by declaring a line like this

var target = Argument("target", "Default");

in this case we’ve declared a target command line switch with the default value of Default (which obviously you’ll need to create a matching task name for). To run this command line using cake, you use a single hyphen such as

.\tools\Cake\Cake.exe --verbosity=Verbose -target=Clean

Note: the Cake exe command line switch uses –, whereas a command line to be passed into the cake script, uses a single -.

I’m not going to cover the DSL here, for that, go check out the Cake Reference.

Implementing a Fluent interface

I’m still undecided as to whether I like fluent interface API designs or not – certainly, when written well, they add a certain readability to code, but they can also easily become overly verbose. However this post is not about whether they’re good or bad, but more about how we might implement such a pattern.

The API pattern “Fluent Interface” was coined in 2005 (according to Wikipedia) but the basic premise of the pattern has been around for far longer.

The general idea is to design an API which is more readable and properties etc. are assigned using cascading methods (or method chaining).

Let’s see an example

So let’s look at a simple example (using object initializer syntax)

I’m not going to bother including the Result class in this post as the full source can be found on my GitHub repos

var r = Result
{ 
   Value = 0,
   StatusCode = 102,
   Message = "Failed to load"
};

a possible API using a Fluent interface might be

var r = ResultBuilder
   .Create()
   .WithValue(0)
   .WithStatusCode(102)
   .WithMessage("Failed to load")
   .Build();

We’re actually using the builder class to create the Result object and set the properties/fields on it. In this example we’re not actually creating the Result object until the Build method is called. In some implementations we might create the Result object upfront, i.e. when we first start supplying properties.

Implementation

The basics of method chaining are simple enough. Your builder methods return an instance to itself, i.e. return this or a copy instance. In the case of the builder class we’re going to implement, we’re going to return an instance of the builder from the Create method, then we’ll create expressions for each WithXXX method to allow us to build the Result object when we’re finished. Finally the Build method will create the Result and apply the expressions against it.

Here’s the source for the ResultBuilder

public class ResultBuilder
{
   private readonly List<Action<Result>> _actions = 
      new List<Action<Result>>();

   public static ResultBuilder Create()
   {
      return new ResultBuilder();
   }
  
   public ResultBuilder WithValue(int value)
   {   
      _actions.Add(a => a.Value = value);
      return this;
   }

   public ResultBuilder WithStatusCode(int statusCode)
   {   
      _actions.Add(a => a.StatusCode = statusCode);
      return this;
   }

   public ResultBuilder WithMessage(string message)
   {   
      _actions.Add(a => a.Message = message);
      return this;
   }

   public Result Build()
   {
      var result = new Result();
      foreach(var action in _actions)
         action(result);

      return result;
   }
}

As you can see, we have a way (in this case, a static method) to create the builder, then we use the fluent API of the builder to create the expressions that will be applied to the object we intend to build. This means we could create a builder to acts as a factory method to create pre-defined Result objects.

If we don’t need the lazy creation portion of the code, i.e. we’re always going to create a result then we could ofcourse simply use something like

public class ResultBuilder
{
   private readonly Result result = new Result();

   public static ResultBuilder Create()
   {
      return new ResultBuilder();
   }
  
   public ResultBuilder WithValue(int value)
   {   
      result.Value = value;
      return this;
   }

   public ResultBuilder WithStatusCode(int statusCode)
   {   
      result.StatusCode = statusCode;
      return this;
   }

   public ResultBuilder WithMessage(string message)
   {   
      result.Message = message;
      return this;
   }

   public Result Build()
   {
      return result;
   }
}

Reading the BOM/preamble

Sometimes we get a file with the BOM (or preamble) bytes at the start of the file, which denote a UNICODE encoded file. We don’t always care these and want to simple remove the BOM (if one exists).

Here’s some fairly simple code which shows the reading of a stream or file with code to “skip the BOM” at the bottom

using (var stream = 
   File.Open(currentLogFile, 
      FileMode.Open, 
      FileAccess.Read, 
      FileShare.ReadWrite))
{
   var length = stream.Length;
   var bytes = new byte[length];
   var numBytesToRead = (int)length;
   var numBytesRead = 0;
   do
   {
      // read the file in chunks of 1024
      var n = stream.Read(
         bytes, 
         numBytesRead, 
         Math.Min(1024, numBytesToRead));

      numBytesRead += n;
      numBytesToRead -= n;

   } while (numBytesToRead > 0);

   // skip the BOM
   var bom = new UTF8Encoding(true).GetPreamble();
                    
   return bom.Where((b, i) => b != bytes[i]).Any() ? 
      bytes : 
      bytes.Skip(bom.Length).ToArray();
}

Record & Playback HTTP requests using Fiddler

I’ve written about the Scotch library previously, but I have scenarios where I cannot intercept HTTP requests directly within my code. i.e. SOAP generated client code handles this for us.

So what we want to do is use Fiddler initially to intercept responses for our service calls, save these and then replay them offline.

Fiddler comes with a great feature which allows us to intercept calls and automatically respond with previously saved data call AutoResponder.

Just select the AutoResponder tab in Fiddler.

All we need to do now, is add a rule, here we can create rules for specific URL patterns, i.e. just intercept some calls or all URLs. We can then assign a file (likely our previously saved intercept response files) and now we have Fiddler acts as the service by responding to requests with our locally saved files.

Don’t forget to “Enable rules” when you’ve saved all your rules and you probably want “Unmatched requests passthrough” also checked to allow other HTTP requests to pass through the rules.

If you’re using a .NET client application, remember you’ll need the following within your App.config,

<system.net>
   <defaultProxy>
      <proxy bypassonlocal="false" usesystemdefault="true" />
   </defaultProxy>
</system.net>

Here’s an example of a simple bit of .NET code that can be used to check everything is working, just add autoresponder rules for the URL myurl/test

var client = new HttpClient();
var r = client.GetStringAsync(new Uri(
   "http://myurl/test")).Result;

References

AutoResponder Reference

Trying to get my head around CQRS

Let’s start by decoding the acronym CQRS – this stands for Command and Query Responsibility Segregation. From this we can see the intent is to segregate our code into command and query responsibilities.

So what’s a command in this context and what’s a query?

  • A command is “any method that mutates state” (analogous to a setter type operation)
  • A query is “any method that returns a value” (analogous to a getter type operation)

It’s probably pretty obvious, but we can liken the command to the CRUD operations, CREATE, UPDATE and DELETE, whereas a RETRIEVE would map to a query (probably self-explanatory).

There’s nothing too complicated about this except in terms of how we’d code such things. The idea would be to split our code into two classes whereby we might have something like this

public class EmployeeCommands
{
   public void CreateEmployee(int employeeId /* other params */) {}
   public void UpdateEmployee(int employeeId /* other params */) {}
   public void DeleteEmployee(int employeeId) {}
}

public class EmployeeQueries
{
   public Employee GetEmployee(int employeeId) { /* implementation */ }
}

If adhering explicitly to a CQRS pattern then we can also think of queries returning state but never changing it and commands as changing state and always returning void (i.e. not returning anything). This can become too restrictive in some situations but obviously one should have a valid reason for deviating from this pattern otherwise you’re not really following the CQRS way.

This is basically the core design principle behind creating a CQRS architecture. One of the key concepts here is that the separation of command and query means we can also do away with a common domain or data model for the commands or queries. Conceptually this separation gives us more freedom to not only design our data models but also in scenarios whereby our reads might come from a different locations/store to our writes, we’re again separating these pieces.

Another interesting side affect of a separation of command and query is, generally a command will warrant a different set of security requirements/permissions to a query. This is not to say we do not require security on queries, especially when sensitive data exists, but obviously we might be far more restrictive on who can amend a data object than who can view one. With this separation we can declare our permissions on our read/write data sources differently if we want.

Separating reads and writes!?

I previously stated that we might have our reads coming from one location and our writes from another, this may sound (initially at least) slightly strange. However, several documents on the concepts behind CQRS mention this scenario.

Notice, I said might, obviously this isn’t a requirement for us to be using a CQRS archtecture.

It’s not unusual to have a data store, such as a database to exist along with one or more data caches. The idea being writes are sent to the DB, the DB updates and the caches are then updated to keep in sync with the DB, but when we query data it’s (depending upon requirements) far faster to get data from the cache and return this. Hence we have a decoupling of the write and the read, so to speak, which means we might maintain a single DB but maintain multiple caches at different locations to ensure the fastest access/reads.

You may have thought – “if we have a situation whereby the DB is written to but the cache then gets updated, isn’t there a possible scenario where the cache’s data is not yet ‘consistent’ with the DB’s data?”

This is indeed the case of this separation, we’re really (sort of) doing something that a lot of No-SQL data stores do, which is have “eventual consistency”. This is eventual consistency, not at the DB level (like MongoDB) but as at an architectural level. In other words our DB will be immediately “consistent” but the cache’s will be “eventually consistent”.

From my understanding, the above is basically what we, as developers, need to do or how we need to think in terms of implementing CQRS.

Most CQRS documents also go on to talk about architectural design. Such as, we see CRQS documents talk about Event Stores as part of the overall design. An event store is used to store each transaction/change in data as single entities and can be used to regenerate the whole database from each of these entities. However I do not want to go into architectural choices/options as CQRS in this post.

References

CQRS Journey
CQRS Journey Source
CQRS, Task Based UIs, Event Sourcing agh!
Building an Event Storage

C# 7 Tuples

C#/.NET has had support for the Tuple class (a reference type) since .NET 4, but with C# 7 tuples now gets some “syntactic sugar” (and a new ValueTuple struct) to make them part of the language.

Previous to C# 7 we’d write code like this

var tuple = new Tuple<string, int>("Four", 4);

var firstItem = tuple.Item1;

Now, in a style more like F#, we can write the following

var tuple = ("Four", 4);

var firstItem = tuple.Item1;

This code makes the creation of tuples slightly simpler but we can also make the use of the “items” within a tuple more readable by using names. So we could rewrite the above in a number of ways, examples of each supplied below

(string name, int number) tuple1 = ("Four", 4);
var tuple2 = (name: "Four", number: 4);
var (name, number) =  ("Four", 4);

var firstItem1 = tuple1.name;
var firstItem2 = tuple2.name;
var firstItem3 = name;

Whilst not quite as nice as F# we also now have something similar to pattern matching on tuples, so for example we might be interested in creating a switch statement based upon the numeric (second item) in the above code, we can write

switch (tuple)
{
   case var t when t.number == 4:
      Console.WriteLine("Found 4");
      break;
}

Again, similar to F# we also can discard items/params using the underscore _, for example

var (_, number) = CreateTuple();

In this example we assume the CreateTuple returns a tuple type with two items. The first is simply ignored (or discarded).