Category Archives: Programming

Index and Range in C#

In C# 8.0 two classes were added to C# to handle Index and Ranges (as well as some syntactic sugar in the form of .. to represent a Range). Both are structs, an Index being literally an int but with some static method and “helper” methods for using indicies. As you might expect, a Range, is made up of a start Index and an end Index. Let’s have a look at both types in a little more depth…

Index

An Index is a very simply type which is best understood in usage on a collection. Let’s therefore assume we have some collection type which, ofcourse, allows for you to get at item at a specific index.

This is hardly an exciting addition in itself but what it does is gives us some “standard” code for handling things like, get the nth item from the end of the collection, using the overloaded ^ operator. Let’s see this in action by creating a simply collection

public class MyCollection
{
  private readonly string[] _collection;

  public MyCollection()
  {
    _collection = new[]
    {
      "One",
      "Two",
      "Three",
      "Four",
      "Five"
    };
  }

  public string this[Index idx] => _collection[idx.GetOffset(_collection.Length)];
}

All we’ve really done here that’s slightly different to how we’d have written this without an Index is we use the Index GetOffset to give us the actual index number to get the value from the collection. What this really does is allow us to write code such as

var c = new MyCollection();
Console.WriteLine(c[^2]);

The ^2 on an Index simply means give me the 2nd item from the end which would result in “Four” being written to the console ^1 would be the equivalent of _collection[_collection.Length – 1] hence you can see that ^2 is the equivalent of _collection[_collection.Length – 2]. The Index helper method simply allows us to let it handle any conversion for us. Ofcourse we can write the following

var c = new MyCollection();
Console.WriteLine(c[2]);

Which, as you’ve expect returns “Three” this item at index 2.

Range

The Range uses Index types to create a Range, for example

var range = new Range(1, 10);

This create a range type with a start of 1 and end of the range 10. Where the Range becomes more useful is in the syntactic sugar, the range shown in the snippet above may be written as

var range = 1..10;

This create a new Range type just as the first snippet, but now if we start to use this sort of stuff in collection types or our own types we start to get some neat syntax to get slices of collections – it should be noted that the current implementation of collections in .NET sees arrays supporting Range and hence can return slices of arrays, whereas other collections do not explicitly support Index and Range but instead can be used with them due to the operator overloading etc.

ofcourse we can create our own types to do similar. Before we look at implementing such things let’s quickly review the basic syntax possibilities/expectations

// short cut for getting a slice [0..10]
var firstToN = array[..10]; 
// short cut for getting the item 2 from the end
var nthItemFromEnd = array[^2]; 
// short cut for getting all items into a copy of the array
var all = array[..]; 

Support for Index

We can write our own types to explicitly understand Index and Range types, however using standard methods existing types can support Index and Range. Let’s demonstrate this by changing our MyCollection type to replace the indexer to look like this (i.e. remove the Index support).

public string this[int idx] => _collection[idx];

If you try to compile with myCollection[^2] you’ll now get a compiler error, cannot convert from ‘System.Index’ to ‘int’. So how do existing types work with Index? This is due to “implicit Index support” which basically means the compiler expects certain named methods with certain expected parameters.

The first required is the type is Countable which basically means it has a property named Length or Count with an accessible getter which returns an int. So this covers arrays and collections such as the List type. Let’s therefore add this to our type

public int Count => _collection.Length;

Now the compiler error will go away. If you think about it, it quite obvious what’s happening, in our original code for the indexer

public string this[Index idx] => 
   _collection[idx.GetOffset(_collection.Length)];

We ofcourse need the length of the collection as well as the Index. Well the compiler using the “implicit Index support” needs the same. So once we add the Count or Length properties, it can now get that information at create the code we otherwise would have to write ourselves and thus an indexer [int idx] now gets the value from the Index implicitly supplied.

So to wrap up “implicit Index support”, your type needs the following

  • The type is Countable i.e. has Length or Count property accessible and returning an int
  • The type has an accessible indexer which takes an int
  • The type does not have an accessible indexer which takes an Index as the first parameter

Support for Range

Let’s change our calling code to try to get a slice of our collection, so now we have

var c = new MyCollection();
Console.WriteLine(c[1..^2]);

This will immediately show the compile time error “cannot convert from ‘System.Range’ to ‘int'”. Let’s therefore add implicit support for Range to our collection.

Well again the type needs to be Countable and also not have a accessible indexer which takes a Range (otherwise this would assumer the type is explicitly handling Range). Which currently have.

As you’d imaging to handle ranges we need a method that creates a slice or sub-collection from our collection and that’s exactly what’s required. Let’s add a Slice method which returns a collection (in our case we’re just pretty much copy the code from Ranges which is also the best place to look for information on Index and Range.

Add the following to MyCollection

public string[] Slice(int start, int length)
{
  var slice = new string[length];
  Array.Copy(_collection, start, slice, 0, length);
  return slice;
}

The compiler should now show our previously broken c[1..^2] code as valid and if you make a small change to output the items from the returned slice, like this

foreach (var item in c[1..^2])
{
  Console.WriteLine(item);
}

You’ll see items from the item at index 1 through to the one 2 from the end, i.e. “Two” and “Three”.

So to wrap up “implicit Range support”, your type needs the following

  • The type is Countable i.e. has Length or Count property accessible and returning an int
  • The type contains an accessible method named Slice which takes two int params, the first being the start index the second being the length
  • The type does not have an accessible indexer which takes an Range as the first parameter

With both the implicit Index and Range code you could ofcourse write explicit code (i.e. use Index and Range in your code). Both offer helper methods to make the experience of using them consistent. Ofcourse neither are restricted to collection types and let’s look at extending Range…

Extending Range

As already mentioned, Index and Range are fairly limited in functionality, but that’s fine, they do exactly what they’re intended to do, but if you’ve ever wanted a Range of numbers in C# you’ve probably looked at Enumerable.Range, so writing something like this

foreach (var i in Enumerable.Range(1, 10))
{
  Console.WriteLine(i);
}

For fun and now we have the syntactic sugar of .. to represent a Range type it’s be cool to write something like this instead

foreach (var i in 1..10)
{
  Console.WriteLine(i);
}

Okay so we know that this is really a Range instance like this

foreach(var r in new Range(1, 10))
{
  Console.WriteLine(i);
}

but ofcourse this will not work, as Range does not support IEnumerator. No worries we’ll create an extension method, something like this

public static class RangeExtensions
{
  public static IEnumerator<int> GetEnumerator(this Range range) =>
    Enumerable.Range(range.Start.Value, range.End.Value).GetEnumerator();
}

This now allows us to use a Range within a foreach and better still we can now use the .. operator syntax

foreach (var item in 1..10)
{
  Console.WriteLine(i);
}

Note: Whilst this offers syntax along the lines of Rust – I’m not necessarily saying using Range in this way within a foreach is a great idea because Range, via Index, supports ^ (from end) syntax and we’re definitely not attempting to support such syntax in our extension method but the terse nature of it’s usage is nevertheless interesting (or for those who prefer more explicit methods then it’s probably horrible).

We could extend our fun and our extension method to add support for IEnumerable for Linq queries, whilst not as nice as the IEnumerator in it’s implicit nature, we might add to RangeExtensions, the following

public static IEnumerable<int> ToEnumerable(this Range range)
{
  foreach (var item in range)
  {
    yield return item;
  }
}

Now we can use this like with Linq, maybe like this

foreach (var i in (1..10).ToEnumerable().Where(v => v % 2 == 0))
{
  Console.WriteLine(i);
}

Which will output the even numbers between 1 and 10.

References

Ranges

Named arguments in C#

I’ve actually never had or felt a need to use named arguments in C#. Am I missing anything?

Well first off I have actually used named arguments in other languages and my immediate feeling was – this makes my code so verbose and I felt I got no real benefit from them. Certainly they can (sort of) document your code a little better, but with most/all development tools including intellisense or other tools to tell the development what parameters they were editing/adding and in some cases even show the named of the argument in the editor – hence the benefit of “documenting” seemed to be of little real use.

This all said, let’s look at what we can do with named arguments.

What are named arguments?

When we write methods/functions we pass arguments/parameters using “positional arguments”, which simply means the order of arguments must match the method signature’s argument order. For example let’s look at a simple method to add a person to some application

void AddPerson(string name, string address, int age)
{
   // do something
}

So when we use this method with positional arguments we would write

AddPerson("Scooby Doo", "Mystery Machine", 11);

In C# we also get the ability to use named arguments instead (without any changes to the method signature) by including the argument name as part of the call, so for example

AddPerson(name: "Scooby Doo", address: "Mystery Machine", age: 11);

With tools like Visual Studio 2019, this doesn’t really add anything useful (if we’re mirroring the argument positions) because Visual Studio already tells us the name of each argument in the editor. Obviously outside of Visual Studio, for example is source control, maybe this is more useful.

Surely there’s more to it than that?

Positional arguments are just that, the calling code must supply each argument in the correct position and whilst we can do the same with named arguments, you can also rearrange the arguments and hence no longer need to call using the same positions, for example let’s switch around the named arguments from earlier to give us this

AddPerson(name: "Scooby Doo", age: 11, address: "Mystery Machine");

The C# compiler will simply rearrange the arguments into their positions producing the same IL as is generated for a callee using positional arguments. Here’s an example of such code generated via dotPeek – it’s exactly the same code as for the positional arguments as one would expect.

IL_0013: ldstr        "Scooby Doo"
IL_0018: ldstr        "Mystery Machine"
IL_001d: ldc.i4.s     11 // 0x0b
IL_001f: call         void NameParamsTests.Program::AddPerson(string, string, int32)
IL_0024: nop

One area where named arguments offer some extra goodness is when we’re using optional argument, so let’s assume our AddPerson signature changes to

static void AddPerson(string name, string address = null, int age = Int32.MinValue)
{
   // do something
}

If we’re using positional arguments and we don’t have an address then we must still supply a value for the address, for example

AddPerson("Scooby Doo", null, 11);

But as we’ve seen, with named arguments the order is no longer a limiting factor, therefore we can used named arguments instead and no even bother with the address, the compiler will figure it out for us, hence we can write

AddPerson(name: "Scooby", age: 11);

Note: We can ofcourse use positional and named arguments in a method call if we wish/need to but then the named arguments would need to be in the correct position limiting the usefulness of using named arguments.

Named arguments – when we have lots of arguments

The simple AddPerson method probably isn’t a great example for using named arguments, so lets instead look at a method that takes more arguments with lots of optional arguments. If we instead have a method which looks like this

void AddPerson(
   string firstName, string lastName, int age = Int32.MinValue,
   string addressLine1 = null, string addressLine2 = null, string addressLine3 = null,
   string city = null, string county = null, string postalCode = null)
{
   // do something
}

Now we can see that if we have partial details for the person we can call this method in a more succinct manner, for example

AddPerson(age: 11, firstName: "Scooby", lastName: "Doo", postalCode: "MM1");

// or with mixed positional and named arguments

AddPerson("Scooby", "Doo", 11, postalCode: "MM1");

As you’d imagine, the compiler simply handles the setting of the optional arguments etc. as before giving us IL such as

IL_0001: ldstr        "Scooby"
IL_0006: ldstr        "Doo"
IL_000b: ldc.i4.s     11 // 0x0b
IL_000d: ldnull
IL_000e: ldnull
IL_000f: ldnull
IL_0010: ldnull
IL_0011: ldnull
IL_0012: ldstr        "MM1"
IL_0017: call         void NameParamsTests.Program::AddPerson(string, string, int32, string, string, string, string, string, string)
IL_001c: nop

Once our methods start getting more arguments and especially if lots are defaulted, then named arguments start to make sense, although with a larger number of arguments, one might question whether in fact the method call itself might be in need of refactoring, with our example here we could ofcourse create separate objects for different parts of the data and with C#’s object initializer syntax we sort of get a similar way to create “named” arguments, for example

public struct Person
{
   public string FirstName { get; set; }
   public string LastName { get; set; }
   public int Age { get; set; }
   public string Line1 { get; set; }
   public string Line2 { get; set; }
   public string Line3 { get; set; }
   public string City { get; set; }
   public string County { get; set; }
   public string PostalCode { get; set; }
}

void AddPerson(Person person)
{
   // do something
}

Now using object initializer syntax we could call this method like this

AddPerson(new Person
   {
      FirstName = "Scooby",
      LastName = "Doo",
      Age = 11,
      PostalCode = "MM1"
   });

Eureka Discovery with Steeltoe outside of ASP.NET

Whilst the previous post showed how we use Steeltoe in an ASP.NET application, sometimes we might not want to (or be able) to use all the nice methods for automatic service discovery etc.

In this post we’re going to create a simple .NET core console application and interact with the discovery service at a slightly lower level.

So…

  • Create yourself a .NET core console application
  • Add the NuGet package Steeltoe.Discovery.Eureka

We need to configure the discovery client. You can use your own configuration method but for this example we’ll hard code parameters into a EurekaClientConfig object.

var config = new EurekaClientConfig
{
   EurekaServerServiceUrls = "http://localhost:8761/eureka/",
   ShouldFetchRegistry = true,
   ShouldRegisterWithEureka = false
};

We’re just writing a client, hence we don’t want to register the application with Eureka, but we do want to fetch registry data.

Next we will create the discovery client, access the applications and get our weatherapi application…

var discoveryClient = new DiscoveryClient(config);
var applications = discoveryClient.Applications;
var service = applications?.GetRegisteredApplication("weatherapi");

Finally we want to find instances of the services associated with the application which are UP and get the URL associated with one of the instances.

var instance = service?.Instances.FirstOrDefault(info => info.Status == InstanceStatus.UP);
if(instance != null)                    
{
   var client = new HttpClient
   {
      BaseAddress = new Uri(instance.HomePageUrl)
   };
   var response = await client.GetAsync("weatherforecast");
   Console.WriteLine(await response.Content.ReadAsStringAsync());
}

If an instance is found we just simply use HttpClient to invoke a GET method on the service and output the response to the console, just to prove everything worked.

This code doesn’t use any load balancing strategies, I’ll leave that to the reader to look into as in a real world scenario we wouldn’t want all clients to use the first instance only.

Creating an ASP.NET client using Eureka and Steeltoe

In the previous post Eureka Server (using Steeltoe) revisited I went through the process of running a Eureka instance and creating a default template based ASP.NET Web API which registers itself with the Eureka server as the application weatherapi.

Let’s now create a basic ASP.NET MVC project to interact with the Eureka server, get an instance of the API service and use it.

Note: This post is almost 100% based on the Channel 9 video Service Discovery with Steeltoe, so credit should go to Tim Hess for the sample code.

  • Create an ASP.NET Core application and then select Web Application (Model View Controller)
  • Add the following NuGet packages, Steeltoe.Discovery.ClientCore, Steeltoe.Discovery.Eureka and System.Net.Http.Json
  • Update the appsettings.json with the following
      "eureka": {
        "client": {
          "serviceUrl": "http://locahost:8761/eureka/",
          "shouldFetchRegistry": "true",
          "shouldRegisterWithEureka": false,
          "validateCertificates": false
        }
      }
    

    Notice we set shouldFetchRegistry to true as we want to get the latest registry information, and we set shouldRegisterWidthEureka to false as, in this case, we don’t want to register this client. Ofcourse change this is your client also exposes services.

  • Within Startup.cs, ConfigureServices add the following
    services.AddDiscoveryClient(Configuration);
    services.AddHttpClient("weather", 
       client => client.BaseAddress = new Uri("http://weatherapi/"))
          .AddServiceDiscovery();
    

    The interesting thing here is that we associate the name “weather” with an address which looks like a URL but really is the name of the service within Eureka that we want to access. Then, by using the AddServiceDiscovery this will be converted to an instance URL representing the instance of the service associated with the app name.

    Note: we can also use load balancing strategies, such as round robin or random.

  • Finally within the Startup.cs, method Configure, add the following
    app.UseDiscoveryClient();
    
  • We’re going to simply copy the WeatherForecast class from the service and add to the client, here it is

    public class WeatherForecast
    {
       public DateTime Date { get; set; }
       public int TemperatureC { get; set; }
       public int TemperatureF => 32 + (int)(TemperatureC / 0.5556);
       public string Summary { get; set; }
    }
    
  • Within HomeController.cs we need to add a variable of type IHttpClientFactory which will be injected into the constructor and then uses to call defined HTTP client, this will then used service discovery to return the URL of an instance (as already discussed) and then we’re use that to call the weather API to get a list of values. Here’s the changes required to HomeController.cs
    private readonly IHttpClientFactory _httpClientFactory;
    
    public HomeController(IHttpClientFactory httpClientFactory, ILogger<HomeController> logger)
    {
       _httpClientFactory = httpClientFactory;
       _logger = logger;
    }
    
    public async Task<IActionResult> Index()
    {
       var client = _httpClientFactory.CreateClient("weather");
       var weather = await client.GetFromJsonAsync<IList<WeatherForecast>>("weatherforecast");
       return View(weather);
    }
    
  • Finally, let’s change the Index.cshtml to display the weather forecast data returned in the Index method. Firstly we declare the @model and then simply create a table to output the items from that model in columns and rows, so here’s the change to be added to the top of the file
    @model IList<EurekaWebClient.Controllers.WeatherForecast>
    

    and within the main div, just add the following

    <table class="table">
       <tr><th>Day of Week</th><th>Summary</th><th>Temp</th></tr>
       @foreach (var weather in Model)
       {
          <tr><td>@weather.Date.DayOfWeek</td><td>@weather.Summary</td><td>@weather.TemperatureF</td></tr>
       }
    </table>
    

That should be it. Ensure Eureka is running, your service is up and you should now see the weather service data in a nice little table.

Eureka Server (using Steeltoe) revisited

The Eureka server is used as a registry for maintaining lists of services and their endpoints. It’s used a lot in the microservice world by way of a microservice registering it’s existence somewhere (in this case in Eureka). When a client (whether it’s another service or anything else for that matter) want to access a service it asks the registry for an instance, in this case we connect to the Eureka server and find the services instance(s) that are available for a specific application name and are ofcourse UP.

Installing and running Eureka Server

In a previous post Spring boot Eureka server we wrote a Java application to run a Eureka server. In this post we’re going to use Docker to host the server and then use Steeltoe with .NET to interact with the instance.

To get an image of Eureka server, let’s use the Steeltoe docker image (Spring and others exist, the Steeltoe image is not intended for production, but is fine for what we want to do)

docker pull steeltoeoss/eureka-server

Now run up the docker image using

docker run --publish 8761:8761 steeltoeoss/eureka-server

If all goes well, connect to the Spring Eureka dashboard using

http://locahost:8761

change localhost to the ip/host you’re running the Eureka server from, now you should now see the Spring Eureka web page.

Registering a .NET core client with Eureka

I’ve a post on this topic A .NET service registered with Eureka, but let’s go through the process again with the current version of Steeltoe (as the NuGet packages have changed somewhat).

  • Create an ASP.NET Core Web Application, as this will represent a REST service that we want to interact with. My project is called RegisterExample and is an API project.
  • Add the NuGet package Steeltoe.Discovery.ClientCore and Steeltoe.Discovery.Eureka
  • In Startup.cs within ConfigureServices add the following
    services.AddDiscoveryClient(Configuration);
    
  • Within the Configure method add the following
    app.UseDiscoveryClient();
    
  • Finally add the following to the appsettings.json file
    // Eureka info
      "eureka": {
        "client": {
          "serviceUrl": "http://localhost:8761/eureka/",
          "shouldFetchRegistry": "false",
          "shouldRegisterWithEureka": true,
          "validateCertificates": false
        },
        "instance": {
          "appName": "weatherapi",
          "port": "8080",
          "ipAddress": "localhost",
          "preferIpAddress": true
        }
      }
    

We’re going to leave the API with the default weatherapi, hence the appName in the appsettings.

Notice we set the value for “shouldFetchRegistry” to false as this service will not be acting as a client to any other services. Obviously change this is you also need to discovery other services. “shouldRegisterWithEureka” is set to true as we want this service to automatically register itself with Eureka.

Now navigate to the URL of your Eureka server again (or refresh) and you should see a new Application. In my case I have an application with the name weatherapi. This name comes from our appsettings.json configuration application/name.

Info and Health

If you click on the instance link within the Eureka server dashboard, you will navigate to

https://localhost:5001/info

(or whatever ip/hostname and port your service is running on) you get a 404, so let’s fix that in our project.

Info, by default will display some basic information about the application, product version etc. However you can also add further custom information if you want, but example the git build SHA1 hash or just some general info.

  • Add NuGet package Steeltoe.Management.EndpointCore
  • In Startup.cs ConfigureServices, add the following
    services.AddSingleton<IInfoContributor, MyInfoContributor>();
    services.AddInfoActuator(Configuration);
    services.AddHealthActuator(Configuration);
    

    The first of these lines will add our implementation of an IInfoContributor to allow for custom info.

  • Still in Startup.cs, but now the method Configure, add the following to the UseEndpoints endpoint routes

    endpoints.Map<InfoEndpoint>();
    endpoints.Map<HealthEndpoint>();
    
  • Now we’ll create a simple implementation of and IInfoContributor which allows us to add our own info, so add the following class

    public class MyInfoContributor : IInfoContributor
    {
       public void Contribute(IInfoBuilder builder)
       {
          builder.WithInfo("MyInfo", new { SomeName = "Scooby Doo" });
       }
    }
    

Now when we run our service we hope to see our info data, however by default Steeltoe seems to set the info and health endpoint to /actuator/info and /actuator/health respectively. Eureka seems to expect /info. So go to the appsettings.json and add the following to the Instance section

"StatusPageUrlPath": "/actuator/info",
"HealthCheckUrlPath": "/actuator/health" 

Note: I’m not sure what I’m missing here and why the defaults don’t match up, but these configuration changes will tell the Eureka server (when we register our service with it) that it should use these paths for info and health.

Now, if you run the service again for /actuator/info you should see something like this

{"MyInfo":{"someName":"Scooby Doo"},
"applicationVersionInfo":{"ProductName":"RegisterExample",
"FileVersion":"1.0.0.0","ProductVersion":"1.0.0"},
"steeltoeVersionInfo":"ProductName":"Steeltoe.Management.Endpoint",
"FileVersion":"3.0.2.0","ProductVersion":
"3.0.2\u002B4089779c66d127f40325a3be9b613149b3b090f2"}}

and health something like

{"status":"UP","details":{"liveness":{},"diskSpace":{"total":4000769372160,
"free":3889734168576,"threshold":10485760,"status":"UP"},"eurekaServer":
{"remoteInstStatus":"UNKNOWN","fetchStatus":"Not fetching","heartbeat":
"Successful","heartbeatStatus":"UP","heartbeatTime":"2021-01-23T20:40:44",
"status":"UP","applications":"NONE"},"readiness":{}}}

with and record types in C# 9.0

In the previous post we looked at the new C# record type. Whilst we can make record types mutable, they’re particularly well suited to immutability. When used as immutable types we need a way to make new immutable types based upon previous ones.

Let’s use our Person record type from the previous post but sprinkled with C# 9 goodness in the form of init

public record Person
{
   public string FirstName { get; init; }
   public string LastName { get; init; }
}

The init syntax gives us the ability to create instances of a Person, assigning values during the construction phase of the record type using standard initialiser syntax.

For example

var p = new Person 
{ 
   FirstName = "Scooby", 
   LastName = "Doo" 
};

However, this post is mean’t to be about the with keyword with allows us to take a record type and create a new instance baed upon and existing record with changes. i.e. we want to take the Person p and create a new record with the FirstName = “Scrappy”, like this

var scrappy = p with { FirstName = "Scrappy" };

the result of outputting this to the console would be

Person { FirstName = Scrappy, LastName = Doo }

C# 9.0 record type

As of yesterday I got the updated version of Visual Studio 2019 (16.8.1) which includes .NET 5 and C# 9.0 – I don’t intend to post about every feature of C# 9.0 – for those interested simply go and take a look at What’s new in C# 9.0.

One feature that’s quite interesting is C# Records.

Record types “are a reference type that provides synthesized methods to provide value semantics for equality”. So basically we can create multiple instances of a record and compare them for equality. Here’s an example of a record type

public record Person
{
   public Person(string firstName, string lastName)
   {
      FirstName = firstName;
      LastName = lastName;
   }

   public string LastName { get; }
   public string FirstName { get; }
}

It’s that easy, we simply declare our type as a record instead of a class.

Now, if we create a couple of instances of a Person that look like this

Person person1 = new("Scooby", "Doo");
Person person2 = new("Scooby", "Doo");

and we compare them using ==/Equals, they will result in the two instances being the same (unlike with classes with would compare equality by reference).

Another feature of records, over classes is. If you use ToString() on an instance of a class you’ll see something like

TestApp.Person

For a record type you’ll instead get

Person { LastName = Doo, FirstName = Scooby }

this is facilitated by the compiler adding a PrintMembers protected method which generates this output.

ANTLR in C#

In the previous post Starting out with ANTLR we look at the basics of creating a grammar and generating code from it, now let’s take that very simple grammar and integrate it into a C# application.

Here’s the grammar again (from our grammar file queryLanguage.g4)

Note: We’re going to capitalize the grammar name as this will then by more in the style of C# class names.

grammar QueryLanguage;

query
    : expression
    ;

expression
    : STRING
    | NUMBER
    | expression 'AND' expression
    | expression 'OR' expression
    ;

WS  : (' '|'\t'|'\r'|'\n')+ -> skip;

STRING : '"' .*? '"';
SIGN
   : ('+' | '-')
   ;
NUMBER  
    : SIGN? ( [0-9]* '.' )? [0-9]+;

The ANTLR4 JAR is not compatible with the ANTRL4 Nuget package, so instead for our Example application we’ll use an alternative, the Antlr4 CodeGenerator, so follow these steps to create an application

  • Create a .NET Core Console application
  • Editor the SDK project file and change netcoreapp3.1 to net472
  • Add the ANTLR4.Runtime and Antlr4.CodeGenerator Nuget packages
  • Add your QueryLanguage.g4 grammar to the project

If you select the .g4 file you can now view the properties for that file within Visual Studio 2019 and (if you wish to) change what’s generated by ANTLR. Let’s just ensure Generate Visitor is Yes.

For some reason a .NET framework 4.7.2 project does not include the properties and whilst we can edit the .csproj file to get things working, I’ve found the above steps the simplest to get ANTLR running in a .NET application at the time of writing.

I’ve found I do still need to edit the .csproj file to add the following

<ItemGroup>
  <Antlr4 Update="QueryLanguage.g4">
    <Listener>false</Listener>
    <CustomToolNamespace>Example.Generated</CustomToolNamespace>
  </Antlr4>
</ItemGroup>

<PropertyGroup>
  <Antlr4UseCSharpGenerator>True</Antlr4UseCSharpGenerator>
</PropertyGroup>

Change Example.Generated to the preferred namespace for the generated files.

Now build the project and if all goes well there should be no errors and the ANTLR code should be generated in obj/Debug/net472 (or whatever configuration you’re using).

Let’s now make some changes to our grammar to make writing Visitor code simpler by adding labels to our expression code, the changes are listed below

expression
    : STRING #String
    | NUMBER #Number
    | expression 'AND' expression #And
    | expression 'OR' expression  #Or
    ;

We use # to create a label and this will turn into a Visit function with the label, i.e. VisitAnd, VisitoOr etc.

All we’re going to do with this grammar is use the Visitor pattern/class to generate code where strings are all lowercase, AND becomes & and OR becomes |, ofcourse you could produce byte code or do all sorts of things with your input.

Create a new file name QueryLanguageVisitor.cs and it should look like this

using Example.Generated;

namespace Example
{
  public class QueryLanguageVisitor : QueryLanguageBaseVisitor<string>
  {
    public override string VisitString(QueryLanguageParser.StringContext context)
    {
      return context.GetText().ToLower();
    }

    public override string VisitNumber(QueryLanguageParser.NumberContext context)
    {
      return context.GetText();
    }

    public override string VisitAnd(QueryLanguageParser.AndContext context)
    {
      return Visit(context.expression(0)) + "&" + Visit(context.expression(1));
    }

    public override string VisitOr(QueryLanguageParser.OrContext context)
    {
      return Visit(context.expression(0)) + "|" + Visit(context.expression(1));
    }
  }
}

As you can see from the above code, we subclass QueryLanguageBaseVisitor (a generated file) and the generic parameter is set as a string as our result of running through the QueryLanguageVisitor will simply be another string.

In the case of the AND and OR which ofcourse are binary expressions, i.e. require two parameters either side of the AND or OR and these may themselves be expression, hence we call Visit those expressions.

At this point, we have nothing to actually run the QueryLanguageVisitor so in the Main method place the following code

// add these using clauses
// using Antlr4.Runtime;
// using Example.Generated;

// example expression
var expression = "\"HELLO\" AND 123";

var inputStream = new AntlrInputStream(expression);
var lexer = new QueryLanguageLexer(inputStream);
var tokenStream = new CommonTokenStream(lexer);
var parser = new QueryLanguageParser(tokenStream);

var visitor = new QueryLanguageVisitor();
var query = parser.query();
var result = visitor.Visit(query);

In the code above, we create an ANTLR input stream (you can ofcource use an AntlrFileStream if you’re taking input from a file). Next we use our generated lexer which is passed into the CommonTokenStream and this in turn is passed into our generated QueryLanguageParser.

Finally we create our newly added QueryLanguageVisitor which will have functions based upon our grammar, in our case the startRule is query hence we call this method and pass the result into the Visit method of our QueryLanguageVisitor. The result (assuming no errors) will be

"hello" & 123

A more fully featured (i.e. includes error handling) implementation would be as follows (concepts and code snippets taken from various existing samples)

public class ParserResult
{
  public bool IsValid { get; internal set; }
  public int ErrorPosition { get; internal set; } = -1;
  public string ErrorMessage { get; internal set; }
  public string Result { get; internal set; }
}

public static class Query
{
  public static ParserResult Parse(string expression, bool secondRun = false)
  {
    if (String.IsNullOrWhiteSpace(expression))
    {
      return new ParserResult
      {
        IsValid = true,
        Result = null
      };
    }

    var inputStream = new AntlrInputStream(expression);
    var lexer = new QueryLanguageLexer(inputStream);
    var tokenStream = new CommonTokenStream(lexer);
    var parser = new QueryLanguageParser(tokenStream);

    lexer.RemoveErrorListeners();
    parser.RemoveErrorListeners();
    var customErrorListener = new QueryLanguageErrorListener();
    parser.AddErrorListener(customErrorListener);
    var visitor = new QueryLanguageVisitor();

    var queryExpression = parser.query();
    var result = visitor.Visit(queryExpression);
    var isValid = customErrorListener.IsValid;
    var errorLocation = customErrorListener.ErrorLocation;
    var errorMessage = customErrorListener.ErrorMessage;
    if (result != null)
    {
      isValid = false;
    }

    if (!isValid && !secondRun)
    {
      var cleanedFormula = string.Empty;
      var tokenList = tokenStream.GetTokens().ToList();
      for (var i = 0; i < tokenList.Count - 1; i++)
      {
        cleanedFormula += tokenList[i].Text;
      }
      var originalErrorLocation = errorLocation;
      var retriedResult = Parse(cleanedFormula, true);
      if (!retriedResult.IsValid)
      {
        retriedResult.ErrorPosition = originalErrorLocation;
        retriedResult.ErrorMessage = errorMessage;
      }
      return retriedResult;
    }
    return new ParserResult
    {
      IsValid = isValid,
      Result = isValid || result != null 
        ? result
        : null,
      ErrorPosition = errorLocation,
      ErrorMessage = isValid ? null : errorMessage
    };
  }
}

public class QueryLanguageErrorListener : BaseErrorListener
{
  public bool IsValid { get; private set; } = true;
  public int ErrorLocation { get; private set; } = -1;
  public string ErrorMessage { get; private set; }

  public override void ReportAmbiguity(
    Parser recognizer, DFA dfa, 
    int startIndex, int stopIndex, 
    bool exact, BitSet ambigAlts, 
    ATNConfigSet configs)
  {
    IsValid = false;
  }

  public override void ReportAttemptingFullContext(
    Parser recognizer, DFA dfa, 
    int startIndex, int stopIndex, 
    BitSet conflictingAlts, SimulatorState conflictState)
  {
    IsValid = false;
  }

  public override void ReportContextSensitivity(
    Parser recognizer, DFA dfa, 
    int startIndex, int stopIndex, 
    int prediction, SimulatorState acceptState)
  {
    IsValid = false;
  }

  public override void SyntaxError(
    IRecognizer recognizer, IToken offendingSymbol, 
    int line, int charPositionInLine, 
   string msg, RecognitionException e)
 {
   IsValid = false;
   ErrorLocation = ErrorLocation == -1 ? charPositionInLine : ErrorLocation;
   ErrorMessage = msg;
 }
}

Now the code that uses our parser simply looks like this (and includes error handling)

var expression = "\"HELLO\" AND 123";
var result = Query.Parse(expression);

Generating IL using C#

Note: This is an old post I had sitting around for a couple of years, I’m not sure how complete or useful it is, but better being published than hidden away and it might be of use at some point.

There are different ways to dynamically generate code for .NET, using tools such as T4, custom code generators run via target builds etc. Then there’s creating your assembly, modules, types etc. via IL. I don’t mean literally write IL files but instead generating your IL via C# using the ILGenerator class and Emit methods.

I wanted to write a factory class that worked a little like Refit in that you define the interface and Refit “magically” creates an implementation to the interface and calls boilerplate code to inject and/or do the work required to make the code work.

Refit actually uses build targets and code generation via GenerateStubsTask and InterfaceStubGenerator not IL generation.

IL is not really a simple way to achieve these things (source generators, currently in previous, would be far preferable) but maybe in some situations IL generation suits your requirements and I thought it’d be an interesting thing to try anyway.

Use case

What I want to do is allow the developer to create an interface which contains methods (we’re only going to support “standard” methods at this point). The methods may take multiple arguments/parameters and must return Task (for void) or Task of T (for return values). Just like Refit, the idea would be that the developer marks methods in the interface with attributes which then tell the factory class what code to generate for the implementation.

All very much along the lines of Refit.

Starting things off by creating our Assembly

We’re going to need to create an Assembly, at runtime, to host our new types, so the first thing we do is, using the domain of the current thread we’ll use the method DefineDynamicAssembly, pass in both an AssemblyName and AssemblyBuilderAccess parameter which creates an AssemblyBuilder. This becomes the starting point for the rest of our builders and eventually our IL code.

Note: If you want to save the assembly to disk, which is very useful for debugging by inspecting the generated code using ILSpy or the likes, then you should set the AssemblyBuilderAccess to AssemblyBuilderAccess.RunAndSave and supply the file path (not the filename) as the fourth argument to DefineDynamicAssembly.

Before we get into this code further, let’s look at a simple interface which will be our starting point.

public interface IService
{
   Task<string> GetName();
}

Whilst the aim, eventually, is to include attributes on our interface and return different generic types, for this post we’ll not get into this, but instead simply generate an implementation which ignores the arguments passed and expects either a return of Task or Task<string>.

Let’s create our assembly – here’s the code for the TypeGenerator class.

public class TypeGenerator
{
   private AssemblyBuilder _assemblyBuilder;
   private bool _save;
   private string _assemblyName;

   public TypeGenerator WithAssembly(string assemblyName, string filePath = null)
   {
      var currentDomain = Thread.GetDomain();
      _assemblyName = assemblyName;
      _save = !String.IsNullOrEmpty(filePath);

      if (_save)
      {
         _assemblyBuilder = currentDomain.DefineDynamicAssembly(
            new AssemblyName(_assemblyName),
               AssemblyBuilderAccess.RunAndSave,
                  filePath);
      }
      else
      {
         _assemblyBuilder = currentDomain.DefineDynamicAssembly(
            new AssemblyName(_assemblyName),
               AssemblyBuilderAccess.Run);
      }
      return this;
   }

   public static TypeGenerator Create()
   {
      return new TypeGenerator();
   }
}

The code above will not actually save the assembly but is part of the process we need to go through to actually save it. Let’s add a save method which will actually save the assembly to disk.

public TypeGenerator Save()
{
   if (!String.IsNullOrEmpty(_assemblyName))
   {
      _assemblyBuilder.Save(_assemblyName);
   }
   return this;
}

Note: we’ll also need to assign the assembly name to the Module which we’re about to create.

Now we need a Module

Creating the module is simply a case of calling DefineDynamicModule on the AssemblyBuilder that we created, this will give us a ModuleBuilder which is where we’ll start generating our type code.

As noted, if we are saving the module then we also need to assign it the assembly name, so here’s the code for creating the ModuleBuilder

public TypeGenerator WithModule(string moduleName)
{
   if (_save)
   {
      _moduleBuilder = _assemblyBuilder.DefineDynamicModule(
         moduleName, _assemblyName);
   }
   else
   {
      _moduleBuilder = _assemblyBuilder.DefineDynamicModule(
         moduleName);
   }
   return this;
}

Creating our types

Considering this post is about IL code generation, it’s taken a while to get to it, but we’re finally here. We’ve created the assembly and within that a module. Our current implementation for generating a type will take the interface as a generic parameter (only interfaces will be handled), here’s the method

public TypeGenerator WithType<T>()
{
   var type = typeof(T);

   if (type.IsInterface)
   {
      EmitTypeFromInterface(type);
   }

   return this;
}

The EmitTypeFromInterface will start by defining a new type using the ModuleBuilder. We’ll create a name based upon the interface type’s name. Obviously the name needs to be unique. To make things simple we’ll just prefix the text “AutoGenerated”, hence type IService will become implementation AutoGeneratedIService. We’ll also need to set up the TypeAttributes to define our new type as a public class and in our case ensure the new type extends the interface. Here’s the code to generate a TypeBuilder (and also create the constructor for the class)

private void EmitTypeFromInterface(Type type)
{
   _typeBuilder = _moduleBuilder.DefineType($"AutoGenerated{type.Name}",
      TypeAttributes.Public |
      TypeAttributes.Class |
      TypeAttributes.AutoClass |
      TypeAttributes.AnsiClass |
      TypeAttributes.BeforeFieldInit |
      TypeAttributes.AutoLayout,
      null, new[] { type });


   var constructorBuilder =
      _typeBuilder.DefineDefaultConstructor(
         MethodAttributes.Public |
         MethodAttributes.SpecialName |
         MethodAttributes.RTSpecialName);

   // insert the following code snippets here
}

Implementing our methods

Obviously an interface requires implementations of it’s methods – yes you can actually save the assembly without supplying the methods and will get a TypeLoadException stating that the new type does not have an implementation for the method.

In the code below we’ll look through the methods on the interface type and using the TypeBuilder we’ll create a MethodBuilder per method which will have the same name, return type and parameters and will be marked as public and virtual, from this we’ll finally get to emit some IL using the ILGenerator. Here’s the code

foreach (var method in type.GetMethods())
{
   var methodBuilder = _typeBuilder.DefineMethod(
      method.Name,
      MethodAttributes.Public |
      MethodAttributes.Virtual,
      method.ReturnType,
      method.GetParameters().Select(p => p.ParameterType).ToArray());

   var ilGenerator = methodBuilder.GetILGenerator();

   // IL Emit code goes here
}

A very basic overview or writing IL code

We can generate IL code using an ILGenerator and Emit methods from a C# application (for example). We can also write IL directly as source code files. For example, create a file test.il

Now add the following code

.assembly MyAssembly
{
}

.method void Test()
{
.entrypoint
ret
}

The text preceded by the . are directives for the IL compiler (ILASM which comes with Visual Studio). Within the file we’ve firstly declared an assembly named MyAssembly. Whilst this file would compile without the .assembly, it will not run and will fail with a BadImageFormatException.

Next we define a method (using the .method directive) named Test. The .entrypoint declares this is the entry point to our application (as this will compile to an EXE). Hence unlike C# where we use Main as the entry point, any method may be the entry point but only one method may be marked as the entry point.

To create a correctly formed method we also need the last line code to be a ret.

If you now compile this file using

ilasm test.il

You might notice that ilasm outputs the warning Non-static global method ‘Test’, made static. Obviously in C# our entry method would normally be a static method. Simply add the keyword static as below

.method static void Test()
{
.entrypoint
ret
}

Let’s now turn this little IL application into the classic Hello World by calling the Console.WriteLine method.

If you’ve ever written any assembly code you’ll know we pass arguments to subroutines by placing the arguments on the stack and then the callee will pop the expected number of arguments. So to output a string, we’ll need to push it onto the stack – in this case we use ldstr which specifically handles strings.

Console.WriteLine is available in the System namespace within mscorlib, and to invoke a method we’ll need to call it specifying the overload (if any) to use along with a fully qualified name, hence our Test method becomes

.method static void Test() 
{
.entrypoint

ldstr "Hello World"
call void [mscorlib]System.Console::WriteLine(class System.String)
ret
}

The easiest way to learn IL is to look at decompilation from tools such as ildasm, ILSpy, Reflector or dotPeek, write code you wish to generate IL for, compile then decompile with one of these tools to see what’s going on.