Author Archives: purpleblob

Vert.x futures

In previous examples of implementations of AbstractVerticle classes I’ve used start and stop methods which take no arguments, there’s actually asynchronous versions of these methods which support the Vert.x Future class.

For example

public class FutureVerticle extends AbstractVerticle {
   @Override
   public void start(Future<Void> future) {
   }

   @Override
   public void stop(Future<Void> future) {
   }
}

Let’s take a look at how our start method might change to use futures.

@Override
public void start(Future<Void> future) {

   // routing and/or initialization code

   vertx.createHttpServer()
      .requestHandler(router::accept)
      .listen(port, l ->
      {
         if(l.succeeded()) {
            future.succeeded();
         }
         else {
            future.fail(l.cause());
         }
      });
)

In this example we simply set the state of the future to success or failure and in the case of the failure supply a Throwable as the argument to the fail method.

Using the Future in our own code

Obviously the Future class may be used outside of the start and stop methods, so let’s take a look at creating and using a Future.

To create a future simply use

Future<Record> f = Future.future();

in this case we’re creating a Future which takes a Record. We can now supply our own AsyncResult handler to handle the future on completion, i.e.

Future<Record> f = Future.future();

f.setHandler(ar ->
{
   if(r.succeeded() {
      // do something with result
   }
});

Many of the Vertx methods (like listen in the earlier code) supply overloads with an AsyncResult callback. We can pass a future as a callback using the method completer and supply a handler via the future. For example

Future<HttpServer> f = Future.future();
f.setHandler(l ->
{
   if(l.succeeded()) {
      future.succeeded();
   }
   else {
      future.fail(l.cause());
   }
});

vertx.createHttpServer()
   .requestHandler(router::accept)
   .listen(port, f.completer());

Shell scripting (Linux)

The shell script file

By default we name a script file with the .sh extension and the first line is usually (although not strictly required) to be one of the following

#!/bin/sh

OR

#!/bin/bash

The use of sh tells Linux we want to run the script with the default shell script . Note that this might be dash, bash, bourne or any other shell available, hence when using sh the developer of the script needs to be aware that they cannot expect bash (for example) capabilities to exist and therefore if bash specific code exists within the script the #!/bin/bash line should be used.

Note: #! is known as the sha-bang

What shell am I running?

You can use $SHELL in your scripts or from the command line, for example

echo $SHELL

Making the script executable

chmod 777 myscript.sh

Comments

Comments are denoted by #. Anything after the # until a new line, will be seen as a comment, i.e.

echo "Some text" # write to stdout "Some text"

Variables

We can create variables, which are case-sensitive, like this

VARIABLE1="Hello"
variable2="World"

echo $VARIABLE1 $variable2

Note: You should not have spaces between the = operator or the command may not found. So for example this will fail VARIABLE1 = “Hello”

Whilst we can use variables that are not strings, underneath they’re stored as strings and only converted to numerical values when used with numerical functions etc.

So for example

i=0
$i=$i+1

will fail with 0=0+1: command not found. To increment a variable (for example) we need to use the following syntax

let "i=i+1"
#OR
i=$((i+1))

we can also use postfix operators, i.e. ++ or += such as

let "i++"
#OR
((i++))
#OR
let "i+=1"
#OR
((i+=1))

We can also create arrays using the following syntax

a=("a" "b" "c")

and an example of indexing into this array is as follows

echo "Array element at index 1 is ${array[1]}"
# outputs Array element at index 1 is b

We can also remove or unset a variables like this

unset i

Logic operations

IF, THEN, ELSE, ELIF…

As our scripts become more capable/complex it’s likely we’ll want to use some logic and branching code, i.e. IF, THEN, ELSE, ELIF code. Let’s look at an example of IF, THEN syntax

if [ $i = 6 ] 
then 
   echo "i is correctly set to 6" 
fi

Note: after the space after the [ and before ] without this the script will error with command not found.

The [ ] syntax is the way you’ll often see this type of operation, from my understanding this is actual an alternate syntax to test i=6, so for example

test i=6; echo "i is correctly set to 6"

Note: the example above shows the test on a single line, in this case the ; is used to denote the end of the line.

We can use = or -eq along with less than, greater than etc. however these standard operators are to be set to use a string comparisons, i.e. we do not use <, instead we use -lt for non-strings, like wise = will do a string comparisons whereas -eq will handle numerical comparisons.

We can also use ELSE for example

if [ ! -d "BAK" ]
then
   echo "BAK does not exist"
else   
   echo "BAK exists"
fi

and finally ELIF

if [ -d "BAK" ]
then
   echo "BAK exists"
elif [ -d "BACK" ]
   echo "BACK exists"
fi

Checkout If Statements! which has a list of the operators in more depth.

[[ ]] vs [ ]

The [ ] is actually just an alias for test as mentioned above. BASH and some other shells also support [[ ]] syntax which is more powerful. See What is the difference between test, [ and [[ ? for more information.

Case (switch)

Extending the IF, THEN, ELSE, ELIF we also have a switch style comparison capability, for example

case $response in
   y|Y) echo "Executing script" ;;
   *) exit ;;
esac

The syntax for y|Y) is pattern matching and ) terminates the pattern. This is followed by one or more statements to be executed followed by the ;; terminator. The *) means match against anything else (or default condition). We then terminate the case block with esac. So in this example we’ll output “Executing script” if the response variable is either y or Y.

Loops

We can run commands in loops such as for, while and until.

While and Until are very similar except that while keeps looping whilst a condition is true whereas until loops, until a condition is true. Here’s a couple of simple examples

i=0
while [ $i -lt 10 ]
do
   echo $i
   ((i++))
done

until [ $i -lt 0 ]
do
   echo $i
   ((i--))
done

for loops using a similar syntax again, except they use the in keyword, for example

array=("a" "b" "c")
for item in ${array[@]}
do
  echo $item
done

This example demonstrates looping through an array, but we can also loop through items returned by shell commands, for example

for item in $(ls -a)
do
   echo $item
done

In this example we’re looping through the results of the command ls -a. Although a better solution to this might be

for item in ${PWD}/*
do
   echo $item
done

The ls version returns multiple items for a file name with spaces, so not too useful if we want each file name including spaces.

Here’s a final example using using the back tick (`) which can be used to enclose commands, for example in this instance we execute the command seq 1 10

for item in `seq 1 10`;
do
   echo $item
done

Passing arguments to your shell script

Arguments are passed into your script via the command line as you’d normal do, i.e. in this example my shell script (myscript.sh) takes two arguments Hello and World

./myscript.sh Hello World

To reference the arguments in the script we simply use $1 and $2. i.e.

echo $1 # Should be Hello
echo $2 # Should be World

There also there’s also the $@ which denotes all arguments, i.e.

echo "$@"

Will output all the arguments passed into the script or function.

Functions

We can create functions inside our shell scripts and/or include other script files which have functions etc. within.

You need to declare the function before usage and writing a function is pretty simple, i.e.

say_hello()
{
    echo "say_hello called"
}

say_hello

To include arguments/parameters we use the same system as passing arguments via the command line, so for example

say_something()
{
    echo "say_something says $1 $2"
}

say_something Hello World
# outputs say_something says Hello World

here we see the arguments are turned into the $1 and $2 variables, but of course local to our function.

STDIN/STDOUT/STDERR

We’ve already seen that echo is the equivalent of output to STDOUT in it’s default usage, although in can be used to output to STDERR, see Illustrated Redirection Tutorial.

We can use read to read input from the user/command line via STDIN.

In it’s most basic use we can write the following

read input

Where input is a variable name.

We can also use it in slightly more powerful ways, such as

read -n1 -p "Are you sure you wish to continue (y/n)?" input

In this case we read a single character (-n1) with the prompt (-p) “Are you sure you wish to continue (y/n)?” into the variable named input.

The read function can also be used to read data from a file by using a file descriptor and the argument -u.

References

Loops for, while and until

ZooKeeper

Why do we need ZooKeeper

The first question I had to ask myself was, why do we need ZooKeeper? After all I could store/publish the host/port etc. to a Redis server, a database or just a plain centrally located web service (which for all I care could store the information to a simple file).

ZooKeeper has been designed specifically for handling configuration and name registry functionality in a distributed and clustered environment, hence comes with more advanced features to ensure consistency of data along with capabilities to handle cluster management of the servers.

ZooKeeper in Docker

I’m using ZooKeeper within docker.

To run a ZooKeeper instance within a Docker container, simply use

docker run --name zookeeper1 --restart always -d zookeeper

My first instance is named zookeeper1. This command will run a server instance of ZooKeeper.

We may also need to attach to the service with a client, we can run the following command

docker run -it --rm --link zookeeper1:zookeeper zookeeper zkCli.sh -server zookeeper

Ensure the name of the Docker instance matches the name you assigned to the server.

Client commands

create We can create a path, /root/sub format, to our data, for example create /services “hello”. Note: it seems that to create child nodes we cannot just type create /services/hello-service “hello”, we need to first create the root node then the child node.
ls We can list nodes by typing ls / or using root and/child nodes, for example ls /services/hello-service. If nodes exist below the listed node the output will be [node-name], so for example ls /services will result in [hello-service]. When no child nodes exists we’ll get [].
get We can get any data stored at the end of a node, so for example get /services/hello-service will display the data stored in the node along with data such as the date/time the data was stored, it’s size etc.
rmr We can recursively remove nodes (i.e. remove a node and all it’s children) using rmr. For example rmr /services.
delete We can delete an end node using delete but this will not work on a node which has children (i.e. it works only on empty nodes).
connect We can connect the client to a running instance of a ZooKeeper server using connect, i.e. connect 172.17.0.2:2181
quit Exists the client.

There are other client commands, but these are probably the main one’s from the client run help to see a full list of commands.

Some command have a [watch] options, which can be enabled by either supplying 1 or true as the last argument to the relevant commands.

Useful References

ZooKeeper Usage 1 – 5

Service Discovery using ZooKeeper with Vert.x

Following on from my post Service discovery with Vert.x, let’s switch from using the Vert.x built-in Service Discovery to using ZooKeeper.

You can fire up a docker container running ZooKeeper using

docker run --name zookeeper --restart always -d zookeeper

Now we’ll need to add the following to our pom.xml

<dependency>
   <groupId>io.vertx</groupId>
   <artifactId>vertx-service-discovery-bridge-zookeeper</artifactId>
   <version>${vertx.version}</version>
</dependency>

Note: vertx.version is set to 3.5.1

The code to create the Service Discovery to ZooKeeper is, as follows

ServiceDiscovery discovery = ServiceDiscovery.create(vertx)
   .registerServiceImporter(new ZookeeperServiceImporter(),
      new JsonObject()
        .put("connection", "172.17.0.2:2181")
        .put("basePath", "/services/hello-service"));

Replace the ip address with the one create by Docker or if you’re running on localhost.

We register the ZooKeeperServiceImporter and supply at least the “connection”, the “basePath” is not required, a default will be supplied if none is explicitly supplied.

Don’t forget you’ll need the import

import io.vertx.servicediscovery.zookeeper.ZookeeperServiceImporter;

References

Docker ZooKeeper
Vert.x Service Discovery

Code

Source code relating to these posts on Vert.x can be found at VertxSamples. The code differs from the posts in that in some cases it’s been refactored to reduce some code duplication etc.

Configuration with Vert.x

Obviously you can use whatever configuration code you like to configure your Verticles and/or Vert.x applications, you might use standard configuration files or a Redis store or whatever.

Vert.x has a single library which allows us to abstract such things and offers different storage formats and code to access different storage mechanisms.

Add the following dependency to your pom.xml

<dependency>
   <groupId>io.vertx</groupId>
   <artifactId>vertx-config</artifactId>
   <version>${vertx.version}</version>
</dependency>

vertx.version in my version is 3.5.0

We can actually define multiple storage mechanisms for our configuration data and chain them. For example imagine we have some we have configuration on the localhost and other configuration on a remote HTTP server. We can set things up to get properties that are a combination of both locations.

Note: If location A has property port set to 8080 but location B (which appears after location A in the chain) has port set to 80 then the last property located in the chain is the one we’ll see when querying the configuration code for this property. For other properties the result is a combination of those from A and those from B.

We can also mark configuration sources as optional so, if they do not exist or are down, the chaining will not fail or exception.

Let’s start with a simple config.properties file stored in /src/main/resources. Here’s the file (nothing exciting here)

port=8080

We’ll now create the ConfigStoreOptions for a “properties” file, then add it to the store (we’ll just use a single store for this example) and finally we’ll retrieve the port property from the configuration retriever…

// create the options for a properties file store
ConfigStoreOptions propertyFile = new ConfigStoreOptions()
   .setType("file")
   .setFormat("properties")
   .setConfig(new JsonObject().put("path", "config.properties"));

// add the options to the chain
ConfigRetrieverOptions options = new ConfigRetrieverOptions()
   .addStore(propertyFile);
    // .addStore for other stores here

ConfigRetriever retriever = ConfigRetriever.create(vertx, options);
retriever.getConfig(ar ->
{
   if(ar.succeeded()) {
      JsonObject o  = ar.result();
      int port = o.getInteger("port");
      // use port config
   }
});

Note: Don’t forget to set the configuration store options “path” to the location and name of your file.

Service discovery with Vert.x

We may get to a point whereby we have multiple Vert.x applications running and we want one Verticle to communicate with another – this is easy enough if the IP address and port are fixed but not so easy in more scalable/real-world scenarios where we cannot guarantee these are fixed.

In such situations we can use service discovery to locate other services.

Before we get started with the code, we need to add the following to the pom.xml

<dependency>
   <groupId>io.vertx</groupId>
   <artifactId>vertx-service-discovery</artifactId>
   <version>${vertx.version}</version>
</dependency>

I’m using vertx.version 3.5.0 in my examples.

Publishing/Registering our Verticle with ServiceDiscovery

To register our Verticle with ServiceDisovery we create a Record object which tells the ServiceDiscovery how to access a Verticle, this includes it’s host/IP, port and service root along with a name for other code to use to locate the service. For example

Record record = HttpEndpoint.createRecord(
   "hello-service",
   "localhost",
   8080,
   "/hello");

So this basically says, create a Record named “hello-service” (the key or name of the service) and it’s IP/host is localhost, obviously this is just for my testing. Next we supply the exposed port and finally the root of the service.

We then publish this record to the ServiceDiscovery object like this

discovery.publish(record, ar ->
{
   if (ar.succeeded()) {
      // publication succeeded
      publishedRecord = ar.result();
   } else {
      // publication failed
   }
});

Upon success we store the Record (in this case we only do this if the call succeeded) so that we can unpublish the service if it’s shutdown.

Let’s look at the full code for a simplified HelloVerticle

public class HelloVerticle extends AbstractVerticle {

    private ServiceDiscovery discovery;
    private Record publishedRecord;

    @Override
    public void start() {
        discovery = new DiscoveryImpl(vertx, 
           new ServiceDiscoveryOptions());

        Router router = Router.router(vertx);
        router.get("/hello").handler(ctx -> {
            ctx.response()
                .putHeader("content-type", "text/plain")
                .end("hello");
        });

        Record record = HttpEndpoint.createRecord(
                "hello-service",
                "localhost",
                8080,
                "/hello");

        discovery.publish(record, ar ->
        {
            if (ar.succeeded()) {
                // publication success
                publishedRecord = ar.result();
            } else {
                // publication failure
            }
        });

        vertx
           .createHttpServer()
           .requestHandler(router::accept)
           .listen(8080, ar -> {
              // handle success/failure 
           });
    }

    @Override
    public void stop() {
        if(discovery != null) {
            discovery.unpublish(publishedRecord.getRegistration(), ar ->
            {
                if (ar.succeeded()) {
                    // Success
                } else {
                    // cannot unpublish the service, 
                    // may have already been removed, 
                    // or the record is not published
                }
            });

            discovery.close();
        }
    }
}

Locating a service via ServiceDiscovery

Let’s take a look at some “consumer” code which will use service discovery to locate our “HelloVerticle”. As expected we need to create access to the ServiceDiscovery object and then we try to locate the Record for a previously added Record.

In the example, below, we search for the “name”, “hello-service”, this is wrapped into a JsonObject and the result (if successful will contain a Record which matches the search criteria. Using HttpClient we can now simply get the reference to this service and interact with it without ever knowing it’s IP address or port.

ServiceDiscovery discovery = ServiceDiscovery.create(v);
discovery.getRecord(
   new JsonObject().put("name", "hello-service"), found -> {
   if(found.succeeded()) {
      Record match = found.result();
      ServiceReference reference = discovery.getReference(match);
      HttpClient client = reference.get();

      client.getNow("/hello", response ->
         response.bodyHandler(
            body -> 
               System.out.println(body.toString())));
   }
});

HttpClient in Vert.x

Vert.x includes an HttpClient and associated code for interacting with HTTP protocols, obviously this can be used to write client applications or in situations where we might use a reference from service discovery.

This is a very short post which is just mean’t to demonstrate the client capability which will be used in the next post (Service discovery with Vert.x).

HttpClient client = vertx.createHttpClient();
client.getNow(8080, "localhost", "/hello", response ->
{
   response.bodyHandler(
      body -> 
         System.out.println(body.toString()));
});

The HttpClient gives us get, post, head etc. the Now postfixed named methods tend to be simpler syntax including the supply of callback and are composable using fluent style syntax.

See also Creating an HTTP client for more information.

Benchmarking my Java code using JUnitBenchmarks

As part of building some integration tests for my Java service code, I wanted to get some (micro-)benchmarks run against the tests. In C# we have the likes of NBenchmark (see my post Using NBench for performance testing), so it comes as no surprise to find libraries such as JUnitBenchmarks in Java.

Note: the JUnitBenchmarks site states it’s now deprecated in favour of using JMH, but I will cover it here anyway as it’s very simple to use and get started with and fits nicely in with existing JUnit code.

JUnitBenchmarks

First off we need to add the required dependency to our pom.xml, so add the following

<dependency>
   <groupId>com.carrotsearch</groupId>
   <artifactId>junit-benchmarks</artifactId>
   <version>0.7.2</version>
   <scope>test</scope>
</dependency>

JUnitBenchmarks, as the name suggests, integrates with JUnit. To enable our tests within the test runner we simply add a rule to the unit test, like this

public class SampleVerticleIntegrationTests {
    @Rule
    public TestRule benchmarkRule = new BenchmarkRule();

   // tests
}

This will report information on the test, like this

[measured 10 out of 15 rounds, threads: 1 (sequential)]
round: 1.26 [+- 1.04], round.block: 0.00 [+- 0.00], 
round.gc: 0.00 [+- 0.00], 
GC.calls: 5, GC.time: 0.19, 
time.total: 25.10, time.warmup: 0.00, 
time.bench: 25.10

The first line tells us that the test was actually executed 15 times (or rounds), but only 10 times was it “measured” the other 5 times were warm-ups all on a single thread – this is obviously the default for benchmarking, however what if we want to change these parameters…

If we want to be more specific about the benchmarking of various test methods we add the annotation @BenchmarkOptions, for example

@BenchmarkOptions(benchmarkRounds = 20, warmupRounds = 0)
@Test
public void testSave() {
   // our code
}

As can be seen, this is a standard test but the annotation tells JUnitBenchmarks will run the test 20 times (with no warm-up runs) and then report the benchmark information, for example

[measured 20 out of 20 rounds, threads: 1 (sequential)]
round: 1.21 [+- 0.97], round.block: 0.00 [+- 0.00], 
round.gc: 0.00 [+- 0.00], 
GC.calls: 4, GC.time: 0.22, 
time.total: 24.27, time.warmup: 0.00, 
time.bench: 24.27

As you can see the first line tells us the code was measured 20 times on a single thread with no warm-ups (as we specified).

I’m not going to cover build integration here, but checkout JUnitBenchmarks: Build Integration for such information.

What do the results actually mean?

I’ll pretty much recreate what’s on Class Result here.

Let’s look at these results…

[measured 20 out of 20 rounds, threads: 1 (sequential)]
round: 1.21 [+- 0.97], round.block: 0.00 [+- 0.00], 
round.gc: 0.00 [+- 0.00], 
GC.calls: 4, GC.time: 0.22, 
time.total: 24.27, time.warmup: 0.00, 
time.bench: 24.27

We’ve already seen that the first line tells us how many times the test was run, and how many of those runs were warm-ups. It also tells us how many threads were used in this benchmark.

round tells us the average round time in seconds (hence the example took 1.21 seconds with a stddev of +/- 0.97 seconds).
round.block tells us the average (and stddev) of blocked threads, in this example there’s no concurrency hence 0.00.
round.gc tells us the average and stddev of the round’s GC time.
GC.calls tells us the number of times GC was invoked (in this example 4 times).
GC.time tels us the accumulated time take invoking the GC (0.22 seconds in this example).
time.total tells us the total benchmark time which includes benchmarking and GC overhead.
time.warmup tells us the total warmup time which includes benchmarking and GC overhead.

Caveats

Apart from the obvious caveat that this library has been marked as deprecated (but I feel it’s still useful), when benchmarking you have to be aware that, the results may be dependent upon outside factors, such as memory available, maybe hard disk/SSD speed if tests include any file I/O, network latency etc. So such figures are best seen as approximates of performance etc.

Also there’s seems to be no way to “fail” a test, for example if the test exceeds a specified time or more GC’s than x are seen, so treat these more as informational.

Starting out with Ansible

In my previous post How to log into a Linux server without a password I generated ssh keys to allow me to log into remote Linux servers without a password. The main intention was to allow me to use ansible on a controller machine to interact with my servers.

Now let’s get started writing/running Ansible commands and playbooks.

If you want to test things on localhost we first need to edit the file /etc/ansible/hosts and add the following

localhost ansible_connection=local

We also edit the same file to add any remote hosts that we intend to connect to. We can group remote servers into named groups, so for example interacting with London based servers separately to New York servers.

To group servers within the hosts file we use syntax similar to Window ini files, i.e.

[london]
ldn_server1
ldn_server2

[new_york]
ny_server1

Playbooks and Ad-hoc commands

A playbook is simply a YAML Ansible script that we can execute against one or more servers but in some cases we just want to execute ad-hoc commands (i.e. not bother creating a playbook).

Here’s an example of and ad-hoc command which will run the df, disk free command against all our servers.

ansible all -m command -a "df"

As you can see after the ansible command (in this example) we list the servers (or group(s)) we want to run the commands against, in this case we’re running the command against all servers. The -m switch tells Ansible to use the module, in this example the module command. As this is the default module we can omit this from the entered command to instead use

ansible all -a "df"

The -a switch switch denotes the arguments that we need to send to the command module. In this instance we’re sending the Linux command df.

Once run, this will display (for each server in our hosts file) the free disk space.

Creating a playbook

So ad-hoc commands are very useful, but we can take this to another level by creating scripts (known as playbooks in Ansible) to run our commands.

Let’s create the equivalent of the ad-hoc command we just ran.

Create a directory for your playbooks and then create the file df.yml, place the following code into it

---
- hosts: all
  tasks:
  - name: Run df across all servers
    command: df
    register: out
  - debug: msg={{out.stdout}}

Now from the folder containing the yml file run ansible-playbook df.yml

The – – – can be used optionally, to denote the start of a YAML file (and … can be used optionally to end one). For some reason most examples I’ve seen have the – – – but not the …, so I’ve included it in this script, but it’s not needed.

YAML files use -hosts: to denote which servers we want this playbook to interact with, followed by the list of tasks. We can optionally name the tasks, then list the commands for each task.

In the above we create a single task to run the command df, by default Ansible will simply tell us whether the command ran, but we’ll probably want to see the output from the server calls, hence the – debug section and the register: out.

How to log into a Linux server without a password

Let’s assume we have two Linux systems, client and server.

On the client, carry out the following steps

ssh-keygen
By default saves to /home/username/.ssh/id_rsa
Enter a passphrase (if you want to add one to the key)
ssh-copy-id root@server (replace root@server with your user name and server IP or host name)
You should get prompted for the server’s password, enter this and the key will be copied

To test everything worked, execute the following

ssh root@server

(obviously, replacing root@server with your user name and server IP or host name) and if all worked you should log into the server without having to enter a password.

Now for each server that you need ssh access without a password, simple run ssh-copy-id root@server with the username and server that you want to connect to.

My Memory

Outsourcing my memory and thoughts (and other ramblings) to the web