UI Automation Testing with Selenium

We’re going on a short journey into the realms of User Interface Automation Testing. This post is dedicated to Selenium which is a Web based automation framework. Now it’s important to note that Selenium is aimed at desktop web and mobile web automation testing – not native desktop, i.e. Windows, WinForms, WPF, UWP, Maui OR Linux, Mac (SwiftUI or UIKit) or native mobile such as Android and iOS.

An alternative web automation testing tool that I will probably look at in another post is Playwright, but ofcourse there are other framework for this.

So why am I starting with Selenium? Well I’m looking at UI Automation testing Windows and mobile apps. and one of the options for testing these is Appium which has it’s root in Selenium (at the very least in that it uses the an extension to the Selenium JSON wire protocol), so I figured it’d be good to have a look at Selenium first, to get an idea of the similarities etc. Okay, this is getting too long winded, let’s get into the code.

Note: Spoiler alert, whilst there are, ofcourse similarities between Appium and Selenium, the API differs in some cases, but it’s all similar enough to make it an easy transition, but beware desktop apps are not designed around a DOM like HTML and some things, such as search for elements using XPath (which we’ll see later) has a performance overhead on the desktop that much less obvious on the web.

Anyway let’s continue with Selenium and the web for now…

Sample App

My same app is going to be a React app (for no good reason other than I like to keep refreshing my React knowledge). The app. is going to look awful, but I don’t care as all I want to do is automation it and test it, so if you create a React app using yarn create react-app my-app –template typescript and simple change App.tsx to look like this

import { useState } from 'react';
import './App.css';

function App() {
  const [state, setState] = useState({
    input: "",
    output: ""
  });

  return (
    <div className="App">
      <input type="text" value={state.input} onChange={ev => setState({
        ...state,
        input: ev.target.value
      })}/>
      <button onClick={() => setState({
        ...state,
        output: state.input
      })}>Copy</button>
      <div>{state.output}</div>
    </div>
  );
}

export default App;

Note: this code is a little horrible, but as stated, I’m really not interested in the sample app beyond automating it, so feel free to run your own web app using whatever tech. and code you prefer.

Also in index.html, change the title to Test App we’ll use this to check out page loaded, before trying to interact with it.

So the UI above is a simple textbox which takes some input, a button that, when press will copy the input to the output div. Like I said it’s not pretty and it’s very basic, but it’s a good enough starting point as we’ll need to find elements, check then, send keystrokes to them, click a button and check the output text is as expected – we’ll add more as we go – but again I’m not wasting time making the UI or code look good.

Creating your test app.

Now I’m going to be using Selenium from C# – WHAT !?? I hear you cry. Well Selenium has support for Java, Python, C# and more, so why not (and ofcourse I intend to move to doing desktop and mobile testing and that will be using C#, so we can compare the code etc. as we go.

I’m going to be using Specflow to write my tests, if you prefer, ignore the Specflow parts of this post and just look at any Specflow generated methods as unit tests in your prefered unit testing framework. For this and my next post I will be using Specflow, NUnit and FluentAssertions Library. As I have the Specflow templates installed in Visual Studio, I’ve taken the following steps

  • From Create New Project, create a Specflow project
  • Ensure Add FluentAssertsion Library is checked
  • Create the project
  • Add Nuget Package Selenium.WebDriver.ChromeDriver
  • Add Nuget Package Selenium.WebDriver

Let’s start with a very simple test, so changing the Calculator.feature you get from the project template to the following

Feature: Test Application

@Default
Scenario: Check initial state
* Load the page
* Check the default values are correct

Note: I’m using the short hand syntax for Given, i.e. * to simply define the steps of my test – again, I’m just wanting to write the bare minimum definition and support code so we can concentrate on the Selenium bits, but if the above offends, please switch for Given, And, When Then etc.

Once you generate your definition steps we end up with the following (but with a couple of code changes to allow the class to adhere to IDisposable

[Binding]
public class TestApplicationStepDefinitions : IDisposable
{
   public void Dispose()
   {
   }

   [Given(@"Load the page")]
   public void GivenLoadThePage()
   {
   }

   [Given(@"Check the default values are correct")]
   public void GivenCheckTheDefaultValuesAreCorrect()
   {
   }
}

It’s probably obvious that we’re going to need to load the page from our sample app. in the GivenLoadThePage method then we’ll get the page elements and check them in GivenCheckTheDefaultValuesAreCorrect by asserting they are as expected.

  • Let’s run our React sample app using yarn start, so we have that available to us.
  • Let’s add a field to the class, this will contain the Chrome web driver (ofcourse you can add packages for other drivers, such as Edge and use those instead). The code and the constructor and the Dispose method now looks like this
    private readonly ChromeDriver _chromeDriver;
    
    public TestApplicationStepDefinitions()
    {
       _chromeDriver = new ChromeDriver();
    }
    
    public void Dispose()
    {
       _chromeDriver.Dispose();
    }
    
  • Next let’s add the following code to GivenLoadThePage
    _chromeDriver.Navigate().GoToUrl("http://localhost:3000");
    

    Ofcourse this assumes you’re web page is running on localhost:3000, so change this to suit. This code will basically start up an instance of Chrome and navigate to the URL, but as there’s now further tests/interaction code at this point, it’ll then be dispose of and closed.

  • Let’s actually check we really did load Chrome and our page was displayed, so following the Navigate line add the following
    _chromeDriver.Title.Equals("Test App", StringComparison.OrdinalIgnoreCase)
       .Should().BeTrue();
    

    Basically we want to check the page has the expected title before we run further test scenarios.

So the first thing you might find is that, the navigation to the page and the tests are so quick you have no idea what was being “seen” by Selenium and therefore whether things were even working. Ofcourse the test we just wrote gives us some confidence, but if you have a typo in the name of your application (for example) you’ll have no idea what went wrong unless you log or write information to console as well – so I’d recommend that when carrying out your tests you log to console (at least) some information to help tell you things, such as which element couldn’t be found etc. However from brevity we’ll ignore such logging for now.

Let’s now fill in the second step we created – this checks the elements are in a default state, i.e. input field is empty, output field is empty and button exists – this will help give us confidence that further tests are not tainted by any incorrect default or starting state.

Selenium uses the method FindElement to locate elements via name, class, id etc. using the By class to define what we’re using to locate an element, so let’s add some code to the GivenCheckTheDefaultValuesAreCorrect method, it should look like this

_chromeDriver.FindElement(By.Name("input")).Text
   .Should().BeEmpty();
_chromeDriver.FindElement(By.Name("copy"))
   .Should().NotBeNull();
_chromeDriver.FindElement(By.Name("output")).Text
   .Should().BeEmpty();

So as you can see, we’re looking for each element and in the input and output case, checking that the text is empty and in the copy element, the button’s state, we’re just basically checking it exists. Ofcourse we could simple ignore the button until we try to use it – that’s down to the person writing the tests, but in the case I’ll check everything exists as expected.

If we now run this scenario via the Visual Studio Test Runner, the first test for the title should pass and then the second will fail – it’ll fail because we never assigned names to our elements within our React Test app. Now if we add name=”input” to the input element in our Test App. then that test will pass but the button fails. Ofcourse we could add a name to this element, but it’s got the text “Copy”, so why not simply look for a button with the text “Copy”, we can do that using the By XPath version of FindElement, so change the “copy” line to the following

_chromeDriver.FindElement(By.XPath("//button[contains(text(), 'Copy')]"))
   .Should().NotBeNull();

Running the test through the Test Runner now shows both the first two assertions were met, but the output element cannot be found – div’s don’t have a name option, but they do had id, so change the Test App div to include id=”output”. Obviously we’ll also need to change out FindElement By.Name to By.Id, i.e.

_chromeDriver.FindElement(By.Id("output")).Text
   .Should().BeEmpty();

Okay so we actually learned a fair amount there about how to get our app up and running and interact (at least by finding) elements on the screen (or within the DOM). Now let’s interact with the page…

Interacting with our application

Once we find an element we can call methods on it such as SendKeys and Click, so for out next test we want to input some string into the input element, click the button and expect to see the same text in the output element, so I’ve created a new scenario

@Copy
Scenario: Check input is copied to output
* Load the page
* Copy input to output

Notice that when each scenario ends, at this time we are disposing of the chrome driver and so the next scenario then needs to start the driver and navigate to the page again – ofcourse there are ways to change our code to stop this, but for now, I’m not worried, the app is small and fast to load. So let’s create the step definition for “Copy input to output”

[Given(@"Copy input to output")]
public void GivenCopyInputToOutput()
{
   _chromeDriver.FindElement(By.Name("input")).SendKeys("Hello World");
   _chromeDriver.FindElement(By.XPath("//button[contains(text(), 'Copy')]")).Click();
   _chromeDriver.FindElement(By.Id("output")).Text
      .Should().Be("Hello World");
}

In the above we find the input element, send the string “Hello World” to it, then press the Copy button and then assert that the output has the expected string (i.e. “Hello World”). As we know that each scenario gets a clean/default webpage we do not need to check the output element is empty first (i.e. to make sure it’s not somehow already got “Hello World” in it). Ofcourse we could pass different strings into this step via Cucumber’s/Specflow’s Example keyword – I’ll leave that to the reader to play with if they wish.

If all went according to plan both scenario’s will complete successfully.

What next?

In this post, I’m not going to dig into all the features of Selenium as what I’ve listed here are the key starting points, however there things to be aware of and gotchas awaiting.

For example one of the biggest issues are race condition (and this relates to ALL version of UI Automation Testing). The first of these might be down to whether elements are currently on the screen, i.e. maybe they’re only written to the DOM when required. Selenium includes implicit waits using

_chromeDriver.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(5000);

As the Selenium documentations states An implicit wait is rarely the best solution. It certainly doesn’t help with the next issue. What if your element IS in the DOM and visible but the process of copying the input to output takes a lot longer than it takes to find the output element. We will end up with the element in it’s default state and therefore the test will fail as it’s run before the output is updated from the button click.

Lets prove that point, I’m going to change the our Test app by first adding the following function

function wait(callback: () => void) {
   setTimeout(callback, 3000);
}

and now the code for the button looks like this

<button onClick={() => {
   wait(() => {
      setState({
         ...state,
         output: state.input
      });
   });
}}>Copy</button>

So what happens here is the button click runs the code after a 3s delay, so the code all still works if, you as the user watch the screen (although obviously it’s slower than expected), but for our test scenario… well just run it and see.

I ran mine and got Expected _chromeDriver.FindElement(By.Id(“output”)).Text to be “Hello World” with a length of 11, but “” has a length of 0, differs near “” (index 0)., so there’s a problem – if we introduce changes to our application which slow things down enough then out tests that once passed will now fail.

As stated earlier implicit waits are not the best solution even for elements not yet being visible, but for scenarios such as this, an implicit wait is useless because we got the element just fine, what we do need to do is wait until the Text changes.

Sadly there’s no such thing, such as an event, to alert us when elements change – so we had to use explicit waits with conditional tests and timeouts, in other words we’ll basically poll the element for it’s text and set a timeout where we decide it’s never coming, something like this

// potential race condition
//_chromeDriver.FindElement(By.Id("output")).Text
//    .Should().Be("Hello World");

// explicit wait (still potential race condition, but hopefully less likely)
var wait = new WebDriverWait(_chromeDriver, TimeSpan.FromSeconds(10));
wait.Until(e => e.FindElement(By.Id("output")).Text == "Hello World")
   .Should().BeTrue();

So now we’re using WebDriverWait to wait until a condition is met with a timeout of 10s, in other words this will essentially poll the test application (the default polling interval is 500ms) and get the element (if an element doesn’t exist the exception is handle for us). It will keep trying until the timeout then will return (in this case) either a True or will throw a WebDriverTimeoutException exception as the condition was not met within the timeout.

As you can see, for testing whether an element exists – i.e. once that might dynamically get created we would just use

var wait = new WebDriverWait(_chromeDriver, TimeSpan.FromSeconds(10));
wait.Until(e => e.FindElement(By.Id("output")).Text
    .Should().Be("Hello World");

Actions API

There’s more API’s than just those listed thus far, but to go too much further would mean this post turns into an API tutorial, a little outside the realms of a simple blog post. Let’s end on one more API feature that we need to know about and that’s the Actions API.

We’ve used the WebElement to Click and SendKeys, these are seen as “high-level interactions”. Sometimes we want to go a little more “low-level”, this is where the Actions API comes in.

Actions are low level in the sense that you can generate keydown, keyup actions for situations where you might need to send keys CTRL+SHIFT+A for example. In this case you’ll need a CTRL keydown along with a SHIFT key down then a keypress for A finally in reverse order, SHIFT keyup and CTRL keyup. Obviously had these been sent as keys via SendKeys it would end up as a CTRL down and up followed by a SHIFT down and up and so on.

Actions also allow you to essential put together a whole bunch of actions in a single command. For example the following creates Actions by moving to the Copy button, then double clicking on it – the Perform method invokes the sequence of actions

var button = _chromeDriver.FindElement(By.XPath("//button[contains(text(), 'Copy')]"));
var actions = new Actions(_chromeDriver);
actions
   .MoveToElement(button)
   .DoubleClick()
   .Perform();