Category Archives: UI Automation Testing

The “Custom” control type and WinAppDriver/Appium

So you’ve and application that you want to UI automation test using WinAppDriver/Appium. You’ve got a property grid with the left hand being the text/label and the right hand being the editor. You decided that a cool way to change values on the edit controls is to inspect what the ControlType is, then customise the code to SendKeys or Click or whatever on those controls.

Sound fair?

Well all this is great if your controls are not (as the title of this post suggests) “Custom” controls. So for WPF this is a UserControl or Control. This is fine if we have a single custom control but no so good if we have multiple custom control types.

This issue raise it’s head due to a HorizontalToggle control which we’re importing into our application via a NuGet package. The control derives from Control and is pretty much invisible to the UI Automation code apart from one Automation Id “SwithThumb”. So to fix this I wrapped the control in a UserControl and added an AutomationProperties.AutomationId attached property. Ofcourse, we could get the source if it’s available and change the code ourselves, but then we’ll have to handle upgrades etc. which may or may not be an issue in the future.

That’s great, now I can see the control but I have some generic code that wants to know the control type, so what can we do on this front?

The truth is we’re still quite limited in what we can do, if we’re getting all elements and trying to decide what to do based upon the ControlType. TextBoxes are Edit control types, Buttons are Button control types, but UserControls are still Custom control types.

Whilst this is NOT a perfect solutions, we can derive a class from a UserControl (which will still be used to wrap the original control), let’s call ours HorizontalToggleControl and it looks like this

public class HorizontalToggleControl : UserControl
{
   protected override AutomationPeer OnCreateAutomationPeer() => 
      new HorizontalToggleControlAutomationPeer(this);
}

What we’re doing here is taking over the OnCreateAutomationPeer and supplying our own automation peer, which will itself allow us to override some of the automation properties, specifically in our case the GetAutomationControlTypeCore.

My HorizontalToggleControlAutomationPeer class looks like this

internal class HorizontalToggleControlAutomationPeer : 
   UserControlAutomationPeer
{
   public HorizontalToggleControlAutomationPeer(UserControl owner) :
      base(owner)
   {
   }

   protected override AutomationControlType GetAutomationControlTypeCore() => 
      AutomationControlType.Thumb;

   protected override string GetLocalizedControlTypeCore() =>
      nameof(HorizontalToggleControl);

}

Now what’s happening in the above code is the we’re creating a localized control name “HorizontalToggleControl”, ofcourse this could literally be localised and read from the resources, but in our case we’re sticking with the actual control name. This, unfortunately is still no use to us as the ControlType in an element will still read as Custom. Changing the GetAutomationControlTypeCore return value fixes this but at the expense of only being able to set the control type to one of the AutomationControlType enums. So it’s of limited use, but as mentioned previously, we only really see the SwitchThumb automation id on the original control and so, Thumb seemed like a possible control type. In reality we might prefer CheckBox, but ofcourse the downside here is if we have check box elements, we’d need to ensure we also look at the automation name or property to determine what type of check box this is, a real Windows one or one that acts like a check box. Either way of doing this is fine.

Appium, WinAppDriver and UI testing

I’ve used Teststack.White (and other) UI automation/testing tools in the past for WPF/WinForms testing. It looks like this project has been deprecated along with the Microsoft offering as part of Visual Studio Enterprise, i.e. CodedUI.

The blub on the Microsoft site regarding CodedUI deprecation suggests the alternative for desktop application testing is Appium and WinAppDriver.

Appium is a framework for native, hybrid and mobile web apps. It’s has multi-language support (just like Selenium) and is based on the mobile JSON wite protocol which is an extension to the Selenium JSON wire protocol, it also shares similarities in it’s API to Selenium, but Selenium is a web automation framework, whereas Appium allows us to run UI automation tests against native mobile and desktop.

So where does the WinAppDriver (in the posts title) come into this?

WinAppDriver is a Microsoft driver that Appium calls into via it’s API. WinAppDriver runs as a server on your local machine and provides automation services for UWP, WPF and WinForms.

Note: Whilst there’s aspects of WinAppDriver that are on Github, the actually source for the server is not.

The idea is WinAppDriver (WAD) is run, we use an Appium NuGet library to interact with it via the Appium API.

As stated, this post is all about the Windows desktop, but Appium also works with mobile, so we’ll look at mobile in another post.

Setting things up

  • You’ll need WinAppDriver, look at the releases and download from there (I’m using v1.2.1)
  • Install WinAppDriver, note where it’s installed, it’ll be something like C:\Program Files (x86)\Windows Application Driver\
  • Now for WinAppDriver to work on Win10, you’ll need to go to Developer Settings and set Developer Mode to ON

We’re going to use Specflow to write our test with, although this is immaterial to the actual UI automation testing it seems that it’s a tech. that’s often used alongside UI Automation testing.

Note: If you do not have the Specflow extension installed in Visual Studio then go and add that, it’ll add some nice syntax highlighting as well as templates etc.

Testing our app.

Let’s create a simple WPF application like the one in my previous Selenium post, it’ll contain a TextBox for input, a Button which when clicked will copy the text from input to a Label which is our output – simple…

Here’s some XAML you can slot into a MainWindow.xaml

<Grid>
   <Grid.RowDefinitions>
      <RowDefinition Height="Auto" />
      <RowDefinition Height="Auto" />
      <RowDefinition Height="Auto" />
   </Grid.RowDefinitions>
   <TextBox Grid.Row="0" Margin="10" Text="{Binding Input}" />
   <Button Grid.Row="1" Margin="10" Command="{Binding ProcessCommand}">Copy</Button>
   <Label Grid.Row="2" Margin="10" Content="{Binding Output}"/>
</Grid>

and here’s the view model (which uses the MVVM Community toolkit source generators)

public partial class MainWindowViewModel : ObservableObject
{
    [ObservableProperty] private string input;
    [ObservableProperty] private string output;

    [RelayCommand]
    private void Process()
    {
        Output = Input;
    }
}

Creating the tests

Now create a new class library (we could simply use the Specflow template but let’s do things by hand to see all the nitty gritty)

  • Create a NUnit test project (or whatever test framework you prefer)
  • As we’re going to use Specflow.NUnit, add the nuget package
  • Also add Specflow.Actions.WindowsAppDriver nuget package
  • Add specflow.actions.json to the root of the test project folder, it should look something like this
    {
      "windowsAppDriver": {
        "capabilities": {
          "app": "path and exe of your application"
        },
        "WindowsAppDriverPath": "path on WinAppDriver including WinAppDriver.exe"
      }
    }
    
  • Create a folder names Features and another named Steps within out project
  • Add a new item, choose a Specflow feature and I’ve named mine TestApplication.feature

Let’s get into the code

We’re going to follow the basic steps we took for the Selenium testing, so in our TestApplication.feature we’ll start with the first scenario

Feature: Test Application

@Default
Scenario: Check initial state
* Check the default values are correct

Now Specflow.Actions.WindowsAppDriver actually gives us an AppDriver that we can constructor inject into our scenarios, so we don’t have to set anything up if all the defaults are as expected. So in our TestApplicationStepDefinitions.cs generated from the Specflow feature file we have

We’re not loading the application as part of the scenario, this will happen automatically when the AppDriver is created by the Specflow library. So we’ll just check the defaults of the page exist and are correct on startup.

[Binding]
public class TestApplicationStepDefinitions : IDisposable
{
    private readonly AppDriver _windowsDriver;

    public TestApplicationStepDefinitions(AppDriver appDriver)
    {
        _windowsDriver = appDriver;
        _windowsDriver.Current.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(5000);
    }

    public void Dispose()
    {    
        _windowsDriver.Current.Quit();
        _windowsDriver.Dispose();
    }

    [Given(@"Check the default values are correct")]
    public void GivenCheckTheDefaultValuesAreCorrect()
    {
    }
}

If we run the test scenario the test application will display and then on dispose, it quits and closes. So far, so good. Let’s now implement the GivenLoadThePage method.

We need to add name’s or id’s to our WPF controls so that we can locate our elements. The WAD and Windows in general uses Name on things like WinForms, or in WPF we should use AutomationId’s like this

AutomationProperties.AutomationId="input"

So for example we would change our XAML to add AutomationId’s like this

<TextBox Grid.Row="0" AutomationProperties.AutomationId="input" Margin="10" Text="{Binding Input}" />
<Button Grid.Row="1" AutomationProperties.AutomationId="copy" Margin="10" Command="{Binding ProcessCommand}">Copy</Button>
<Label Grid.Row="2" AutomationProperties.AutomationId="output" Margin="10" Content="{Binding Output}"/>

Now we can write the following into the GivenCheckTheDefaultValuesAreCorrect method

_windowsDriver.Current.FindElementByAccessibilityId("input").Text
   .Should().BeEmpty();
_windowsDriver.Current.FindElementByAccessibilityId("copy")
   .Should().NotBeNull();
_windowsDriver.Current.FindElementByAccessibilityId(("output")).Text
   .Should().BeEmpty();

Notice we’re using FindElementByAccessibilityId as there’s no FindElementByAutomationId. The AccessibilityId method of Appium maps to AutomationId within WAD and ofcourse some of the Appium methods are meaningless to WAD, such as FindElementByCssSelector.

  • AccessibilityId maps to AutomationId
  • ClassName maps to ClassName
  • Name maps to Name

Tools

Before we move on with our second test scenario. All this is fairly easy, locating your elements is fairly easy when it’s liberally coated in AutomationId’s OR an app you control and hence can add these id’s. In situations where you cannot edit the source you’ll need to look for one of the other potential keys to locate elements….

If you’ve installed the Windows 10 SDK installed then check Program Files (for example C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64) for the inspect.exe. This utility will show you the properties of different Windows, and you can drill down to find the name, classname and/or automation id for different controls within your application.

Similar to inspect is UIAVerify which is an old tool but pretty good (it does crash occasionally though). Microsoft states this is a legacy tool and recommends Accessibility Insights which I’ve not used much yet, but looks nicer than inspect and with some of the capabilities of UIAVerify (and so far, crashes less).

Another alterantive tool is the WinAppDriver UIRecorder. At the time of writing this seems a pretty basic tool but useful but one useful aspect is that in can grab XPath for your elements.

Onto the second scenario

Okay so we’ve got our tools and now want to write a second scenario, as listed below (just add to your current .feature file)

@Copy
Scenario: Check input is copied to output
* Copy input to output

If you’ve come from reading the previous Selenium post, you’ll not be learning anything new, for anyone that’s just come straight to this post. We’ll now create the GivenCopyInputToOutput method that maps to the Given step in this new feature

[Given(@"Copy input to output")]
public void GivenCopyInputToOutput()
{
   _windowsDriver.Current.FindElementByAccessibilityId("input").SendKeys("Hello World");
   _windowsDriver.Current.FindElementByAccessibilityId("copy").Click();
   _windowsDriver.Current.FindElementByAccessibilityId("output").Text
      .Should().Be("Hello World");
}

If you run this scenario it should send the keys/string “Hello World” to the input field, it’ll then click the copy button and the output field should update to show the same text.
Now, this worked well because we are using the AutomationId. However, our button also has the Name property correctly set so we could use FindElementByName, let’s change thing so we try to find the “copy” button to use XPath using the Name property

_windowsDriver.Current.FindElementByXPath("//Button[@Name = 'Copy']").Click();

Note: When running against an application with a lot of elements, XPath can easily be 10-20s slower, so where possible try to stick to using name and automation id (and class name if that helps).

One thing the XPath option does offer though that’s really useful, is the power of XPath and so we can search for elements which maybe change dynamically in a prescribed way. So the example XPath above we’re looking for a button with a name of “Copy”, but maybe we have a button who’s name changes to “Copy First Name” depending on usage, now with XPath we can write something like

_windowsDriver.Current.FindElementByXPath("//Button[contains(@Name, 'Copy')]").Click();

Hence we locate a button with a Name that contains the string “Copy”. This could ofcourse be problematic if we have many buttons with the string “Copy” in which cases we can use FindElementsByXPath to get a collection of elements, then try to figure out what we want from there.

What next?

My Selenium post, section headed as What next? covers the topic of race conditions and using the Actions API. The code for Appium is pretty much the same as that for Selenium but ofcourse using the Windows driver and syntax/API, we’ll repeat some of that post here for completeness but if you’re read the Selenium post you’ll probably already be aware of the things mentioned here.

One of the biggest issues with UI Automation testing is the problem around race conditions, i.e. our automation test tries to locate an element that’s either not yet been displayed, or worse still, was displayed then hidden.

With regards waiting for a UI element to appear, we can ofcourse add some form of polling with a timeout (the Selenium API includes this), with regards something that was displayed before we were able to locate it, we’d obvious need to look at handling this is some fashion (most likely a combination of timeout and looking for some other element that might tell us that the former has gone – for example a progress indicator may have disappeared, but if the controls are enabled we don’t care that we couldn’t locate the progress indicator).

Okay, so let’s look at what we can do to make things a little better…

Selenium (and Appium) has an implicit wait timer which can be applied be used, but as the Selenium documentations states An implicit wait is rarely the best solution. However it is an option, so let’s check out what it looks like

_windowsDriver.Current.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(5000)

It may (and probably is) better that we use explicit timeouts within our code.

Note: I read somewhere that explicit and implicit in the same code can cause problems, so it sounds like it’s probably best to stick to one or the other for the most part or at least be aware of potential issues.

Let’s look at how we can wait for an element to appear. Let’s assume that our code’s Process method has changed to this rather crude (delayed update) code.

[RelayCommand]
private void Process()
{
   Task.Run(() =>
   {
      Thread.Sleep(3000);
   }).ContinueWith(tsk =>
   {
      Output = Input;
   });
}

In this code we very crudely simulate a 3 second operation taking place before Input is copied to Output. Assume it’s a web service call or whatever you like but it will mean that our UI Automation test code (without an an implicit wait of sufficient timeout) will NOT locate the change to Output immediately and thus the test will fail. As you’ll have realised, we now need an explicit wait on this code. Basically we want to poll the UI every n milliseconds for the Output to change. We will do this for a given timeout so if things have not updated in, say 10 seconds then there’s an issue. Here we go down to Selenium based code

var wait = new WebDriverWait(_windowsDriver.Current, TimeSpan.FromSeconds(10));
wait.Until(e => _windowsDriver.Current.FindElementByAccessibilityId("output").Text == "Hello World")
   .Should().BeTrue();

In the above code, we use the Selenium WebDriverWait (don’t worry about the prefix Web it works for non-Web as well). We tell it the driver to use and the timeout. I’ve put 10 seconds here, but the code will actually poll the UI every (by default) 500ms and when the condition is true it will complete, hence will stop as soon as the change is located or after the timeout period, whichever is first.

One caveat is, instead of using the e variable which will be an IWebDriver I use the _windowsDriver.Current as this includes the higher level methods such as FindElementByAccessibilityId.

Actions API

There’s more API’s than just those listed thus far, but to go too much further would mean this post turns into an API tutorial, a little outside the realms of a simple blog post. Let’s end on one more API feature that we need to know about and that’s the Actions API.

We’ve used the WindowsElement (and therefore AppiumElement) to Click and SendKeys, these are seen as “high-level interactions”. Sometimes we want to go a little more “low-level”, this is where the Actions API comes in.

Actions are low level in the sense that you can generate keydown, keyup actions for situations where you might need to send keys CTRL+SHIFT+A for example. In this case you’ll need a CTRL keydown along with a SHIFT key down then a keypress for A finally in reverse order, SHIFT keyup and CTRL keyup. Obviously had these been sent as keys via SendKeys it would end up as a CTRL down and up followed by a SHIFT down and up and so on (i.e. no keys held down for the duration of the interaction).

Actions also allow you to essential put together a whole bunch of actions in a single command. For example the following creates Actions by moving to the Copy button, then double clicking on it – the Perform method invokes the sequence of actions.

Here’s a simple example of an Actions API being use to double click a button

var button = _windowsDriver.Current.FindElementByAccessibilityId("copy");
var actions = new Actions(_windowsDriver.Current);
actions
   .MoveToElement(button)
   .DoubleClick()
   .Perform();

UI Automation Testing with Selenium

We’re going on a short journey into the realms of User Interface Automation Testing. This post is dedicated to Selenium which is a Web based automation framework. Now it’s important to note that Selenium is aimed at desktop web and mobile web automation testing – not native desktop, i.e. Windows, WinForms, WPF, UWP, Maui OR Linux, Mac (SwiftUI or UIKit) or native mobile such as Android and iOS.

An alternative web automation testing tool that I will probably look at in another post is Playwright, but ofcourse there are other framework for this.

So why am I starting with Selenium? Well I’m looking at UI Automation testing Windows and mobile apps. and one of the options for testing these is Appium which has it’s root in Selenium (at the very least in that it uses the an extension to the Selenium JSON wire protocol), so I figured it’d be good to have a look at Selenium first, to get an idea of the similarities etc. Okay, this is getting too long winded, let’s get into the code.

Note: Spoiler alert, whilst there are, ofcourse similarities between Appium and Selenium, the API differs in some cases, but it’s all similar enough to make it an easy transition, but beware desktop apps are not designed around a DOM like HTML and some things, such as search for elements using XPath (which we’ll see later) has a performance overhead on the desktop that much less obvious on the web.

Anyway let’s continue with Selenium and the web for now…

Sample App

My same app is going to be a React app (for no good reason other than I like to keep refreshing my React knowledge). The app. is going to look awful, but I don’t care as all I want to do is automation it and test it, so if you create a React app using yarn create react-app my-app –template typescript and simple change App.tsx to look like this

import { useState } from 'react';
import './App.css';

function App() {
  const [state, setState] = useState({
    input: "",
    output: ""
  });

  return (
    <div className="App">
      <input type="text" value={state.input} onChange={ev => setState({
        ...state,
        input: ev.target.value
      })}/>
      <button onClick={() => setState({
        ...state,
        output: state.input
      })}>Copy</button>
      <div>{state.output}</div>
    </div>
  );
}

export default App;

Note: this code is a little horrible, but as stated, I’m really not interested in the sample app beyond automating it, so feel free to run your own web app using whatever tech. and code you prefer.

Also in index.html, change the title to Test App we’ll use this to check out page loaded, before trying to interact with it.

So the UI above is a simple textbox which takes some input, a button that, when press will copy the input to the output div. Like I said it’s not pretty and it’s very basic, but it’s a good enough starting point as we’ll need to find elements, check then, send keystrokes to them, click a button and check the output text is as expected – we’ll add more as we go – but again I’m not wasting time making the UI or code look good.

Creating your test app.

Now I’m going to be using Selenium from C# – WHAT !?? I hear you cry. Well Selenium has support for Java, Python, C# and more, so why not (and ofcourse I intend to move to doing desktop and mobile testing and that will be using C#, so we can compare the code etc. as we go.

I’m going to be using Specflow to write my tests, if you prefer, ignore the Specflow parts of this post and just look at any Specflow generated methods as unit tests in your prefered unit testing framework. For this and my next post I will be using Specflow, NUnit and FluentAssertions Library. As I have the Specflow templates installed in Visual Studio, I’ve taken the following steps

  • From Create New Project, create a Specflow project
  • Ensure Add FluentAssertsion Library is checked
  • Create the project
  • Add Nuget Package Selenium.WebDriver.ChromeDriver
  • Add Nuget Package Selenium.WebDriver

Let’s start with a very simple test, so changing the Calculator.feature you get from the project template to the following

Feature: Test Application

@Default
Scenario: Check initial state
* Load the page
* Check the default values are correct

Note: I’m using the short hand syntax for Given, i.e. * to simply define the steps of my test – again, I’m just wanting to write the bare minimum definition and support code so we can concentrate on the Selenium bits, but if the above offends, please switch for Given, And, When Then etc.

Once you generate your definition steps we end up with the following (but with a couple of code changes to allow the class to adhere to IDisposable

[Binding]
public class TestApplicationStepDefinitions : IDisposable
{
   public void Dispose()
   {
   }

   [Given(@"Load the page")]
   public void GivenLoadThePage()
   {
   }

   [Given(@"Check the default values are correct")]
   public void GivenCheckTheDefaultValuesAreCorrect()
   {
   }
}

It’s probably obvious that we’re going to need to load the page from our sample app. in the GivenLoadThePage method then we’ll get the page elements and check them in GivenCheckTheDefaultValuesAreCorrect by asserting they are as expected.

  • Let’s run our React sample app using yarn start, so we have that available to us.
  • Let’s add a field to the class, this will contain the Chrome web driver (ofcourse you can add packages for other drivers, such as Edge and use those instead). The code and the constructor and the Dispose method now looks like this
    private readonly ChromeDriver _chromeDriver;
    
    public TestApplicationStepDefinitions()
    {
       _chromeDriver = new ChromeDriver();
    }
    
    public void Dispose()
    {
       _chromeDriver.Dispose();
    }
    
  • Next let’s add the following code to GivenLoadThePage
    _chromeDriver.Navigate().GoToUrl("http://localhost:3000");
    

    Ofcourse this assumes you’re web page is running on localhost:3000, so change this to suit. This code will basically start up an instance of Chrome and navigate to the URL, but as there’s now further tests/interaction code at this point, it’ll then be dispose of and closed.

  • Let’s actually check we really did load Chrome and our page was displayed, so following the Navigate line add the following
    _chromeDriver.Title.Equals("Test App", StringComparison.OrdinalIgnoreCase)
       .Should().BeTrue();
    

    Basically we want to check the page has the expected title before we run further test scenarios.

So the first thing you might find is that, the navigation to the page and the tests are so quick you have no idea what was being “seen” by Selenium and therefore whether things were even working. Ofcourse the test we just wrote gives us some confidence, but if you have a typo in the name of your application (for example) you’ll have no idea what went wrong unless you log or write information to console as well – so I’d recommend that when carrying out your tests you log to console (at least) some information to help tell you things, such as which element couldn’t be found etc. However from brevity we’ll ignore such logging for now.

Let’s now fill in the second step we created – this checks the elements are in a default state, i.e. input field is empty, output field is empty and button exists – this will help give us confidence that further tests are not tainted by any incorrect default or starting state.

Selenium uses the method FindElement to locate elements via name, class, id etc. using the By class to define what we’re using to locate an element, so let’s add some code to the GivenCheckTheDefaultValuesAreCorrect method, it should look like this

_chromeDriver.FindElement(By.Name("input")).Text
   .Should().BeEmpty();
_chromeDriver.FindElement(By.Name("copy"))
   .Should().NotBeNull();
_chromeDriver.FindElement(By.Name("output")).Text
   .Should().BeEmpty();

So as you can see, we’re looking for each element and in the input and output case, checking that the text is empty and in the copy element, the button’s state, we’re just basically checking it exists. Ofcourse we could simple ignore the button until we try to use it – that’s down to the person writing the tests, but in the case I’ll check everything exists as expected.

If we now run this scenario via the Visual Studio Test Runner, the first test for the title should pass and then the second will fail – it’ll fail because we never assigned names to our elements within our React Test app. Now if we add name=”input” to the input element in our Test App. then that test will pass but the button fails. Ofcourse we could add a name to this element, but it’s got the text “Copy”, so why not simply look for a button with the text “Copy”, we can do that using the By XPath version of FindElement, so change the “copy” line to the following

_chromeDriver.FindElement(By.XPath("//button[contains(text(), 'Copy')]"))
   .Should().NotBeNull();

Running the test through the Test Runner now shows both the first two assertions were met, but the output element cannot be found – div’s don’t have a name option, but they do had id, so change the Test App div to include id=”output”. Obviously we’ll also need to change out FindElement By.Name to By.Id, i.e.

_chromeDriver.FindElement(By.Id("output")).Text
   .Should().BeEmpty();

Okay so we actually learned a fair amount there about how to get our app up and running and interact (at least by finding) elements on the screen (or within the DOM). Now let’s interact with the page…

Interacting with our application

Once we find an element we can call methods on it such as SendKeys and Click, so for out next test we want to input some string into the input element, click the button and expect to see the same text in the output element, so I’ve created a new scenario

@Copy
Scenario: Check input is copied to output
* Load the page
* Copy input to output

Notice that when each scenario ends, at this time we are disposing of the chrome driver and so the next scenario then needs to start the driver and navigate to the page again – ofcourse there are ways to change our code to stop this, but for now, I’m not worried, the app is small and fast to load. So let’s create the step definition for “Copy input to output”

[Given(@"Copy input to output")]
public void GivenCopyInputToOutput()
{
   _chromeDriver.FindElement(By.Name("input")).SendKeys("Hello World");
   _chromeDriver.FindElement(By.XPath("//button[contains(text(), 'Copy')]")).Click();
   _chromeDriver.FindElement(By.Id("output")).Text
      .Should().Be("Hello World");
}

In the above we find the input element, send the string “Hello World” to it, then press the Copy button and then assert that the output has the expected string (i.e. “Hello World”). As we know that each scenario gets a clean/default webpage we do not need to check the output element is empty first (i.e. to make sure it’s not somehow already got “Hello World” in it). Ofcourse we could pass different strings into this step via Cucumber’s/Specflow’s Example keyword – I’ll leave that to the reader to play with if they wish.

If all went according to plan both scenario’s will complete successfully.

What next?

In this post, I’m not going to dig into all the features of Selenium as what I’ve listed here are the key starting points, however there things to be aware of and gotchas awaiting.

For example one of the biggest issues are race condition (and this relates to ALL version of UI Automation Testing). The first of these might be down to whether elements are currently on the screen, i.e. maybe they’re only written to the DOM when required. Selenium includes implicit waits using

_chromeDriver.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(5000);

As the Selenium documentations states An implicit wait is rarely the best solution. It certainly doesn’t help with the next issue. What if your element IS in the DOM and visible but the process of copying the input to output takes a lot longer than it takes to find the output element. We will end up with the element in it’s default state and therefore the test will fail as it’s run before the output is updated from the button click.

Lets prove that point, I’m going to change the our Test app by first adding the following function

function wait(callback: () => void) {
   setTimeout(callback, 3000);
}

and now the code for the button looks like this

<button onClick={() => {
   wait(() => {
      setState({
         ...state,
         output: state.input
      });
   });
}}>Copy</button>

So what happens here is the button click runs the code after a 3s delay, so the code all still works if, you as the user watch the screen (although obviously it’s slower than expected), but for our test scenario… well just run it and see.

I ran mine and got Expected _chromeDriver.FindElement(By.Id(“output”)).Text to be “Hello World” with a length of 11, but “” has a length of 0, differs near “” (index 0)., so there’s a problem – if we introduce changes to our application which slow things down enough then out tests that once passed will now fail.

As stated earlier implicit waits are not the best solution even for elements not yet being visible, but for scenarios such as this, an implicit wait is useless because we got the element just fine, what we do need to do is wait until the Text changes.

Sadly there’s no such thing, such as an event, to alert us when elements change – so we had to use explicit waits with conditional tests and timeouts, in other words we’ll basically poll the element for it’s text and set a timeout where we decide it’s never coming, something like this

// potential race condition
//_chromeDriver.FindElement(By.Id("output")).Text
//    .Should().Be("Hello World");

// explicit wait (still potential race condition, but hopefully less likely)
var wait = new WebDriverWait(_chromeDriver, TimeSpan.FromSeconds(10));
wait.Until(e => e.FindElement(By.Id("output")).Text == "Hello World")
   .Should().BeTrue();

So now we’re using WebDriverWait to wait until a condition is met with a timeout of 10s, in other words this will essentially poll the test application (the default polling interval is 500ms) and get the element (if an element doesn’t exist the exception is handle for us). It will keep trying until the timeout then will return (in this case) either a True or will throw a WebDriverTimeoutException exception as the condition was not met within the timeout.

As you can see, for testing whether an element exists – i.e. once that might dynamically get created we would just use

var wait = new WebDriverWait(_chromeDriver, TimeSpan.FromSeconds(10));
wait.Until(e => e.FindElement(By.Id("output")).Text
    .Should().Be("Hello World");

Actions API

There’s more API’s than just those listed thus far, but to go too much further would mean this post turns into an API tutorial, a little outside the realms of a simple blog post. Let’s end on one more API feature that we need to know about and that’s the Actions API.

We’ve used the WebElement to Click and SendKeys, these are seen as “high-level interactions”. Sometimes we want to go a little more “low-level”, this is where the Actions API comes in.

Actions are low level in the sense that you can generate keydown, keyup actions for situations where you might need to send keys CTRL+SHIFT+A for example. In this case you’ll need a CTRL keydown along with a SHIFT key down then a keypress for A finally in reverse order, SHIFT keyup and CTRL keyup. Obviously had these been sent as keys via SendKeys it would end up as a CTRL down and up followed by a SHIFT down and up and so on.

Actions also allow you to essential put together a whole bunch of actions in a single command. For example the following creates Actions by moving to the Copy button, then double clicking on it – the Perform method invokes the sequence of actions

var button = _chromeDriver.FindElement(By.XPath("//button[contains(text(), 'Copy')]"));
var actions = new Actions(_chromeDriver);
actions
   .MoveToElement(button)
   .DoubleClick()
   .Perform();