Category Archives: Appium

The “Custom” control type and WinAppDriver/Appium

So you’ve and application that you want to UI automation test using WinAppDriver/Appium. You’ve got a property grid with the left hand being the text/label and the right hand being the editor. You decided that a cool way to change values on the edit controls is to inspect what the ControlType is, then customise the code to SendKeys or Click or whatever on those controls.

Sound fair?

Well all this is great if your controls are not (as the title of this post suggests) “Custom” controls. So for WPF this is a UserControl or Control. This is fine if we have a single custom control but no so good if we have multiple custom control types.

This issue raise it’s head due to a HorizontalToggle control which we’re importing into our application via a NuGet package. The control derives from Control and is pretty much invisible to the UI Automation code apart from one Automation Id “SwithThumb”. So to fix this I wrapped the control in a UserControl and added an AutomationProperties.AutomationId attached property. Ofcourse, we could get the source if it’s available and change the code ourselves, but then we’ll have to handle upgrades etc. which may or may not be an issue in the future.

That’s great, now I can see the control but I have some generic code that wants to know the control type, so what can we do on this front?

The truth is we’re still quite limited in what we can do, if we’re getting all elements and trying to decide what to do based upon the ControlType. TextBoxes are Edit control types, Buttons are Button control types, but UserControls are still Custom control types.

Whilst this is NOT a perfect solutions, we can derive a class from a UserControl (which will still be used to wrap the original control), let’s call ours HorizontalToggleControl and it looks like this

public class HorizontalToggleControl : UserControl
{
   protected override AutomationPeer OnCreateAutomationPeer() => 
      new HorizontalToggleControlAutomationPeer(this);
}

What we’re doing here is taking over the OnCreateAutomationPeer and supplying our own automation peer, which will itself allow us to override some of the automation properties, specifically in our case the GetAutomationControlTypeCore.

My HorizontalToggleControlAutomationPeer class looks like this

internal class HorizontalToggleControlAutomationPeer : 
   UserControlAutomationPeer
{
   public HorizontalToggleControlAutomationPeer(UserControl owner) :
      base(owner)
   {
   }

   protected override AutomationControlType GetAutomationControlTypeCore() => 
      AutomationControlType.Thumb;

   protected override string GetLocalizedControlTypeCore() =>
      nameof(HorizontalToggleControl);

}

Now what’s happening in the above code is the we’re creating a localized control name “HorizontalToggleControl”, ofcourse this could literally be localised and read from the resources, but in our case we’re sticking with the actual control name. This, unfortunately is still no use to us as the ControlType in an element will still read as Custom. Changing the GetAutomationControlTypeCore return value fixes this but at the expense of only being able to set the control type to one of the AutomationControlType enums. So it’s of limited use, but as mentioned previously, we only really see the SwitchThumb automation id on the original control and so, Thumb seemed like a possible control type. In reality we might prefer CheckBox, but ofcourse the downside here is if we have check box elements, we’d need to ensure we also look at the automation name or property to determine what type of check box this is, a real Windows one or one that acts like a check box. Either way of doing this is fine.

Is the checkbox checked on my AppiumWebElement ?

From Appium/WinAppDriver you’ll probably want to check if a checkbox or radio button element is checked at some point. For such elements we use the AppiumWebElement Selected property to get the current state. To set it within UI automation we would click on it, so for example we might have code like this

public bool IsChecked
{
   get => Selected;
   set
   {
      if(IsChecked != value)
      {
         Click();
      }
   }
}

Note: In my case I wrap the AppiumWebElement in a CheckBoxElement, but this could be in a subclass or extension method, you get the idea.

Appium, Android and MAUI

Prerequisites

You’ll need Appium server (I’m using Appium Server UI), I installed into my Windows c:\tools\Appium folder and next (whilst not a requirement but is very useful) grab the latest version of Appium Inspector this will allows us to inspect elements within our running Android application, I unzipped this to c:\tools\AppiumInspector.

Appium Server GUI has a button to run the inspector but, unless I’m missing something, this simply sends you to the GitHub repo for the inspector, so run the inspector yourself separately.

Creating a simple app.

Create yourself a MAUI application (mine’s call MauiUIAutomationTest, so you’re see this listed in various places below, if you create your own name, remember to update in the code as well).

Like my previous Appium test samples the code is simple (I’ll recreate here). The MainPage.xaml within the ContentPage looks like this

<Grid>
   <Grid.RowDefinitions>
      <RowDefinition />
      <RowDefinition />
      <RowDefinition />
   </Grid.RowDefinitions>

   <Entry Grid.Row="0" AutomationId="input" Margin="10" Text="{Binding Input}" />
   <Button Grid.Row="1" AutomationId="copy" Margin="10" Command="{Binding ProcessCommand}" Text="Copy" />
   <Label Grid.Row="2" AutomationId="output" Margin="10" Text="{Binding Output}"/>
</Grid>

I’m using MVVM Community Toolkit and so MainPageViewModel.cs is simply this

public partial class MainPageViewModel : ObservableObject
{
    [ObservableProperty] private string input;
    [ObservableProperty] private string output;

    [RelayCommand]
    private void Process()
    {
       Output = Input;
    }
}

Creating our test code

I’ve created an NUnit test class library and added the NuGet package Appium.WebDriver (yes I know it’s an app. but that’s the NuGet package we need). If you’re creating your unit test class library from scratch you’ll need NUnit and NUnit3TestAdapter installed as dependencies as well).

Now my first unit test is, as follows

public class Tests
{
    private AndroidDriver<AndroidElement> _driver;

    [SetUp]
    public void Setup()
    {
        var driverOptions = new AppiumOptions();
        driverOptions.AddAdditionalCapability(MobileCapabilityType.PlatformName, "Android");
        driverOptions.AddAdditionalCapability(MobileCapabilityType.AutomationName, "UiAutomator2"); 
        driverOptions.AddAdditionalCapability(MobileCapabilityType.DeviceName, "Pixel_3_XL_API_29");

        _driver = new AndroidDriver<AndroidElement>(new Uri("http://localhost:4723/wd/hub"), driverOptions); 
        _driver.ActivateApp("com.companyname.mauiuiautomationtest");
    }

    [TearDown]
    public void TearDown()
    {
        _driver.Quit();
    }

    [Test]
    public void ChangeInput_ThenCopy_ExpectMatchingOutput()
    {
        var entry = _driver.FindElementById("com.companyname.mauiuiautomationtest:id/input666");
        entry.Clear();
        entry.SendKeys("Hello World");
        var button = _driver.FindElementById("com.companyname.mauiuiautomationtest:id/copy666");
        button.Click();
        var label = _driver.FindElementById("com.companyname.mauiuiautomationtest:id/output666");
        Assert.That(label.Text, Is.EqualTo("Hello World"));
    }
}

Obviously this is not perfect as we’re running the setup and tear down code after every test, but as I only have one test, I don’t care (it’s for the reader to do this correctly). Now we setup the driver options, the platformName is ofcourse Android, the deviceName relates to the device you’re using, I’m using the Android emulator for the Pixel 3 API 29, finally automationName specifies the automation engine to use, to be honest you could remove this and it’ll default the standard Appium engine but UiAutomator2 is listed as Appiums flagship engine, so let’s use that.

I had all sorts of issues trying to get Appium to start my MAUI app (as it will try to install it), however I think this is likely down to the way Visual Studio builds the apk, i.e. I probably need to publish a signed apk or the likes, but for now I’ll let Visual Studio deploy the app and then access it via Appium. Hence we have the following code to active an already installed app.

 
_driver.ActivateApp("com.companyname.mauiuiautomationtest");

Eager to see something working?

If you’re eager to see something working, then

  • Deploy your application to your Android emulator (I just get Visual Studio to deploy it)
  • Run the Appium Server GUI and start a new session
  • Run your unit test

If all went well the test was green and if not, check your application name etc.

Appium Inspector

The Appium server does not come with an inspector now, you need to download it separately (as stated earlier). Ofcourse we can probably get a lot done without an inspector, but it helps and gives confidence that if it see’s our id’s etc. then Appium will. You’ll need to tell the inspector to connect to the Appium server – make sure that Remote Path has the URL /wd/hub and you have valid the following Desired Capabilities

{
  "appium:automationName": "UIAutomator2",
  "appium:deviceName": "Pixel_3_XL_API_29",
  "platformName": "Android"
}

Ensure your have Appium server is running. Before you run the inspector just make sure your app. is running (just makes things simpler) and now start a session from your Appium Inspector. It should show you the screen of the current running application.

As you can see from the code below, we expect a “copy” automation id

<Button Grid.Row="1" AutomationId="copy" Margin="10" Command="{Binding ProcessCommand}" Text="Copy" />

MAUI AutomationId’s do not appear as AutomationId or Accessibility Id for that matter, but instead they become id’s but prefixed with you app name, i.e. that’s why our unit tests has com.companyname.mauiuiautomationtest:id/output. All that really matters is that we have a unique value on an Appium accessible property to do some UI Automation testing. We do have the option of XPath locators, but these should probably be used as a the last resort, if we do not have anything else to locate against.

WinAppDriver, connecting to existing instance of an application

In a previous post I showed how we can use specflow.actions.json to configure the WinAppDriver with the application name etc. but what if you want to simply connect to an existing instance of an application?

If we’re stick with Specflow then we would probably create a file in the Drivers folder of our test project (or ofcourse add one if it doesn’t exist)

public class WinAppDriver : IDisposable
{
   provate WindowsDriver<WindowsElement> _driver;

   public WindowsDriver<WindowsElement> Current
   {
      get 
      {
         if(!_driver != null) 
         {
            return _driver;
         }

         var appWindowHandle = new IntPtr();
         foreach(var clsProcess in Process.GetProcesses())
         {
            if(clsProcess.ProcessName.Contains("MyApp"))
            {
               appWindowHandle = clsProcess.MainWindowHandle;
               break;
            }
         }

         var appWindowHandleHex = appWindowHandle.ToString("x");

         var options = new AppiumOptions
         {
            PlatFormName = "Windows"
         };

         options.AddAdditionalCapability("deviceName", "WindowsPC");
         options.AddAdditionalCapability("appTopLevelWindows", appWindowHandleHex);

         return _driver = new WindowsDriver<WindowsElement>(new Uri("http://127.0.0.1:4723", options);
      }
   }
}

Testing Windows Package Application

With a packaged application, i.e. from the Window Store we cannot just supply the path of the .exe, instead we need the Package.appxmanifest “Package family name” which can be used as the “app” value

Appium, WinAppDriver and UI testing

I’ve used Teststack.White (and other) UI automation/testing tools in the past for WPF/WinForms testing. It looks like this project has been deprecated along with the Microsoft offering as part of Visual Studio Enterprise, i.e. CodedUI.

The blub on the Microsoft site regarding CodedUI deprecation suggests the alternative for desktop application testing is Appium and WinAppDriver.

Appium is a framework for native, hybrid and mobile web apps. It’s has multi-language support (just like Selenium) and is based on the mobile JSON wite protocol which is an extension to the Selenium JSON wire protocol, it also shares similarities in it’s API to Selenium, but Selenium is a web automation framework, whereas Appium allows us to run UI automation tests against native mobile and desktop.

So where does the WinAppDriver (in the posts title) come into this?

WinAppDriver is a Microsoft driver that Appium calls into via it’s API. WinAppDriver runs as a server on your local machine and provides automation services for UWP, WPF and WinForms.

Note: Whilst there’s aspects of WinAppDriver that are on Github, the actually source for the server is not.

The idea is WinAppDriver (WAD) is run, we use an Appium NuGet library to interact with it via the Appium API.

As stated, this post is all about the Windows desktop, but Appium also works with mobile, so we’ll look at mobile in another post.

Setting things up

  • You’ll need WinAppDriver, look at the releases and download from there (I’m using v1.2.1)
  • Install WinAppDriver, note where it’s installed, it’ll be something like C:\Program Files (x86)\Windows Application Driver\
  • Now for WinAppDriver to work on Win10, you’ll need to go to Developer Settings and set Developer Mode to ON

We’re going to use Specflow to write our test with, although this is immaterial to the actual UI automation testing it seems that it’s a tech. that’s often used alongside UI Automation testing.

Note: If you do not have the Specflow extension installed in Visual Studio then go and add that, it’ll add some nice syntax highlighting as well as templates etc.

Testing our app.

Let’s create a simple WPF application like the one in my previous Selenium post, it’ll contain a TextBox for input, a Button which when clicked will copy the text from input to a Label which is our output – simple…

Here’s some XAML you can slot into a MainWindow.xaml

<Grid>
   <Grid.RowDefinitions>
      <RowDefinition Height="Auto" />
      <RowDefinition Height="Auto" />
      <RowDefinition Height="Auto" />
   </Grid.RowDefinitions>
   <TextBox Grid.Row="0" Margin="10" Text="{Binding Input}" />
   <Button Grid.Row="1" Margin="10" Command="{Binding ProcessCommand}">Copy</Button>
   <Label Grid.Row="2" Margin="10" Content="{Binding Output}"/>
</Grid>

and here’s the view model (which uses the MVVM Community toolkit source generators)

public partial class MainWindowViewModel : ObservableObject
{
    [ObservableProperty] private string input;
    [ObservableProperty] private string output;

    [RelayCommand]
    private void Process()
    {
        Output = Input;
    }
}

Creating the tests

Now create a new class library (we could simply use the Specflow template but let’s do things by hand to see all the nitty gritty)

  • Create a NUnit test project (or whatever test framework you prefer)
  • As we’re going to use Specflow.NUnit, add the nuget package
  • Also add Specflow.Actions.WindowsAppDriver nuget package
  • Add specflow.actions.json to the root of the test project folder, it should look something like this
    {
      "windowsAppDriver": {
        "capabilities": {
          "app": "path and exe of your application"
        },
        "WindowsAppDriverPath": "path on WinAppDriver including WinAppDriver.exe"
      }
    }
    
  • Create a folder names Features and another named Steps within out project
  • Add a new item, choose a Specflow feature and I’ve named mine TestApplication.feature

Let’s get into the code

We’re going to follow the basic steps we took for the Selenium testing, so in our TestApplication.feature we’ll start with the first scenario

Feature: Test Application

@Default
Scenario: Check initial state
* Check the default values are correct

Now Specflow.Actions.WindowsAppDriver actually gives us an AppDriver that we can constructor inject into our scenarios, so we don’t have to set anything up if all the defaults are as expected. So in our TestApplicationStepDefinitions.cs generated from the Specflow feature file we have

We’re not loading the application as part of the scenario, this will happen automatically when the AppDriver is created by the Specflow library. So we’ll just check the defaults of the page exist and are correct on startup.

[Binding]
public class TestApplicationStepDefinitions : IDisposable
{
    private readonly AppDriver _windowsDriver;

    public TestApplicationStepDefinitions(AppDriver appDriver)
    {
        _windowsDriver = appDriver;
        _windowsDriver.Current.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(5000);
    }

    public void Dispose()
    {    
        _windowsDriver.Current.Quit();
        _windowsDriver.Dispose();
    }

    [Given(@"Check the default values are correct")]
    public void GivenCheckTheDefaultValuesAreCorrect()
    {
    }
}

If we run the test scenario the test application will display and then on dispose, it quits and closes. So far, so good. Let’s now implement the GivenLoadThePage method.

We need to add name’s or id’s to our WPF controls so that we can locate our elements. The WAD and Windows in general uses Name on things like WinForms, or in WPF we should use AutomationId’s like this

AutomationProperties.AutomationId="input"

So for example we would change our XAML to add AutomationId’s like this

<TextBox Grid.Row="0" AutomationProperties.AutomationId="input" Margin="10" Text="{Binding Input}" />
<Button Grid.Row="1" AutomationProperties.AutomationId="copy" Margin="10" Command="{Binding ProcessCommand}">Copy</Button>
<Label Grid.Row="2" AutomationProperties.AutomationId="output" Margin="10" Content="{Binding Output}"/>

Now we can write the following into the GivenCheckTheDefaultValuesAreCorrect method

_windowsDriver.Current.FindElementByAccessibilityId("input").Text
   .Should().BeEmpty();
_windowsDriver.Current.FindElementByAccessibilityId("copy")
   .Should().NotBeNull();
_windowsDriver.Current.FindElementByAccessibilityId(("output")).Text
   .Should().BeEmpty();

Notice we’re using FindElementByAccessibilityId as there’s no FindElementByAutomationId. The AccessibilityId method of Appium maps to AutomationId within WAD and ofcourse some of the Appium methods are meaningless to WAD, such as FindElementByCssSelector.

  • AccessibilityId maps to AutomationId
  • ClassName maps to ClassName
  • Name maps to Name

Tools

Before we move on with our second test scenario. All this is fairly easy, locating your elements is fairly easy when it’s liberally coated in AutomationId’s OR an app you control and hence can add these id’s. In situations where you cannot edit the source you’ll need to look for one of the other potential keys to locate elements….

If you’ve installed the Windows 10 SDK installed then check Program Files (for example C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64) for the inspect.exe. This utility will show you the properties of different Windows, and you can drill down to find the name, classname and/or automation id for different controls within your application.

Similar to inspect is UIAVerify which is an old tool but pretty good (it does crash occasionally though). Microsoft states this is a legacy tool and recommends Accessibility Insights which I’ve not used much yet, but looks nicer than inspect and with some of the capabilities of UIAVerify (and so far, crashes less).

Another alterantive tool is the WinAppDriver UIRecorder. At the time of writing this seems a pretty basic tool but useful but one useful aspect is that in can grab XPath for your elements.

Onto the second scenario

Okay so we’ve got our tools and now want to write a second scenario, as listed below (just add to your current .feature file)

@Copy
Scenario: Check input is copied to output
* Copy input to output

If you’ve come from reading the previous Selenium post, you’ll not be learning anything new, for anyone that’s just come straight to this post. We’ll now create the GivenCopyInputToOutput method that maps to the Given step in this new feature

[Given(@"Copy input to output")]
public void GivenCopyInputToOutput()
{
   _windowsDriver.Current.FindElementByAccessibilityId("input").SendKeys("Hello World");
   _windowsDriver.Current.FindElementByAccessibilityId("copy").Click();
   _windowsDriver.Current.FindElementByAccessibilityId("output").Text
      .Should().Be("Hello World");
}

If you run this scenario it should send the keys/string “Hello World” to the input field, it’ll then click the copy button and the output field should update to show the same text.
Now, this worked well because we are using the AutomationId. However, our button also has the Name property correctly set so we could use FindElementByName, let’s change thing so we try to find the “copy” button to use XPath using the Name property

_windowsDriver.Current.FindElementByXPath("//Button[@Name = 'Copy']").Click();

Note: When running against an application with a lot of elements, XPath can easily be 10-20s slower, so where possible try to stick to using name and automation id (and class name if that helps).

One thing the XPath option does offer though that’s really useful, is the power of XPath and so we can search for elements which maybe change dynamically in a prescribed way. So the example XPath above we’re looking for a button with a name of “Copy”, but maybe we have a button who’s name changes to “Copy First Name” depending on usage, now with XPath we can write something like

_windowsDriver.Current.FindElementByXPath("//Button[contains(@Name, 'Copy')]").Click();

Hence we locate a button with a Name that contains the string “Copy”. This could ofcourse be problematic if we have many buttons with the string “Copy” in which cases we can use FindElementsByXPath to get a collection of elements, then try to figure out what we want from there.

What next?

My Selenium post, section headed as What next? covers the topic of race conditions and using the Actions API. The code for Appium is pretty much the same as that for Selenium but ofcourse using the Windows driver and syntax/API, we’ll repeat some of that post here for completeness but if you’re read the Selenium post you’ll probably already be aware of the things mentioned here.

One of the biggest issues with UI Automation testing is the problem around race conditions, i.e. our automation test tries to locate an element that’s either not yet been displayed, or worse still, was displayed then hidden.

With regards waiting for a UI element to appear, we can ofcourse add some form of polling with a timeout (the Selenium API includes this), with regards something that was displayed before we were able to locate it, we’d obvious need to look at handling this is some fashion (most likely a combination of timeout and looking for some other element that might tell us that the former has gone – for example a progress indicator may have disappeared, but if the controls are enabled we don’t care that we couldn’t locate the progress indicator).

Okay, so let’s look at what we can do to make things a little better…

Selenium (and Appium) has an implicit wait timer which can be applied be used, but as the Selenium documentations states An implicit wait is rarely the best solution. However it is an option, so let’s check out what it looks like

_windowsDriver.Current.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(5000)

It may (and probably is) better that we use explicit timeouts within our code.

Note: I read somewhere that explicit and implicit in the same code can cause problems, so it sounds like it’s probably best to stick to one or the other for the most part or at least be aware of potential issues.

Let’s look at how we can wait for an element to appear. Let’s assume that our code’s Process method has changed to this rather crude (delayed update) code.

[RelayCommand]
private void Process()
{
   Task.Run(() =>
   {
      Thread.Sleep(3000);
   }).ContinueWith(tsk =>
   {
      Output = Input;
   });
}

In this code we very crudely simulate a 3 second operation taking place before Input is copied to Output. Assume it’s a web service call or whatever you like but it will mean that our UI Automation test code (without an an implicit wait of sufficient timeout) will NOT locate the change to Output immediately and thus the test will fail. As you’ll have realised, we now need an explicit wait on this code. Basically we want to poll the UI every n milliseconds for the Output to change. We will do this for a given timeout so if things have not updated in, say 10 seconds then there’s an issue. Here we go down to Selenium based code

var wait = new WebDriverWait(_windowsDriver.Current, TimeSpan.FromSeconds(10));
wait.Until(e => _windowsDriver.Current.FindElementByAccessibilityId("output").Text == "Hello World")
   .Should().BeTrue();

In the above code, we use the Selenium WebDriverWait (don’t worry about the prefix Web it works for non-Web as well). We tell it the driver to use and the timeout. I’ve put 10 seconds here, but the code will actually poll the UI every (by default) 500ms and when the condition is true it will complete, hence will stop as soon as the change is located or after the timeout period, whichever is first.

One caveat is, instead of using the e variable which will be an IWebDriver I use the _windowsDriver.Current as this includes the higher level methods such as FindElementByAccessibilityId.

Actions API

There’s more API’s than just those listed thus far, but to go too much further would mean this post turns into an API tutorial, a little outside the realms of a simple blog post. Let’s end on one more API feature that we need to know about and that’s the Actions API.

We’ve used the WindowsElement (and therefore AppiumElement) to Click and SendKeys, these are seen as “high-level interactions”. Sometimes we want to go a little more “low-level”, this is where the Actions API comes in.

Actions are low level in the sense that you can generate keydown, keyup actions for situations where you might need to send keys CTRL+SHIFT+A for example. In this case you’ll need a CTRL keydown along with a SHIFT key down then a keypress for A finally in reverse order, SHIFT keyup and CTRL keyup. Obviously had these been sent as keys via SendKeys it would end up as a CTRL down and up followed by a SHIFT down and up and so on (i.e. no keys held down for the duration of the interaction).

Actions also allow you to essential put together a whole bunch of actions in a single command. For example the following creates Actions by moving to the Copy button, then double clicking on it – the Perform method invokes the sequence of actions.

Here’s a simple example of an Actions API being use to double click a button

var button = _windowsDriver.Current.FindElementByAccessibilityId("copy");
var actions = new Actions(_windowsDriver.Current);
actions
   .MoveToElement(button)
   .DoubleClick()
   .Perform();