Test Run

Low-Level UI Test Automation

James McCaffrey

Code download available at: TestRun0509.exe (134 KB)
Browse the Code Online

Contents

The Application Under Test
The Test Automation Code
Finding Windows
Manipulating the App and Checking App State
Message Boxes and Menus
Extending the Test Harness

There are several ways to test a Windows®-based application through its user interface. For example, in the January 2005 issue of MSDN®Magazine (Test Run: Lightweight UI Test Automation with .NET) I described a lightweight technique for testing .NET-based applications using .NET reflection. Since then, many of you have asked how to perform lower-level tasks like clicking away a Message Box. This column answers all those questions by showing you how to write lightweight, low-level UI test automation for Windows-based applications. The technique I'll discuss involves calling Windows API functions such as FindWindow (exposed by user32.dll) and sending Windows messages like WM_LBUTTONDOWN to the application under test.

Now, let's take a look at Figure 1. It shows a dummy Windows color mixer application that allows a user to type a color into a textbox control, then type or select a color in a combobox control, and, after clicking on the button control, a message that represents the result of "mixing" the two colors is displayed in a listbox control. In Figure 1, red and blue produces purple according to the application. The UI test automation is a console application that launches the application under test, obtains window handles to the windows and controls on the app, and simulates a user entering information and clicking the button control. After automatically clicking away an error message window, the test automation checks the resulting state of the application under test, verifies that the listbox control contains the correct message, and prints a pass or fail result. I captured the screen shot in Figure 1 just before the test automation simulates a user clicking on the File | Exit menu item to close the application under test.

Figure 1 Form UI Test Automation

Figure 1** Form UI Test Automation **

In the sections that follow I will briefly describe the sample application I am testing, explain how to launch it from the test automation program, and how to get references to the application's various windows and controls. I will also simulate user actions and check application state using Windows API functions and messages. Then I'll describe how you can extend and modify the test system to meet your own needs. I think you'll find the ability to quickly write lightweight low-level automated UI test automation a useful addition to your skill set.

The Application Under Test

Let's look at the application under test. The color mixer application is a simple Windows-based form. For simplicity I accepted the Visual Studio® .NET default control names of Form1, textBox1, comboBox1, button1, and listBox1. I added a single top-level menu item, File, which has a single submenu item, Exit. The heart of the application under test is the code in Figure 2.

Figure 2 Color Mixer Application Code

private void button1_Click(object sender, System.EventArgs e)
{
    string tb = textBox1.Text;
    string cb = comboBox1.Text;

    if (tb == "<enter color>" || cb == "<pick>")
        MessageBox.Show("You need 2 colors", "Error");
    else if (tb == cb)
        listBox1.Items.Add("Result is " + tb);
    else if ((tb == "red" && cb == "blue") || 
             (tb == "blue" && cb =="red"))
        listBox1.Items.Add("Result is purple");
    else
        listBox1.Items.Add("Result is black");
}

When a user clicks on the button1 control, the application grabs the text values in the textBox1 and comboBox1 controls. If the user has not entered two colors (changing the contents of the controls from their default values), a Message Box is displayed with an error message (see Figure 3). As you'll see later, a Message Box is a top-level window which cannot be manipulated directly through the application that calls it. This means if you want to be able to click the Message Box away you'll have to work at the window level instead of the application level.

Figure 3 Error Message

Figure 3** Error Message **

If the two color strings match, a message with that color is displayed. If the textbox and combobox controls contain "red" and "blue", a result message with "purple" is displayed. If any other color combinations are in the textbox and combobox controls, a result message with "black" is displayed. Obviously this is just a dummy app for demo purposes but it has most of the fundamental characteristics of Windows-based apps that are needed to demonstrate automated UI testing.

Manually testing even this minimal application through its user interface would be monotonous, error prone, time consuming, and inefficient. You'd have to enter some inputs, click the button control, visually verify the result message, and manually record the pass/fail result. A much better approach is to leverage the capabilities of the .NET environment to write lightweight test automation that simulates a user exercising the application and determining if the application has responded correctly. By automating tedious test cases you can free up time for more interesting and useful cases that need to be performed manually.

The Test Automation Code

The overall outline of the test code is shown in Figure 4. I decided to use C# but you can easily modify the code to Visual Basic® .NET or any other .NET-compliant language. I begin by adding and declaring a reference to a library named WindowControllerLib that contains all the low-level code and really does all the work (this library is available for download from the MSDN Magazine Web site). I'll explain that code in the next three sections of this column. I could have placed all the library code directly in the test harness, but creating a separate library DLL provides better organization and allows the library code to be reused more easily.

Figure 4 UI Test Automation Code Structure

using System;
using System.Diagnostics;  // for Process.Start()
using System.Threading;    // for Thread.Sleep() 
using WindowControllerLib; // low-level user-defined library

namespace RunTest {
  class Class1 {
    static void Main(string[] args) {
      try {
        Console.WriteLine("\nStart test scenario\n");
        Console.WriteLine("Launching app under test");
        using(Process p = Process.Start("C:\\LowLevelUIAutomation\\" + 
            "WinApp\\bin\\Debug\\WinApp.exe"))
        {
          // find the app form window
          // find textBox1, comboBox1, button1, listBox1
          // click button, get error message box
          // find message box
          // get message box OK button, click it
          // type 'red' to textBox1 and 'blue' to comboBox1
          // click button
          // check listBox1 state, print 'pass' or 'fail'
          // get File->Exit, and click it
        } 
        Console.WriteLine("\nEnd test scenario");
      }
      catch(Exception ex)
      {
        Console.WriteLine("Fatal error: " + ex.Message);
      }
    } 
  } 
}

After launching the application under test, the first step to automating is to get a reference to it:

Controller f = new Controller(p.MainWindowHandle);

My library has a class named Controller that represents a window. Here I declare a Controller object to represent the main application form. The handle to the main window of the application is available from a property on the Process object returned from the call to Process.Start.

Finding Windows

The key to my lightweight low-level UI automation is a library I named WindowControllerLib. It consists of a single class named Controller with a single data field:

private IntPtr ptrToWindow = IntPtr.Zero; // handle

The System.IntPtr data type is a platform-specific type that is used to represent a native pointer or handle.

I defined four overloaded Controller class constructors. The constructor I used previously simply accepts an IntPtr and stores that to the ptrToWindow private field. The primary constructor accepts a window class name and a window caption/name/title and calls the FindWindow Windows API function:

public Controller(string lpClassName, 
   string lpWindowName)
{
   this.ptrToWindow = FindWindow(lpClassName, 
                         lpWindowName);
}

The .NET environment makes it easy to call Windows API functions using the P/Invoke mechanism. All I have to do is import the System.Runtime.InteropServices namespace and then place a DllImport attribute inside my Controller class definition:

[DllImport("user32.dll", CharSet=CharSet.Auto)]
static extern IntPtr FindWindow(string lpClassName, string lpWindowName);

Here I am essentially saying I want to use the Windows API function FindWindow, located in user32.dll, and let .NET worry about Unicode and ASCII issues. Very nice.

I defined a third Controller constructor which is designed to obtain a reference to a child window of another window:

public Controller(IntPtr hwndParent, IntPtr hwndChildAfter, 
                  string lpszClass, string lpszWindow)
{
  this.ptrToWindow = FindWindowEx(hwndParent, hwndChildAfter,
                                  lpszClass, lpszWindow);
}

This constructor is useful to get an object that is a control on a form because most controls are child windows. The constructor calls the FindWindowEx API function. The first parameter is a handle to the parent window. The second parameter tells the function where to start searching from and usually takes a null value, which means to search all child windows. Just like the FindWindow function, all I need to do to be able to call FindWindowEx is to insert an appropriate DLLImport attribute:

[DllImport("user32.dll", CharSet=CharSet.Auto)]
static extern IntPtr FindWindowEx(IntPtr hwndParent, 
  IntPtr hwndChildAfter, string lpszClass, string lpszWindow);

The fourth constructor uses FindWindowEx to get a reference to a child window that does not have a name/title:

public Controller(IntPtr hwndParent, int index) 
{
  int ct = 0;
  IntPtr result = IntPtr.Zero;
  do
  {
    result = FindWindowEx(hwndParent, result, null, null);
    if (result != IntPtr.Zero) ++ct;
  } while (ct < index && result != IntPtr.Zero);
  this.ptrToWindow = result; // return 1-based indexed child
}

This constructor is useful for test automation because there are many window controls that may not have a title. Examples include empty textbox and listbox controls. But every child window has a predecessor and successor window, so I can calculate a 1-based index value relative to its parent. I use a 1-based index so I can use an index of 0 to represent a self reference. So if you know the index of the child window you can use this constructor to get it whether it has a name and title or not. For instance, in the example application under test, the combobox child control of the main form window has an index of 1, the button is 2, the listbox is 3, and the textbox is 4. Note that these child window index values are not the same as the TabIndex property values for controls. There are a couple of ways you can determine the index value of a child control. One way is to examine the source code of the application under test. The index value of a child control is the position in which it is added with Controls.Add method. For example here is how the app controls are added to my dummy app:

this.Controls.Add(this.comboBox1);
this.Controls.Add(this.button1);
this.Controls.Add(this.listBox1);
this.Controls.Add(this.textBox1);

But in a testing environment you may not always have access to the application source code. A second way to determine the index value of a child control/window is to use the Spy++ tool that comes with Visual Studio .NET. You can use the tool to get detailed information about any window, including the predecessor and successor windows. Figure 5 shows the information for the button1 control. You can see the button1 control is a child of Form1 and its predecessor window has caption "<pick>" (which is the combobox control) and its successor window has an empty caption (which is the listbox control).

Figure 5 Getting Window Information Using Spy++

Figure 5** Getting Window Information Using Spy++ **

To summarize, I've defined four Controller constructors that are wrappers around the FindWindow and FindWindowEx Windows API functions. The first constructor can be used to get a reference to the main form, like so:

Controller f = new Controller(p.MainWindowHandle);

whereas the second constructor can be used to search for and get a handle to the same window using its window name:

Controller f = new Controller(null, "Form1"); // Form1

The third constructor can be used to get a reference to a child window/control if you know the child's name/title:

Controller butt = new Controller(f.PtrToWindow, IntPtr.Zero,
                                 null, "button1"); // button1

And finally the fourth constructor can be used to get a reference to a child window/control if you know its 1-based index relative to the parent, as shown here:

Controller lb = new Controller(f.PtrToWindow, 3); // listBox1

After calling a constructor I can check to see if I actually got a reference to a window by checking the IsValid property, which simply compares the stored IntPtr to IntPtr.Zero:

public bool IsValid { get { return this.ptrToWindow != IntPtr.Zero; } }

Manipulating the App and Checking App State

After adding code to my library that enables me to get a reference to any window/control on my application, I need to add code that allows me to manipulate the app. Exactly how you'll want to manipulate your application under test will vary according to your test scenario, but in this case I need to send characters to the textbox control. The key to most low-level window manipulation is to use the SendMessage API function to send a Windows message to the window/control. Here is code from the Controller class that will send a single character to a target window:

public void SendChar(char c) {
  uint WM_CHAR = 0x0102;
  SendMessage1(this.ptrToWindow, WM_CHAR, c, 0);
}

If you are new to calling the SendMessage function, the hard part is figuring out exactly which Windows message to send. Here I'm sending a WM_CHAR message. There's no easy way to know which message to use. What I like to do is just browse through a list of all Windows message constants in the WinUser.h file (usually located in the PlatformSDK\Include subfolder of your Visual Studio .NET installation) and then look up information in the Visual Studio .NET integrated help (another useful technique is to use the Spy++ tool mentioned earlier to monitor what messages are being sent to a control you're using).

With a little bit of experience you'll get a good feel for useful messages. The last two of the SendMessage function's four parameters are named wParam and lParam. These two parameters have variable data type and meaning depending on which message argument is passed to the second parameter. In the case of WM_CHAR, the wParam is the character code of a key press. (Notice that I perform an implicit type conversion from char to int.) The lParam is an integer whose bits specify the repeat count, scan code, extended-key flag, context code, previous key-state flag, and transition-state flag of the key press. In this case I just pass 0 (or in other words all 32 bits are 0) which is a normal key press.

To invoke the SendMessage function I added the following P/Invoke declaration:

[DllImport("user32.dll", EntryPoint="SendMessage")]
static extern bool SendMessage1(IntPtr hWnd, uint Msg,
                                int wParam, int lParam);

This attribute will create a "SendMessage1" alias for SendMessage. I do this because SendMessage can take several input parameter and return type signatures.

Once I have the ability to send a single character to a window control, I can easily create a method to send a string of characters:

public void SendChars(string s)
{
  foreach (char c in s) SendChar(c);
}

With these two routines in hand I can send a string to the textBox1 control:

Controller tb = new Controller(f.PtrToWindow, IntPtr.Zero,
                               null, "<enter color>");
Console.WriteLine("\nTyping 'red' to app");
tb.SendChars("red");

Nearly all UI automation needs the ability to simulate a button click. Here's how I implemented that functionality in my WindowControllerLib class:

public void ClickOn()
{
  uint WM_LBUTTONDOWN = 0x0201;
  uint WM_LBUTTONUP   = 0x0202;
  PostMessage(this.ptrToWindow, WM_LBUTTONDOWN, 0, 0); // button down
  PostMessage(this.ptrToWindow, WM_LBUTTONUP, 0, 0); // button up
}

I use the PostMessage API function instead of SendMessage. Usually you'll want to use SendMessage which calls the window procedure for the specified window and does not return until the procedure has processed the message. For a simulated button click I want to return without waiting for the thread to process the message. As before, I create a P/Invoke declaration for PostMessage:

[DllImport("user32.dll")] // used for button-down & button-up
static extern int PostMessage(IntPtr hWnd, uint Msg,
                              int wParam, int lParam);

When processing the WM_LBUTTONDOWN and WM_LBUTTONUP messages, the wParam value indicates whether various key-state masks like MK_SHIFT are down. The lParam value represents the x and y coordinates of the cursor relative to the upper-left corner of the window. Here I pass 0,0 to WM_LBUTTONDOWN and WM_LBUTTONUP in order to simulate mouse clicks on the upper-left corner of the target window without any keys pressed.

In a test automation scenario you must be able to check the state of windows/controls. In this example I need to check the listBox1 control to see if the expected message is there or not. Here is the method I added to my library to do that:

public int ListBoxFindString(string s)
{
  uint LB_FINDSTRING = 0x018F;
  int result = SendMessage4(this.ptrToWindow, LB_FINDSTRING, -1, s);
  return result; // -1 if not found, 0-based index otherwise
}

I use the SendMessage function with a LB_FINDSTRING Windows message argument. Notice that this use of SendMessage returns an int, and accepts int and string arguments for its wParam and lParam parameters. This signature is different than the one I used with the WM_CHAR message in my SendChar library method. Now because of overloading I could have just used a default SendMessage alias instead of creating SendMessage1 and SendMessage2; however, other Windows messages use the same parameter types but with different return types and so they can't be distinguished. I find it easier just to create multiple SendMessageX-type aliases. My method returns -1 if the target string is not found; otherwise it returns the 0-based index of the target's location in the associated listbox control.

With a ListBoxFindString method defined I can call it in my test automation code, like so:

Console.WriteLine("Checking listBox1 for 'purple'");
int result = lb.ListBoxFindString("Result is purple");
Console.WriteLine("Test scenario result = " + 
  (result == -1 ? "*FAIL*" : "Pass");

Message Boxes and Menus

Dealing with message boxes and menus requires slightly different tricks than normal windows/controls. Recall that my dummy application under test will display an error message box if the user has not supplied two colors before clicking on the button control (see Figure 3). My test scenario exercises this functionality so I need to know how to click the error message box away. This is easy if you know that a message box window is not a child window of its calling window. So in order to click the OK button on a message box you get a reference to the message box just like you would any other top-level window and then get the child OK button control. The code in Figure 6 shows how I did this in my test scenario code.

Figure 6 Dealing with a Message Box

// find msgBox
Console.WriteLine("Looking for Message Box");
Controller mb = null;
bool mbFound = false;
while (!mbFound)
{
  mb = new Controller(null, "Error");
  if (mb.PtrToWindow == IntPtr.Zero)
  {
    Console.WriteLine("Message Box window not found yet . . . ");
    Thread.Sleep(100);
  }
  else
  {
    Console.WriteLine("Message Box window found with ptr = " +
                      mb.PtrToWindow);
    mbFound = true;
  }
}

// get Message Box OK button
Console.WriteLine("Clicking away Message Box");
Controller okButt = new Controller(mb.PtrToWindow, 1);
okButt.ClickOn();

Dealing with menu items requires special handling because Windows menus are neither child controls nor top-level windows but rather separate objects. The trick to working with menus is to use the GetMenu, GetSubMenu, and GetMenuID Windows API functions with the WM_COMMAND message:

[DllImport("user32.dll")] 
static extern IntPtr GetMenu(IntPtr hWnd);

[DllImport("user32.dll")] 
static extern IntPtr GetSubMenu(IntPtr hMenu, int nPos);

[DllImport("user32.dll")] 
static extern int GetMenuItemID(IntPtr hMenu, int nPos);

I coded a FormExitApp method in my library that simulates a user clicking on a specified item of a specified submenu:

public void FormExitApp(int subMenu, int exitItem)
{
  IntPtr pMenu = GetMenu(this.ptrToWindow); // get main Menu
  IntPtr pSubMenu = GetSubMenu(pMenu, subMenu);   // get "File" submenu
  int menuID = GetMenuItemID(pSubMenu, exitItem); // get "Exit"
  uint WM_COMMAND = 0x0111;
  SendMessage4(this.ptrToWindow, WM_COMMAND, menuID, null);
}

The code is fairly self-explanatory. GetMenu gets a reference to the main menu associated with a window/form. Then this reference can be used with GetSubMenu to get a reference to a particular submenu (such as "File", "Edit", "View"). And then that reference can be used to get a menu ID (such as "Copy", "Paste"). The WM_COMMAND message is sent when the user selects a command item from a menu, when a control sends a notification message to its parent window, or when an accelerator keystroke is translated. So sending WM_COMMAND simulates clicking on a menu item.

Extending the Test Harness

The code presented in this column gives you a foundation to build your own lightweight low-level UI test automation libraries. As you've seen, the steps determine what you want to do, determine which Windows message to use, write a small wrapper method around the SendMessage or PostMessage API functions, and then call the wrapper method in your test code.

Let's walk through an example. Recall that my example test automation supplies a color string to the comboBox1 control by calling a SendChars helper method, simulating typing into the combobox. Suppose you want to supply a color string by simulating a user selecting one of the comboBox1 items instead of simulated typing. By browsing through the WinUser.h file you'll find a CB_SELECTSTRING message. After looking this message up in the Visual Studio .NET integrated help you'd find this is just what you need and also get details of how to call SendMessage. In this case, the return value is CB_ERR = -1 if the selection is unsuccessful or the 0-based index of the selected item if successful. The wParam parameter specifies where to begin searching (-1 means search all), and the lParam parameter is the string to search for. Next you add a new P/Invoke definition if necessary. In this case, none of the library SendMessage signatures I've defined thus far is an exact match, so you'd add a new alias (the typical approach is to create a single P/Invoke definition for a target function and change how you marshal the parameters, but I find this approach easier):

[DllImport("user32.dll", EntryPoint="SendMessage", 
  CharSet=CharSet.Auto)]
static extern bool SendMessage2(IntPtr hWnd, uint Msg, 
                                int wParam, string lParam);

Then you'd code a wrapper method:

public void ComboBoxSelectString(string s)
{
  uint CB_SELECTSTRING = 0x014D;
  SendMessage2(this.ptrToWindow, CB_SELECTSTRING, -1, s);
}

And finally you'd call the new library method in your test code:

Controller cb = new Controller(f.PtrToWindow, 1); // comboBox1
cb.ComboBoxSelectString("blue");

Notice that because this column is primarily instructional I have left out most error checking, but in a production environment you'll want to add plenty. For example, in the ComboBoxSelectString helper method I didn't check my input argument and I ignored the SendMessage return value. Because your test automation will typically be running unattended and you expect to find errors, you'll want to liberally add error-checking code in both your library and your test harness.

The automated UI test technique I've presented here has been used successfully on many products. Because it is very quick and easy to implement it can be used early in the product cycle when the system under test is highly unstable. The disadvantage of this UI test automation technique is that it is relatively lightweight so it's difficult to handle all possible UI testing situations. Additionally, the code in this column is essentially traditional Win32® code that has been ported to managed code. I have not taken advantage of some of the new features of Windows Forms, such as support for the WM_GETCONTROLNAME message which allows you to easily identify controls that have no caption. For more information, see Brian McMaster's article "Automating Windows Forms".

The next generation of Windows, code-named "Longhorn," will use a new graphics subsystem called "Avalon," which promises to take the UI test automation to a new level by enabling direct and consistent access to all Windows UI elements. Even so, there will still be situations where low-level automation techniques, such as the ones I've presented here, will be useful.

Send your questions and comments for James to  testrun@microsoft.com.

James McCaffrey works for Volt Information Sciences Inc., where he manages technical training for software engineers working at Microsoft. He has worked on several Microsoft products including Internet Explorer and MSN Search. James can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.