Test Run
Web UI Automation with Windows PowerShell
Dr. James McCaffrey
Code download available at:TestRun2008_03.exe(155 KB)
Contents
The Web App
Test Automation Scripting with Windows PowerShell
Taking It a Step Further
Windows PowerShellTM, the new Microsoft command shell and scripting language, is a great platform for several kinds of lightweight test automation. In this month's Test Run column, I show you how to use Windows PowerShell to create quick and easy UI test automation for any kind of Web application by automating Internet Explorer®. This column is primarily intended for beginners, but experienced engineers will find some interesting information here, too.
Before you start, make sure you have added whatever sites you are going to test to your "Trusted Sites" list in Internet Explorer—otherwise the scripts may not work. I begin my automation demonstration by issuing the Windows PowerShell command:
PS C:\> $ie = new-object -com "InternetExplorer.Application"
This creates an instance of the classic InternetExplorer COM automation object from the SHDocVw.dll library. The new-object keyword is a Windows PowerShell cmdlet (pronounced command-let). There are approximately 130 cmdlets, and they form the heart of Windows PowerShell functionality. You can explore what the cmdlets are by running get-command and you can always get help on a cmdlet by running get-help Command. I supply a -com switch (which is actually a shortcut for -comObject) to new-object. This specifies that I am instantiating a classic COM object using the object's ProgID rather than instantiating a managed object.
I store the resulting object into the $ie variable (all Windows PowerShell variables are preceded by a $ character, making them easy to distinguish from other token types). Next, I use the Navigate method to load my dummy MiniCalc Web application under test into my browser automation object:
PS C:\> $ie.navigate("https://localhost/MiniCalc/Default.aspx")
PS C:\> $ie.visible = $true
One of the great features of Windows PowerShell is that it can help you explore an object's capabilities. With older scripting technologies, if I didn't know that the InternetExplorer object has a Navigate method, I'd be forced to look that information up using some sort of external reference. But with Windows PowerShell, I have several quick ways to see the available methods and properties for an object. For example, I can use tab completion by typing "$ie." and then pressing the Tab key. After each key press, an available property or method will be displayed. I can also use the get-member cmdlet to get a list of all available properties and methods, and their signatures:
PS C:\> $ie | get-member | more
Another Windows PowerShell discovery capability is command completion. For instance, I can type "$ie.vi" and then press the Tab key, and Windows PowerShell will finish typing my ie.visible statement for me. These discovery features in Windows PowerShell are huge time savers.
Next, I obtain references to all input controls on my application under test:
PS C:\> $doc = $ie.document
PS C:\> $tb1 = $doc.getElementByID("TextBox1")
PS C:\> $tb2 = $doc.getElementByID("TextBox2")
PS C:\> $add = $doc.getElementByID("RadioButton1")
PS C:\> $btn = $doc.getElementByID("Button1")
I use the Document property to fetch the active document and then use the getElementById method to obtain references to each control. Notice that for this technique to work, all of my HTML elements must have an ID value. In most situations this is not a problem. If you create your Web application with Visual Studio®, all controls automatically receive IDs. If you are writing test automation for a Web application where the elements do not have IDs, you can also use the getElementsByTagName method to return a collection of elements and then access a specific element by index.
Next, my automation simulates some user input:
PS C:\> $tb1.value = 5
PS C:\> $tb2.value = 7
PS C:\> $add.checked = $true
PS C:\> $btn.click()
Notice that although the TextBox1 and TextBox2 values referenced by $tb1 and $tb2 are string types, I can omit quotes because Windows PowerShell will correctly infer the correct data type for me even though I could have typed the command as:
PS C:\> $tb1.value = "5"
I finish my interactive automation by examining the resulting state of the MiniCalc Web application:
PS C:\> $tb3 = $doc.getElementById("TextBox3")
PS C:\> $ans = $tb3.value
PS C:\> if ($ans -eq "12.0000") { 'Pass' } else { '*FAIL*' }
I first get a reference to the TextBox3 control. I did not do this earlier because, after calling the $btn click method, an HTTP request is sent to the Web server and a new page with a new TextBox3 value is generated and returned to the client browser. For brevity, I can simply type 'Pass' instead of write-host 'Pass' because the default Windows PowerShell action for a string value is to output the value to the host.
In the following sections of this month's column, I briefly describe the dummy MiniCalc Web application under test so you'll know exactly what is being tested. Then I refactor the interactive commands shown in Figure 1 to a more practical Windows PowerShell script and demonstrate how to deal with tricky timing issues. I conclude by examining some pros and cons of using Windows PowerShell for Web UI automation compared to alternative approaches such as using a commercial test framework, writing a custom C# harness with Visual Studio, and writing custom automation that uses a JavaScript approach. I think you'll find the techniques I present here to be an extremely useful and valuable addition to your software testing toolkit.
Figure 1** Web App UI Automation with Windows PowerShell **(Click the image for a larger view)
The Web App
Let's begin by examining the MiniCalc ASP.NET Web app shown in the background of Figure 1. This is the target of my UI test automation. MiniCalc is a simple ASP.NET Web app. The techniques I present in this month's column can automate ASP.NET Web applications, classic ASP Web applications, and applications created with technologies such as PHP and Ruby.
To keep my Web application code as short and as simple as possible, I coded the app in Notepad and placed both logic code and display code in a single file. The entire code is listed in Figure 2. (All the code discussed in this column is available on the MSDN® Magazine Web site.) I name my Web application Default.aspx and placed the app at Web site https://localhost/MiniCalc.
Figure 2 Web Application under Test
<%@ Page Language="C#" %>
<script runat="server">
private void Button1_Click(object sender, System.EventArgs e)
{
int alpha = int.Parse(TextBox1.Text.Trim());
int beta = int.Parse(TextBox2.Text.Trim());
System.Threading.Thread.Sleep(3000);
if (RadioButton1.Checked) {
TextBox3.Text = Sum(alpha, beta).ToString("F4");
}
else if (RadioButton2.Checked) {
TextBox3.Text = Product(alpha, beta).ToString("F4");
}
else
TextBox3.Text = "Select method";
}
private static double Sum(int a, int b) {
double ans = a + b;
return ans;
}
private static double Product(int a, int b) {
double ans = a * b;
return ans;
}
</script>
<html>
<head>
<style type="text/css">
fieldset { width: 16em }
body { font-size: 10pt; font-family: Arial }
</style>
<title>Default.aspx</title>
</head>
<body bgColor="#ccffff">
<h3>MiniCalc by ASP.NET</h3>
<form method="post" name="theForm" id="theForm"
runat="server" action="Default.aspx">
<p>
<asp:Label id="Label1" runat="server">
Enter integer:  
</asp:Label>
<asp:TextBox id="TextBox1" width="100" runat="server" />
</p>
<p>
<asp:Label id="Label2" runat="server">
Enter another: 
</asp:Label>
<asp:TextBox id="TextBox2" width="100" runat="server" />
</p>
<p></p>
<fieldset>
<legend>Arithmentic Operation</legend>
<p>
<asp:RadioButton id="RadioButton1"
GroupName="Operation" runat="server"/>
Addition
</p>
<p>
<asp:RadioButton id="RadioButton2"
GroupName="Operation" runat="server"/>
Multiplication
</p>
<p></p>
</fieldset>
<p>
<asp:Button id="Button1" runat="server"
text=" Calculate " onclick="Button1_Click" />
</p>
<p>
<asp:TextBox id="TextBox3" width="120" runat="server" />
</p>
</form>
</body>
</html>
To perform Web application UI automation, you must know the IDs of the user controls you wish to manipulate and examine. If you look over the code in Figure 2, you'll see that I use Visual Studio-style default IDs for my controls. I use the IDs TextBox1 and TextBox2 for the two textbox controls that will hold the integers entered by the user:
<asp:TextBox id="TextBox1" width="100" runat="server" />
I use RadioButton1 as the ID for the RadioButton control, which allows the user to select either Addition or Multiplication; Button1 as the ID for the Button control, which causes the Web app to add or multiply the values in TextBox1 and TextBox2; and I use TextBox3 as the ID for the textbox control, which holds the result. All my logic is contained in the Web app's Button1_Click method, which handles the Button1 click event.
I begin by fetching the user input and converting from string to type int, as shown here:
int alpha = int.Parse(TextBox1.Text.Trim());
int beta = int.Parse(TextBox2.Text.Trim());
System.Threading.Thread.Sleep(3000);
After capturing the user input, I insert a Thread.Sleep statement in order to simulate some processing time, such as that which would occur in a real Web application that accesses a back-end database. As you'll see shortly, dealing with unpredictable HTTP response times is the most difficult part of writing lightweight UI test automation for Web applications.
After I have the two user-supplied integers I need, I check which operation (addition or subtraction) the user wants, compute the result and place that result into the TextBox3 control:
if (RadioButton1.Checked) {
TextBox3.Text = Sum(alpha, beta).ToString("F4");
}
else if (RadioButton2.Checked) {
TextBox3.Text = Product(alpha, beta).ToString("F4");
}
else
TextBox3.Text = "Select method";
}
I implicitly cast my answer to type double, and when I place the result in the TextBox3 control, I format to four decimal places by using an "F4" argument to the ToString method.
Test Automation Scripting with Windows PowerShell
Although the interactive test automation with Windows PowerShell described in the first section of this column can be useful in many testing situations, you will often want to create test automation scripts. Figure 3 shows one way to test the MiniCalc Web application. Figure 4 shows the script in action.
Figure 3 UI Test Automation with Windows PowerShell
# file: testScript.ps1
function main()
{
write-host "`nBegin test automation using Windows PowerShell`n"
write-host 'Launching IE'
$ie = new-object -com "InternetExplorer.Application"
$ie.navigate("about:blank")
$ie.visible = $true
[System.Threading.Thread]::Sleep(2000)
write-host "`nResizing IE to 425 x 535"
$ie.height = 535
$ie.width = 425
write-host "`nNavigating to MiniCalc Web application"
navigateToApp $ie "https://localhost/MiniCalc/Default.aspx"`
"TextBox1" 100 2
write-host "`nGetting input controls"
$doc = $ie.document
$tb1 = $doc.getElementByID("TextBox1")
$tb2 = $doc.getElementByID("TextBox2")
$add = $doc.getElementByID("RadioButton1")
$btn = $doc.getElementByID("Button1")
if ($tb1 -eq $null -or $tb2 -eq $null -or $add -eq $null -or $btn`
-eq $null) {
write-host "One or more controls are null" -backgroundcolor "red"`
-foregroundcolor "yellow"
}
else {
write-host "All controls found"
}
write-host "`nSetting TextBox1 to 5"
$tb1.value = 5
write-host "Setting TextBox2 to 7"
$tb2.value = 7
write-host "Selecting 'Addition' operation"
$add.checked = $true
write-host "`nClicking 'Calculate' button"
$tb3 = $doc.getElementByID("TextBox3")
$old = $tb3.value
$btn.click()
$wait = $true
$numWaits = 0
while ($wait -and $numWaits -lt 100) {
$numWaits++
[System.Threading.Thread]::Sleep(50)
$tb3 = $doc.getElementByID("TextBox3")
if ($tb3.value -ne $old) {
$wait = $false
}
else {
write-host "Waiting for app to respond $numWaits . . ."
}
}
if ($numWaits -eq 100) {
throw "Application did not respond after 100 delays"
}
else {
write-host "Application has responded"
}
write-host "`nChecking for 12.0000"
$tb3 = $doc.getElementByID("TextBox3")
$ans = $tb3.value
if ($ans -eq '12.0000') {
write-host "Target value found"
write-host "`nTest scenario: Pass" -foregroundcolor 'green'
}
else {
write-host "Target value not found"
write-host "`nTest scenario: *FAIL*" -foregroundcolor 'red'
}
trap {
write-host "Fatal error: " $_.exception.message -backgroundcolor red`
-foregroundcolor yellow
}
write-host "`nEnd test automation`n"
} # main
function navigateToApp($browser, [string] $url, [string] $controlID,`
[int] $maxDelays, [int] $delayTime)
{
$numDelays = 0
$loaded = $false
$browser.navigate($url)
while ($loaded -eq $false -and $numDelays -lt $maxDelays) {
$numDelays++
[System.Threading.Thread]::Sleep($delayTime)
$doc = $browser.document
if ($doc -eq $null) {
continue
}
$controlRef = $doc.getElementByID($controlID)
if ($controlRef -eq $null) {
write-host "Waiting for Web app to load $numDelays . . ."
}
else {
write-host "Web app loaded after $numDelays pauses"
$loaded = $true
}
}
if ($numDelays -eq $maxDelays) {
throw "Browser not loaded after $maxDelays delays"
}
}
main
# end script
Figure 4** Executing Test Automation with a Script **(Click the image for a larger view)
If you examine Figure 4, you'll see that after I launch a Windows PowerShell shell, I verify that my Windows PowerShell session's execution policy allows script execution and then invoke my script. Notice that under Windows PowerShell I must specify the path to the script (.\ if the script is in the current directory) even when the script is in the current directory.
The overall structure of the test script is:
# file: testScript.ps1
function main
{
# code
}
function navigateToApp($browser, [string] $url,
[string] $controlID, [int] $maxDelays,
[int] $delayTime)
{
# code
}
main
# end script
I call my main function main, but there is no default Windows PowerShell script entry point, so I could have named this function anything. After defining an auxiliary navigateToApp function, I issue the single statement main to launch my script's execution. The first few lines of the main function are:
write-host "`nBegin test automation using Windows PowerShell`n"
write-host 'Launching IE'
$ie = new-object -com "InternetExplorer.Application"
$ie.navigate("about:blank")
$ie.visible = $true
[System.Threading.Thread]::Sleep(2000)
My first two write-host statements show how, in Windows PowerShell, double quotes are intelligent in the sense that certain escape sequences, such as the 'n newline character and object references beginning with the $ character, are evaluated by the script execution engine. Single-quote-delimited strings are interpreted literally.
As explained earlier, I use the new-object cmdlet to instantiate an instance of a classic InternetExplorer COM automation object. I then navigate to the about:blank page and make my browser visible. I call directly into the Microsoft® .NET Framework to access the Thread.Sleep method and pause my test automation for two seconds. There are two points here. First, the ability to directly call into the .NET Framework is a key advantage of Windows PowerShell over most other scripting technologies. Second, although sleeping the thread in order to give the Web application under test a chance to respond works, there is a much better approach, which I will describe shortly.
The next few lines of my automation script set the Internet Explorer browser to a known state:
write-host "`nResizing IE to 425 x 535"
$ie.height = 535
$ie.width = 425
In general, when performing most types of Web application UI test automation, it's a good idea to set characteristics of the browser to a known state so that any bugs that are revealed by the automation can be observed more easily. For example, consider a Web app that is a front end to a SQL Server® data store and has a search functionality that displays results in a listbox control. Suppose the app's logic forgets to clear the listbox control after a particular search. Now starting from the standard default state, which will have an empty results listbox, test automation might determine that the Web app works correctly and not reveal the bug. But if you start the automation from a state that already contains some results in the listbox, the new results would be concatenated with the old results and the bug would be revealed. You should also write scripts that allow variations in initial state to aid in uncovering bugs that are sensitive to those variations.
Next, I point my browser to the MiniCalc Web application under test:
write-host "`nNavigating to MiniCalc Web application"
navigateToApp $ie "https://localhost/MiniCalc/Default.aspx" "TextBox1" 100 80
I invoke the navigateToApp helper function. You can interpret this call to mean "navigate Internet Explorer to URL https://localhost/MiniCalc/Default.aspx and then wait until a reference to an HTML element with ID equal to TextBox1 is accessible, pausing 80 milliseconds between attempts to access TextBox1, up to a maximum of 100 attempts." Determining when your application is loaded is not trivial. The crude alternate to load my Web app is simple but generally not as effective:
$ie.navigate("https://localhost/MiniCalc/Default.aspx")
[System.Threading.Thread]::Sleep(8000)
The two problems with this approach are that there is no good way to predict how long to pause your automation, and there's no clear way to deal with a situation where the application under test does not load within the allotted time. The navigateToApp function solves both these problems and is listed in Figure 5.
Figure 5 NavigateToApp Helper Function
function navigateToApp($browser, [string] $url, [string] $controlID,
[int] $maxDelays, [int] $delayTime)
{
$numDelays = 0
$loaded = $false
$browser.navigate($url)
while ($loaded -eq $false -and $numDelays -lt $maxDelays) {
$numDelays++
[System.Threading.Thread]::Sleep($delayTime)
$doc = $browser.document
if ($doc -eq $null) {
continue
}
$controlRef = $doc.getElementByID($controlID)
if ($controlRef -eq $null) {
write-host "Waiting for Web app to load $numDelays . . ."
}
else {
write-host "Web app loaded after $numDelays pauses"
$loaded = $true
}
}
if ($numDelays -eq $maxDelays) {
throw "Browser not loaded after $maxDelays delays"
}
}
The navigateToApp function essentially goes into a delay loop, checking in each iteration to see whether a specified user control reference is available. The loop also will exit if some maximum number of iterations through the loop is exceeded to prevent an infinite loop condition.
Although Windows PowerShell is object-based, it is considered acceptable to refer to simple objects as variables. You can see that the navigateToApp function uses local variables $numDelays and $loaded, but they don't have to be explicitly declared to be local variables. I could have preceded these variables with the $private: qualifier to make their local scope explicit or used the $global: qualifier to make the values of these variables available outside the scope of the function. The control logic in navigateToApp checks if the document object is available; if not, I use the continue statement to short-circuit out of the current loop iteration and then try again after a delay. If I do have a valid reference to the document object, then I attempt to get a reference to a target element.
I also could have used the Windows PowerShell elseif control structure. Additionally, instead of using an explicit $loaded variable, I could have used the Windows PowerShell break statement to exit the delay loop. Windows PowerShell has a rich set of control structures that allow you to program in many different styles, including whatever programming style you are accustomed to, and this speeds up your learning curve.
After exiting the delay loop, I check to see whether the loop exited because of exceeding the maximum number of delays (which means I never obtained a reference to a target user control, signaling that the Web application never loaded successfully). In this case, I throw an exception that I will catch in the main function by using the Windows PowerShell trap mechanism.
After the Web application under test is fully loaded, I obtain references to all user input controls:
write-host "`nGetting input controls"
$doc = $ie.document
$tb1 = $doc.getElementByID("TextBox1")
$tb2 = $doc.getElementByID("TextBox2")
$add = $doc.getElementByID("RadioButton1")
$btn = $doc.getElementByID("Button1")
As I mentioned earlier, the getElementById method requires that all controls have an ID attribute, but in situations where you need to access controls without IDs, you can also use the getElementsByTagName method.
Next, I perform a quick check to make sure my HTML element references are valid:
if ($tb1 -eq $null -or $tb2 -eq $null –or
$add -eq $null -or $btn -eq $null) {
write-host "One or more controls are null"
-backgroundcolor "red" -foregroundcolor "yellow"
}
else {
write-host "All controls found"
}
When writing Windows PowerShell-based UI test automation, it is generally a matter of personal coding style whether to throw an exception or to simply display a message using the write-host cmdlet when you error check. Conceptually, throwing an exception is the more logical approach, but using write-host allows you to specify easily visible text using the –backgroundcolor and –foregroundcolor arguments.
After I have verified that all user input controls are available, I can easily manipulate them, like so:
write-host "`nSetting TextBox1 to 5"
$tb1.value = 5
write-host "Setting TextBox2 to 7"
$tb2.value = 7
write-host "Selecting 'Addition' operation"
$add.checked = $true
Now I am ready to simulate the user action that will trigger a post to the Web server—a button-click in this case—and then wait for the response from the server. This is not so easy. Similar to loading the application under test, a simplistic approach such as the following just isn't effective because there is no reliable way to know in advance how long to pause your test automation:
write-host "`nClicking 'Calculate' button"
$btn.click()
[System.Threading.Thread]::Sleep(5000)
One of several possible solutions is to first get some prerequest control value on the Web application, then trigger the HTTP request, and then use a delay loop until the prerequest control value has changed. For example, first I can fetch the value in the TextBox3 control and save it:
$tb3 = $doc.getElementByID("TextBox3")
$old = $tb3.value
Now I can simulate a user clicking on the Calculate button:
$btn.click()
And then I can go into a delay loop until either the value in TextBox3 has changed or I exceed some maximum number of delays:
$wait = $true
$numWaits = 0
while ($wait -and $numWaits -lt 100) {
$numWaits++
[System.Threading.Thread]::Sleep(50)
$tb3 = $doc.getElementByID("TextBox3")
if ($tb3.value -ne $old) {
$wait = $false
}
else {
write-host "Waiting for app to respond $numWaits . . ."
}
}
After my delay loop terminates, I check to see if the exit occurred because of exceeding the maximum number of attempts to find a change in the target control's value:
if ($numWaits -eq 100) {
throw "Application did not respond after 100 delays"
}
else {
write-host "Application has responded"
}
At this point in my automation, I have successfully loaded the Web app under test, manipulated elements, triggered an HTTP request, and determined that there has been a response from the server. Now I can check the resulting state of the Web application to determine a pass or fail result. First, I get a reference to, and the value of, TextBox3:
write-host "`nChecking for 12.0000"
$tb3 = $doc.getElementByID("TextBox3")
$ans = $tb3.value
I cannot get a reference to TextBox3 until this point because a reference obtained before the HTTP request would be lost after the HTTP response. Now I can check the resulting value and display the test scenario result:
if ($ans -eq '12.0000') {
write-host "Target value found"
write-host "`nTest scenario: Pass" -foregroundcolor 'green'
}
else {
write-host "Target value not found"
write-host "`nTest scenario: *FAIL*" -foregroundcolor 'red'
}
At the end of my main function, I use the trap statement to deal with any exceptions that may have been thrown during the test run:
trap {
write-host "Fatal error: " $_.exception.message
-backgroundcolor red -foregroundcolor yellow
}
Here I simply display the exception message. In some cases, you may want to use the continue statement to force your test automation to continue running even on a fatal error. Or you may want to exit your automation altogether.
Taking It a Step Further
I hardcoded many values for clarity, but you will likely want to parameterize your automation scripts in a production environment. For example, I hardcoded the path to the Web application under test. Windows PowerShell has good mechanisms for passing command-line arguments to scripts—you can add parameters to a script by adding param($param1, $param2), and so on, to the top of your script.
Additionally, you may want to extend your automation scripts by parameterizing test case input values and corresponding expected values. Again, Windows PowerShell has elegant ways to read test case data from an external flat text file, an external XML file, a SQL database, or other test case data store.
Similarly, my example test automation script writes its pass/fail result to the shell. Windows PowerShell also allows you to easily save results to any type of data store that you wish. One interesting example of this is to write your test results to a Microsoft Team Foundation Server, which gives you great test management capabilities. For details on integrating with Team Foundation Server, please see my Test Run column in the MSDN Magazine Launch 2008 issue (msdn.microsoft.com/msdnmag/issues/08/LA/TestRun).
In much the same way that test automation complements rather than replaces manual testing, ultralightweight software test automation with Windows PowerShell complements rather than replaces other types of test automation and test frameworks. For example, because the UI testing technique I've presented here uses the Internet Explorer object model, you cannot use it to test Web applications running on other Web browsers or devices. In situations like those, you can employ a JavaScript approach (see the February 2007 Test Run column, "AJAX Test Automation", at msdn.microsoft.com/msdnmag/issues/07/02/TestRun), or use a commercial test framework.
The techniques I've presented here are available to some extent in other scripting languages. But based on my experience, using Windows PowerShell for script-based test automation has five small, but significant, characteristics that give it an advantage over other scripting environments.
First, Windows PowerShell can directly access both COM objects and the .NET Framework (rather than go through a wrapper mechanism).
Second, the interactive mode of Windows PowerShell allows you to quickly experiment while developing your automation scripts, which greatly speeds up the script creation process.
Third, the built-in discovery mechanisms of Windows PowerShell, such as dot-tab completion and the get-member cmdlet, provide you with what is essentially a virtual documentation help feature.
Fourth, the Windows PowerShell built-in collection of cmdlets simplifies many mundane test automation tasks.
Fifth, in my opinion, Windows PowerShell is simply easier and more intuitive to use.
While none of these advantages by themselves are huge, when taken together these characteristics provide you with a high rate of return on your test automation with Windows PowerShell relative to the cost you pay to write your automation.
Send your questions and comments for James to testrun@microsoft.com.
Dr. James McCaffrey works for Volt Information Sciences, Inc., where he manages technical training for software engineers working at the Microsoft Redmond, Wash., campus. He has worked on several Microsoft products including Internet Explorer and MSN Search. James is the author of .NET Test Automation Recipes and can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.