Test Run

Using Excel For Test Data

Dr. James McCaffrey

Code download available at:TestRun2006_11.exe(171 KB)


The Library under Test
The Test Case Data File
Creating the Test Harness
Storing Results
Wrapping Up

If you are writing test automation, there are four complementary approaches you can take: buy and use a commercial test framework (including Visual Studio® 2005 Team System, which has some neat new features), use an open source test framework, write custom heavyweight automation (generally more than four pages of code), or write lightweight automation (generally under four pages of code). Most of my Test Run columns focus on techniques for lightweight automation. When writing lightweight test automation, one of your best choices for test case data and test results storage is to use Microsoft® Excel®.

In this month's column I will show you how to use Excel for test automation storage. Even though this is aimed at developers with beginning .NET coding skills, intermediate and advanced programmers should still find the information I present here interesting and rather useful.

Suppose you are writing a cribbage game like the one shown in Figure 1. In cribbage, your hand consists of five cards—four in your hand and one shared by you and your opponent. Your point score is determined by a number of factors, such as the number of pairs in your hand and the number of card combinations you hold that sum to 15. Each pair is worth 2 points and each set of cards that sum to 15 is worth 2 points. Face cards count as 10, aces count as 1, and all other cards count as their point value. For the hand shown in Figure 1, the value of the pairs is 2 because there is a single pair of 7s. The value of the 15s is 6 because there are three combinations of cards that sum to 15: the 7 of diamonds and 8 of clubs; the 7 of clubs and 8 of clubs; the 7 of diamonds, 7 of clubs, and ace of hearts.

Figure 1 App Under Test

Figure 1** App Under Test **

Behind the scenes, the game UI calls into a CribbageLib class library that you've written. This library houses classes to represent a Card object and a Hand object, as well as methods and properties, such as Hand.ValueOfPairs and Hand.ValueOf15s.

In this column, I will present techniques for using Excel to hold test case data and store test results, as illustrated by the program shown in Figure 2. The lightweight test harness begins by checking to see if the Excel spreadsheet that holds test case data exists. (I'll explain shortly how to set up an Excel spreadsheet so that you can programmatically read from it.) Next, the harness probes into the Excel data to determine the number of test cases and then reads all the Excel data into an in-memory DataTable object. (I'll discuss the alternative approach of reading one test case at a time in the last section of this column.) The harness creates a new Excel spreadsheet to hold test case results, and then executes each test case. The harness determines a pass/fail result for each test, prints the result to the console, and saves the result to the Excel spreadsheet. The complete source code for the test harness and the library under test are in the code download that accompanies this column.

In the sections that follow, I will briefly describe the class library under test so you will understand how to set up appropriate Excel spreadsheets for storage. I will present the code for the simple test harness that produced the results shown in Figure 2. I will explain the code in detail so you will be equipped to modify and extend the code to use Excel to meet your own needs. And I will conclude with a very brief discussion of how and when to use Excel for test storage, looking at the pros and cons of this approach when compared to other types of test storage (SQL Server™ databases, text files, and XML files in particular). I believe you'll find the ability to use Excel for lightweight test storage a useful addition to your software testing, development, and management tool kit.

Figure 2 Using Excel for Test Data Storage

Figure 2** Using Excel for Test Data Storage **(Click the image for a larger view)

The Library under Test

In order to test a software component with automation, you must know at a bare minimum the component's interfaces. The code in Figure 3 shows the structure of my CribbageLib class library. Let me emphasize that I am deliberately not adhering to standard .NET coding guidelines so I can keep the size of my library under test small.

Figure 3 CribbageLib Library Structure

using System; namespace CribbageLib { public class Card { public int rank; // 0 = Ace, 1 = Deuce, . . , 12 = King public int suit; // 0 = Clubs, 1 = Diamonds, 2 = Hearts, 3 = Spades public int value; // Ace = 1, Deuce = 2, . . , Jack/Queen/King = 10 public string image; // "Ac", "9h", etc. public Card(string c) { // create Card from string c } public override string ToString() { return image; } } // class Card public class Hand { Card[] cards; public Hand(Card c1, Card c2, Card c3, Card c4, Card c5) { cards = new Card[5]; cards[0] = c1; cards[1] = c2; cards[2] = c3; cards[3] = c4; cards[4] = c5; } public override string ToString() { return "{ " + cards[0].ToString() + " " + cards[1].ToString() + " " + cards[2].ToString() + " " + cards[3].ToString() + " " + cards[4].ToString() + " }"; } public int ValueOf15s { get { /* return point value */ } } public int ValueOfPairs { get { /* return point value */ } } } // class Hand } // ns CribbageLib

The CribbageLib library contains a Card class to represent a single card. I use int fields rank, suit, and value to represent the rank (such as ace, deuce, three), the suit (club, diamond, heart, spade), and value in cribbage (aces are worth 1 point, deuces 2, face cards are 10 points). The Card constructor accepts a string (such as "Ac" for the ace of clubs), parses the two input characters, and stores values for the three corresponding member fields. For convenience when writing a ToString() method, I also store a string field named image, which is just the input argument. The Hand class is essentially an array of five Card objects, representing the four cards a player holds plus the one common card shared by both players. The Hand constructor accepts five Card objects.

The heart of my CribbageLib class library under test consists of the Hand.ValueOf15s and Hand.ValueOfPairs properties. These properties return the point value for the number of 15s in a cribbage hand and the point value for the number of pairs in a hand. Although the purpose of my CribbageLib library is merely to provide a sample to test, the ValueOf15s and the ValueOfPairs properties are quite interesting little problems in their own right. I've actually seen these used as hiring interview questions. However, as far as this column is concerned, if you want to use Excel for lightweight test storage, you don't need to know the implementation details of the properties. You can treat these properties as black boxes that have expected results for a particular input.

The Test Case Data File

You typically create your Excel data files manually when using Excel test case data storage for lightweight test automation, (I will demonstrate how to create Excel files programmatically in just a moment.) Manually creating an Excel spreadsheet for test case data is like creating a spreadsheet for any other purpose, but with a couple of added steps. Because my test harness uses OLE DB technology to access Excel programmatically, I need to supply column header labels to identify each column so I can write queries. I also need to create a named area in the spreadsheet to correspond to a SQL table name.

For this example, I want to store a test case ID number (like "00001"), a test case input (like "Ah7d7c8cJs") to represent a cribbage hand, a value to indicate which class property I am sending my input to (like "ValueOf15s"), and an expected result (like "6"). So, I launch Excel, manually type in appropriate column names ("caseID", "input", "method", and "expected"), and manually enter my test case data. To assign a virtual table name to my spreadsheet, I select the column header cells (cells A1 through D1 in this case) and then type a table name into the Excel Name Box field in the upper left-hand corner of the spreadsheet. In this case, I named my table tblTestCases, as you can see in Figure 4.

Figure 4 Test Case Data in Excel

Figure 4** Test Case Data in Excel **(Click the image for a larger view)

Although it is not strictly necessary to do so, I also rename the worksheet to the same name as the virtual table name. When programmatically accessing Excel data, the name in the Name Box field (rather than the worksheet name) is used to determine which data to reference. But as you'll see in a moment, when you programmatically create an Excel spreadsheet, the worksheet name receives the same name as that assigned to the Name Box. So renaming the worksheet of Excel test data that has been manually created is more consistent than leaving the worksheet named Sheet1.

If you look closely at the Excel spreadsheet in Figure 4, you will notice that I entered all the test case data as text. For instance, the expected result for the first test case is 6 formatted as text rather than 6 formatted as a numeric type. This is indicated by the small green triangle in cell D2. I recommend you store all test case data as text and then programmatically convert non-string data to the appropriate data type in the test harness. The Excel data model is not the same as the .NET data model. It is generally easier to perform all data type conversions in the harness rather than trying to get data types as close as possible in the spreadsheet.

Creating the Test Harness

For clarity, I've combined the code used to perform several different tasks with Excel test case data into a single test harness program. Figure 5 shows the structure of the test harness that generated the output shown in Figure 2.

Figure 5 Test Harness Structure

using System; using System.IO; using System.Data; using System.Data.OleDb; using CribbageLib; namespace CribbageLibTest { class Program { static void Main(string[] args) { try { Console.WriteLine("\nBegin lightweight test harness with " + "Excel storage demo\n"); Console.WriteLine("Verifying Excel test case data exists"); if (!File.Exists("..\\..\\..\\TestCases.xls")) throw new Exception("Test case data not found"); else Console.WriteLine("Found test case data"); Console.WriteLine("Determining number of test cases"); ... // probe Excel test case data code here Console.WriteLine("Number of test cases = " + count); Console.WriteLine("Reading all test case data from Excel " + "into memory"); ... // read all test case data code here Console.WriteLine("All test case data now in memory"); Console.WriteLine("\nCreating Excel test case results file"); ... // create Excel code here Console.WriteLine("Results file is " + resultsFile); Console.WriteLine("\nExecuting CribbageLib tests\n"); ... // connect to Excel test results data here string caseid, input, method, result; int actual, expected; Console.WriteLine("caseID input method expected case result"); Console.WriteLine("==========================================="); for (int row = 0; row < dt.Rows.Count; ++row) // main test loop { ... // read test case, execute, print result, // save result to Excel } Console.WriteLine("\nAll test results stored to Excel\n"); Console.WriteLine("End Excel as test storage demo\n"); } catch (Exception ex) { Console.WriteLine("Fatal error: " + ex.Message); } Console.ReadLine(); } // Main() } // class Program } // ns CribbageLibTest

I begin my test harness by adding a project reference to the CribbageLib.dll component that houses my library under test. Then I add a using statement for the namespaces in the library so that I can reference the Card and Hand classes without having to fully qualify their names. In addition, I add a using statement to the System.Data.OleDb namespace, which contains classes that can be used to connect to, access, and manipulate OLE DB data sources, including Excel spreadsheets. I also add a using statement to the System.Data namespace so I can easily instantiate and use a DataTable object to act as an in-memory data store for my test case data from Excel. Because this column is essentially a tutorial, for simplicity I organize my test harness into a single Main method—you, however, may want to consider making your harnesses more modular.

I begin by printing a start message to console, then use the static File.Exists method to verify that the Excel test case data is located where I expect it to be. (To simplify this example, I have removed most of the error-checking code you would need in a production environment.) Once I know my test case data exists, I probe the Excel spreadsheet to find out how many rows of data there are:

int count; string probeConnStr = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=testCases.xls;" + "Extended Properties=\"Excel 8.0;HDR=YES\""; using(OleDbConnection probeConn = new OleDbConnection(probeConnStr)) { probeConn.Open(); string probe = "SELECT COUNT(*) FROM [tblTestCases$A1:A65536] " + "Where caseID IS NOT NULL"; using(OleDbCommand probeCmd = new OleDbCommand(probe, probeConn)) { count = (int)probeCmd.ExecuteScalar(); } }

Here, I create a connection string that specifies the appropriate OLE DB provider, location, and auxiliary information. You have to be careful with the syntax. Notice that in the Extended Attributes property I use a \" sequence to embed a double quote character into the connection string. The HDR=YES attribute indicates that my Excel worksheet has an initial header row. The "Excel 8.0" part does not directly refer to the version of the Excel program on my computer—it refers to the Jet database ISAM (Indexed Sequential Access Method) format that is installed. You can check the ISAM version on your machine by viewing the system registry setting HKEY_LOCAL_MACHINE\Software\Microsoft\Jet\4.0\ISAM Formats. Next I create an OleDbConnection object, passing the connection string to the constructor. I call the Open method to establish a connection to my Excel spreadsheet. I then craft a select string that will return the number of non-NULL rows in my spreadsheet, which will be the number of test cases since the header row does not contribute to the return value. This, of course, assumes that I do not have any empty rows in my Excel test case data. Notice the somewhat unusual syntax in the SELECT statement, which refers to my Excel data. I surround my virtual table with square bracket characters and I append a $ character to the virtual table name. I specify a range of A1:A65536 because 65536 is the maximum number of rows supported in an Excel worksheet. Finally, I use the ExecuteScalar method of the OleDbCommand class so I can capture the return value of the number of test cases into a variable named count. And I complete the probing code by closing the OleDbConnection.

Now I'm ready to reconnect to my Excel spreadsheet and read all my test case data from Excel into memory like this:

string tcdConnStr = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=testCases.xls;" + "Extended Properties=\"Excel 8.0;HDR=YES\""; using(OleDbConnection tcdConn = new OleDbConnection(tcdConnStr)) { tcdConn.Open(); int lastRow = count + 1; string select = "SELECT * FROM [tblTestCases$A1:D" + lastRow + "]"; OleDbDataAdapter oda = new OleDbDataAdapter(select, tcdConn); DataTable dt = new DataTable(); oda.FillSchema(dt, SchemaType.Source); oda.Fill(dt); }

In this snippet, I create a new connection string. I could use the same string variable as I used to probe the number of test cases; I use a different string variable only for clarity and to make it easier for you to modularize my code if you wish. Next I create a new OleDbConnection object. Again, I could have reused the connection object I used for the test case count probe. After opening the connection, I compute the last row of test case data by adding 1 to the number of actual test cases in the Excel spreadsheet—the extra 1 accounts for the header row. Then I build a SELECT string by concatenating the first part of the select with the variable holding the number of the last row that contains test case data.

In order to fill my in-memory DataTable object with test case data, I create an OleDbDataAdapter. Then I instantiate a new DataTable object to hold all my test case data. I use the OleDbDataAdapter's FillSchema method to configure my DataTable attributes to conform to my Excel data source. I then call the Fill method, which actually transfers my test case data from Excel into memory. And I finish by closing the OleDbConnection. Very clean and simple.

Storing Results

Now that I've gotten all my test case data into memory, I am ready to create a new Excel spreadsheet to store my test case results. The code is similar to my other Excel tasks except this time I use an OleDbCommand object (see Figure 6).

Figure 6 Creating a Spreadsheet to Store Results

string stamp = DateTime.Now.ToString("s"); stamp = stamp.Replace(":", "-"); string resultsFile = "testResults" + stamp + ".xls"; string tcrConnStr = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=" + resultsFile + ";" + "Extended Properties=\"Excel 8.0;HDR=YES\""; using(OleDbConnection tcrConn = new OleDbConnection(tcrConnStr)) { tcrConn.Open(); string create = "CREATE TABLE tblResults (caseID char(5), Result char(4), WhenRun DateTime)"; using(OleDbCommand createCmd = new OleDbCommand(create, tcrConn)) { createCmd.ExecuteNonQuery(); } }

I begin by fetching the system date and time so that I can create a time-stamped results file name. I pass an "s" argument to the ToString() method so I will get a date-time string in a format that is sortable (such as 2006-07-23T16:56:44). But because the : character is not valid in a file name, I use the String.Replace method to replace all : characters with hyphens.

My connection string is similar to the two others I've already used, except that I embed the time-stamped file name into it. I then open the connection to a new Excel file. The file does not actually exist yet, so you can think of this as a kind of virtual connection.

Next I craft a SQL-like CREATE string specifying a virtual table name of tblResults. Notice that I can specify the data type for each column. You have to be a bit careful here because many SQL data types do not map exactly to Excel data types. An alternative approach is to simply design your results file so that all the columns are varchar type data. Then, if you need to perform some numeric analysis of your test results in Excel (say, computing an average or a maximum value), you can manually format the columns you are analyzing into the appropriate Excel type (such as Number or Percentage).

In this example, I store the test case ID, the test result as "pass" or "fail", and a DateTime variable that holds the time at which the test case was executed. I finish by instantiating an OleDbCommand object and using the ExecuteNonQuery method. As I mentioned earlier, this process creates an Excel spreadsheet that contains a worksheet named tblResults, which has a Named Area tblResults.

Now that my test case data is in memory and my Excel results file has been created, I am ready to execute my tests and store my results. I begin by creating a connection to my newly created Excel results file, as shown here:

Console.WriteLine("\nExecuting CribbageLib tests\n"); string insertConnStr = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=" + resultsFile + ";" + "Extended Properties=\"Excel 8.0;HDR=YES\""; using(OleDbConnection insertConn = new OleDbConnection(insertConnStr)) { insertConn.Open(); OleDbCommand insertCmd = new OleDbCommand(); insertCmd.Connection = insertConn;

The next thing I do is set up an INSERT command, which I will use to insert a test result after determining whether a test case has passed or failed.

Next I set up my test harness variables and then print a simple output header. I declare string variables to hold my test case ID, my test case input, the method (or Property in this example) under test, and an indicator of the result ("pass" or "fail"). I declare int variables expected and actual to hold the expected return value from the method under test (which is stored in my in-memory DataTable) and the actual result returned when the method under test is called by the test harness. This is done like so:

string caseid, input, method, result; int actual, expected; Console.WriteLine("caseID input method expected case result"); Console.WriteLine("===========================================");

Now the main test loop iterates through each row of the DataTable object, as shown in the following:

for (int row = 0; row < dt.Rows.Count; ++row)

Inside the main test loop, I first perform a rudimentary check to make sure I have a valid test case. Then I fetch each column of the current row of the DataTable and store into the variables with more meaningful names that I declare just outside the loop:

object o = dt.Rows[row]["caseid"]; if (o == null) break; caseid = (string)dt.Rows[row]["caseid"]; input = (string)dt.Rows[row]["input"]; method = (string)dt.Rows[row]["method"]; expected = int.Parse( (string)(dt.Rows[row]["expected"]) );

Notice that this is the point at which I do my type conversions. Remember that I originally stored all my test case data as type Text in my Excel spreadsheet and read all the data as type String into my DataTable object. Now I can call the method under test and retrieve the actual value:

CribbageLib.Hand h = new Hand(new Card(input.Substring(0, 2)), new Card(input.Substring(2, 2)), new Card(input.Substring(4, 2)), new Card(input.Substring(6, 2)), new Card(input.Substring(8, 2))); if (method == "ValueOf15s") actual = h.ValueOf15s; else if (method == "ValueOfPairs") actual = h.ValueOfPairs; else throw new Exception("Unknown method in test case data");

I use the String.Substring method to parse out each pair of characters, such as 7c, which represent a Card object. The first integer argument to Substring is a zero-based index value of where in the string to begin extracting the substring. The second integer argument is the number of characters to extract (as opposed to the ending index as you might have thought). I branch on the value of variable method and call the method under test, capturing the actual return value. Finally, to finish my main test loop, I determine the date and time of my test case execution, determine the test result, build up an INSERT string, print my result to console, and insert the result into the Excel results worksheet:

DateTime whenRun = DateTime.Now; result = (actual == expected) ? "Pass" : "FAIL"; Console.WriteLine(caseid + " " + h.ToString() + " " + method.PadRight(15, ' ') + expected.ToString().PadRight(8, ' ') + result); string insert = "INSERT INTO tblResults (caseID, Result, WhenRun) values ('" + caseid + "', '" + result + "', '" + whenRun + "')"; insertCmd.CommandText = insert; insertCmd.ExecuteNonQuery();

I use the PadRight method—this is an old trick—to line up my output in columns. The insert statement is a bit ugly because of the embedded single quotes, commas, and paren characters—you just have to be careful here. Because I am invoking an INSERT command, I use the ExecuteNonQuery method to actually perform the insert operation.

Wrapping Up

As I mentioned, lightweight test automation is a complement to, not a replacement for, other test automation approaches. If you do find yourself in a scenario where lightweight automation is appropriate, Excel makes an excellent test data storage choice. The two major alternatives for such storage are plain text files and XML documents. Text files are particularly simple but compared to Excel storage, they have the significant disadvantage that spurious control characters and whitespace characters are difficult to detect. Meanwhile, XML documents can be overkill for lightweight test automation storage. XML's hierarchical structure makes reading and writing the data somewhat more difficult than reading and writing data stored in Excel.

Additionally, Excel has some neat advantages that are useful in this sort of scenario. First, if you use Excel to store test case data, you can employ a strategy of "open test case data entry." By this, I mean you can place the Excel file in a public area and allow everyone on your team to add test cases. A few years ago I was writing a poker library. I found an old clunker machine, set up an Excel test case data spreadsheet on it, placed a deck of cards next to the machine, and encouraged everyone to enter test cases. It was very successful and the process uncovered some critical bugs. Since most people are comfortable with Excel, you'll probably get more participation in this sort of open project than if you use some other data storage. The second notable thing about Excel is that if you save your results to a worksheet, you've got a built-in way to do some lightweight test result analysis using the many math functions.

The new Microsoft Office 2007 system will enhance your ability to use Excel for test data storage. The new Office XML Formats for Excel, Word, and PowerPoint® stand to bring several programmatic interoperability improvements to software development and testing. Because the new Excel .xlsx file format is based on XML, it will provide improved file management capabilities and an improved ability to recover damaged files. And because the Office XML Formats use ZIP compression technology, test data files will be significantly smaller.

For more information on the new Microsoft Office 2007 file formats, see Ted Pattison's Basic Instincts column in this issue of MSDN®Magazine.

Send your questions and comments for James to testrun@microsoft.com.

Dr. James McCaffrey works for Volt Information Sciences Inc., where he manages technical training for software engineers working at Microsoft. He has worked on several Microsoft products, including Internet Explorer and MSN Search. James can be reached at jmccaffrey@volt.com or v-jammc@microsoft.com.