Equivalence class partitioning - Part 1
Wow...where does the time go? I was remiss last week in posting, and it has been a month since I posted about equivalence class partitioning. So, let's get back to it shall we?
Equivalence class partitioning (ECP) is a functional testing technique useful in either black box or white box test design. A technique is a systematic approach to help solve a complex problem. Techniques are not silver-bullets, but they are a logical and analytical approach to problem solving that heavily draws upon the tester's cognitive abilities (having a basis in or reducible to empirical factual knowledge) as opposed to random guessing (or little men running about inside one's head triggering turbid thoughts). Contrary to popular misconceptions the application of ECP is not a rote, brain-dead activity. The ECP technique requires in-depth knowledge of the data set (data type, encoding method, etc), the programming language used in the implementation, the algorithm structure, the operating environment, protocols, and even the hardware platform may impact how the data for a particular parameter might be decomposed. The effectiveness of the application of this technique solely lies in the testers ability to adequately decompose the data set for a given parameter into subsets in which any element from a specific subset would produce the same result as any other element from that subset.
Essentially the ECP technique analyzes input or output variable data for each specific parameter and decomposes the data into discrete valid and invalid class subsets. See my article in Software Testing & Performance magazine for a more in-depth understanding of the equivalence class partitioning technique. The article uses a simple next date program as an example because working with integer type data within a limited range is usually a good introduction to this complex technique.
A frequent reader of this blog suggested he often sees this technique applied to simple parameters limited to number type inputs, but would like to understand how this is applied to a more complex parameter that takes a string input. Interestingly enough, one of the exercises I designed in our internal training (and also used in the workshop I present regularly at the Software Testing and Performance conferences) deals specifically with string input. Let me outline the exercise.
Your goal is to adequately decompose the set of data (the ANSI Latin 1 character set) into valid and invalid equivalent class subsets to evaluate the behavior of the base filename parameter passed to COMDLG32.DLL's file save functionality from a user's perspective (GUI) on the Windows Xp operating system with an NTFS file system on a PC/AT platform.
NOTES:
- On the Windows environment the base filename parameter is separate and not interdependent on the extension parameter. (Some frequent readers may remember I explained this point previously to an individual who mistakenly assumed the base filename parameter and extension parameter were interdependent on the Windows environment.) So, remember, the purpose of this exercise is to focus initially on the base filename parameter (don't worry about the extension parameter...yet).
- To expose the user perspective of the file save functionality of COMDLG32 we will use Notepad.exe, and select the File -> Save As... menu item to instantiate the Save As... dialog.
- Characters greater than 0x7F can be entered using the numeric input method by holding down the ALT key and pressing number keypad keys equivalent to the characters decimal value. For example to enter the € character press and hold the ALT key and press the 0 1 2 8 keys on the numeric keypad. Release the ALT key and the character is displayed. (You can also use my Babel tool to generate specific characters by inputting the Unicode values for the characters in the Custom range groupbox and copying them into the filename control.)
- Not all characters can be entered using the numeric input method. For example, control sequences cannot be directly entered from the keyboard using this method (or by holding down the control key).
- In order to succeed, one must understand
- Common programming concepts (especially for the C family of languages)
- Windows File naming conventions
- Windows Xp, file I/O APIs
- Windows Xp NTFS file system
- Knowledge of basic character encoding
- Historical knowledge of FAT file systems
- A ECP table similar to the one outlined below may help you organize your subsets
Input/OutputCondition | Valid ClassSubsets | ExpectedResult | Invalid ClassSubsets | ExpectedResult |
Base Filename |
Give it a try...next week I will offer a solution that you can compare your own results against.
Comments
Anonymous
November 30, 2007
The comment has been removedAnonymous
February 28, 2008
"the ANSI Latin 1 character set" That is almost well defined. It would be more accurate to say Code Page 1252 since, e.g., you probably want 0x80 to mean what Code Page 1252 says instead of what ISO Latin 1 says. But anyway it's pretty close. "COMDLG32.DLL's file save functionality" That is poorly defined. In some cases you need to say which language version of Windows is involved. In some cases even in the NT series I've seen misbehaviour but I'm not sure if that's because of COMDLG32.DLL or because of runtime libraries for various Visual Studio languages. Anyway since we are talking about code pages and not Unicode, I think you'd better specify which language version of COMDLG32.DLL. "NTFS file system" That too. I haven't tested to see if NTFS partitions that are formatted while a Turkish environment is active get their upcasing tables defined appropriately for Turkish, but from what I've read about the design it seems to me that they "should" get correct upcasing tables. The Win32 and NT APIs participate in deciding whether two filenames that differ only in casing are identical or not, but the partition's upcasing table ought to be paramount in order to preserve the partition's integrity.Anonymous
February 28, 2008
Hi Norman, Actually, I do mean the ANSI Latin 1 character set which the the Windows 1252 (Latin 1) code page is derived, as referenced here (http://www.alanwood.net/demos/ansi.html) and here (http://orwell.ru/info/ansi.htm) and here (http://www.medcalc.be/manual/ansi_character_set.php), and other places such as the ANSI library. But, nice try! WRT to COMDLG32.DLL, I specified the Windows Xp operating system. Xp is a single worldwide binary and all internal processing is in Unicode. Therefore, your language version argument is outdated and a false assumption that would cause a tester to waste valuable cycles. But, I am sure you knew that. I would however suggest that you upgrade from NT 4.0 to Window Xp or Vista.Anonymous
February 28, 2008
"Actually, I do mean the ANSI Latin 1 character set which the the Windows 1252 (Latin 1) code page is derived" The ISO Latin 1 character set assigns code points 0x80 through 0x9F to control characters. You want Windows code page 1252 which assigns many of those code points to printable characters. "Xp is a single worldwide binary" It is not. Vista comes a lot closer to that goal than XP does. "and all internal processing is in Unicode." GetProcAddress. gethostname. DbgPrint. You're very close to right though (a lot closer than the assertion that XP is a single worldwide binary). XP came out before Visual Studio .Net. It was still necessary for VB6 programs to run on XP. One or two service packs were enough to make Japanese VB6 executables display Japanese correctly on Japanese XP. As mentioned, I'm not sure if these problems were because of COMDLG32.DLL or because of runtime libraries for various Visual Studio languages.Anonymous
February 29, 2008
The comment has been removedAnonymous
March 02, 2008
"VB was notoriously renouned for its inability to deal with various character sets and charset issues." Correct. I found the files that I had to use to work around problems, and they're OCX files not DLL files (even though the internals are DLLs). Though again notice that this was VB6's use of its own language version, no foreign codings involved. VC++6 had trouble with Unicode but I don't remember now if it had trouble with its own language version (MSDN pages sure did though). Actually VC++2005 SP1 still has some trouble with Unicode. I ought to try VC++ 2008. "Perhaps you would have been happier if Xp was not released until VS.NET was released" No, XP should have been released when XP was ready for release, and that was when XP SP2 was released. I hold nothing against releasing CTPs as CTPs, and XP prior to SP2 should have been released as CTPs. It is fortunate that Microsoft did the right thing with SP2, not charging customers double to get a bugfix release. Microsoft also did the right thing with some service packs for Visual Studio 6, fixing bugs without charging extra. I've read enough about regressions from SP5 to SP6 to suggest that Microsoft's memory needs refreshing. VS6 SP7, VS2002 SP2, VS2003 SP2, and VS2005 SP2 are sorely needed. Anyway, in this case the OS issues and language library issues appear to be independent. I hadn't recalled which library files were involved in this particular problem, but later you and I both found that this one involved language libraries. I agree that COMDLG32.DLL isn't a problem here.