Introduction to TestApi – Part 6: Text String Generation API

Series Index

+++

General Notes

Testing often necessitates constructing strings to deliberately exercise specific features of an application. TestApi provides a text string generation API to facilitate generating strings that are random within a set of fixed parameters, such as the number of characters, target Unicode range, whether or not the string should contain numbers, whether or not the string should contain line breaks, etc.

The API was created by our engineer Dennis Deng in collaboration with other engineers around the company.

Unicode Basics

The TestApi text string generation API targets both general testers and professional text testers and exposes functionality for constructing strings given a novice or advanced understanding of Unicode. Two common Unicode concepts which are used throughout the library are defined below. For a comprehensive introduction to Unicode and a complete glossary, visit http://unicode.org/glossary/.

  • Code Points: In Unicode terminology any particular character set is referred to as an Encoding or Code. Acode point is one specific point within a code, i.e. one specific character within a character set.
  • Unicode Character Code Chart: The Unicode standard defines a number of different character sub-sets, which are called Unicode character code charts (or Unicode charts for short). These charts are available on http://unicode.org/charts. Every chart (represented by the UnicodeChart class) is uniquely identified by three properties: group name, name and sub-name. For example, the “Armenian Ligatures” chart has a group name of “European Scripts”, name of “Armenian” and sub-name of “Armenian Ligatures”. It is important to note that the Unicode charts are versioned. TestApi supports the Unicode chart which is built in the referenced version of the .NET framework.

String Generation Technology

  • StringFactory: A StringFactory is an object used for generating random strings. Rather than constructing purely random strings, the factory will generate strings within a certain set of parameters, determined by a provided StringProperties object.
  • StringProperties: A StringProperties object specifies the properties of a string to be randomly generated by a StringFactory, such as the number of unique characters or the Unicode character set (or subset) to use. A StringProperties object is passed to StringFactory.GenerateRandomString(…).
  • UnicodeRange: A UnicodeRange is a range of Unicode code points. A StringProperties object contains a UnicodeRange. When the UnicodeRange of a particular StringProperties object is set to a particular range of values, any strings generated using that StringProperties object will be confined to the given UnicodeRange.

image

 

Example #1

The following example demonstrates how to generate a Cyrillic string that contains between 10 and 30 characters.

 //
// Generate a Cyrillic string with a length between 10 and 30 characters.
//

StringProperties properties = new StringProperties();
properties.MinNumberOfCodePoints = 10;
properties.MaxNumberOfCodePoints = 30;
properties.UnicodeRanges.Add(new UnicodeRange(UnicodeChart.Cyrillic));

string s = StringFactory.GenerateRandomString(properties, 1234);

 

The generated string may look as follows:

 
s: Ӥёӱіӱӎ҄ҤяѪӝӱѶҾүҕГ

Notice how the string contains 17 characters that are all part of the Cyrillic character code chart.

Example #2

The following example demonstrates how to generate a string of a fixed length with numbers.

 // 
// Generate a string of 20 random code points containing numbers.
//

StringProperties properties = new StringProperties();
properties.MinNumberOfCodePoints = 20;
properties.MaxNumberOfCodePoints = 20;
properties.HasNumbers = true;
properties.UnicodeRanges.Add(new UnicodeRange(0, 0xFFFF));

string s1 = StringFactory.GenerateRandomString(properties, 5678);
string s2 = StringFactory.GenerateRandomString(properties, 5678);

The generated strings may look as follows:

 
s1: 鼯粳뷷賵Ɨ烣૝犥0崪𣏕窚氢ཝꥁ姽戁䅽7
s2: 鼯粳뷷賵Ɨ烣૝犥0崪𣏕窚氢ཝꥁ姽戁䅽7

Notice how the strings contain numbers and have exactly 20 code points (characters). Notice also how using the same seed (5678) results in generation of exactly the same strings.

 

Conclusion

Text string generation is a tricky. TestApi exposes a simple straightforward API for generation of text strings from a set of desired string properties.