Share via


An Overview of Data Generator Extensibility

You can use Visual Studio Premium or Visual Studio Ultimate to generate meaningful data for testing. By using the built-in data generators, you can generate random data, generate data from existing data sources, and control many aspects of data generation. If the functionality of the built-in generators is insufficient, you can create custom data generators. To create custom data generators, you use the classes in the Microsoft.Data.Schema.Tools.DataGenerator namespace.

The Data Generator Extensibility API

The extensibility API provides classes from which developers can inherit. In addition to classes, the API includes attributes that you can apply to your derived classes. By applying these attributes, you reduce the amount of code that is required in custom generators for common cases.

You can use the extensibility API in the following three ways to create custom data generators:

Extensibility

Description

Difficulty

Example

Declarative Extensibility

Easy

The built-in integer data generator

Normal Extensibility

Medium. This method is recommended most of the time.

Walkthrough: Creating a Custom Data Generator

Walkthrough: Creating a Custom Data Generator for a Check Constraint

Base Extensibility

  • Create a class that implements the IGenerator interface.

  • Implement all methods that are required by your generator.

  • Create a custom designer for the generator that implements the IDesigner.

  • Implement all methods that are required by your designer.

Difficult

None

Base Extensibility

The base extensibility API is the mechanism by which the data generation engine and the designers for data generation plans interact. This API was designed to meet the following goals:

  • Robustness — To promote a consistent and robust implementation in both the design-time and run-time engines.

  • Flexibility — To support complex generators such as the data bound generator.

A design trade-off that is implicit in the base extensibility API is that it is more complex than the higher-level declarative extensibility API.

Registering Custom Data Generators

Before you can use your custom data generator, you must register it on your computer. If you are giving your custom data generator to other people to use, they must register the generator on their computers.

You can register custom data generators in the following ways:

Method

Permissions Required

Example

Register the generator in the Extensions folder.

Power User or higher

Create a deployment project to register the generator.

Administrator

  • None

Data Generators, Distributions, and Designers

You can create custom data generators and custom designers for those generators. You can also create custom distributions for numeric data generators and custom designers for those distributions.

  • Custom data generators produce random test data according to a set of rules that you specify. You can use the default designer with those generators, or you can create a custom designer for them by inheriting from DefaultGeneratorDesigner. For example, the regular expression data generator is a built-in generator, but it uses a custom designer so that it can perform custom validation of user inputs at design time.

  • By using a custom generator designer, you can customize how input and output properties are retrieved from the user, set default values, and specify validation behavior.

  • By using a custom distribution, you can control the distribution of numeric values that a data generator generates.

  • Custom distribution designers control the design-time behavior for a custom distribution. This behavior includes getting the names of the input properties for the distribution, setting the default values of the input properties, and validating the values of the input properties for the distribution.

Data Generators and Localization

The data generators that are included with Visual Studio Premium and Visual Studio Ultimate are localized because Visual Studio ships multiple language versions. You probably do not have to localize your custom data generators. If you must create a data generator that will be localized, you should create a custom designer. You can also override the GetInputs method to localize the input property names.

Note

If possible, you should inherit from the DefaultGeneratorDesigner class, not implement the IDesigner interface, to avoid extra work.

Data Generator Instancing

Custom data generators can share data. The scope of the shared data is generator type and database table. Every generator type has a unique instance dictionary for each database table. For example, a custom data generator for a table named Customers has access to a shared dictionary. You can put any information into the dictionary and share that information. The dictionary is guaranteed to be the same instance for each generator type and table. For example, you can create a custom data generator and request the dictionary from GeneratorInit. Then you can verify whether the dictionary contains shared information. If it does, you can use the information to generate data. You can also create the shared information that other instances of your generator can use.

Note

Generator instancing is an advanced technique. You can use generator instancing to create a custom data generator that handles check constraints across columns — for example, a check constraint that requires that one column is greater than another column.

The Data Generation Process

Data generation occurs in the following phases:

Determine the designer type

Design time

This phase requires the type of the data generator as an input. The engine can then query the GeneratorAttribute to retrieve the designer type. Most of the time, GeneratorAttribute is inherited from the base class, which specifies the default designer.

Instantiate and initialize the designer

Design time

The designer is instantiated. The designer is initialized by calling Initialize and passing the generator type as an argument.

Retrieve the input descriptors

Design time

The InputDescriptor is retrieved from the designer. The default designer does this by retrieving all properties of the data generator that are marked with the InputAttribute.

Set the default values

Design time

The default values are set.

Get the generator output descriptions

Design time

The OutputDescriptor is retrieved from the designer. The default designer uses properties that are marked with OutputAttribute to create the descriptions that appear in the Generator Output column of the Column Details window.

Instantiate the generator

Run time

The data generator is instantiated by using the default constructor.

Set the generator inputs

Run time

All input values are set in the data generator from the input descriptors that are retrieved from the designer.

Validate the generator

Run time

The ValidateInputs method is called. If validation fails, the generator will throw an InputValidationException exception. Any exception other than a data validation exception is treated as an unrecoverable error.

Initialize the generator

Run time

The Initialize method is called. This step enables the data generator to perform any necessary setup before data generation occurs, such as specifying the connection string for the target database or seeding the random number generator. This phase occurs one time before data generation occurs.

Run the data generation

Run time

During this phase, new results are generated by calling the GenerateNextValues method. Results can be retrieved by using the GetOutputValue method. This method retrieves a scalar value from the generator that corresponds to the output key that is passed to the generator as input. This phase iterates through results until all the results that you want have been generated.

Clean up

Run time

After all data generation is complete, Dispose is called to clean up the generator.

See Also

Tasks

How to: Create Custom Data Generators

Walkthrough: Creating a Custom Data Generator

Reference

Microsoft.Data.Schema.DataGenerator

Concepts

Generate Specialized Test Data with a Custom Data Generator