Scaffolding (Reverse Engineering)

Article
03/27/2023

Reverse engineering is the process of scaffolding entity type classes and a DbContext class based on a database schema. It can be performed using the Scaffold-DbContext command of the EF Core Package Manager Console (PMC) tools or the dotnet ef dbcontext scaffold command of the .NET Command-line Interface (CLI) tools.

Note

The scaffolding of a DbContext and entity types documented here is distinct from the scaffolding of controllers in ASP.NET Core using Visual Studio, which is not documented here.

Tip

If you use Visual Studio, try out the EF Core Power Tools community extension. These tools provide a graphical tool which builds on top of the EF Core command line tools and offers additional workflow and customization options.

Prerequisites

Before scaffolding, you'll need to install either the PMC tools, which work on Visual Studio only, or the .NET CLI tools, which across all platforms supported by .NET.
Install the NuGet package for Microsoft.EntityFrameworkCore.Design in the project you are scaffolding to.
Install the NuGet package for the database provider that targets the database schema you want to scaffold from.

Required arguments

Both the PMC and the .NET CLI commands have two required arguments: the connection string to the database, and the EF Core database provider to use.

Connection string

The first argument to the command is a connection string to the database. The tools will use this connection string to read the database schema.

How you quote and escape the connection string depends on which shell you are using to execute the command; refer to your shell's documentation for more information. For example, PowerShell requires you to escape the $ character, but not \.

The following example scaffolds entity types and a DbContext from the Chinook database located on the machine's SQL Server LocalDB instance, making use of the Microsoft.EntityFrameworkCore.SqlServer database provider.

.NET Core CLI
Visual Studio PMC

dotnet ef dbcontext scaffold "Data Source=(localdb)\MSSQLLocalDB;Initial Catalog=Chinook" Microsoft.EntityFrameworkCore.SqlServer

Scaffold-DbContext 'Data Source=(localdb)\MSSQLLocalDB;Initial Catalog=Chinook' Microsoft.EntityFrameworkCore.SqlServer

User secrets for connection strings

If you have a .NET application that uses the hosting model and configuration system, such as an ASP.NET Core project, then you can use the Name=<connection-string> syntax to read the connection string from configuration.

For example, consider an ASP.NET Core application with the following configuration file:

{
  "ConnectionStrings": {
    "Chinook": "Data Source=(localdb)\\MSSQLLocalDB;Initial Catalog=Chinook"
  }
}

This connection string in the config file can be used to scaffold from a database using:

.NET Core CLI
Visual Studio PMC

dotnet ef dbcontext scaffold "Name=ConnectionStrings:Chinook" Microsoft.EntityFrameworkCore.SqlServer

Scaffold-DbContext 'Name=ConnectionStrings:Chinook' Microsoft.EntityFrameworkCore.SqlServer

However, storing connection strings in configuration files is not a good idea, since it is too easy to accidentally expose them, for example, by pushing to source control. Instead, connection strings should be stored in a secure way, such as using Azure Key Vault or, when working locally, the Secret Manager tool, aka "User Secrets".

For example, to use the User Secrets, first remove the connection string from your ASP.NET Core configuration file. Next, initialize User Secrets by executing the following command in the same directory as the ASP.NET Core project:

dotnet user-secrets init

This command sets up storage on your computer separate from your source code and adds a key for this storage to the project.

Next, store the connection string in user secrets. For example:

dotnet user-secrets set ConnectionStrings:Chinook "Data Source=(localdb)\MSSQLLocalDB;Initial Catalog=Chinook"

Now the same command that previous used the named connection string from the config file will instead use the connection string stored in User Secrets. For example:

.NET Core CLI
Visual Studio PMC

dotnet ef dbcontext scaffold "Name=ConnectionStrings:Chinook" Microsoft.EntityFrameworkCore.SqlServer

Scaffold-DbContext 'Name=ConnectionStrings:Chinook' Microsoft.EntityFrameworkCore.SqlServer

Connection strings in the scaffolded code

By default, the scaffolder will include the connection string in the scaffolded code, but with a warning. For example:

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
#warning To protect potentially sensitive information in your connection string, you should move it out of source code. You can avoid scaffolding the connection string by using the Name= syntax to read it from configuration - see https://go.microsoft.com/fwlink/?linkid=2131148. For more guidance on storing connection strings, see http://go.microsoft.com/fwlink/?LinkId=723263.
    => optionsBuilder.UseSqlServer("Data Source=(LocalDb)\\MSSQLLocalDB;Database=AllTogetherNow");

This is done so that the generated code does not crash when first used, which would be a very poor learning experience. However, as the warning says, connection strings should not exist in production code. See DbContext Lifetime, Configuration, and Initialization for the various ways that connection strings can be managed.

Tip

The -NoOnConfiguring (Visual Studio PMC) or --no-onconfiguring (.NET CLI) option can be passed to suppress creation of the OnConfiguring method containing the connection string.

Provider name

The second argument is the provider name. The provider name is typically the same as the provider's NuGet package name. For example, for SQL Server or Azure SQL, use Microsoft.EntityFrameworkCore.SqlServer.

Command line options

The scaffolding process can be controlled by various command line options.

Specifying tables and views

By default, all tables and views in the database schema are scaffolded into entity types. You can limit which tables and views are scaffolded by specifying schemas and tables.

The -Schemas (Visual Studio PMC) or --schema (.NET CLI) argument specifies the schemas of tables and views for which entity types will be generated. If this argument is omitted, then all schemas are included. If this option is used, then all tables and views in the schemas will be included in the model, even if they are not explicitly included using -Tables or --table.

The -Tables (Visual Studio PMC) or --table (.NET CLI) argument specified the tables and views for which entity types will be generated. Tables or views in a specific schema can be included using the 'schema.table' or 'schema.view' format. If this option is omitted, then all tables and views are included. |

For example, to scaffold only the Artists and Albums tables:

.NET CLI
Visual Studio PMC

dotnet ef dbcontext scaffold ... --table Artist --table Album

Scaffold-DbContext ... -Tables Artists, Albums

To scaffold all tables and views from the Customer and Contractor schemas:

.NET CLI
Visual Studio PMC

dotnet ef dbcontext scaffold ... --schema Customer --schema Contractor

Scaffold-DbContext ... -Schemas Customer, Contractor

For example, to scaffold the Purchases table from the Customer schema, and the Accounts and Contracts tables from the Contractor schema:

.NET CLI
Visual Studio PMC

dotnet ef dbcontext scaffold ... --table Customer.Purchases --table Contractor.Accounts --table Contractor.Contracts

Scaffold-DbContext ... -Tables Customer.Purchases, Contractor.Accounts, Contractor.Contracts

Preserving database names

Table and column names are fixed up to better match the .NET naming conventions for types and properties by default. Specifying -UseDatabaseNames (Visual Studio PMC) or --use-database-names (.NET CLI) will disable this behavior preserving the original database names as much as possible. Invalid .NET identifiers will still be fixed and synthesized names like navigation properties will still conform to .NET naming conventions.

For example, consider the following tables:

CREATE TABLE [BLOGS] (
    [ID] int NOT NULL IDENTITY,
    [Blog_Name] nvarchar(max) NOT NULL,
    CONSTRAINT [PK_Blogs] PRIMARY KEY ([ID]));

CREATE TABLE [posts] (
    [id] int NOT NULL IDENTITY,
    [postTitle] nvarchar(max) NOT NULL,
    [post content] nvarchar(max) NOT NULL,
    [1 PublishedON] datetime2 NOT NULL,
    [2 DeletedON] datetime2 NULL,
    [BlogID] int NOT NULL,
    CONSTRAINT [PK_Posts] PRIMARY KEY ([id]),
    CONSTRAINT [FK_Posts_Blogs_BlogId] FOREIGN KEY ([BlogID]) REFERENCES [Blogs] ([ID]) ON DELETE CASCADE);

By default, the following entity types will be scaffolded from these tables:

public partial class Blog
{
    public int Id { get; set; }
    public string BlogName { get; set; } = null!;
    public virtual ICollection<Post> Posts { get; set; } = new List<Post>();
}

public partial class Post
{
    public int Id { get; set; }
    public string PostTitle { get; set; } = null!;
    public string PostContent { get; set; } = null!;
    public DateTime _1PublishedOn { get; set; }
    public DateTime? _2DeletedOn { get; set; }
    public int BlogId { get; set; }
    public virtual Blog Blog { get; set; } = null!;
    public virtual ICollection<Tag> Tags { get; set; } = new List<Tag>();
}

However, using -UseDatabaseNames or --use-database-names results in the following entity types:

public partial class BLOG
{
    public int ID { get; set; }
    public string Blog_Name { get; set; } = null!;
    public virtual ICollection<post> posts { get; set; } = new List<post>();
}

public partial class post
{
    public int id { get; set; }
    public string postTitle { get; set; } = null!;
    public string post_content { get; set; } = null!;
    public DateTime _1_PublishedON { get; set; }
    public DateTime? _2_DeletedON { get; set; }
    public int BlogID { get; set; }
    public virtual BLOG Blog { get; set; } = null!;
}

Use mapping attributes (aka Data Annotations)

Entity types are configured using the ModelBuilder API in OnModelCreating by default. Specify -DataAnnotations (PMC) or --data-annotations (.NET Core CLI) to instead use mapping attributes when possible.

For example, using the Fluent API will scaffold this:

entity.Property(e => e.Title)
    .IsRequired()
    .HasMaxLength(160);

While using Data Annotations will scaffold this:

[Required]
[StringLength(160)]
public string Title { get; set; }

Tip

Some aspects of the model cannot be configured using mapping attributes. The scaffolder will still use the model building API to handle these cases.

DbContext name

The scaffolded DbContext class name will be the name of the database suffixed with Context by default. To specify a different one, use -Context in PMC and --context in the .NET Core CLI.

Target directories and namespaces

The entity classes and a DbContext class are scaffolded into the project's root directory and use the project's default namespace.

.NET CLI
Visual Studio PMC

You can specify the directory where classes are scaffolded using --output-dir, and --context-dir can be used to scaffold the DbContext class into a separate directory from the entity type classes:

dotnet ef dbcontext scaffold ... --context-dir Data --output-dir Models

By default, the namespace will be the root namespace plus the names of any subdirectories under the project's root directory. However, you can override the namespace for all output classes by using --namespace. You can also override the namespace for just the DbContext class using --context-namespace:

dotnet ef dbcontext scaffold ... --namespace Your.Namespace --context-namespace Your.DbContext.Namespace

You can specify the directory where classes are scaffolded using -OutputDir, and -ContextDir can be used to scaffold the DbContext class into a separate directory from the entity type classes:

Scaffold-DbContext ... -ContextDir Data -OutputDir Models

By default, the namespace will be the root namespace plus the names of any subdirectories under the project's root directory. However, you can override the namespace for all output classes by using -Namespace. You can also override the namespace for just the DbContext class using -ContextNamespace.

Scaffold-DbContext ... -Namespace Your.Namespace -ContextNamespace Your.DbContext.Namespace

The scaffolded code

The result of scaffolding from an existing database is:

A file containing a class that inherits from DbContext
A file for each entity type

Tip

Starting in EF7, you can also use T4 text templates to customize the generated code. See Custom Reverse Engineering Templates for more details.

C# Nullable reference types

The scaffolder can create EF model and entity types that use C# nullable reference types (NRTs). NRT usage is scaffolded automatically when NRT support is enabled in the C# project into which the code is being scaffolded.

For example, the following Tags table contains both nullable non-nullable string columns:

CREATE TABLE [Tags] (
  [Id] int NOT NULL IDENTITY,
  [Name] nvarchar(max) NOT NULL,
  [Description] nvarchar(max) NULL,
  CONSTRAINT [PK_Tags] PRIMARY KEY ([Id]));

This results in corresponding nullable and non-nullable string properties in the generated class:

public partial class Tag
{
    public Tag()
    {
        Posts = new HashSet<Post>();
    }

    public int Id { get; set; }
    public string Name { get; set; } = null!;
    public string? Description { get; set; }

    public virtual ICollection<Post> Posts { get; set; }
}

Similarly, the following Posts tables contains a required relationship to the Blogs table:

CREATE TABLE [Posts] (
    [Id] int NOT NULL IDENTITY,
    [Title] nvarchar(max) NOT NULL,
    [Contents] nvarchar(max) NOT NULL,
    [PostedOn] datetime2 NOT NULL,
    [UpdatedOn] datetime2 NULL,
    [BlogId] int NOT NULL,
    CONSTRAINT [PK_Posts] PRIMARY KEY ([Id]),
    CONSTRAINT [FK_Posts_Blogs_BlogId] FOREIGN KEY ([BlogId]) REFERENCES [Blogs] ([Id]));

This results in the scaffolding of non-nullable (required) relationship between blogs:

public partial class Blog
{
    public Blog()
    {
        Posts = new HashSet<Post>();
    }

    public int Id { get; set; }
    public string Name { get; set; } = null!;

    public virtual ICollection<Post> Posts { get; set; }
}

And posts:

public partial class Post
{
    public Post()
    {
        Tags = new HashSet<Tag>();
    }

    public int Id { get; set; }
    public string Title { get; set; } = null!;
    public string Contents { get; set; } = null!;
    public DateTime PostedOn { get; set; }
    public DateTime? UpdatedOn { get; set; }
    public int BlogId { get; set; }

    public virtual Blog Blog { get; set; } = null!;

    public virtual ICollection<Tag> Tags { get; set; }
}

Many-to-many relationships

The scaffolding process detects simple join tables and automatically generates a many-to-many mapping for them. For example, consider tables for Posts and Tags, and a join table PostTag connecting them:

CREATE TABLE [Tags] (
  [Id] int NOT NULL IDENTITY,
  [Name] nvarchar(max) NOT NULL,
  [Description] nvarchar(max) NULL,
  CONSTRAINT [PK_Tags] PRIMARY KEY ([Id]));

CREATE TABLE [Posts] (
    [Id] int NOT NULL IDENTITY,
    [Title] nvarchar(max) NOT NULL,
    [Contents] nvarchar(max) NOT NULL,
    [PostedOn] datetime2 NOT NULL,
    [UpdatedOn] datetime2 NULL,
    CONSTRAINT [PK_Posts] PRIMARY KEY ([Id]));

CREATE TABLE [PostTag] (
    [PostsId] int NOT NULL,
    [TagsId] int NOT NULL,
    CONSTRAINT [PK_PostTag] PRIMARY KEY ([PostsId], [TagsId]),
    CONSTRAINT [FK_PostTag_Posts_TagsId] FOREIGN KEY ([TagsId]) REFERENCES [Tags] ([Id]) ON DELETE CASCADE,
    CONSTRAINT [FK_PostTag_Tags_PostsId] FOREIGN KEY ([PostsId]) REFERENCES [Posts] ([Id]) ON DELETE CASCADE);

When scaffolded, this results in a class for Post:

public partial class Post
{
    public Post()
    {
        Tags = new HashSet<Tag>();
    }

    public int Id { get; set; }
    public string Title { get; set; } = null!;
    public string Contents { get; set; } = null!;
    public DateTime PostedOn { get; set; }
    public DateTime? UpdatedOn { get; set; }
    public int BlogId { get; set; }

    public virtual Blog Blog { get; set; } = null!;

    public virtual ICollection<Tag> Tags { get; set; }
}

And a class for Tag:

public partial class Tag
{
    public Tag()
    {
        Posts = new HashSet<Post>();
    }

    public int Id { get; set; }
    public string Name { get; set; } = null!;
    public string? Description { get; set; }

    public virtual ICollection<Post> Posts { get; set; }
}

But no class for the PostTag table. Instead, configuration for a many-to-many relationship is scaffolded:

entity.HasMany(d => d.Tags)
    .WithMany(p => p.Posts)
    .UsingEntity<Dictionary<string, object>>(
        "PostTag",
        l => l.HasOne<Tag>().WithMany().HasForeignKey("PostsId"),
        r => r.HasOne<Post>().WithMany().HasForeignKey("TagsId"),
        j =>
            {
                j.HasKey("PostsId", "TagsId");
                j.ToTable("PostTag");
                j.HasIndex(new[] { "TagsId" }, "IX_PostTag_TagsId");
            });

Other programming languages

The EF Core packages published by Microsoft scaffold C# code. However, the underlying scaffolding system supports a plugin model for scaffolding to other languages. This plugin model is used by various community-run projects, for example:

EntityFrameworkCore.VisualBasic provides support for Visual Basic
EFCore.FSharp provides support for F#

Customizing the code

Starting with EF7, one of the best ways to customize the generated code is by customizing the T4 templates used to generate it.

The code can also be changed after it is generated, but the best way to do this depends on whether you intend to re-run the scaffolding process when the database model changes.

Scaffold once only

With this approach, the scaffolded code provides a starting point for code-based mapping going forward. Any changes to the generated code can be made as desired--it becomes normal code just like any other code in your project.

Keeping the database and the EF model in sync can be done in one of two ways:

Switch to using EF Core database migrations, and use the entity types and EF model configuration as the source of truth, using migrations to drive the schema.
Manually update the entity types and EF configuration when the database changes. For example, if a new column is added to a table, then add a property for the column to the mapped entity type, and add any necessary configuration using mapping attributes and/or code in OnModelCreating. This is relatively easy, with the only real challenge being a process to make sure that database changes are recorded or detected in some way so that the developer(s) responsible for the code can react.

Repeated scaffolding

An alternate approach to scaffolding once is to re-scaffold every time the database changes. This will overwrite any previously scaffolded code, meaning any changes made to entity types or EF configuration in that code will be lost.

[TIP] By default, the EF commands will not overwrite any existing code to protect against accidental code loss. The -Force (Visual Studio PMC) or --force (.NET CLI) argument can be used to force overwriting of existing files.

Since the scaffolded code will be overwritten, it is best not to modify it directly, but instead rely on partial classes and methods, and the mechanisms in EF Core that allow configuration to be overridden. Specifically:

Both the DbContext class and the entity classes are generated as partial. This allows introducing additional members and code in a separate file which will not be overridden when scaffolding is run.
The DbContext class contains a partial method called OnModelCreatingPartial. An implementation of this method can be added to the partial class for the DbContext. It will then be called after OnModelCreating is called.
Model configuration made using the ModelBuilder APIs overrides any configuration done by conventions or mapping attributes, as well earlier configuration done on the model builder. This means that code in OnModelCreatingPartial can be used to override the configuration generated by the scaffolding process, without the need to remove that configuration.

Finally, remember that starting with EF7, the T4 templates used to generate code can be customized. This is often a more effective approach than scaffolding with the defaults and then modifying with partial classes and/or methods.

How it works

Reverse engineering starts by reading the database schema. It reads information about tables, columns, constraints, and indexes.

Next, it uses the schema information to create an EF Core model. Tables are used to create entity types; columns are used to create properties; and foreign keys are used to create relationships.

Finally, the model is used to generate code. The corresponding entity type classes, Fluent API, and data annotations are scaffolded in order to re-create the same model from your app.

Limitations

Not everything about a model can be represented using a database schema. For example, information about inheritance hierarchies, owned types, and table splitting are not present in the database schema. Because of this, these constructs will never be scaffolded.
In addition, some column types may not be supported by the EF Core provider. These columns won't be included in the model.
You can define concurrency tokens in an EF Core model to prevent two users from updating the same entity at the same time. Some databases have a special type to represent this type of column (for example, rowversion in SQL Server) in which case we can reverse engineer this information; however, other concurrency tokens will not be scaffolded.

Share via