The pros and cons of Partial Projects and Database Project References

This blog post assumes you are using Visual Studio 2010. However, most of the information is relevant to Visual Studio Team System 2008 Database Edition GDR.

We get a lot of questions on the forum for which the answer is to use partial projects and/or database project references. Before you choose one or the other method to implement your database design it is helpful to understand the pros and cons of each feature. This blog post attempts to give you some insight into these pros and cons.

Inside Data Dude

Before we discuss Partial Projects and Database Project References it is important to understand how the project system behaves. Specifically, it is important to understand how it represents your design internally. The following figure describes how a database or server project named “Foo” is represented. In other words, the figure describes what Data Dude consumes to create the in-memory model of your database design. We refer to the in-memory model as the “Schema Model.” The Schema Model represents four things

  1. Your Project Properties \ Project Settings; Project version – SQL 2005, SQL 2008 or DAC and Collation for database model – The model is actually persisted in a SQL Compact database. Since there is not a 1 to 1 mapping between SQL Server collations and SQL Compact collations you have to tell Data Dude which one to use.
  2. The database settings you define in the .sqlsettings file
  3. The permissions you define in the .sqlpermissions file
  4. The objects you declare in your .sql files
  5. The XML schema collections you reference in .xsd files

 p4

The reason it is important to understand this is because it profoundly affects what happens when you deploy your project. If an object or setting is not in the Schema Model it will not be compared to the target database and added, modified or deleted when you deploy the project.

p5

Partial Projects

Partial projects allow you to share source code files between projects without creating multiple copies of the actual source code. In other words, the same physical file is incorporated directly into the Schema Model for two (or more) projects. See the following figure. In this diagram some files have been exported from project “Foo” and imported into project “Bar.” It’s important to understand that the same physical .sql file(s) is being consumed by each project. The Schema Model for “Bar” does not understand that the source files are being maintained in another project (“Bar”).

p6

This brings us to the Pros and Cons of partial projects

Pros

  • When you include code by importing a partial project the T-SQL code is included in the project’s Schema Model.
  • When you edit the .sql file the changes are reflected in the Schema Model for the project that owns the .sql file as well as the project(s) importing the file. Deployment is simple because all your objects are defined in one Schema Model.

Cons

  • If you add a new file that you want to export you have to
    • Re-create the partial project export
    • Delete the partial project from the project doing the import
    • Re-import the partial project

This is obviously a big pain and is the number one reason people don’t like partial projects. The work around is to define all the objects that you plan to export from a particular project in one .sql file.

  • The export/import definition is persisted in a .files file. This file contains links to the original .sql files being exported. These links are file paths so if the file paths change all the links are broken. This can easily happen as projects are checked into source control and then the projects are built using a build system like TFS Build.
  • If you perform a refactoring operation on an object that has been imported/exported, the refactoring log is only persisted in the project in which you performed the refactoring operation. The refactoring log is what the deployment component uses to figure out that an object, for example, should be renamed instead of dropped and recreated. For example, if you perform the refactoring operation in the project that does the import, the preservation of intent information is not persisted to the project from which the object is exported. However, the object is renamed in both projects (assuming the Rename refactor operation). The work arounds are a) always perform refactor operations from the project that you will deploy or b) manually copy the specific refactoring element from the refactoring log file into the refactoring log file of the other project.

Database Project References

Database project references allow you to reference T-SQL objects defined in another database or server project. The objects defined in the referenced project are included as [External] objects in the Schema Model of the referencing project. See the following figure.

p7

As the figure indicates, the objects defined in the referenced project are not truly included in the Schema Model for “Foo.” If an object is referenced that is not defined in the Schema Model the validation component tries to find the object’s definition in the [External] set of objects.

Two, Three and Four Part Naming

When you create your database project reference you have to decide if you will be referencing the objects from the referenced project using two part, three part or four part names. You control this behavior by defining Database Reference Variables, as explained in the section below titled Configuring the Database Reference.

Two part names

The Data Dude product team refers to project references that use two part names as “Composite Projects.” The reason being you are implementing a single database that is “composed” of two or more projects. The term “Composite Project” is a euphemism for a project reference that is configured for two part name references.

There are two common ways that developers organize their code

  • By object type – Tables in one project, Views in another and Sprocs in a third project
  • By Schema – Sales in one project, Employees in another, etc.

It is important to understand that each project must be deployed independently and they must be deployed in the correct order. If you use Visual Studio to deploy your project or solution it will automatically deploy everything in the correct order. If you use a command line tool like msbuild or vsdbcmd you have to make sure you do the deployment in the proper order.

Three part names

By defining a database name using a literal value or SQLCMD variable you are configuring the database reference such that references to SQL objects must be defined using three part names. See the configuration information below for details.

If you need to refer to objects that are defined in SQL Server’s master database this is the configuration you will use. You will define a Database Reference to the appropriate master.dbschema file that we ship and you will set the database variable name to master after enabling the Literal checkbox.

Four part names

By defining a server variable you are configuring the database reference such that references to SQL objects must be defined using four part names.

 

Configuring the Database Reference

There are several configuration settings available when defining a database project reference. It’s helpful to understand what each setting actually affects. I’ve included a screenshot of the dialog below.

p8

Database Reference

This is the setting that allows you to point to what you are actually referencing. You have two choices

  1. Another project in the current solution
  2. A .dbschema file

When you build a database or server project the artifact created is a .dbschema file. It’s important to understand that both choices actually end up pointing to a .dbschema file. Choosing the first option simply looks at the current build output folder setting and automatically figures out the path and filename. In other words, you don’t have to browse for the file. This also means that if you don’t actually build the project you are referencing the database reference is effectively broken. You’ll see a warning icon in Solution Explorer.

Database Reference Variables

Use these settings when you need to use three or four part names. In other words, you are defining a database reference to a project that will be deployed to a different physical database.

  1. Define server variable – You are defining a SQLCMD variable and the associated value. You can then use the variable instead of hard coding a server name in your T-SQL code. When you deploy your project you can optionally override the value of the variable.
  2. Define database variable – This setting gives you the option of defining a SQLCMD variable representing the database name or choosing the Literal option to indicate that your T-SQL code will be hard coding the name of the database. The most common use of the Literal setting is when referencing the database “master.”

Updating Schema Objects and Scripts

By selecting this option Data Dude will do a search and replace of any server or database names throughout your code with the variables you defined.

Resolution of External Objects

The option Suppress errors caused by unresolved references in the referenced project is the most relevant to topic of this blog post. This option affects how the validation component behaves. Imagine you are defining a reference to a project which in turn references another project (we’ll call it “Third”). Think about the previous explanation of how the Schema Model is populated. When the validation component is validating the [External] objects it will not be able to find objects that are defined in the project “Third.” If you leave this option off you will get errors for the objects defined in “Third.” By setting this option you cause the validation component to not attempt to resolve objects that are referenced from the project being referenced.

Now let’s look at the pros and cons of Database Project References.

Pros

  • If an object is added, modified or deleted the Schema Model of the referencing project (“Foo”) will be updated as soon as the referenced project (“Bar”) is built. Remember that the .dbschema file is what is actually referenced.
  • Since each project creates its own Schema Model you can improve the overall performance of Visual Studio by breaking your database design up into multiple projects. Note that there is a point of diminishing returns with this technique. All the Schema Models share the same process so memory limitations can become an issue.
  • You can provide .dbschema files without providing the associated source code. Thereby preventing someone from making changes to the code of the referenced project.
  • You can define references to other projects that define objects within the same database (two part names), a different database (three part names) or even a different server (four part names).
  • Code reuse is very easy. Imagine you have one project that implements tables, views, sprocs, etc. for diagnostic or auditing functionality. You can re-use the functionality by referencing it from n other database projects representing various database designs. You have the flexibility to use two, three or four part names.

Cons

  • Each project has to be deployed individually and the projects must be deployed in the proper dependency chain order. This is because the objects are only represented in the model for the owning project.
  • Since each project must be deployed individually you have to make sure the deployment setting Generate DROP statements for objects that are in the target database but that are not in the database project is not enabled. Otherwise each time you deploy a project all the objects defined in the referenced projects will get dropped from the target database. This setting is defined in the .sqldeployment properties file and it is cleared by default. The implication of this behavior is that if you actually want to remove an object definition from the target database you will have to either recreate the database during deployment or explicitly drop the object using a tool like SSMS.
  • Using Schema Compare to compare a database to a project that is part of a composite project (references using two part names) is very difficult. The problem is the model for the database represents the entire database design while the model for the project only contains the objects defined in that particular project. So Schema Compare reports a bunch of update actions for the objects defined in the other projects. If you need to use Schema Compare you have to set the update action for the objects that exist in other projects to Skip so they don’t get dropped from the target database when you do a Write Updates. That can be a lot of work! In order to avoid that I create a Visual Studio configuration (i.e. Release and Debug) named DontDeploy and set the Project Properties \ Deploy \ Deploy action to Create a deployment script (.sql) . Then I can easily switch my Visual Studio configuration, do a build that doesn’t actually deploy to the target database and look at each deployment script. Remember that one is created for each project.

Conclusion

Hopefully this is helpful. Let me know if you still have questions on this subject that I haven’t addressed.