Fun with Schemas

Remember when we wrote the Database Analyzer using DAO a while back? This worked great, but on Access .mdb files only. However, what happens if we are using an OLE DB data provider and we don't know exactly what fields are available? Well, we can accomplish the same thing for any data source as we did using out DAO Table Analyzer, using ADO. Since ADO really talks to the OLE DB layer, we can get any information on the underlying data source from the OLE DB provider. This can easily be done by using the OpenSchema method of our connection object.  

By using the OpenSchema method, we can spy on information about the particular data source we are connected to. We can easily get information about the data source, such as the tables on the server and the columns in those tables.

Take a look at the figure below:

Our application uses ADO to talk to the OLE DB data provider. Our application probably does not have to know how to communicate with all of the various data stores. We just use ADO to talk to the OLE DB provider and it takes care of the nitty gritty of how to communicate with the various and sundry data stores.  When we select a specific OLD DB data provider, we know that our application can just use ADO to talk to that OLE DB provider. And using this approach, we are removed from having to know about the details of each and every data source. We just leave that up to the OLE DB provider.

But, despite all this built-in invisibility, what if our program needs to know something about the data store we are accessing?   For example, what if we need to find out things like field names? Or what if we need to know if certain variables will be supported? Well, this is a snap using ADO.

It is the responsibility of the OLE DB provider to give us this type of information. This way our application can quickly get information on the underlying data store that might range from a relational database such as Access to an e-mail message or text file.

When reading about OLE DB, you will see the terms Consumer and Provider. A consumer is any application that uses - or consumes - OLE DB interfaces. For example, our programs have been using ADO to talk to OLE DB to connect to our Access database. Our ADO code and the data control are both consumers of OLE DB services.

An OLE DB provider uses OLE DB interfaces, such as our ODBC connection that we created. This means that an OLE DB provider (our ODBC connection) allows consumers of their services to access data in a uniform way via the OLE DB interface. Conceptually, an OLE DB provider is similar to an ODBC driver. That driver provides a uniform mechanism for accessing relational data - it understands SQL. But the cool thing about OLE DB providers is that they not only provide a mechanism for relational data, but they can talk to non-relational data sources as well.

OK, what if our program needs to find out information about the underlying data store? What we'll do next is create some code that will allow us to display information about how the data source that we want to access is laid out. We'll display the description of what's in the data source – its schema.

Try It Out – Getting the Schema of the Data Source using ADO

1.   Add a new form to your project. Call the form frmSchema. Add a single command button named cmdSchema - nothing fancy:

We will use this single form for these next few examples - we will just add a few command buttons and print the results to VB's Immediate Window using the Print method of the debug.object. Rather than cloud the examples with a lot of formatting code, I want to focus on the ADO code. So just use a single form and add another button when asked. Thanks!

2.  OK enough talk. Let's do some coding. Add the following code to the Click event procedure of the command button.  This code will establish a connection with a data source. Then we will ask the data source which tables and fields are available. You will quickly notice that the ADO code is much easier to write than the equivalent DAO code.

Private Sub cmdSchema_Click()

Dim adoConnection As ADODB.Connection
Dim adoRsFields As ADODB.Recordset
Dim sConnection As String
Dim sCurrentTable As String
Dim sNewTable As String

Set adoConnection = New ADODB.Connection

sConnection = "Provider=Microsoft.Jet.OLEDB.3.51;Data Source=c:\BegDB\Biblio.mdb"

adoConnection.Open sConnection

Set adoRsFields = adoConnection.OpenSchema(adSchemaColumns)

sCurrentTable = ""
sNewTable = ""

Do Until adoRsFields.EOF
  sCurrentTable = adoRsFields!TABLE_NAME
  If (sCurrentTable <> sNewTable) Then
    sNewTable = adoRsFields!TABLE_NAME
    Debug.Print "Current Table: " & adoRsFields!TABLE_NAME
  End If
  Debug.Print "   Field: " & adoRsFields!COLUMN_NAME
  adoRsFields.MoveNext
Loop

adoRsFields.Close
Set adoRsFields = Nothing
adoConnection.Close
Set adoConnection = Nothing

End Sub

3.  Run your frmSchema form and click on the Schema button. We will step through what the code is doing shortly, but first, take a look at the results that appear in the Immediate window:

Since we are using the debug.print method, the output is just being sent to the Immediate window. Of course, if you wish, you can get fancy and place the output in a TreeView control as we did using DAO a few chapters back. But in this example we can see just how easy it is to interrogate the OLE DB provider to get this type of information.

You will see some tables that start with Msys such as MSysIMEXColumns. These tables are used by Jet to store various meta-information about the tables and database. Meta-information really means information about information. So you get to spy on the various tools that Jet uses to maintain an Access database. Of course, these would not be present if you used ADO to open another  - non-Access - data source.

How It Works

We start out by dim'ing our local variables. We dim an ADODB connection and recordset object as usual:

Dim adoConnection As ADODB.Connection
Dim adoRsFields As ADODB.Recordset
Dim sConnection As String
Dim sCurrentTable As String
Dim sNewTable As String

Next, we set a reference to our (new) connection in preparation for opening it. Of course, that reference is in our object variable adoConnection. In order to open the connection, we must set the .ConnectionString property. So again, to illustrate the point, we just assign the connection string to a string variable, sConnection. Next, we invoke the .Open method of the connection object and pass in the sConnection variable that holds the connection string as a parameter:

Set adoConnection = New ADODB.Connection

sConnection = "Provider=Microsoft.Jet.OLEDB.3.51;Data Source=c:\BegDB\Biblio.mdb"

adoConnection.Open sConnection

We now have an open connection. Why not do something with it? 

Our cunning plan is to retrieve a recordset of information about the data source. By using the .OpenSchema method, we can get returned to us information about the data source, such as information about the tables on the server and the columns in the tables. There are several constants that can be used to retrieve specific information about the underlying data source. Of these, we will use the adSchemaColumns constant – this will return the table name and the column name. This way we can find out about what tables are in the database, and what fields are in the tables (in our next example, we will find out the details about the individual fields in the tables):

Set adoRsFields = adoConnection.OpenSchema(adSchemaColumns)

At this point, we have a valid recordset containing the information about the table and field names. Now we will loop through the recordset and print the results in the debug (immediate) window.  The two string variables, sCurrentTable and sNewTable, are used as placeholders. We will loop through the results and provide the table name as a header and then print the fields inside that table in an indented manner.

The returned recordset, adoRsFields, will have a combination of Table Name and Field Name in each record:

sCurrentTable = ""
sNewTable = ""

Do Until adoRsFields.EOF
  sCurrentTable = adoRsFields!TABLE_NAME
  If (sCurrentTable <> sNewTable) Then
    sNewTable = adoRsFields!TABLE_NAME
    Debug.Print "Current Table: " & adoRsFields!TABLE_NAME
  End If
  Debug.Print "   Field: " & adoRsFields!COLUMN_NAME
  adoRsFields.MoveNext
Loop

The Do loop simply runs the code inside of it until the end of the recordset. The first time though the loop, we assign the value of the current table name to the variable sCurrentTable. Of course, there will be several fields for each table, so we want to only print the table name once.

The next line is used to determine if there is a new table name in the current record. However, we initialized the variable sCurrentTable to "", so the sCurrentTable value and the name of the table are not equal. Therefore, we first assign the name of the current table to sNewTable and then print the name of the table in the Immediate window.

The code then exits the If…End If and prints the name of the field in that same record. Recall that each record in the recordset will have both the current table and a field in that table. Then the current record pointer is incremented by using the .MoveNext method of the recordset.

The next time through the loop, we assign the name of the table in that record to sCurrentTable. If the name of that table is equal to the name of the table name that was just printed out, If…End If structure is bypassed and only the field is printed.

Trawling for Data Types

There will be times when we need to know the data types that are supported by the underlying data source. For example, we would not want to try to write a variable to an underlying field if that field could not support the data, right? For example, it would be embarrassing to write a variant to an integer field, only to be surprised by an error message.

It would be great if there were a simple way for us to find out what is supported by whatever data store we are connected to, right? Well, ADO provides an easy and painless way to find out.

As mentioned above, you can also use the .OpenSchema method to find out this important information. By passing in the constant adSchemaProviderTypes as a parameter, a recordset is returned that shows all of the types provided. Let's have a go at doing that now.

Try It Out – Determining Data Types of the Data Source using ADO

1.   Add another button to your frmSchema form that was used in the proceeding example. Name it cmdDataTypes and caption it as shown:****

2.  Next, add this code to the click event of the cmdDataTypes button. If you are lazy like me, you can cut and paste from the cmdSchema_Click event and just change a few lines:

Private Sub cmdDataTypes_Click()
Dim adoConnection As ADODB.Connection
Dim adoRsFields As ADODB.Recordset
Dim sConnection As String
Set adoConnection = New ADODB.Connection
sConnection = "Provider=Microsoft.Jet.OLEDB.3.51;Data Source=c:\BegDB\Biblio.mdb"
adoConnection.Open sConnection
Set adoRsFields = adoConnection.OpenSchema(adSchemaProviderTypes)
Do Until adoRsFields.EOF
  Debug.Print "Data Type: " & adoRsFields!TYPE_NAME & vbTab _
                   & "Column Size: " & adoRsFields!COLUMN_SIZE
  adoRsFields.MoveNext
Loop
adoRsFields.Close
Set adoRsFields = Nothing
adoConnection.Close
Set adoConnection = Nothing
End Sub

3.  Press F5 to run the program. Press the Data Types button. When you run the program, the following output is sent to the immediate window:

How It Works

Since the code is almost identical to our last program, we don't need to dwell on what is happening. But let's take a look at the output. Of course, the Data Type tells us all of the data types available in this particular data source. The Column Size tells us the length of a column or parameter. The length refers to either the maximum or the defined length for this type by the provider. For character data, this is the maximum or defined length in characters. For date/time data types, this is the length of the string representation (which assumes the maximum allowed precision of the fractional seconds component). If the data type is numeric, the column size is the upper bound on the maximum precision of the data type.  Pretty cool.

© 1998 by Wrox Press. All rights reserved.