SQL Q+AClusters, Snapshots, Log Shipping, and More
Edited by Nancy Michell
Cluster vs. Standby Server
Q I have a SQL Server 2000 failover cluster. My system and user databases are replicated to a single machine in another building because I don’t have enough machines to create a second cluster there. I need to manage disaster recovery with a single machine. What is the best way to configure that machine: use a SQL Server failover cluster with only one node or a standalone SQL Server (standby server) with the original virtual server name?
A There are a few things to keep in mind when making your decision. If in the future you acquire more machines you can add additional nodes relatively easily to a one-node failover cluster, but you’d have to upgrade a standalone instance to a failover cluster before you could add more nodes, which is not as easy (though it is supported in Microsoft® SQL Server™ 2000). See the following SQL Server Books Online articles for information on this topic: "How to: Create a New SQL Server 2005 Failover Cluster" and "How to upgrade from a default instance to a default clustered instance of SQL Server 2000".
A standalone SQL Server 2005 instance cannot be upgraded to a failover cluster without reinstalling the SQL Server instance as a failover clustered instance, so if it were a version 2005 instance and there was a chance you’d want to eventually add more nodes you should start with a single-node cluster.
For your storage-level replication be sure the data and log being replicated on the disaster recovery site actually represent the state of the data and log that appeared at the primary site at some point in time. You don’t want your data files to get too far ahead of your log files, or vice versa. There is a relationship and ordering between the files that must be preserved (via consistency groups or some other mechanism) so that the database will be recoverable.
Also be sure you’re testing the disaster recovery site periodically, and be aware that there is other state information that is outside of the data and log files. For example, changes to the registry will not magically appear on the remote site like they would with a geographically dispersed cluster.
Finally, remember that for SQL Server 2000, the standalone will be easier to rename. See Knowledge Base article 307336 "The SQL Server Network Name resource cannot be renamed". Also note that renaming the SQL Server virtual or clustered servers is not supported in SQL Server 2000.
Database Access During Snapshot
Q If my application tries to access a snapshot during the drop/creation of a new snapshot, I presume it will get an error and I will have to resubmit my query. With log shipping, when the database is unavailable due to the application of a new log, will my application also get an error, or will it just run slower while the log is being applied?
A With regard to your question about what happens to your client application when a log is being restored, there are two possibilities. If the application is connected when you attempt the restore, then the restore will fail since it’s an exclusive-use operation. You have the option to break the connection before the restore, but if you do that then your application will see the disconnect. The second possibility is that the client attempts to connect while the restore is in process. If this happens, then either the connection attempt will time out, or will successfully wait until the restore is complete.
The important concept to understand is that you can’t read the database while a log is being restored; restore is an exclusive-use operation.
You have two options for dealing with this: one is to prevent logs from being applied when users are connected to the database, and the other is to close user connections to the database when logs are being applied.
On a practical level, you will find that there will be times when users can’t access the log shipping standby regardless of which option you choose. Log shipping was never designed for a reporting solution despite the fact that you are technically able to do it if you restore the main database using WITH STANDBY. The main purpose of log shipping is for use in availability and disaster recovery. This question has come up quite frequently and the answer is always the same. If you need a reporting solution, use something else. Log shipping needs exclusive access while loading transaction logs (TLogs) for however long that takes.
Drive Space During Installation
Q I need to upgrade a Systems Management Server (SMS) 2003 SP2 server, which is running SQL Server 2000 SP4. The server has a C drive with about 1.2GB of free space and a D drive with about 10GB free. SQL Server 2000 is installed on the D drive. During the setup for SQL Server 2005, I selected Advanced and gave it a path of D:\Program Files\Microsoft SQL Server, yet I can’t install because I don’t have enough free space. How can I install everything on D as with SQL Server 2000?
A To successfully install SQL Server 2005 you’ll need about 2GB of free space on the system drive even if you’ve elected to install all SQL Server components to a non-system drive. There are SQL Server files that must go to %Program Files%\Microsoft SQL Server\90. In addition, Windows® Installer caches the installation package to the system drive until the installation is complete; the cache is temporary and is deleted when setup finishes.
To install SQL Server components to another drive, choose an installation location for each feature in the Advanced Feature Selection dialog of SQL Server setup. Another option is to use the command-line utility written by Gert Drapers to temporarily divert "Program Files" installs to an alternative drive.
Hardware Recommendations
Q I need some advice on hardware for my SQL Server 2005 migration. The vast majority of my line-of-business applications currently run on SQL Server 2000. What are your recommendations?
A It sounds like you want a prescriptive guide to moving from SQL Server 2000 running on server A to SQL Server 2005 running on server B. Unfortunately, it’s not that simple as servers have changed a lot since the introduction of SQL Server 2000, as has the capabilities of SQL Server itself. You probably don’t want the new server to just run the same old apps at the same old speed. Chances are you want a boost in throughput or performance, or want to buy less hardware to consolidate some servers.
There are a couple of SQL Server 2000 sizing tools from Microsoft, HP, and Unisys (look under the Build section). Another resource to try is the Windows Server™ System Reference Architecture (WSSRA) site which includes hardware recommendations for each piece of hardware in the reference architecture. If you are considering server consolidation, then the following whitepaper is a good place to start: "Planning for Consolidation with Microsoft SQL Server 2000".
SQL Server Security
Q I need to deploy SQL Server Express and a Microsoft .NET Framework 2.0-based smart client application to many users. I need to seal the database (especially the application data) so none of the users can see or alter the data. The systems administrator should not be able to see or change the application data, but he should be able to drop tables or attach and detach databases. I also need to prevent anybody from seeing the data if they detach the database and attach it to another server. How can I accomplish this?
A One solution is to encrypt the data in the application before it is written to the database. Alternatively, you could use column-level encryption, a new feature in SQL Server 2005. The big issue is that once a column is encrypted at the SQL Server level, performance during data modification or lookup suffers. If you are going to encrypt all the columns, a simple lookup will require an on-the-fly decryption of all the data in the table to find what you are looking for. If the encryption is done at application level, the client application will have to decrypt the data and then find the rows it’s looking for. Either way, you may want to consider a partial encryption, leaving the primary key and foreign keys unencrypted.
Log Shipping Shrinks TLog
Q I have configured log shipping between two SQL Server 2000 installations. The problem is that the TLog of the database that is configured for log shipping, after being shipped on the standby SQL Server, remains shrunken on the source server.
The TLog impacts performance by growing quite large on a regular basis, until it shrinks again at the next shipping. Is it possible to avoid the shrinking action on the source SQL Server?
A A log shipping TLog backup will not shrink the physical log file. The used space in the files will go down as logical log files are released and truncated, but the physical space will remain.
Shrinking the database/log file with DBCC SHRINKDATABASE or SHRINKFILE will queue deferred log file shrinking and it will not take effect until the tail of the file has been backed up, which sounds exactly like what you’re experiencing. If you do not want the log file to return to the allocation size you can create the file at the size you want and the shrink action will not shrink the log file to smaller than the size at which the file was created.
To avoid shrinking the log files, turn off autoshrink on the database and use DBCC SHRINKFILE on the data files instead of DBCC SHRINKDATABASE if you are manually shrinking the database.
Log Shipping and Data Loss
Q Let’s say I have a log shipping job that backs up TLogs set up to run every 15 minutes. The log shipping job executes at 10:00 A.M. and the database server goes down at 10:05 A.M. Let’s also assume that I have data that comes in between 10:00 and 10:05 and I lose the .mdf file on the primary server, but still have the .ldf file. Would I be able to restore the secondary server up to 10:05 A.M.? How can I minimize the possibility of data loss when using log shipping?
A To minimize the possibility of data loss, you can increase the frequency of the backup job and implement better logic to do the failover when the primary server goes down. But as you can see, no matter what you do, there will always be a window during which data loss can occur. So this should be part of your deployment plan.
You can also back up the data in the .ldf file without the .mdf file (in SQL Server 2000). The challenge is that you need to run the BACKUP LOG WITH CONTINUE_AFTER_ERROR (SQL Server 2005) or BACKUP LOG WITH NO_TRUNCATE on the primary server.
Some people use hardware mirroring on the log volumes to ensure that the log is available on a remote site in order to cover the case where the primary server is unavailable.
Bulk Insert with XML Formatfile
Q I’m trying to load ST_ENDERECO.txt into the table tb_endereco using the format file fmt_endereco.xml.
BULK INSERT dbo.tb_endereco
FROM ‘C:\BulkInsert\ST_ENDERECO.TXT’
WITH TABLOCK
(formatfile = ‘C:\BulkInsert\fmt_ endereco.xml’,
)
When I do this via a bulk insert, I get an error stating that string or binary data would be truncated. However, when I do it using openrowset (see Figure 1), it works fine. Can you tell me what I am doing wrong in bulk insert process?
Figure 1 Using OPENROWSET
INSERT dbo.tb_endereco
(id_integracao, tp_logradouro, no_logradouro,
nu_endereco, no_complemento, no_bairro, no_cidade, sg_estado, cd_cep)
SELECT t1.*
FROM OPENROWSET(BULK ‘C:\BulkInsert\ST_ENDERECO.TXT’,
FORMATFILE=‘C:\BulkInsert\fmt_endereco.xml’
AS t1
A First of all you need to know that the bulk load statement and the INSERT...SELECT...FROM OPENROWSET statement are not equivalent.
With the format file provided, bulk insert would map the first nine columns (including the identity column) to the nine source fields in the data file. Since you haven’t specified the KEEP_IDENTITY option the data for the identity column would be automatically omitted.
The equivalent INSERT...SELECT...FROM OPENROWSET would look like Figure 2, and it would fail with exactly the same error.
Figure 2 Equivalent Code
INSERT dbo.tb_endereco
(id_integracao, tp_logradouro, no_logradouro,
nu_endereco, no_complemento, no_bairro, no_cidade, sg_estado)
SELECT
tp_logradouro, no_logradouro,
nu_endereco, no_complemento, no_bairro, no_cidade, sg_estado, cd_cep
FROM OPENROWSET(BULK ‘C:\BulkInsert\ST_ENDERECO.TXT’,
FORMATFILE=‘C:\BulkInsert\fmt_endereco.xml’
) AS t1
Unfortunately, this is one case where the new style format file does not allow you to express your intention (you want to skip the first target column and map the rest of the target columns to the source fields). In order to work around the issue you have two options.
You could create a view that would skip the identity column:
CREATE VIEW v_tb_endereco
AS
SELECT
id_integracao, tp_logradouro, no_logradouro,
nu_endereco,
no_complemento, no_bairro, no_cidade,
sg_estado, cd_cep
FROM tb_endereco
And then issue the original bulk insert statement, like so:
BULK INSERT v_tb_endereco
FROM ‘C:\BulkInsert\ST_ENDERECO.TXT’
WITH (formatfile = ‘C:\BulkInsert\fmt_
endereco.xml’,tablock)
Alternatively, you could use the equivalent older style format file with the same bulk insert statement.
The XML format file is geared more towards generating a rowset from datafile. So for example, you can do this:
SELECT *
FROM OPENROWSET (BULK, ....)
You will notice that the XML format file provides you with an explicit type for the column while in the old format file it is derived from the type of the field. Now, you typically load a rowset into a target table, using INSERT and SELECT statements like the ones in this code snippet:
INSERT INTO <target table> (<col-list>)
SELECT *
FROM <a rowset>
In this command, you need to explicitly provide that mapping of columns as generated by the select statement to the columns in the target table.
The old format file was meant for bulk import, so it also had the target table column mapping. But as indicated, you can use view to get around this problem with XML format file. In this case, the view definition is actually doing the column mapping for you. SQL Server Books Online has more information on these topics.
Thanks to the following Microsoft IT pros for their technical expertise: Sunil Agarwal, Yunwen Bai, Batu Balkanli, Byron Boudreaux, Patrick Conlan, Gert Drapers, Michael Epprecht, Fernando Ferreira, Padma Ganapathy, Alessandro Gregnanin, Mucahit Gundebahar, Allan Hirt, Dan Jones (SQL), Tengiz Kharatishvili, Steve Lindell, Saverio Lorenzini, Jeff Mayer, Ivan Penkov, Steve Schmidt, Margi Showman, Chris Skorlinski, Stephen Strong, Derek Tan, Don Vilen, Steven Wort, and Sedat Yogurtcuoglu.
Edited by Nancy Michell
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.