System Center
Better Backups with Data Protection Manager 2007
Calvin Keaton
At a Glance:
- Moving away from tape backups
- Continuous Data Protection
- Disk-based backups
- More granular control with DPM
The data protection market has recently experienced a renaissance of sorts. After years of tape-based backup solutions and one-size-fits-all approaches to nightly backup, a number of new
technologies and methods have been introduced in a relatively short period of time. These new technologies have included disk-based backup, Continuous Data Protection (CDP), and even something as seemingly simple as data encryption. Some of these new approaches have been made possible or even necessary by recent decreases in the price of disk storage, by improvements in network and processor performance, and by new regulatory requirements.
Microsoft has a long history of providing platform- and application-based solutions in this area, such as NTBackup, Exchange Server, and SQL ServerTM backup tools, to name just a few examples.
System Center Data Protection Manager (DPM) 2006 was the first Microsoft standalone software entry in the data protection market; it was one of the first products to offer disk-based data protection, and it remains one of the only such offerings to have been built from the ground up with disk-based backup in mind. Most other offerings were and continue to be tape-based approaches that have been retrofitted for disk.
DPM is a part of the System Center family of products. System Center products are designed to work together in making the day-to-day management of your complex IT infrastructure both easier and more cost effective. System Center solutions are based on automation and best practices derived from the Microsoft® Operations Framework (MOF) and Information Technology Infrastructure Library (ITIL) and can be used at all levels of an organization.
DPM is just one part of the System Center family. System Center Configuration Manager provides configuration management and software delivery. System Center Operations Manager gives you proactive system monitoring and automation. System Center Capacity Planner can be used for capacity planning and what-if analysis of infrastructure deployments. (See microsoft.com/systemcenter for more information.)
System Center Data Protection Manager 2006 provided disk-based backup of File and Print servers as well as WAN-based disk backup of branch offices. It was designed to protect file servers in the datacenter and to get local tape backup hardware out of the branch. But it lacked native support for Microsoft applications and relied on third-party products for long-term archive to tape.
The new version of DPM, System Center Data Protection Manager 2007, builds on this foundation and continues to provide robust disk-based protection of File and Print servers as well as branch offices, but it also adds support for business-critical Microsoft applications such as Exchange, SQL Server, and SharePoint® as well as native support for tape backups. While it's not unusual to see a significant amount of change in a new version, this is especially true in the case of DPM. The second version has added a large number of new features as well as significant enhancements to existing features. In this article, I'll walk you through several of these new and enhanced features and give you a broad look at what you can expect from this exciting new release.
Application and Platform Support
Many backup software vendors take a "peanut butter" approach to application support. They spread their engineering investment across a large number of applications in an attempt to support as many data sources as possible. The result is very basic support for a large number of applications. Rather than attempting to provide limited support for a broad range of applications or platforms, Microsoft has chosen to focus on protecting a few very specific applications on the Windows® platform. This has allowed us to focus our engineering investment on these applications to build an integrated backup and restore experience, which includes features not commonly found in other backup products.
Microsoft Exchange support was the number-one feature request of users of DPM 2006. With this in mind, DPM 2007 introduces support for both Exchange Server 2003 and Exchange Server 2007. Protection occurs at the storage group level with individual storage groups appearing as objects in the DPM UI, which can then be added to protection groups as necessary to enable scheduled, policy-driven backups to occur. Clustered Storage Groups appear as a single object, removing the need to track individual members of a cluster and manually sync up their protection. Restores can occur at the Storage Group, Database, or Mailbox level. Granular recovery is achieved via automation, which uses standard Exchange tools to restore the data to a recovery storage group and then extract more granular objects like mailboxes once again via standard Exchange tools (see Figure 1).
Figure 1** Exchange support in DPM **(Click the image for a larger view)
Unlike some other Exchange backup solutions, DPM does not rely on non-Exchange tools or reverse engineering the Exchange database when it extracts objects that are then restored. So DPM-based mailbox restores are fully supported by Exchange.
DPM 2007 is the first product to fully support Exchange 2007 Cluster Continuous Replication (CCR) and Local Continuous Replication (LCR) clusters. Exchange 2003 clusters are supported as well.
SQL Server 2000 and SQL Server 2005 are supported by DPM 2007, with both backup and restore occurring at the database level. Support for redirected and renamed restores, as well as full support for clustered, mirrored, or log-shipping databases, is also included.
As with Exchange, clustered databases appear as a single object in the UI, removing the need to configure protection on multiple clustered servers separately (see Figure 2). SQL Server support in DPM can be used to protect a number of third-party applications that rely on back-end SQL Server databases, with support for the application occurring via file protection and the SQL Server database being protected natively via DPM support for SQL Server. The flexible scheduling engine in DPM allows both the file backup and the SQL Server backup to be added to the same protection group so that they occur simultaneously and share a common management and scheduling interface.
Figure 2** Clustering backups **(Click the image for a larger view)
Collaboration workloads are protected with SharePoint 2007, with Windows SharePoint Services (WSS) 3.0 and Microsoft Office SharePoint Server (MOSS) 2007 both being supported. Support occurs at the farm level, even if that farm spans more than one server. So, for example, a farm consisting of three Web front-end servers, two site collection servers, and a single database server for a total of six servers would appear as a single object in the DPM UI, would be backed up as a single object, and could be restored as a single object by DPM as well.
While full farm restores are supported, DPM also includes support for site, site collection, and individual document, list, or item restores. Older versions of SharePoint can, of course, be protected as SQL Server databases via DPM SQL Server protection features.
File and Print Workloads continue to be supported in DPM 2007, with Open File support available at no additional charge. Support for file and folder exclusions has been added, as well as System State backup support. It's important to note that File support has been further expanded to include Windows Vista® and Windows XP. But support for these client operating systems is relatively limited, as support is mainly aimed at systems that are used as file servers. DPM users are permitted to protect a small number of clients as needed, but this support will not easily scale to larger numbers of desktops.
Native support for Exchange Servers, SQL Server, SharePoint servers and file servers allows DPM to support a vast majority of the workloads found in Windows environments. But for less-common workloads, DPM 2007 also includes support for pre- and post-scripts, which can be run automatically when a backup occurs. This functionality allows DPM to support a wide range of other applications or workloads. Such support is not as integrated as that offered for Exchange, SQL Server, and SharePoint, but it does offer significant flexibility in data source support. White papers detailing how to support some key non-Microsoft products such as Oracle will be available on the DPM TechCenter at technet.microsoft.com/dpm.
One of the more interesting innovations in DPM 2007 is host-based virtual server support. While most backup products are able to support virtual server guests as individual servers, few, if any, are able to provide such protection at the host level via a single license and agent install. DPM leverages recursive volume shadow copy service (VSS) writers to protect all the guests that are hosted on a given virtual server host via a single agent deployed on the host. These recursive VSS writers are able to call VSS writers that live on guests or even in applications that run on guests regardless of what application or operating system is installed on the guest. This allows DPM to protect any platform or application that runs as a guest, regardless of the vendor. Such host-based virtual server backups are application-consistent and result in a single Virtual Hard Disk (VHD) image that can be restored on an existing or new host as necessary.
These host-based backup images don't offer the same granularity of restore as do backups made via a DPM agent installed on the guest or a standalone server, but they do offer other efficiencies including easy system recovery to virtualized environments.
Speed, Reliability, and Efficiency
In addition to improvements in both depth and breadth of application support as compared to the previous version, DPM 2007 allows increased granularity of full and incremental backups. While DPM 2006 allowed you to synchronize up to every hour, the new version of DPM allows you to take an incremental backup every 15 minutes and a full backup every hour. For customers with less aggressive service level agreements (SLAs), full backups may be required only once per week. Thus those customers can reduce the frequency of backups on less critical servers while still allowing for aggressive protection schedules for critical servers. Even though synchronization with the DPM server occurs up to every 15 minutes, the DPM agent is still actively tracking changes on the protected server during the interval between synchronizations.
This new DPM agent continuously tracks block-level changes as they occur on a protected server. This is made possible by an entirely new volume filter developed for DPM 2007 that consists of a bitmap that lives in paged pool memory and includes one bit for every block on the protected volume.
Each time a block is written to in the volume, a bit is flipped in the bitmap. The processor and memory impact associated with this process is lower than with a typical antivirus filter and there is no disk space requirement that scales with the change rate. In fact, the load associated with this filter doesn't scale at all with the change rate. This is very important, as most block-level change tracking schemes generally result in high processor and memory overhead as well as a disk footprint that scales with the change rate. So higher change rates tend to have greater impact, and in some cases very high change rates can result in the system running out of disk space. With DPM 2007, the impact of the volume filter on processer, memory, and disk is the same regardless of the change rate. Whether it's one percent or 1,000 percent, the change rate simply does not impact the low overhead associated with the filter.
Tracking changed data is only one part of the story, however. DPM 2007 also leverages the VSS service on the protected server to create application-consistent backup images that can be recovered reliably. VSS was developed by Microsoft to provide the backup infrastructure for Windows XP and Windows Server® 2003, and to serve as a mechanism for creating consistent point-in-time copies of data (shadow copies). VSS can produce consistent shadow copies by coordinating reads and writes with business applications, file-system services, backup applications, fast-recovery solutions, and storage hardware. As a result, VSS is arguably the most consistent and reliable mechanism for producing application consistent backups on the market today. Several features in Windows Server 2003 use VSS, including Shadow Copies for Shared Folders.
Most products that use VSS for application backups create a VSS replica and leave it in place on the protected server to track changes and get an application-consistent image for backup. Unfortunately, keeping the replica in place on the protected server at all times has some significant performance implications, as a copy on write occurs to keep the VSS replica up-to-date each time the application writes to disk. This can create as much as 25 percent processor overhead and has a significant impact on write times.
Since DPM tracks changes via its volume filter, it does not need to keep a VSS replica in place 100 percent of the time. Instead, DPM creates a VSS replica when full backup occurs, then overlays the volume filter bitmap to identify only changed data in the replica. This changed data is moved over to the DPM server, which is used to create a recovery point. The VSS replica on the protected server is then deleted and the volume filter is reset. Block-level change tracking then continues with little overhead and no need for a VSS replica or its associated resource use.
This solution guarantees that recovery points will result in a restorable application image, and it reduces the amount of data moved between the protected server and the DPM server.
With DPM 2007, a full backup does not move a full copy of the data from the protected server to the DPM server. Instead, it moves only the changed data. So a daily full backup on a 100GB Exchange Storage Group with a 10 percent change rate per day would be only 10GB.
This reduction in the size of full backups translates into very efficient storage on the DPM server when a snapshot is created to capture a recovery point. Each snapshot consists of the changed data that made up the last full backup, and it occurs automatically when a full backup is performed. When dealing with applications, these snapshots also include any intervening incremental backups that have occurred.
Incremental backups are essentially log backups in DPM, so every time you take a full backup of an application, the changed data is moved over to the DPM server and is bundled with any intervening log backups to create a single snapshot that contains multiple recovery points. In the case of the 100GB Exchange Storage Group with a 10 percent daily change rate described earlier, if we assume a daily full backup and 15-minute increments, a snapshot would include 10GB of changed data and approximately 5-10GB of application logs. But it's important to realize this 15 to 20GB snapshot would contain up to 97 recovery points!
Full backups must occur at least once per week, and the total number of snapshots is limited to 512 for application servers and 64 for file servers. So, performing weekly full backups and incremental backups every 15 minutes would result in 673 recovery points per snapshot and more than 340,000 recovery points if all 512 snapshots were used. While it's unlikely anyone would ever have a need for so many recovery points on disk, it does demonstrate the flexibility and scalability of DPM with regard to job scheduling and point-in-time recovery.
Including log backups with full backups allows DPM to efficiently store a very large number of recovery points in a very small amount of disk space. But there is some overhead associated with this process, and the process is not technically compression or even single instance storage (SIS), though there are some data de-duplication components to the process, namely the block-level change tracking. Whatever the label happens to be, DPM does a very good job of using a small amount of storage for each recovery point created. More importantly, because DPM does not use traditional compression, single instance storage, or de-duplication to perform this task, there is room for additional efficiencies if a compression, SIS, or de-duplication storage platform is used to provide storage to the DPM server.
Zero Data Loss Recovery and CDP
Perhaps one of the most exciting changes in DPM 2007 is its ability to perform lossless recoveries of Microsoft applications. This is made possible by the block-level change tracking and VSS architecture as well as deep integration with existing application logs on the protected server.
Generally speaking, there are two approaches to data protection and recovery in the market today:
Replication-Based Solutions This is what most customers traditionally think of as Continuous Data Protection (CDP). Replication-based data protection involves moving changes on the protected server over to the backup server as they occur. This is a disk-based solution that purports to allow recovery to any point in time with zero data loss.
While this is a compelling claim, there are a few very real disadvantages associated with this approach. First and foremost, a replication solution has no concept of application state, nor does it have any awareness of the moving parts in an application. So it doesn't flush buffers or account for data that may be in memory or in transit when the backup is occurring. This means that replication-based recovery points are often not application-consistent. You can perform a recovery, but the application doesn't actually come back in a usable state. This issue is so prevalent that some replication-based CDP vendors have notes in their documentation telling you to try another recovery point for an application if the current one doesn't work.
In addition to these application-consistency issues, replication-based CDP solutions are generally quite expensive, have significant network and processor overhead, and require fairly large amounts of storage, which may well be located on proprietary hardware. Replication-based data protection solutions are also not particularly well- suited to archiving or off-site tape backups, both of which are a requirement for many customers.
Snapshot-Based Solutions This is a traditional tape backup approach to data protection in which point-in-time snapshots are created on external media that can then be used for point-in-time recoveries. The recovery point objective (RPO) is determined by the frequency of the snapshots, so a daily backup results in 24 hours of data loss. The advantage of snapshots is that they are cheap, easy to manage, have a more limited impact on the network, and are well-suited to disk or tape archiving. They also tend to result in application-consistent images because you have time to prepare the application prior to the snapshot being taken, so you have less risk that any moving parts of the application will be missed when the backup occurs.
One recent trend in snapshotting is near-CDP solutions. These are snapshot solutions in which the frequency of snapshots is increased so that the RPO becomes very low. Some products now measure their RPO in minutes and take such frequent snapshots that they have dropped the near-CDP moniker entirely and now call themselves CDP products. They still lose some minutes worth of data in the case of a restore, however.
DPM 2006 was firmly in the near-CDP snapshotting camp, but DPM 2007 is unique in its approach to CDP. The new version of DPM takes snapshots up to every 15 minutes using VSS and its block-level change tracking technology to only move the changed data from the VSS snapshot on the protected server. Thus, individual point-in-time snapshots are very small, as they only consist of block-level changes. This is essentially a very advanced snapshotting approach. The new innovation in DPM is deep application integration for Exchange, SQL, and SharePoint. It's so deep that DPM is aware of the app logs and is able to roll forward those logs in the case of a recovery. As long as your app logs are available (they should be if you follow best practices and have them on a separate drive), DPM will be able to roll them forward from the last point-in-time snapshot for a lossless recovery (see Figure 3).
Figure 3** Recovery options in the Data Protection Manager Administrator Console **(Click the image for a larger view)
In many ways, this hybrid approach to CDP is the best of both worlds. It offers the application consistency and long-term archive support of a snapshotting solution with the zero data loss features of a replication product, while avoiding many of the disadvantages of either solution.
Support for Disk and Tape Backups
No discussion of the new features of DPM 2007 would be complete without mentioning the addition of tape support to the product. DPM 2006 only supported Disk-to-Disk (D2D) backups, with data being moved from disk on the protected server to disk on the DPM server. DPM 2007 includes support for both disk and tape media. Therefore, data can be moved from disk on the protected server to disk attached to the DPM server (D2D), to tape attached to the DPM server (D2T), or protected data can be staged to disk on the DPM server before being moved to tape for long-term retention (D2D2T).
DPM 2007 supports a wide variety of tape libraries and autoloaders, as well as a wide variety of different tape media types.
Tape support in DPM includes many advanced features, such as software-based tape media encryption at no additional cost, as well as basic encryption key management and tape retention and location tracking.
Wrap-Up
I've detailed several of the new features and enhancements found in System Center Data Protection Manger 2007. These features are designed to deliver consistent, reliable, and easily managed data protection for Microsoft platforms and applications, setting a new standard in Windows application and platform data protection.
In addition to the features outlined in this article, there are many more enhancements in the product such as Windows-integrated End User Recovery, flexible bandwidth throttling, and an advanced command-line interface. I encourage you to download the Beta 2 version of Data Protection Manager 2007 to see these and other features at work. For more information and an overview of all the products in the System Center family, please visit microsoft.com/systemcenter. From there you can navigate to any solution in the system center family, including Data Protection Manager.
Calvin Keaton is currently the product planner for DPM. Formerly he managed the Microsoft IT Data Protection service. Calvin had six years of industry experience in datacenter operations and hardware prior to coming to Microsoft, including managing Data Center Site Operations teams for Compaq and, later, HP Services.
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.