Windows IT Pro is the authoritative and independent resource for windows nt, windows 2000, windows 2003, windows xp. Features a collection of resources and magazines for windows IT professionals.
  
  
  Advanced Search 


April 03, 2007

Exchange CDP

Is your Exchange organization getting the best protection possible?
RSS
View this exclusive article with VIP access -- click here to join |
See More Backup and Recovery Articles Here | Reprints | Or sign up for our VIP Monthly Pass!

CDP Products for Exchange
Asempra Business Continuity Server
Asempra, http://www.asempra.com/solutions
DigiVault
Lucid8, http://www.lucid8.com/product/digivault.asp
Double-Take
Double-Take Software, http://www.doubletake.com/products/double-take/default.aspx
Enterprise Rewinder
CA XOsoft, http://www.xosoft.com/products/f_Rewinder.shtml
FalconStor Message Recovery for Microsoft Exchange
FalconStor Software, http://www.falconstor.com/en/solutions/?pg=Application&sb=EmailApp&sm=MSMRE
HP StorageWorks for Microsoft Exchange Server
HP, http://h18006.www1.hp.com/storage/osmswindowsexchange.html?jumpid=reg_R1002_USEN
InfiniView
Mendocino Software, http://www.mendocinosoft.com/products.htm
LifeKeeper for Exchange
SteelEye Technology, http://www.steeleye.com/products/exchange.html
MailShadow 2.0
Cemaphore Systems, http://www.cemaphore.com/mailshadow.php
Microsoft Data Protection Manager 2006
Microsoft, http://www.microsoft.com/systemcenter/dpm/default.mspx
NearPoint for Microsoft Exchange
Mimosa Systems, http://www.mimosasystems.com/html/exchange_archive.htm
Neverfail for Microsoft Exchange
Neverfail Group, http://www.neverfailgroup.com/products/app-modules/exchange.aspx
RecoverPoint
EMC, http://software.emc.com/products/software_az/recoverpoint.htm?hlnav=T
Symantec Backup Exec 11d for Windows Servers
Symantec, http://www.symantec.com/enterprise/products/overview.jsp?pcid=1018&pvid=57_1
TimeData for Exchange
TimeSpring Software, http://timespring.com/TS_TD_Exchange.php

Continuous data protection (CDP) systems have gotten a lot of attention in the Exchange Server world over the last year or two. After the devastation caused by Hurricanes Rita and Katrina, many organizations that had previously been satisfied with their disaster recovery arrangements started to look for better protection. Conventional backup systems are like the spare tire on your car: When your tire goes flat, you need a properly inflated, usable spare and the related tools on hand, but changing a tire is still a hassle—especially if you have to do it in bad weather, alongside a busy road, or under other less-than-ideal conditions. It would be much nicer to have a dashboard button that, when pushed, would automatically fix your tire for you. That's the basic notion behind CDP: making disaster recovery easier with less data loss by increasing the frequency at which data items such as Exchange databases are backed up.

Do you need CDP? Any time you add complexity to a network or system, you're increasing the risk of its failure. However, your organization might benefit from the degree of protection that CDP offers.

RTO and RPO
Before we discuss how CDP works and how to implement it, we need to define two common terms in the disaster recovery world: recovery time objective (RTO), a measurement of how long you're willing to wait for a restore; and recovery point objective (RPO), the point in time to which you want to recover. Think of the RPO as the maximum amount of data you're willing to lose. For example, if you use a CDP product that copies data every hour, you should be able to restore to within the last hour, losing only up to an hour of data.

CDP solutions are generally designed to do two things: minimize the RTO by providing tools for quickly restoring a copy of the data; and provide the finest possible granularity for RPO by using either continuous copying of data or frequent intermediate replication checkpoints.

Host-Based vs. Storage-Based
You'll discover some crucial differences in how CDP products operate. The first difference is that some products protect data by using software that runs on the server you want to protect, whereas other products operate beneath the OS's notice because they run on a SAN controller. The first class of products is called host-based and the second class is called storage-based.

Most host-based systems use what's known as a file system filter driver. The system installs a driver that sits below the Windows I/O management subsystem (itself a part of the kernel), tracking which data items on a given volume are written to and copying those data items to a remote system over the network. Host-based CDP products typically protect a set of files or folders, although some can protect entire volumes. The responsibility generally lies with you to make sure that host-based CDP software is pointed at the correct set of folders to capture your Exchange databases and transaction logs, although some products are more Exchange-aware than others.

Some host-based systems implement transaction-level replication by monitoring changes to the Exchange database with Messaging API (MAPI). These products often have the advantage of not requiring any software on the Exchange server itself; however, they typically require a gateway server that aggregates the transactions and acts as the replication target.

Storage-based CDP has the advantage of taking place on the SAN; you don't install or maintain drivers or other components on the servers. In theory, these systems should have a minimal effect on Exchange because they function without any connection to Exchange or Windows. In practice, storage-based systems have three primary drawbacks. First, they're expensive. Second, you must be using SANs (and generally you have to have identical SAN controllers on either end of the connection—a further expense). Third, they sometimes limit the number of users you can host on a protected Exchange server because the way they copy data to the remote system creates disk latency.

Synchronous vs. Asynchronous
CDP systems copy data in one of two ways. Typically, when the Exchange Information Store (IS) makes a write request to the Windows I/O manager, the IS continues its work without waiting for the write to complete; at the time of the write request, the IS registers a callback function, and the I/O manager calls that function when the write finishes. This method is known as asynchronous I/O because the completion of the write is disconnected from what the requester is doing.

In synchronous I/O systems, the requester issues a write request, then waits for the write to finish. Synchronous I/O systems are simpler to code than asynchronous I/O sytems, and it's easier to predict their behavior. However, they tend to be slower than asynchronous I/O systems, which is why Exchange uses asynchronous I/O.

These concepts might seem esoteric, but they play a central role when you're deciding which CDP solution to deploy. Microsoft's support policy for CDP and replication products makes the distinction explicit. Think about what happens when data from the source system is copied to a replica, whether it's on the same machine, on a SAN, or across a LAN or WAN. When the source issues a write request, the data has to be written to the local disk, but it also has to be copied to the replica. If the source system's write request doesn't complete until the remote write is finished, that's a synchronous CDP operation. If, as is more common, the source write and remote write take place independently (i.e., they're not coupled in a predictable sequence), that's an asynchronous CDP operation.

What Microsoft Says
As you might expect, Microsoft has a pretty clear stance on the use of CDP products. The Microsoft article “Multi-site data replication support for Exchange 2003 and Exchange 2000” at http://support.microsoft.com/kb/895847 describes what's supported:

  • If you use an asynchronous solution—whether host- or storage-based—Microsoft expects you to use the CDP vendor as the first line of support for the replicated data. If you encounter problems, Microsoft might ask you to show that the problem isn't caused by the CDP technology, possibly by removing it.
  • If you use storage-based synchronous replication, Microsoft's policy depends on whether you're using a geographically dispersed, or stretched, cluster. The bottom line is that if you're using a stretched cluster, all your hardware must be certified for use in stretched clusters (according to the searchable list at http://www.microsoft.com/whdc/hcl/search.mspx). For solutions that are on the list, Microsoft provides full support, except that the storage or cluster vendor must provide support for the storage and replication components of your deployment. If you're not using a stretched cluster, Microsoft recommends but doesn't require that you use hardware that appears on the certification list, but the hardware and storage vendors are still on the hook for primary support.
  • If you're using host-based synchronous replication, it's essentially the same as an asynchronous solution—unless you happen to use a configuration that appears on the Wolfpack Hardware Compatibility List (WHCL), in which case it's supported like a storage-based synchronous product.

Is this confusing? Well, yes. As a practical matter, what these support statements mean is that Microsoft doesn't guarantee that it can help solve problems if those problems are caused by (or even influenced by) the use of CDP products. Microsoft will try to help, but if the problem can be traced to the CDP solution, or if “less disruptive troubleshooting" methods (a charming phrase!) don't identify the problem, you might have to remove your CDP solution to continue troubleshooting.

Microsoft also provides some deployment guidelines for CDP products at http://www.microsoft.com/technet/prodtechnol/exchange/guides/E2k3DataRepl. The guidelines state three basic criteria for choosing an asynchronous solution, which I'll quote here:

  • It can maintain the write-order consistency of all devices in a storage group, including being continuously consistent with each other;
  • It has been proven to be recoverable, preferably in both a lab and a production environment;
  • It is being provided by a vendor with a support plan in place for the replicated data.

CDP for Exchange 2003
How can you get continuous protection for Exchange Server 2003? The answer depends on what you're trying to accomplish and how much you can afford to spend. Several vendors offer CDP solutions for Exchange 2003, including EMC, Double-Take Software (formerly NSI Software), HP, SteelEye Technology, and XOsoft (now part of CA). Some SAN vendors also offer hardware-based solutions that work with Exchange.

When you're choosing a solution, the big things to consider involve what happens after you have a failure. Provided that you have enough bandwidth, and that you carefully monitor the replication solution, most products will do a sufficient job of replicating your data from one location to another. However, Exchange 2003 doesn't provide any native support for failing over operations to a remote site unless you're using clustering. Therefore, a true CDP solution will need to have some kind of failover mechanism, whether the product includes it or you have to do it yourself.

Failover requires several interlocking steps, including updating your mail exchanger (MX) record to point to the recovery server so that inbound mail flows, rehoming mailboxes by adjusting the homeMDB attribute of the affected users' objects in Active Directory (AD), and updating Outlook client profiles to point at the new server. When you're evaluating CDP products, be sure to test each product by failing production operations over to it, then failing them back. If you can't do this easily (or if the product doesn't meet Microsoft's three criteria above), you probably shouldn't use it.

CDP for Exchange 2007
Exchange Server 2007 marks a radical departure from Exchange 2003 in many ways. One of the most important changes is that it includes native support for two different CDP methods: local continuous replication (LCR) and cluster continuous replication (CCR).

LCR copies storage groups to different disks on the same server. This type of replication helps protect against problems with the original storage group's physical storage, and it protects against some types of on-disk corruption, such as failed or corrupted writes. LCR replicas provide fast restores (provided you've fixed the problem that caused the original failure), which is great if you have a short RTO, and they might allow you to take fewer full backups to secondary storage such as tape. You can create backups from the LCR replica instead of from the production database, which can be a significant time-saver. However, LCR failover requires manual action.

CCR is designed to provide full replication of data between nodes in a cluster. The way CCR works is ingenious: You set up a two-node Exchange cluster that uses a special network share called a file share witness to keep track of the cluster state. The witness can be on any server accessible across the network, although Microsoft recommends using a server in the same AD site as the cluster. The two cluster nodes don't have to share any storage. All previous versions of Exchange require the use of shared storage in clusters. To copy data from one node to the other, the CCR feature uses the same basic log-copying mechanism as LCR.

Both of these technologies work with a single database per storage group. Therefore, they're best suited for protecting high-value data instead of entire servers. In using LCR and CCR, you're also limited in the ways you can protect public folder databases; this isn't a big problem because public folders already include their own replication mechanism. CCR requires that you use the same location and paths for the storage groups and databases on both nodes, just as conventional clustering does.

Both LCR and CCR benefit from a little-noticed Exchange 2007 change: Transaction logs are now exactly 1MB, down from the 5MB size we've always had before. The smaller size makes replication performance more efficient.

And the Winner Is . . .
The majority of CDP solutions for Exchange use the host-based asynchronous approach. Products that take this approach generally offer the best balance between deployment flexibility, protection capability, and ownership cost. After all, buying software to protect your Exchange servers is almost guaranteed to be less expensive than buying a new SAN to use as a replication target!

With that fact in mind, you have some things to think about when choosing a host-based asynchronous CDP solution. First, you must clearly define your RTO and RPO and decide which is more important. Would you rather have extremely fast restores or lose less data? If restore speed is crucial, you might want to design a system that uses Microsoft Volume Shadow Copy Service (VSS) or a similar point-in-time copy mechanism as the primary means of backing up your data.

You could design a hybrid solution. For example, you could use your favorite backup utility to stream an Exchange backup to a disk file, then use the CDP solution to replicate that file to a remote site. This approach avoids many of the pitfalls of direct Exchange replication, but its RPO granularity is limited by the interval at which you take the original backups. Still, for many companies, such an approach is better than keeping local backups on site.

To further complicate matters, Microsoft recently introduced a beta of System Center Data Protection Manager Version 2.0 (DPMv2). The original version could be used with Exchange, but only if you wanted to use a conventional backup program to stream Extensible Storage Engine (ESE) data to a backup disk file; DPM then could replicate the file to the DPM server. DPMv2 can directly protect Exchange, but I haven't had a chance to test it thoroughly yet.

It's too early to tell which CDP vendors will update their products to work with Exchange 2007, especially given that Exchange 2007 includes its own CDP functionality. Taking the time now to understand how the technology works will help you have the necessary tools on hand when your system's tires go flat.

End of Article



Reader Comments
CDP is great, but you have to remember that many solutions on the market are simply just copying data from one place to another - paying no attention to whether the data is corrupt in the first place. So essentially, you're backing up garbage data. Additionally, we all pretty much need a low RTO (who can wait for a backup to restore?) and at the same time a high-frequence RPO (who can stand lost data?) The best types of CDP work at the application level, incorporating technology that knows when things go wrong, so you can prevent copying of corrupt data and fix problems before they become disasters. Additionally, CDP that incorporates HA (high availability) is the ultimate solution. Using VMs (virtual machines) enables us to dramatically reduce the RTO, as we can simply stand up a new VM that is already waiting and then unwind the backup data to it - then we're ready to go... AppAssure Software (http://www.AppAssure.com) has a great solution that is best of both worlds - RTO and RPO...

dsoschin March 17, 2008 (Article Rating: )


You must log on before posting a comment.

If you don't have a username & password, please register now.




Top Viewed ArticlesView all articles
The Memory-Optimization Hoax

Don't believe the hype. At best, RAM optimizers have no effect. At worst, they seriously degrade performance. ...

Command Prompt Tricks

One reader shares his tip for setting up the command prompt to reflect a remote path. ...

WinInfo Short Takes: Week of November 24, 2008

An often irreverent look at some of the week's other news, including a Vista Capable dismissal request, Zune price reductions, Morrow musings, Novell and Microsoft sitting in a tree ... two years later, Yahoo!, IE 6 on Windows Mobile, and so much more ...


Related Articles Backing Up and Restoring Microsoft Exchange Server 2007, Part 2

Setting Up Local Continuous Replication in Exchange 2007

High Availability Out of the Box

Exchange Server and Outlook Whitepapers Protecting (You and) Your Data with Exchange Server 2007

StoreVault SnapManagers for Microsoft Exchange and SQL Server

Related Events The Myths & Truths of Email Management with SharePoint

Top 10 Email Security Challenges and Solutions

Virtualization Management

Check out our list of Free Email Newsletters!

Exchange Server and Outlook eBooks Spam Fighting and Email Security for the 21st Century

Understanding and Leveraging Code Signing Technologies

The Expert's Guide for Exchange 2003: Preparing for, Moving to, and Supporting Exchange Server 2003

Related Exchange Server and Outlook Resources Become a VIP member of the Windows IT Pro community!
Get it all with the VIP CD and VIP access. A $500+ value for only $279!

Subscribe to Windows IT Pro!
Solve your toughest technical problems with our experts and access 10,000 + articles online. 30% off

Monthly Online Pass - Only $5.95!
Get instant access to 10,000+ articles from Windows IT Pro Magazine!

TechNet Virtual Labs
Evaluate and test Microsoft's newest products.

Exchange & Outlook UPDATE eNewsletter
News, strategies, products, and developments in Exchange Server and Outlook messaging.

Windows IT Pro Home Register FAQ for Windows WinInfo News
Europe Edition About Us Contact Us/Customer Service Media Kit Affiliates / Licensing  
SQL Server Magazine Office & SharePoint Pro Windows Dev Pro IT Job Hound ITTV
IT Library Technology Resource Directory Connected Home Windows Excavator Windows SuperSite 
 
 Windows IT Pro is a Division of Penton Media Inc.
 Copyright © 2008 Penton Media, Inc., All rights reserved. Terms and Use | Privacy Statement | Reprints and Licensing