Subscribe to Windows IT Pro
January 17, 2006 12:00 AM

Keep Tabs on Replications

Monitor data-sync problems between replication partners
Windows IT Pro
InstantDoc ID #48746
Rating: (0)
Downloads
48746.zip

Lately I've noticed more companies using replication to copy data to remote sites. Another common replication scenario involves copying data from a source to a destination location before a company switches over to a new server or storage area. Often replication occurs over VPNs and sometimes over slower network links. Should a replication not work correctly and synchronizations not finish or actually fail, the systems administrator or data owner might be unaware of the problems until users complain about obsolete or missing data. I've written a batch script, ReplicationTest, which makes dealing with such difficulties easier by notifying administrators and data owners of data-variation problems that occur during replication and that warrant further investigation. Let's delve into the workings of the script.

The Problem: Compare Source and Destination Files
Often, data on the master or source location is in a writeable state, whereas the remote (i.e., destination) location is read-only. Therefore, changes to files at the source location might not always instantly be reflected at the destination location until after replication finishes. If you take a snapshot of the directory size and number of files on the source, these values could be different on the destination location until replication to the destination is triggered by changes or time-of-day scheduled copy operations. Consequently, comparing the source and destination locations at any point in time might reveal small differences in the directory size and file count between the two areas, even if replication is working correctly. However, if the size and file count on the master and destination locations differ significantly, you'd probably need to investigate further to determine whether you have a replication configuration problem or a network link that can't adequately support replication traffic.

ReplicationTest compares the total directory size and number of files on the source and destination devices within a predetermined threshold of variation. To do this, the script first invokes the Diruse command-line utility (diruse.exe), which displays directory-size information, to capture the point-in-time directory sizes at the source and destination locations. You can find Diruse in the Windows XP and Windows Server 2003 Support Tools and in the Microsoft Windows 2000 Resource Kit. The syntax for the Diruse command is

Diruse.exe /MB \\server 
  \Targetlocation |
  find " TOTAL:" 

(Some commands in this article wrap to several lines because of space limitations; you should type commands on one line.) ReplicationTest uses a default threshold of 10 percent, although this value is configurable. If the destination-location values vary by more than 10 percent compared with the source location, the script notifies the administrator of the discrepancy by sending a page or email message.

In most cases, merely comparing the size and file count on the replication partners provides an adequate comparison of the file differences between them. However, if any of the replication partners has limited changes—for example, the overall size or file count changes infrequently, new files are seldom introduced, or old files are overwritten with new versions—this approach might not work well for you. In the replication testing I've done for clients, I've found that comparing total directory size and file count can be an accurate way to gauge data concurrency. The alternative—performing file-for-file comparisons—is a high-overhead operation compared with the snapshot-based technique that ReplicationTest uses.

ReplicationTest Tasks
ReplicationTest performs three primary tasks: It launches Diruse commands on the replication partners simultaneously, performs math operations to determine whether the variation in the Diruse results exceeds the threshold, and, if the threshold is exceeded, sends a pager or email notification to the administrator.

Launching two Diruse commands simultaneously. If you launched the Diruse command on one share and waited for it to finish before launching it on the second location, the results could be skewed because of timing. If the commands were launched many minutes apart and replication was in progress, a false variation might result. Ideally, the script should launch the Diruse command for the source and destination areas simultaneously, then return the results of Diruse while the script idles and waits for a signal to move ahead and compare the results.

When you run a command such as Diruse in a script, the command finishes executing before script flow passes to the next command. We want to start Diruse on the source location and simultaneously launch a second instance against the destination location. There are several ways to accomplish this, but the method I've chosen is to have ReplicationTest create two scripts, then use the Start command to launch them. The output from these two spawned scripts is sent to two output files, then returned into the main script to make the directory-size and file-count comparisons.

ReplicationTest creates .bat and .tmp files to run and retain the Diruse run results. Therefore, the script needs a way to ensure that the created files have unique names to avoid accidentally overwriting files. For example, if you launched a second overlapping instance of a script and didn't use unique filenames for .tmp files, you could overwrite another temporary file. The sidebar "Naming Temporary or Output Files" explains several techniques for creating unique filenames. After ReplicationTest compares the information in the .tmp files, it performs a cleanup operation and removes the two spawned scripts and their two temporary output files.

A caveat about the Diruse utility is that it might fail when invoked inside a For command because double quotes are used in the command path. (Generally, you'd enclose a command path in double quotes to accommodate spaces.) The sidebar "Handling Spaces in Command Paths," page 4, discusses methods for dealing with spaces in command paths.

Perform math operations to analyze the variation. After the Diruse results are returned, the script must perform math operations to determine whether the variation exceeds the specified threshold. You could use the Windows Set command with the /a switch to perform simple calculations. However, Set /a has some constraints in dealing with nonintegers: It can't handle nonintegers as input and outputs only integers. It's likely the Diruse command will produce directory-size output that contains decimals, which Set /a doesn't support. I enabled ReplicationTest to perform the necessary math operations by doing some creative coding and invoking Mathomatic, a short Perl script I wrote to handle simple math operations. The script uses Mathomatic in a couple of ways; one of the more interesting uses is to perform the equivalent of an If x GTR y statement, as the ReplicationTest excerpt in Listing 1 shows. If you're new to Perl, you'll probably find it helpful to review the comments in the Mathomatic script.

Send a pager or email notification. A useful notification mechanism is to send an email message, which you can direct either to an email address or to a pager or cell phone as a text message via an email address. ReplicationTest uses Blat, a popular command-line SMTP mailer program, to send the notification, as Listing 2 shows. You can download Blat (the current version is 2.5.0) at http://www.sourceforge.net. When you unzip the Blat download file, look for the syntax.html file in the \docs folder. This file contains descriptions of the utility's 80-plus command-line switches. Blat has more switch options than almost any other command-line utility I've used and even has its own Yahoo! discussion group (http://groups.yahoo.com/group/blat) of Blat devotees. If you're completely overwhelmed by Blat's many switch options, see the examples.txt file in the \docs folder for examples of a few simple switch options.

Related Content:

ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here

advertisement

advertisement

Windows is a trademark of the Microsoft group of companies. Windows IT Pro is used by Penton Media Inc. under license from owner.