Replicate Publisher changes to Subscribers and Subscriber changes back to the Publisher
SQL Server's advanced replication features let you synchronize data between databases, whether the databases reside on the same or different servers. But why not just back up the database and restore it on another server or use Data Transformation Services (DTS) to transfer tables and data? If you're working with read-only databases, these techniques work fine. But what if you have a large (100GB or larger) online transaction processing (OLTP) database that handles 100 or more transactions a minute? Or what if you have more than 1000 users running select, insert, update, and delete statements at least 8 hours a day? In these scenarios, running a backup and restore every 10 minuteswhich would lock your tables and generate irate calls from usersis unacceptable. Using DTS or bulk copy program (bcp) to transfer that much data whenever you want to synchronize databases also isn't an option; the transfer would take far too long.
Although replication doesn't speed up data transfer, it does let you transfer data from one server to others all at once, then apply those changes to other databases. In short, replication can be the best choice for data transfer if you're looking for a happy medium between data concurrency and availability.
SQL Server 6.0 introduced replication support, and SQL Server 6.5 added minor enhancements. But SQL Server 7.0 takes replication to a higher level of functionality and ease of use. SQL Server 7.0 goes beyond snapshot and standard transactional replication to support two-way (or immediate updating subscription) replication and merge replication. It also lets you replicate data to and from non-SQL Server platforms. And SQL Server 7.0 automates all replication tasks, letting database administrators set up and administer replication through wizards without writing a single stored procedure. Through the replication wizards, you can even instruct SQL Server to write cleanup tasks and notify the administrator of errors by email or pager. Let's walk through SQL Server 7.0's wizards to set up an example immediate-updating subscription replication solution, examine what SQL Server does behind the scenes to implement this solution, then test the solution.
Publishers, Subscribers, and Distributors
Microsoft uses a Publisher-Subscriber metaphor to describe SQL Server's replication process. Publishers are servers that make data available for replication. Subscribers are servers to which SQL Server replicates the data. Note that each Publisher can simultaneously act as a Subscriber. On the Publisher, SQL Server generates publications, which can contain one or more articles. An article is simply a table or a subset of a table. You then create subscriptions by choosing the publications you want the Publisher to replicate to Subscribers. Each subscription can contain one or multiple articles, and to be replicated, articles must belong to subscriptions. The last piece of the replication puzzle is the Distributorthe server that tracks the data and transactions that you need to replicate. Note that the Distributor can be the Publisher server or a separate server.
SQL Server 7.0 supports three types of replication: snapshot, transactional, and merge. Snapshot replication transfers a "snapshot" of the dataa copy of the data at a specific point in timefrom one database to another. Snapshot replication generates minimal additional workload on the Publisher and Distributor servers, but it doesn't give you a high level of data consistency among replication participants because it doesn't keep Publisher and Subscriber databases in synch. Thus, snapshot replication is best for situations where you don't need up-to-the-minute data synchronization.
In contrast, with transactional replication, several processes continuously monitor changes on the Publisher and periodically replicate those changes to Subscribers. With SQL Server 7.0, you can set up replication to also monitor Subscriber databases and replicate changes back to the Publisher. Microsoft calls this functionality immediate-updating subscription. Transactional replication is best for situations where you need constant synchronization between the Publisher and Subscriber servers.
SQL Server 7.0 also features support for the new merge replication function, which lets you merge data from databases on several servers to one database on the central server. Merge replication is best for environments in which a central server acts as the organization's main repository and must be regularly synchronized with all the Subscribers. Subscribers in merge replication also can modify data. If the Publisher and a Subscriber modify the same data at the same time, a conflict reader algorithm determines the winner.
When considering replication, you have to balance replication latency and data integrity. With minimal latencythat is, if you replicate transactions as soon as SQL Server commits them on the Publisheryou increase the workload for both the Publisher and Distributor. To handle the load, you might want a separate server acting as Distributor. However, if you accept some delay between transaction commitment and replication, the workload is more manageable but you risk not having up-to-the-minute data on the Subscribers. For example, consider a bank headquarters, which uses SQL Server to generate daily management reports, and the bank's branch offices, which use SQL Server to track ATM transactions. You could accept some delay in replicating transactions to the headquarters. However, you need to replicate withdrawal of money at one branch to the rest of the branch servers right away.
In this example, you would invest in a separate Distrib-utor server to forward transactions from each Publisher (each branch server) to the Subscribers (the rest of the branch servers). With SQL Server 7.0, you would set up a central Distributor and configure
all branch offices as Publishers with immediate-updating subscription replication. In this configuration, all bank branches would have up-to-the-minute balances of each account at all times, regardless of the withdrawal location. You would set up headquarters, however, as a Subscriber that pulls subscriptions from all branch servers at the end of each business day. Managers, who don't need the latest account information for each customer, would then receive daily activity reports in the morning.
Creating a Publisher
SQL Server 7.0's replication wizards make setting up most replication tasks an easy matter. (You can also use stored procedures to implement replication processes. Although this article doesn't cover the process, "Custom Replication," page 37, shows how you can use stored procedures to implement custom replication solutions.) To see how to use the replication wizards, let's walk through a simple example that sets up an immediate-updating subscription replication process between databases on two servers.
To set up replication, you must use a login account that is a member of SQL Server's Process Administrators (or higher authority) server role. (SQL Server 7.0 has seven built-in server roles; for a list of these roles and their permissions, see Michael D. Reilly, Certifiably SQL, "Login Security," January 2000.) To assign a login to a particular server role, right-click the login under the Security tab in SQL Server Enterprise Manager, then choose Properties from the resulting pop-up menu. Click the Server Roles tab (as Figure 1 shows), select the appropriate role (I'm using the built-in sa account for this example), and click Apply.
You're now ready to run the Configure Publishing and Distribution Wizard, which you find under the Tools/Replication menu. This wizard lets you select the current or a different server as the Distributor and set up the Publisher, which by default is the current server. You also use this wizard to choose the servers you want as Subscribers to the current server's publications. The Configure Publishing and Distribution Wizard automatically creates snapshot, log reader, merge, and distribution agents as well as replication cleanup jobs.
After the introductory screen, the Configure Publishing and Distribution Wizard asks you to choose a Distributor. Let's select the TANYA server (the Publisher) as the Distributor server. The next wizard screen lets you accept a default name (distribution) and location (E:\MSSQL7\data) for the distribution database and enables the current server as Distributor. However, for demonstration purposes, let's select the option to customize the publishing settings. Figure 2 shows the resulting wizard screen, which lets you provide a name and location for the distribution database. For this example, let's leave these settings at their defaults.
Figure 3 shows the next wizard screen, which lets you enable one or more servers to use TANYA as their Distributor after you configure them as Publishers. Clicking the Properties button (...) next to the distribution database name in the Registered servers window brings up the screen in Figure 4. From here, you can change the snapshot folder location. The snapshot folder is where SQL Server stores the snapshot data before replicating it to the Subscribers. The default location is \\TANYA\E$\MSQL7\ReplData, where E$ is the share on which you installed SQL Server. This screen also lets you specify the security profile for the replication agents, which run on the Publisher. By default, replication agents impersonate the SQL Server Agent account on the Pub-lisher. For our exam-ple, let's enable only REBA to use TANYA as a Publisher, accept the default security profile, and accept the default location for the snapshot folder.
The next step is to enable the databases you want to publish. After you enable the database for publishing, the database owner can create a publication in this database. For this example, let's select SQL Server's Northwind sample database as our publication database, as Figure 5 shows. Note that this wizard screen also lets you enable transactional and merge replication. Let's select transactional replication only.