Figure 1 shows a VRRP configuration in which a virtual router comprising two physical routers has a VRIP address of 192.1.1.1. Router 1 is the master router for VRID 1 (it has a priority of 110), and Router 2 is a backup router (it has a priority of 100). A virtual router uses a unique Media Access Control (MAC) address formed by appending the VRID to one of the physical routers' MAC addresses. For example, the MAC address of the virtual router in Figure 1 is 00005E000101 because the MAC address of one of the physical routers is 00005E0001 and the VRID is 01. The computers in the subnet in Figure 1 use VRIP address 192.1.1.1 as their default gateway. When a computer sends information to the gateway, an accompanying Address Resolution Protocol (ARP) message requests the gateway's MAC address. The virtual router's active router responds by sending the virtual MAC address rather than the router's physical MAC address. Therefore, the computers can connect to an available router without knowing which physical router they should use.
The VRRP configuration in Figure 1 provides fault tolerance but wastes router resources because the backup router is idle. Fortunately, you can set up a VRRP configuration in which both routers are active. A VRRP router can serve more than one VRID and VRIP address on the same interface. For example, as Figure 2 shows, you can define Router 2 as the master router for VRID 02 and VRIP address 192.1.1.2 and Router 1 as the backup router for virtual router VRID 02. You can configure half the computers on the subnet to use VRIP address 192.1.1.1 as their default gateway, and the other half to use VRIP address 192.1.1.2 as their default gateway. This configuration is load balanced as well as fault tolerant.
Major vendors have implemented VRRP in their routers and routing switches. Cisco's VRRP implementation, Hot Standby Router Protocol (HSRP), is a proprietary protocol similar to VRRP. Alteon and Arrowpoint use VRRP to provide redundancy for server load balancers. The vendors call their redundancy configurations active-backup or active-active, which are similar to the configurations in Figure 1 and Figure 2, respectively. (For more information about Web server load balancers, see "Web Server Load Balancers," April 2000.)
BGP
Routers often use a routing protocol to exchange routing information and dynamically update their routing tables when network topology changes (e.g., when a router or link fails). A network under one administrative domain, such as an organization's intranet, is known as an autonomous system (AS). A routing protocol used within an AS is an interior routing protocol. The Routing Information Protocol (RIP) and Open Shortest Path First (OSPF) protocol are two popular interior routing protocols. Different ASs generally use an exterior routing protocol (aka an interdomain routing protocol) to exchange routing information. The Internet exterior routing protocol is BGP, which the IETF defined in RFC 1771. Each AS needs a unique AS number from InterNIC to run BGP on the Internet.
BGP typically runs in routers on an AS's border (e.g., your Internet routers, ISPs' routers to their customers and other ISPs). BGP routers that directly exchange BGP routing information are peers. For example, in Figure 3, page 78, Router 1 in AS1 and Router 4 in AS4 are peers. In addition, Router 2 and Router 3; Router 2 and Router 5; Router 3 and Router 6; and Router 4, Router 5, and Router 6 are peers. Two ASs that use BGP to connect are also peers (e.g., AS2 and AS3).
When two BGP peer routers have established a TCP connection, they use BGP update messages to exchange or advertise routing information. BGP routers send BGP routing information to the ASs that they and their peer routers can reach. This information includes Internet routes the routers have learned from other routers and intranet routes the routers have learned from an interior routing protocol or static routing configuration. BGP uses an aggregated or Classless Inter-Domain Routing (CIDR) IP address (aka a prefix), such as 192.1.0.0/16, to represent the route to an AS. A BGP router also associates an AS-PATH attribute with each route. This attribute denotes the path from the advertising router's AS to the AS associated with the CIDR address. For example, AS3 in Figure 3 has the network address 192.100.0.0/16. AS1, a direct peer of AS3, advertises that one possible route to 192.100.0.0/16 has the AS-PATH attribute 1 3. AS4, a direct peer of AS1, receives this information and can use it as a factor in its calculation of the best route from AS4 to AS1.
In a BGP router, you can define a policy that filters which routes a router accepts from a peer and which routes the router will advertise. To optimize routing and implement redundancy, you can incorporate attributes, such as preferences and metrics, into received and advertised routes. Peer routers use KeepAlive messages to check each others' availability. If a router doesn't receive a KeepAlive message from a peer after a predefined interval, the router drops the BGP session, removes the unreachable peer's routes from its BGP routing table, and sends an update message about the change to its other peers.
BGP running between two ASs is known as external BGP (EBGP). BGP running between routers within the same AS is known as internal BGP (IBGP). All IBGP routers in an AS must communicate with one another. You use IBGP rather than a conventional interior routing protocol (e.g., OSPF) because IBGP can take advantage of BGP's routing policy feature. BGP can natively re-advertise learned BGP routes and their associated AS-PATH attributes among IBGP routers. Many ISPs and companies that have multiple Internet connections use IBGP in their border routers. One IBGP router doesn't need to physically connect to another IBGP router as long as the routers can reach one another through an interior routing protocol or static routing configuration. For example, in Figure 3, IBGP logically connects Router 4, Router 5, and Router 6 in AS4. Thus, Router 4 in Los Angeles can advertise the routes it has learned from Router 1 of AS1 to Router 5 in Chicago and Router 6 in New York.
Multihoming
The simplest Internet-connection scenario is a company with one Internet connection between its network and an ISP. Unfortunately, this setup doesn't offer redundancy or fault tolerance. For redundancy, you need a multihomed configurationthat is, you must configure multiple Internet connections to one or more ISPs. The two major categories of multihomed configurations are multiple connections to one ISP and multiple connections to multiple ISPs.
If you want to multihome to one ISP, two configurations are popular. You can connect your single Internet router to two or more routers at different Points of Presence (POPs) at an ISP, as Figure 4 shows. Alternatively, you can connect two or more routers at your company to two or more routers at different POPs at an ISP, as Figure 5, page 80, shows. Although the first configuration provides redundant Internet connections, the single router at your location creates a single point of failure. The second configuration offers better redundancy: If your Internet routers are in different sites, a disaster in one location of your company won't prevent the remaining sites from accessing the Internet. If you've implemented global server load balancing for your Web servers, your customers will still be able to reach an available site.
If you want to multihome to multiple ISPs, you connect your single or multiple Internet routers to routers at two or more ISPs, as Figure 6 shows. This configuration adds more reliability to your Internet connections because if one ISP experiences a major network outage, other healthy ISPs will provide Internet access.
Fault-Tolerant Multihomed Configurations
You can set up a fault-tolerant multihomed configuration so that one link is the primary link and the other links are backup links. If the primary link is down, traffic will fail over to the backup links. For example, in Figure 4, the link from Company A's Router 3 to ISP1's Router 1 in Los Angeles is the primary link and the link from Router 3 to ISP1's Router 2 in New York is the backup link. To force Router 1 into primary link status and Router 2 into backup link status, Router 3's administrator can configure two static default routes: a shorter route to Router 1 and a longer route to Router 2. Router 3 will then give preference to the shorter link for its outbound Internet traffic.
Alternatively, Router 3 can accept the advertised default routes from Router 1 and Router 2 and associate a BGP local preference (LOCAL-PREF) attribute value with each route to denote the preferred router. The greater the value, the higher the preference. For example, Router 3's administrator can set Router 1's default route LOCAL-PREF attribute to 200 and Router 2's default route LOCAL-PREF attribute to 100 to make the Los Angeles link the primary link for outbound traffic.
To use the Los Angeles link as the primary link for inbound traffic, Router 3's administrator can apply BGP's multiple-exit-discrimination (MED) attribute to Router 3's advertised route (192.1.0.0/16). The MED attribute instructs peer ASs to choose the link with the lowest MED value as the exit to the network if the AS has multiple exits to the network. For example, if Router 3 advertised route 192.1.0.0/16 with a MED value of 100 to Router 1 and a MED value of 200 to Router 2, ISP1 would use the Los Angeles link as the primary link and the New York link as the backup link to Router 3 for inbound traffic. However, to the route, ISP1 could add a LOCAL-PREF value that overrides Router 3's MED attribute (BGP always uses the LOCAL-PREF value first when making a routing decision). To avoid problems, ask your ISP to use your MED values.
Load-Balanced Multihomed Configurations
You can create a load-balanced multihomed configuration by specifying which routers advertise and receive information about certain routes. For example, in Figure 5, Company A has two routes. Route 192.1.0.0/16 is the shortest route between ISP1 and Router 3, and 130.1.0.0/16 is the shortest route between ISP1 and Router 4. Thus, Company A's network administrator might want to configure Router 3 to prefer the Los Angeles link for inbound traffic by adding a lower MED value to the route that Router 3 advertises to Router 1 in Los Angeles and a higher MED value to the route that Router 3 advertises to Router 2 in New York. The administrator might also set a lower MED value to the route that Router 4 advertises to Router 2 in New York and a higher MED value to the route that Router 4 advertises to Router 1 in Los Angeles. The result would be that, for inbound traffic, the Los Angeles link is the primary link for 192.1.0.0/16 and the backup link for 130.1.0.0/16, and the New York link is the primary link for 130.1.0.0/16 and the backup link for 192.1.0.0/16.
If your Internet router accepts specified routes advertised from your ISP, you can load-balance these routes for outbound traffic. For example, in Figure 5, Company A has an e-business partner with a short route (route 193.1.0.0/16) to ISP1's Los Angeles POP and another partner with a short route (route 11.0.0.0/8) to ISP1's New York POP. Company A's administrator can associate a higher LOCAL-PREF value with route 193.1.0.0/16 and a lower LOCAL-PREF value with 11.0.0.0/8 received by Router 3 to make ISP1's Los Angeles link the primary link for 193.1.0.0/16 and the backup link for 11.0.0.0/8. To set the New York link as the primary link for 11.0.0.0/8 and the backup link for 193.1.0.0/16, reverse these settings for the two routes received by Router 4. In addition, Company A's administrator can define the Los Angeles link as the primary link for the default route (i.e., all other Internet routes) and the New York link as the backup link.
To load-balance and add fault-tolerance to a multihomed configuration that has multiple connections to multiple ISPs (as Figure 6 shows), you can use the same methods that you use for multihomed configurations that have multiple connections to one ISP. However, remember that the MED attribute works only in situations in which an AS has multiple connections to another AS (i.e., MED is nontransitive). Thus, if you have only one link each to multiple ISPs, you can't use the MED attribute. In Figure 6, Company A has only one connection to each ISP, so Company A's administrator can't use the MED attribute. Instead, the administrator can manipulate the AS-PATH attribute to advertise a route. For example, to set AS1 as the backup link for 130.1.0.0/16, the administrator can create a bogus AS-PATH value by adding 4 to the normal AS-PATH value 4. When Router 3 advertises 130.1.0.0/16 with an AS-PATH value of 4 4 to AS1, AS1 will advertise the route with an AS-PATH value of 1 4 4 to AS3. Router 4 advertises 130.1.0.0/16 with a normal AS-PATH value of 4 to AS2, and AS2 advertises the route with an AS-PATH value of 2 4 to AS3. Therefore, AS3 will choose the AS2 link for traffic to 130.1.0.0/16 because this route is shorter.
When you connect to multiple ISPs, block all ISP-established routes and their learned routes except routes that you specify. Otherwise, ISPs might discover a short path to another destination through your AS, and your network might become a transit AS for traffic between ISPs.
Fasten Your Seat Belts
You can use the building blocks I've described to build a redundant IP routing configuration. Multiple default gateways, IRDP, and VRRP provide first-layer routing redundancy. Multihomed Internet connections that use BGP provide second-layer routing redundancy. If you set up additional routers between the first and second layers, such as backbone routers for your network, be sure to use multiple routers and paths to incorporate redundancy. In addition, consider using reliable or redundant switches for your Internet hosts and routers. When you have a highly redundant network in place, you provide a disaster-resistant vehicle to safely carry your e-business onto the Internet.