When I speak about Exchange at seminars and other events, the topic of best
operational practices often comes up. People want to know the steps they must
take to operate an efficient and effective Exchange installation once the
software moves from pilot status into production. Email is now a
mission-critical application for many large companies, and these organizations
want to minimize the company's risk in the investment they make to implement
client/server-based messaging. In "Planning a Large-Scale Exchange
Implementation," May 1997, I discussed how to plan for a successful
implementation; now I'll consider day-to-day operations in an Exchange
environment and explain the five guiding principles that will make your
operations successful.
Microsoft designed Exchange to be scalable, robust, and reliable in
distributed environments. Exchange manages reasonably large user populations on
individual servers (one server at Digital has supported more than 2750
mailboxes) and will manage far larger populations as Windows NT and hardware
evolve. Exchange is more akin to mainframe or mini-computer messaging systems,
such as IBM PROFS or Digital ALL-IN-1, than Microsoft Mail or Lotus cc:Mail.
Guiding Principles
Managing very large user communities is impossible if you don't follow
disciplined systems management practices. I have several principles that guide
efficient system management for a production-category Exchange server.

- Plan for success. Assume that users
will increase the demand on the servers, the volume of mail traffic will
increase, and you'll deploy new messaging applications (such as workflow). Make
sure that system configurations incorporate room for growth and accommodate
periods of increased demand.
- Use dedicated hardware for Exchange.Configure the hardware to provide a resilient
and reliable service on a
continuous basis for three years with a minimum number of interventions (and
system downtime) required. After three years, replace the hardware.
- Keep downtime to a minimum. Never
take an action that interferes with or removes the Exchange service from users.
For any intervention that requires taking servers offline, plan in advance and
clearly communicate your intentions to users. Also, be prepared for catastrophic
hardware failure. Outline a recovery plan to handle emergencies.
- Track system statistics. Proactive
system monitoring is a prerequisite for delivering a production-quality service.
While you're monitoring the system, gather regular statistics on system use and
analyze the data to help identify potential problems and protect the quality of
service.
- Follow well-defined, regular
housekeeping procedures.
Exchange needs disciplined management to achieve maximum potential. Anyone
can take the Exchange CD-ROM, slap it into a drive, install the software, and
have a server up and running with clients connected in 30 minutes. Such a system
can handle a small user community. This approach is OK if that level of service
is all you need. The strategy I outline here is geared to large, corporate
deployments, but the logic that drives the strategy is valuable no matter what
size shop you run. The five principles are generic, but they have proved to work
over a large number of Exchange deployments in the past two years.
1. Plan for
Success
Any configuration will come under increasing pressure as it ages. You
experience the best performance immediately after you install the system,
when disks are not fragmented, users put little demand on the computer, and
application files are as small as they'll ever be.
As people get to know an application, the user-generated load increases.
Users send more messages, and the messages are larger. Users find more reasons
to use the underlying service: For example, you might install a fax connector
for better communication with external agencies or deploy a full-text retrieval
package to improve manageability of public folder contents. The disks fill up
with user and application data. With Exchange, the information store swells to
occupy as much space as you can devote to it. If you don't configure the system
with success in mind and incorporate room for growth, you'll end up with a
system that runs smoothly at the beginning only to suffer increasingly as time
goes by.
I recommend overconfiguring the service at the start so that you don't
become entangled in a cycle of constant upgrades. Install two CPUs rather than
one, use 128MB of RAM rather than 96MB, have 20GB of disk instead of 16GB, and
so on. Build server configurations that can handle at least some expected
software developments over the next few years. For example, consider RAID
controllers for system clustering. Look at the hardware that existing clustering
solutions use and see whether you can include hardware with the same or superior
capabilities. (For more information on clustering solutions, see Mark Smith, "Clusters
for Everyone," and Joel Sloss, "Clustering Solutions for Windows NT,"
June 1997.) Because the upcoming release of 64-bit NT 5.0 will require a new
version of Exchange before it can be used for messaging, it is probably at the
outer range of consideration. But think about Alpha CPUs if you're interested in
building high-end servers that you want to eventually run 64-bit NT on. Alpha
CPUs are also appropriate as servers that must handle high levels of format
translation work, such as those that host Internet connectors. Configure systems
that will be successful over time rather than just today. Any other approach
might require more hardware upgrades than you want in a production environment.
2. Use Dedicated Hardware for Exchange
You can install Exchange on just about any NT server that has the correct
revision level of the operating system (for Exchange 5.0, the correct level is
NT 4.0 with Service Pack 3--SP3) and a minimum of 32MB of RAM. The same server
can run other BackOffice applications and some personal productivity
applications such as Office 97. For good measure, the server can provide file
and print sharing to a set of workstations, not to mention Domain Name System
(DNS), Windows Internet Name Service (WINS), and Dynamic Host Configuration
Protocol (DHCP), and act as a domain controller. The applications will install
and run, but run slowly. And, with all those applications, think of the steps
you'll have to take to get the server back online in case of hardware failure. I
do not recommend this mix on a production system. Having dedicated hardware lets
you tailor and tune the configuration to meet the needs of an application.