How to Buy the Right Windows NT System
If your experience is anything like mine, you migrated to Windows NT from a DOS/Windows environment. For the most part, you probably appreciate the robustness of the operating system with its multitasking and multithreading capabilities. Yet, when you think back to your Windows 3.1 days, you probably miss the seemingly infinite capacity for editing and tweaking SYSTEM.INI and WIN.INI to increase performance, for adding space to the environment by increasing the shell size, etc.
In Windows NT, many of these capabilities are gone, and you're faced with editing an arcane--and not well-understood--Registry that is the very heart of Windows NT. Don't feel bad. Even many experienced NT users find the Registry daunting. To overcome this lack of knowledge, you may have bought the excellent Windows NT Resource Kit (see the book review in the September issue of Windows NT Magazine) and tried to figure out how to optimize NT by adding keys and changing this and that. Your Registry grew and grew. Then, you upgraded to the next version of NT, and things didn't work right. So you had to start all over again from scratch. All that work was simply thrown away.
Well, there's a way around repeating and repeating that same process. You can optimize the operating system without editing the Registry if you understand how to set up your hardware in an optimized fashion.
Most approaches to optimizing NT assume that you're going to change your system after you purchase it. To the inexperienced computer user, this can be scary. This approach will help you design an NT-optimized system--one that you can order directly from a reseller--and tell you what you need to ask for. Thus, when your system arrives, you will have an optimized NT system when you open the box instead of after spending a couple of days with the Performance Monitor.
Avoiding Bottlenecks
Bottlenecks are notorious, although it can be difficult to determine their cause. One particularly graphic example of a bottleneck involves a catsup bottle at a restaurant. It's the one place that hangs things up. In this case, the bottleneck could be the CPU, the cache, the memory, the I/O performance of the bus, the peripheral controller, the hard drive, or just about anything where the demand exceeds the supply.
To control the effect of bottlenecks--you can never eliminate them--you must optimize each "link" in the processing chain until the overall performance is acceptable for a given task. In simplest terms, a task is a collection of computer events to be completed. These events can involve access to system resources. (We tend to consider tasks in an application, hence the Task Manager in NT and the Taskbar in Windows 95.)
The Fast Bus Ride
For reasons that baffle me, many people consider the Industry Standard Architecture (ISA) bus their bus of choice. The ISA bus was actually considered out of date in 1988. Since the ISA standard was designed for the Intel 286 chip, only 16MB of memory can be accessed directly on the bus. If you add more memory to the system, all access to addresses above 16MB must be buffered to regions below 16MB. This double buffering slows the system down.
Following the ISA standard, IBM developed the Micro Channel Adapter (MCA), and an independent group of hardware vendors developed the Extended Industry Standard Architecture (EISA) standard. Both were 32-bit buses, not 16 bits as the ISA bus was. The number of access lines was increased, and the memory limit for Direct Memory Addressing (DMA) increased to 4GB. Although not normally mentioned, one major advantage that EISA and MCA offered over ISA was their ability to do things rapidly in DMA. In fact, accomplishing an event via DMA was faster than doing it locally.
ISA refuses to die. Fortunately, the Peripheral Component Interface (PCI) and, to a lesser degree, the VESA local bus (VL) have diminished the industry's reliance on it. The lack of DMA memory is still a problem since neither PCI nor VL fully populates the bus. Since PCI and VL have local-bus architectures, the 16MB limit does not exist. So, why bring it up? You may want to add an ISA card to your system for a specific function, and the card may be a bus master. If so, it will have the 16MB legacy and may slow the system down significantly. I don't recommend an ISA bus for anything involving multitasking.
What constitutes a reasonable bus for purchase? Let's consider the attributes it needs.
Enough high-speed/access slots to enable you to work around ISA limitations. If you mirror your data or add a second bus-master controller, that will automatically take up two slots, leaving you only one or two for other bus-master or local-bus cards. Adding video and networking uses all your slots or takes you over the local-bus limit. You need to have three or four local-bus slots. If you choose ISA for your remaining buses, bear in mind that you have no more room for expansion, and the card you will most likely want to add will be PCI or VL. For this reason, I prefer PCI/EISA motherboards. I have a PCI/ISA system although I consider it far from ideal.
A system using bus-mastering devices that can undergo CPU arbitration. A bus-mastering card is typically a card with a coprocessor. The card can directly access memory, and the coprocessor can finish the task so the CPU can do other things. Arbitration determines who controls the CPU and for how long. In Programmed I/O (PIO), the system allows one device to occupy the CPU until the process or thread is finished. (A process is a single running instance of an application; threads are the executable units of a process.)
In an arbitrated scheme, CPU access can be assigned a priority or serviced within defined clock times. The importance of this issue becomes obvious when you consider that one device can disrupt what another device is doing (e.g., getting data off a hard disk can disrupt any other event until the data is in memory). Don't misunderstand me. PIO cards and devices (IDE hard drives, for example) work well with NT and may actually be faster on a single process/thread than a bus-mastering device. This is not, however, the case for multitasking and thread isolation.
Admit it. The near future--at least--is all PCI. You might as well ignore implementation issues and get a PCI-based--and not a VL-based--motherboard. VL may be faster on a per-card basis--no Plug-and-Play overhead--but the industry is standardizing on PCI.
EISA is still a viable option, but the cards are old and most probably won't be updated soon. Even so, PCI/EISA is the optimum motherboard. However, some of the new PCI/ISA motherboards based upon the latest incarnation of the Triton chipset also show promise for workstation use.
CPU, Cache, and Memory
Performance issues arise because processes and related threads are assigned priorities and related time slices. Those with higher priorities can disrupt the performance of those of lower priority. NT is a preemptive multitasking system; it can preempt one thread to start processing another.
In addition, the inherent client/server nature of NT affects performance. When a process creates a thread, that thread is a client of the Win32 subsystem server. If that client asks the server to do something, a complementary thread is created in the subsystem. The server thread and the client thread don't share time space; one is suspended for the other.
Therefore, to optimize NT, you need to provide the smoothest flow possible between threads without causing disruptions in processing. Depending upon the nature of the processes, you will run into various issues. The CPU may be too slow; there may not be enough memory available for the proper functions; or the hard drives and data transfer may be too slow. These are the issues this article addresses.
Cache: Any modern motherboard should have a minimum of 256KB of cache memory. You need to match this cache to the CPU and bus speed. For example, the 33-MHz bus Pentium systems need cache speeds below 10 nanoseconds (ns) while the 30-MHz bus can function well with standard 15-ns cache. Newer motherboards use high-speed single in-line memory module (SIMM)-like cache that can be altered according to the speed of the CPU. If the system has a significant amount of memory, it's probably worthwhile to maximize the cache on the motherboard. You'd probably want to insist on this up front.