Using counters to diagnose your system's health
In March, I showed you how to start Windows NT's Performance
Monitor, add counters, and save settings. I also mentioned a few of the critical
counters. In this article, I'll look at those and other counters, and give you
more details about when to use them. The aim is to evaluate the overall health
of your system and network. This article will not give you a comprehensive list
of available counters. Rather, it introduces how to use counters and focuses on
those counters that are most useful to the systems administrator and power user.
Objects
Any discussion of counters must start with a description of the objects that
generate counters. Common objects, such as Processor, are in the Performance
Monitor selection list on all NT installations. Table 1 lists those Performance
Monitor objects. Optional objects appear only if you choose certain options
during setup. For example, you will see the NetBEUI object only if you select
NetBEUI as a protocol to install. And the last group of objects, add-ons, are
objects that you add to Performance Monitor with other software, not with NT.
First, let's examine the optional and add-on objects that you can track.
For each object, I'll discuss some specific counters. You can start Performance
Monitor on your system and follow along.
Optional Objects
Some objects appear in Performance Monitor only when the associated service
or process is running. I'll highlight a few of these optional objects.
The Browser measures the various Browser Service transmissions.
This object is relevant only if the computer is a browser or a potential
browser. The Browser Service gives you the list of resources available on the
network. When you use the Map Network Drive option from the My Computer icon,
the Browser Service is responsible for displaying shared directories.
The Server Service is the complement to the Redirector. It makes resources
on the local computer available to other users across the network. Therefore,
the Server Service does not need to run on a workstation that acts as a client
computer and doesn't share any of its local data or printers with the rest of
the network.
Depending on the protocols and network services you have installed, you'll
see one or more network objects on your Performance Monitor list, such
as Appletalk; Client Service for NetWare; IP, TCP, Network Interface; NetBEUI
and NetBEUI Resource; NWLink IPX, NWLink NetBIOS, and NWLink SPX; and RAS Port
and RAS Total. Each network object measures multiple counters; overall, they
monitor the network throughput.
Add-On Objects
Add-on objects are associated with software other than NT. For example,
Microsoft SQL Server adds several objects, which you'll see only when SQL Server
is running.
Counters
As I mentioned in March, Performance Monitor has about 350 counters. You'll
probably use only a few of them on a regular basis for ongoing system monitoring
and keep the rest for troubleshooting and tuning. Some counters are for
programmers to use only in debugging and optimizing applications.
A good place to start is the default counter for each object. When you
select an object, notice that the highlighted counter is not necessarily the
first one in the list. Instead, the highlighted counter is the one that the NT
developers thought would be the most useful. For example, when you select the
Cache object, the default counter is Data Map Hits %, as Screen 1 shows
on page 159. The following paragraphs highlight a selection of counters that you
might find useful.
The Data Map Hits % counter under the Cache object shows how often
requested data was found in the cache. This counter means you can retrieve the
data rapidly from physical memory instead of having to read from the disk. A
consistently low value, say below 80 percent when the system is very busy, can
signify insufficient memory available.
The Average Disk Queue Length counter under the LogicalDisk object
measures the average number of read and write requests that were queued for the
selected disk during the sampling interval. A value greater than 1 or 2
indicates a potential bottleneck at the disk, and processes are forced to wait
on disk access. Further investigation is in order before you can be sure the
disk is the problem. Your system might have insufficient RAM, resulting in
constant paging from memory to disk and back again. Resolve memory shortages
before deciding that you have a disk problem.
The LogicalDisk object's Avg. Disk sec/Transfer counter shows how
long, in seconds, the average disk transfer takes. On its own, this counter
might tell whether you have a fast or slow disk, although the value will vary
depending on the type of data you are processing. The actual value for short
files will be in tens of milliseconds, which will show as 0.0nn seconds.
If all your disk counters are zero, use the diskperf - y command to make
sure that you turned on disk monitoring, as I described in March.
One powerful way to use Performance Monitor is to combine values from
different counters. Suppose you have an Avg. Disk Queue Length of 3 and an Avg.
Disk sec/Transfer of 0.033. With three requests (each taking about 33
milliseconds) in the queue, about 100 milliseconds (one-tenth of a second) will
pass before the system can process a new request. This calculation gives you an
idea of the delay that waiting on your hard disk causes.