In Screen 5, you can see a few red motors, signaling that the operator and
motors stopped reporting. I knew I had a problem when red became the most
prevalent color on the screen. I started losing operators and motors in domino
fashion. The test ran to completion, but the results were not what I expected. I
ran the test a dozen times, with similar results (see the sidebar, "
Analyzing Test Results,"
for details about Dynameasure's test analysis
features). The results revealed that somewhere between 27 motors and 38 motors,
the average response time per transaction more than doubled, the number of bytes
transmitted per second across the network fell sharply, and the system could not
support more than an average of 56 users. In a commercial business network,
these results would imply that if the business expanded from 30 users to 40
users, file operations would slow considerably. I immediately thought this
situation couldn't be right because I'd used the same network configuration to
generate hundreds of client workloads using Dynameasure for SQL tests. I changed
data set parameters such as scale and Think Time, but the results were much the
same.
After spending more than enough time studying the situation, I called
Bluecurve's technical support team. The representative was extremely helpful and
stated that the Lab's network configuration should be able to handle a much
bigger user workload. The representative suggested an approach right out of the
user manual: Develop a plan first. Because Dynameasure Enterprise was a new
release of familiar software, I had given the user manual a cursory glance,
accepted the defaults for configuration, and run a test I picked at random. I
had not defined what area of the network I wanted to stress (clients, server,
bandwidth) or decided whether I was testing CPU usage, RAM, hard disk
performance, or new hardware or software. Because the test results were not what
I expected, I blamed the software. Instead, I should have read the manual and
developed a strategy based on what kind of baseline I wanted to achieve. I
simply wanted to measure network throughput under a workload of 100 motors.
After I identified what my baseline was going to measure, I defined the
performance testing strategy.
Bluecurve advised me to start with a small-scale data set and a minimal
number of motors, and gradually increase the scale or the number of motors to
find the network bottlenecks. So after reading the user manual, I proceeded to
run the same File Services test: Copy All Bi-directional. I created a 0.01 scale
data set, increased the Think Time to 10 seconds, changed the File Test
Specifications to only 30 motors (one for each workstation), and conFigured only
one step instead of six.
The test results did not improve: During pretest, operators and motors
again turned red, signaling they were not reporting. However, now I had a better
understanding of the information Dynameasure could give me. Besides monitoring
the progress and results in the Dispatcher, I used other Dynameasure diagnostic
tools. For example, Dynameasure logs every action the motor performs. I
identified a red motor in the Dispatcher and then went to that machine and
viewed Motor Details, as shown in Screen 6. From the Log, I determined that the
motors were not being turned on during the pretest, which caused the Not
Reporting status. I also observed that all the red client machines connected to
the same repeater. The NT Event Log showed redirector errors for the problem
machines.
Then the Miracle Occurs
I took the next logical action--I turned off the suspect client machines. I
was amazed with the next series of tests. The remaining 23 clients had no
problem running the test with 23 motors. Next, I gradually increased the number
of motors to the Lab's 100 limit. Screen 7 shows the results the Analyzer (the
third component in the Manager) displayed. The results show that the network
configuration could support 100 users applying the workload in the Copy All
Bi-directional test. I changed the Cogent S-1200 repeater that the problem
machines connected to, and all 30 operators started and displayed a blue Ready
status in the Dispatcher. All subsequent tests ran with 30 operators providing
100 motors.
As you can see, Dynameasure worked properly from the beginning. Had I
followed a performance strategy and properly examined the test results, I would
have saved myself days of anguish. Dynameasure found a hardware bottleneck in my
network and let me correct it. Mastering Dynameasure takes time, but the tools
and data resources make network analysis a breeze.
Screen 8 displays
performance results for three client-to-server
file operation tests that I ran on the Lab's network. The Motors per Step graph
shows that the network can support at least 100 users. The Bytes per Second
graph compares the transaction performance of individual file operations and
charts how the network would react to a given number of users executing the same
file operation. For example, the green line represents throughput performance
for the Copy Large Data to Server test. After Step 3, system performance
degrades. The Motors per Step graph shows that 40 motors completed the file
operations in Step 3. Thus, adding more than 40 users to the network will
decrease system performance.
The more I used Dynameasure 1.5, the more I liked it. Armed with the user
manual's appendix, any systems administrator can select and modify tests to
mimic typical user actions on the network. With customized tests, a systems
administrator can identify where bottlenecks will occur, for which type of
transactions performance will degrade, and the network's user capacity. This
product definitely performs as advertised.