Analyze your sites and servers
Web benchmarks are useful for gathering a lot of information about your sites and servers. For example, you can use Web benchmarks to load test hardware before you deploy it to determine its stability under load, find out your existing hardware's capabilities, find bottlenecks in your applications, or even determine which Web server software or platform will best meet your needs. (Many different Web benchmarks are available on the Internet. See Table 1 for a list of common tests.)
You might never have to test with a specific benchmark for that benchmark to be useful to you. With all the competition in the hardware and software markets today, many vendors are publishing their own results to demonstrate their equipment's capabilities.
For example, many vendors use the SPECweb99 benchmarking software to test their products. The SPECweb99 benchmark, which Standard Performance Evaluation Corporation (SPEC) wrote, is the most recent release of this benchmark. The first release in 1996 was called SPECweb96. SPEC has many other benchmarks, and I recommend that you check out its Web site (http://www.spec.org) for further information. Many top hardware vendors are using the SPECweb99 benchmark to demonstrate their hardware's power in Web serving environments that use different Web server software and OSs.
I'm going to teach you how to make sense of Web benchmarks. I give you a brief explanation of how to read a benchmark, followed by a comparison of specific sets of results from different benchmarks. Later in this article, I take a more analytical look at some of these results and show you how to interpret them for useful information.
Reading Benchmarks
Here are a few items to keep in mind when you're looking at benchmarks so that you can interpret them without jumping to conclusions or being misled by the information that's presented. (The information in this section applies to any benchmark, whether it's Web, CPU, or I/O related.) Begin by reading about the benchmark. Information to look for when you're reading includes methods used during testing, the hardware and software tested, and any special changes the testers made to the operating environment during testing. By paying attention to these details, you can better understand how the testers achieved their results.
Start by looking at the benchmark itself. Is it an industry standard benchmark with no affiliation to a particular hardware or software manufacturer? If not, who came up with the benchmark? For example, it wouldn't be fair for Microsoft to make a benchmark for Windows NT, then test NT against Sun Microsystem's Solaris. In such a case, Microsoft might have conducted the test to produce favorable results. After all, Microsoft is trying to sell NT, not Solaris. You need to look at the results objectively; benchmarks can be deceiving.
Next, does the benchmark compare similar items (e.g., comparing two servers or two OSs)? If the compared items aren't similar (e.g., comparing an Alpha processor to an Intel processor), the results are more difficult to interpret. Did the testers take special steps to prepare the items they were testing before they executed the tests? For example, in the case of OSs, if you're reading results of similar OSs, were they tweaked at all? If so, were they tweaked in the same way? I've seen tests that pit two Web servers on different platforms against each other with OSs tuned in different ways.
When looking at any kind of benchmark, you should view it as you would any sort of sporting event. No team or player should have an unfair advantage. As consumers and users, we want to see a good clean match in which each contestant has a fair chance to prove that it deserves our business.