Track Web site use to keep your IIS ship headed in the right direction
Microsoft (Internet Information Server) uses log files to track information about events that occur when users access IIS Web sites and applications. Information such as the number of visitors to your site or application, the number of bytes sent and received, and the referring page is invaluable for Web managers, who can examine the data to discern trends and patterns.
IIS 4.0 has four types of log formats, and you can store log data in files or databases. Maintaining logs increases the overhead on your system, so you want to be sure you're capturing the information you need in the most efficient way. Let's look at some of the decisions you need to make about IIS 4.0 logs and walk through the process of creating them.
Log Formats
Table 1, page 77, describes the four types of log files you can use with IIS 4.0: the World Wide Web Consortium (W3C) Extended Log File format, the National Center for Supercomputing Applications (NCSA) Common Log File format, the Microsoft IIS Log File format, and the ODBC Logging format. The W3C Extended Log File format is the most popular because it's flexible. You can select only the properties you need to monitor for a particular Web site. However, the W3C format might also tempt you to log too many items. Remember that the more items you log, the more work IIS 4.0 must do and the more disk or database space the logs will occupy. Figure 1 shows the W3C format's list of extended properties, with the default properties selected.
The W3C format and the other standard format, NCSA Common Log File, are useful when you need to aggregate logs from many Web servers, some of which aren't running IIS. Your organization can run Web servers from various vendors, and if the servers support logs in a common format, any analysis tools that you purchase or create that work with that format can access the logs. For example, Sane Solutions' NetTracker can read log files and import them into a database for use with its analysis tools. You use the ODBC format when you want to log information directly to a database.
The NCSA Common Log File format, the Microsoft IIS Log File format, and the ODBC Logging format use the local IIS 4.0 server's time, but W3C logs use Greenwich Mean Time (GMT). Obviously, you need to know the time zone that your logs are using so that you aren't misled about the time at which logged activities and rollovers actually occur. I verified the time zones in Table 1 with my own tests after discovering that the IIS 4.0 documentation isn't reliable on this topic. For example, the documentation says that the Microsoft IIS Log File format rolls over at midnight GMT, but that isn't the caseit actually rolls over at midnight local time.
You can set logs to roll over on a daily, weekly, or monthly basis. Alternatively, you can set logs to roll over when they reach a certain size. To configure these settings, you use the log's General Properties page. To display this page, open Internet Service Manager (ISM) 4.0, click the server that contains the Web site on which you want to set log options, right-click the Web site, and select Properties. Click the Web Site tab, and click Properties beside the log format type.
IIS 4.0 names log files according to a particular format that contains a prefix that identifies the log format followed by the log starting date. Table 1 shows the prefixes for each log format. IIS 4.0 uses the longer prefixes for files that are closed by size and the two-character prefixes for files closed by date. For example, IIS 4.0 would give the name nc001013.log to an NCSA log file created according to a daily schedule and started on October 13, 2000. (For more information about log-file names, see "File Names for Logging" in the IIS 4.0 online documentation.)
File vs. Database
Perhaps the most important decision you need to make when setting up IIS 4.0 logging is whether to log to a disk file or a database. The Microsoft article "Internet Information Server Performance Logging to Disk vs. ODBC" (http://support.microsoft.com/support/kb/articles/q142/5/57.asp) discusses the performance implications of this decision. According to this article and to the IIS 4.0 documentation about log files, when logging to a file, IIS 4.0 caches the log data and writes it to disk in 64KB blocks. IIS 4.0 also flushes the cache when you stop the WWW service. You can increase the log buffer for fewer writes or decrease it for more up-to-date information in the log file. To change the cache size, add the LogFileBatchSize subkey (along with a value that specifies the desired cache size) to the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\InetInfo\Parameters registry subkey.
When you use ODBC logging, each time an event occurs on IIS 4.0, the Web server writes log data to the SQL Server or other database you specified. This realtime logging causes IIS 4.0 to perform lots of insert operations to the log-file database and can overload the database server or become a bottleneck for IIS 4.0. The Microsoft Internet Information Server Resource Kit says ODBC logging "might not be a good idea for a busy server ... because it slows down the server and does not scale well under a high level of stress."
Thus, Microsoft reference materials seem to suggest that logging to a file performs better than logging to a database, right? Not so fast. I performed a test and found that IIS 4.0 writes log data to a file after each event. Contrary to what the reference materials say, IIS 4.0 doesn't cache data but rather writes data to the file as each event occursat least it does with the default settings. To perform a similar test, set up logging for a Web site, then access one page in the site two or three times. Open the log file, and look at the entries. You'll see an entry for each of your accesses.
(The aforementioned "Internet Information Server Performance Logging to Disk vs. ODBC" Microsoft article reports that problems can occur if you use active log files for realtime reporting. The "Interpreting Log Files" section of the IIS 4.0 online documentation says you can access the current log file only after stopping the W3C Service. However, I was able to use Notepad to access an open log file and use Windows Explorer to copy the log file.)
If IIS 4.0 writes log data each time an access occurs, it seems to be doing the same work in a file log operation as in a database log operation. Some databases might be faster than some file systems, and vice versa. You must test to determine whether a file or database works best in your situation. However, I suspect that in general, file operations are faster than database operations.
Another approach to putting log information in a database is to create a batch job that picks up closed log files after log rollovers. The batch job can import the log information into the database during off-hours, reducing the load on IIS 4.0 and the database during peak usage hours. If you use Microsoft SQL Server as your database, the Import Wizard makes setting up imports a breeze and log-file imports very quick.
In addition to producing logs that you must store somewhere, logging causes overhead for IIS 4.0 and for a disk system, whether you use file logging or database logging. The more data that you capture, the more work that IIS 4.0 and the file system must do. Choose your logging types and methods carefully to make sure you aren't logging more data than you need.
Logging to a File
By default, IIS 4.0 enables W3C logging and tracks the Time, Client IP Address, Method, URI Stem, and Http Status properties. To change logging settings for a Web site, perform the following steps:
- Open ISM.
- Click the server that contains the Web site on which you want to enable logging.
- Right click the Web site, and select Properties.
- Click the Web Site tab.
- On the Web Site tab, which Figure 2, page 80, shows, ensure that the Enable Logging check box is selected.
- Select a log-file format from the Active log format list.
- Click Properties next to the Active log format list.
- Change the property settings for the log, then return to the Web Site tab.