Is your IIS machine just hanging around?
Last month, I showed you what can happen to IIS if it crashes. This month, I look at a more common and sometimes more difficult problem: IIS hanging. The call you receive might go like this:
CUSTOMER: "I can't get to your Web site. My browser times out."
ADMINISTRATOR: "Do you see any errors?"
CUSTOMER: "No. My browser just waits and waits."
ADMINISTRATOR: "The Web server seems to be running fine, and no
errors appear on the server."
You just experienced one symptom of an IIS hang. I classify hangs in two categories: deadlocks and blocking. I show you both types and walk you through the process for solving each type. (The dumps I show you are from real-world scenarios.)
What's a Hang?
A hang is any situation in which IIS or a subset of IIS stops responding to input but doesn't crash or stop running. The two types of hangs are deadlocks and blocking. Because IIS is a multithreaded application, it can suffer when two or more threads are waiting on each other to complete some work. Picture this: Two trains come to a railroad crossing at the same time, but neither can go through until the other has passed. You now have a deadlock. Other threads need resources that threads 1 and 2 have tied up, so those other threads will stop responding and eventually, all of IIS will hang.
Now consider a steam train coming into a train station. The train needs to get water. Unfortunately, the people responsible for keeping the water tower filled didn't do their job, so no water is available. The locomotive is stuck at the station waiting for water to come. You now have blocking. In this case, an external process or request (the water tower) has held up the internal thread (the train).
Deadlocks
In "Diagnosing Problems in IIS," May 2001, I introduced you to Windows NT Synchronization objects. These objects let IIS work efficiently. However, they also let hangs occur. In my first scenario, I show you a deadlock that results from the improper use of Synchronization objects in a piece of code. Deadlocks are caused by locks held on critical sections of code. I use a command in WinDbg called !locks to see what's happening.
First, load the dead_locked.dmp file in WinDbg. (You can download dead_locked .dmp from the Code Library on the IIS Administrator Web site at http://www.iisadministrator .com.) Select Open Crash Dump from WinDbg's File menu to load dead_ locked.dmp. Figure 1 shows WinDbg's Command and Disassembly windows with dead_locked.dmp open. Note that I took the dump from a machine running NT 4.0 Service Pack 5 (SP5), so you need to set your symbol path
to point to the proper symbols. To download these symbols, go to http://www.microsoft.com/ ntserver/nts/downloads/re commended/sp5/debug/ default.asp. (The encryption level is unimportant for this example.) Note that the more than 50MB download file expands to over 200MB, so be sure that you have the disk space. For information about symbols and how to install them, see "Understanding and Using Symbols" (June 2001).
After your symbols are in place, type
!locks
at the command prompt (i.e., the bottom pane that's preceded by 0:000>) in WinDbg. Figure 2 shows the output from running this command against dead_locked.dmp. (Note that I've deleted all but the two items you need to look at.) For easier viewing, I recommend that you paste the output of the !locks command into Microsoft Notepad.
When you go through this output, you first want to look at the LockCount value. If this value is 0, you can ignore the lock because no thread is currently using the Critical Section. When you've found all the locks that have a LockCount value of 1 or higher, you must examine the threads that own the locks. To find the thread ID, find the OwningThread value. Then, use the ~* command, which Figure 3 shows, to list all threads and their IDs.