Subscribe to Windows IT Pro
May 01, 1999 12:00 AM

Using Microsoft's x86 Kernel Debugger

Windows IT Pro
InstantDoc ID #5131
Rating: (0)
Troubleshooting your crashed NT systems

The use of Microsoft's command line x86 kernel debugger is commonly seen as a black art, both by experienced support professionals and new Windows NT users. When workstations or servers suffer a failure and display a blue screen, they generate a crash dump. Unfortunately, many users ignore or delete these crash dumps. However, with some basic preparation and knowledge, you can use the kernel debugger to yield valuable information on the systems' state at the time of failure. You can then correlate this information with the installed hardware, software, and other system parameters to help formulate a strategy for troubleshooting the system.

Although a full treatment of the kernel debugger and debugging tactics might fill several books, setting up the kernel debugger to debug a crash dump is not difficult. This article explains this process step by step and presents several specific command examples that demonstrate how you can use the debugger to extract useful information from crash dumps. I've also presented references to existing literature on kernel debugging for further research.

Kernel Debugging Basics
Debuggers let you inspect and troubleshoot program code as it runs. You can examine variables, registers, and stacks, and pinpoint problems by stepping line-by-line through a program. Some debuggers support source-level debugging by matching the developer’s source code (written in C, Basic, or another high-level language) with the corresponding machine instructions. This level of detail shows how the system compiled each line of source code and the exact effect of that code on the system. Other debuggers support only direct machine instruction or assembly language debugging. You typically use kernel debuggers to debug core OS components and drivers and use user-mode debuggers to debug applications and services. In a live debug session, a serial cable connects the target machine that you want to debug with a host machine that runs the debugger. Debug code running on both machines communicates commands and data via the serial ports. In a crash dump debug session, you analyze the crash dump file representing the complete contents of memory at the time of the crash offline after the failure has occurred.

Several debuggers are available for NT from Microsoft and third-party software vendors such as Compuware NuMega. Two well-known debuggers from Microsoft for use on the x86 platform are i386kd.exe and windbg.exe. I386kd.exe (available in the \support\debug\i386 directory of the NT 4.0 CD-ROM) is the command-line kernel debugger for x86 code. Windbg.exe (available separately from Microsoft) is the GUI version of i386kd.exe and can perform kernel-mode debugging and user-mode debugging. Each debugger executable interprets the register, stack, and instruction information for a particular processor architecture. For Alpha code, alphakd.exe is the equivalent of i386kd.exe. (This article refers to the term kernel debugger to mean the x86 i386kd.exe from NT 4.0.)

Microsoft and third-party software vendors sometimes request that customers submit crash dumps as compressed files for diagnostic purposes. In some instances, vendors request permission to dial in to a customer's site and engage in a live debug session. The vendors typically perform these sessions using i386kd.exe because they can easily export or pipe this tool to the host machine and then access the failed system remotely via the remote.exe utility, which is available from Microsoft Windows NT 4.0 Server Resource Kit. Even if you never debug your crash dumps, setting up the symbolic information ahead of time will speed up this debugging process.

Blue Screens and Crash Dumps
The blue screen of death is something every experienced NT support professional has seen. The sidebar, "Windows NT Kernel Debugging Resources," lists resources that explain why blue screens happen and how to interpret them. As a refresher, the blue screen indicates that the OS encountered an abnormal situation that it couldn't handle using normal error mechanisms. The OS consequently decided that it couldn't guarantee continued safe processing. Rather than risk corrupted data, NT provides a special internal function known as KeBugCheckEx(). The OS and device drivers use this function to halt the system when they find themselves in the previously described situation. After taking control of the system and placing the display into VGA 80x50 mode, this function generates all the information seen on the blue screen, such as the stop code/parameters, driver addresses, and stack data. The function also generates a crash dump, but only if you select the Write debugging information option on the Startup/Shutdown tab of the System applet in the NT 4.0 Control Panel, as Screen 1 shows. Assuming you properly size the paging file, the OS invokes the savedump.exe utility to write the contents of memory into the paging file and mark the location with a special code. Upon reboot, NT copies this part of the paging file to the filename specified, usually \%systemroot%\memory.dmp.

After the Savedump utility writes the contents of memory to disk, the OS displays a message to this effect and you can restart the system to restore operation and access the crash dump. I suggest you move the memory.dmp file from the crashed system to removable storage or another location on the network.

Related Content:

ARTICLE TOOLS

Comments
  • Namrata
    8 years ago
    May 12, 2004

    This is a very helpful article but I would like to know more about the kernel debugger. I am trying to learn windows CE.NET4.2 and I am unable to understand what "debugging the OS on the Emulator using the kernel debugger "means? I would really appreciate it if some 1 helps asap. Thank you.

  • Ian Robertson
    8 years ago
    May 10, 2004

    Under Windows XP on an HP Pavilion 722a I have had numerous occurences of being taken to the blue screen while playing Freecell. Once the dump has finished (where does it go?) it attempts a reboot and reports a boot disk failure. After powering down and up again the machine normally boots up OK.

    However today I got a different message "Unable to load OS" and I ended up having to restore my C drive. No more Freecell for me!!!

  • CyreZ
    8 years ago
    Apr 28, 2004

    I tried this one out, very good article..but when i load all services its says "Unable to verify timestamp for every file" also for ntoskrnl.exe, i cant do anything, the system always write a dump and i cant get anything out of it :\\

  • Ahmed zewail
    8 years ago
    Apr 18, 2004

    I have windows2000 advanced server and I have more problem with him such as dumping memory and event viewer cosist of more error message like this"This Machine is a PDC of the domain at the root of the forest. Configure to sync from External time source using the net command, 'net time /setsntp:

  • Dave White
    8 years ago
    Mar 04, 2004

    Great article! It took me a while to learn how to use the debugger and set up the dump file, but I figured out the exact thing that was causing my terminal server to crash several times a day! (And I got a good laugh out of it, because it turned out that the FREECELL game was incompatible with WinNT 4.0 Terminal Services Service Pack 6, for some reason. Whoever was playing it will be really embarrassed when they open the program I substituted for Freecell, which just pops up a message saying that it has been disabled because it brings down the remote application server for the whooole company...)

    Dave

You must log on before posting a comment.

Are you a new visitor? Register Here

advertisement

advertisement

Windows is a trademark of the Microsoft group of companies. Windows IT Pro is used by Penton Media Inc. under license from owner.