After you gain access to the original installation's Registry hive files, you're ready to begin offline editing. Although you're probably familiar with NT's Registry editors, you might not know that you can use them to open Registry hive files on other NT installations or alternate Registry sets from the same installation. To edit Registry hive files offline, open regedt32.exe (regedit.exe doesn't support loading native Registry hive files offline), highlight the HKEY_LOCAL_MACHINE root key, and select the Load Hive option in the Registry menu to locate the hive file you want to bring into the Registry editor. In this case, you want to change a service or driver's startup type, and NT stores this information in the SYSTEM hive. After you locate and select the file, the system will prompt you to provide a key name for the hive file contents, as Screen 2 shows. This activity doesn't modify the original hive file's name, nor does it permanently affect the Registry of the local installation you're booted under. In addition, the name you choose doesn't matter because the Registry editor will use the name only as a temporary Registry branch that contains the data of the original Registry hive file. After you provide a key name, it will appear in the HKEY_LOCAL_MACHINE window.
At this point, you're editing the SYSTEM hive from your original NT installation, and you can resolve your startup failure. As with any Registry editing session, back up the hive file you're working with before you edit. When you open your new key, SYSTEM2 in my example, the display is slightly different from what you usually see under the SYSTEM key. Most notably, the only ControlSet subkeys available are ControlSetxxx keys, where xxx is a number such as 001. The display doesn't exhibit the CurrentControlSet subkey that you usually see when editing the live Registry of a local machine. The display doesn't show CurrentControlSet because it's an alias for the control set that loaded when NT booted.
To ensure you're editing the correct control set and not the default control set of the parallel NT installation, choose the Select subkey under your newly created key. The right pane of the Registry editor will display several values, as Screen 3 shows. NT uses the values and their data to determine which control set is the default set loaded at startup, which value is the CurrentControlSet value, which data represents the Last Known Good configuration, and which set has failed to boot successfully. In Screen 3, the Current value tells you the last control set NT used during startup. This value represents the control set NT is using as the CurrentControlSet entry. In most cases, this value matches the default value. In my example, the data contained in Current is 0*2, which tells you that ControlSet002 is the set you want to edit. After you locate the correct control set, you can modify your service or driver startup state.
The Registry entries related to your original NT installation's services and drivers are under the HKEY_LOCAL_MACHINE\SYSTEM2\ ControlSet00x\Services\name of suspect service or device driver Registry key. In this key, SYSTEM2 refers to the subkey in my example, ControlSet00x reflects the control set you previously determined, and name of suspect service or device driver is the name of the service or device driver that you suspect is causing your problem. Each service and driver that the Services subkey lists stores several values within its root key name, including a Start value (i.e., REG_DWORD value). This value's number determines the current startup state of that service or device driver. Setting the Start value to 0*4 disables a service or driver and prevents NT from attempting to start it during the boot process. Table 1, page 88, lists the possible Start key values for services and device drivers. After you finish editing your Registry offline, you must unload the imported hive file. To do so, highlight the key name you assigned to the hive and select Unload Hive from the Registry menu.
Now that you can disable services and device drivers in your original installation, you can successfully disable the offending element that is preventing NT from booting successfully. Determining which service or driver is the culprit might take experimentation, but you can use the events that lead up to the problem and information that the STOP error screen provides to help isolate and disable the problematic component.
Recovery Software
A discussion of NT system recovery isn't complete without mentioning third-party utilities that can assist you in this process. Winternals Software's ERD Commander and Remote Recover, and Systems Internals' NTRecover are excellent products from the premier makers of NT recovery software. Although each of these utilities can help you recover a damaged NT system, they differ in their methodologies and strengths. For example, NTRecover, Systems Internals' original NT recovery utility, lets you access the hard disk of an unstable NT system by connecting a serial cable between the damaged system and a working NT system. After they're connected, you can use NTRecover to copy and delete files, or run Chkdsk or virus scan utilities on the remote disk. In most situations, NTRecover provides all the functionality required to successfully recover an unbootable system.
ERD Commander is a dream come true for NT administrators who long for the days of booting DOS disks to recover wayward DOS and Windows 95 installations. This command-line-based utility boots from a 3.5" disk and can read and write to NTFS volumes. Screen 4 shows ERD Commander's interface. The Professional Edition of this utility includes several enhanced features, such as support for fault-tolerant disk sets (i.e., disk sets using NT's ftdisk.sys driver), the ability to run Chkdsk, password recovery, support for FAT32 volumes, support for the Expand utility, and command-line options that let you selectively control or disable the startup state of services and drivers.
Remote Recover is the newest Winternals recovery-utility product. This utility provides a custom boot disk that includes Network driver interface specification (NDIS 2) driver support to let you remotely access an unstartable system's NTFS volumes over the network. This support lets Remote Recover remotely access the system and perform recovery functions similar to NTRecover and ERD Commander.
Don't Rule Out Hardware
Making assumptions about server disaster recovery is dangerous. For example, when you're dealing with a blue-screened NT installation, assuming that the problem is software-related is easy. However, defective hardware or hardware-related events (e.g., a failing hard disk or disk controller, bad main memory or cache RAM, overly aggressive BIOS performance settings) might be the culprit. By displaying STOP codes that don't indicate hardware as the problem's source, hardware-related blue screens sometimes masquerade as software-related failures.
Hardware-related problems are especially suspect if you have recently changed hardware or a power-related event has occurred (e.g., a full outage or series of voltage sags or spikes). For example, suppose you installed a new fax board in your server last week, and the fax board worked fine during your testing. However, a week later, the server blue screens and the STOP error message doesn't point you to a particular service, driver, or hardware component. The malefactor might be a hardware-related problem with the fax board or the interaction with its driver that occurs only under a heavy traffic load. In a situation like this one, assuming that NT has become damaged is easy. However, if you're fighting a hardware battle with software weapons (e.g., restoring the Registry, reinstalling NT), you might end up chasing your tail for a long time.
Be Proactive
In part 1 and in this article, I've shown you advanced techniques and utilities that you can employ in emergency situations in which an NT system refuses to boot. More important, I've discussed proactive measures that you can take now to increase your chances of performing a successful system recovery as well as reduce the amount of time that a recovery operation will take. Microsoft's documentation covers traditional recovery techniques, such as using NT Setup Repair and restoring the Last Known Good configuration, but these measures often prove insufficient. If you perform proactive disaster preparation measures, you might never have to use Microsoft's recovery techniques.