Subscribe to Windows IT Pro
October 15, 2001 12:00 AM

Heap Corruption, Part 1

Windows IT Pro
InstantDoc ID #22275
Rating: (7)

Debugging in the Real World
Returning to the example, because Thread 1 is finished, Windows turns control back over to Thread 2. Remember that Thread 2 has set up three character variables and has put values in the first two. Now Thread 2 does its math.

Figure 7, page 14, shows the pseudo-code for this math. What do you think the result of this pseudocode will be? If you guessed a divide by zero exception, you're correct. Thread 2 receives the number stored in the heap location that the variable a points to: This number is now 0 because Thread 1 wrote a 0 there. However, here's the problem. If you hook a debugger up to the system when this code executes, the code will trip and generate a Stack Backtrace that points to the division statement in Thread 2 as the problematic code.

Now, because you've been following every step of both threads as the program ran, you know that Thread 2 is simply a victim in this case. In the real debugging world, you have no such information. As a matter of fact, if you look at the process you're debugging, Thread 1 isn't even running (remember that Windows already terminated it). The real problem code doesn't even exist in memory anymore. Furthermore, this type of problem is sporadic and random at best because it results from a very specific set of circumstances. Often, when you work with corruption that involves multiple threads such as in this example, you see different symptoms each time (i.e., the corruption affects different routines). In this example, you saw a divide by 0 exception in the math division routine. However, in another crash, an Access Violation error could occur in a string-copy routine, or you might not see a crash at all—one of the big problems with trying to debug heap corruption.

Also, a long delay might take place before any noticeable problem occurs. In my example, the problem didn't show itself until long after the problematic code had actually executed. For this reason, you can't always assume that the faulting stack is actually pointing to the incorrect code. In addition, heap corruption doesn't always involve multiple threads. Often, a thread trashes its own data.

You might ask why Windows (more specifically, ntdll.dll, which controls memory) lets this corruption occur. The reason is that the heap isn't policed. Ntdll.dll keeps a record of the memory it has given out so that it knows what memory it has left, but that's all this DLL does. The code is responsible for its behavior. The code in the example made one crucial mistake. The code based its memory request on the size of the incoming string—a common practice. However, the code forgot to add one extra byte for the null terminator. To combat this problem, you can use a tool called PageHeap.

PageHeap
To understand how PageHeap works, let's dive a little more deeply into Windows' memory manager. (I'll discuss PageHeap in more detail next month.) Allocating and committing memory in Windows can take a relatively long time (compared with using memory). Therefore, Windows creates heaps of memory and goes through the allocation and commitment of memory for each heap in one lump sum. Windows then has the memory ready to dole out as requests come in.

You can think of this process like requesting that a tanker truck full of gasoline be brought from the refinery to the gas station. Then, a driver can fill up his or her car from the station without waiting. Otherwise, each time a driver wanted gas, he or she would have to wait for that tank of gas to be brought from the refinery.

Windows increases the size of the heaps as necessary. The default size for the heap in a Windows application is 1MB. When Windows has given out half of that 1MB, the OS then doubles the heap. When Windows has given out half of that doubled amount, the OS will double the heap again. In this way, Windows is always ready to hand out memory (unless you run out of memory, which is a topic I'll cover in a future article).

When ntdll.dll doles out memory during a typical session with heaps, it gives a request the next available memory space following the previous request's allocation. However, when you use PageHeap, which is built into ntdll.dll, that behavior changes. Ntdll.dll doles out memory along with unused memory both before and after each piece of memory that's used. Ntdll.dll then marks this unused memory as "no access." Marking the memory in this way tells Windows that applications can't use that memory for any reason. Any read or write operation outside the requested area causes an Access Violation error, and a debugger can stop the program when the error occurs. (In the example, the Access Violation error would occur as soon as the code wrote the null terminator character.)

Next Month
The topic of heap corruption is too broad to cover in one article. Next month, I'll discuss the specifics of using PageHeap and familiarize you with the tool's requirements, pitfalls, and benefits. I'll also present other considerations about troubleshooting heap corruption.

Related Content:

ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here

advertisement

advertisement

Windows is a trademark of the Microsoft group of companies. Windows IT Pro is used by Penton Media Inc. under license from owner.