Analyzing Logs from Exception Monitor

 

Geoff Gray
Microsoft Internet Server Support Engineer

Updated August 2000

Summary: This article explains how to use the Exception Monitor Log File Analyzer to diagnose problems with IIS. The article also discusses general debugging concepts and shows what to look for when reviewing ReadLogs output. (30 printed pages)

Contents

Introduction
Background
Installing and Running ReadLogs
Understanding Windows NT and Windows NT Debuggers
Processes, Threads, and Other Internal Stuff
Memory
Understanding DLLs
Dissecting the Stack
CDB Commands Used by Exception Monitor
More General Analysis Information
Building Debug Files
Conclusion

Introduction

This article explains how to use the Exception Monitor Log File Analyzer (ReadLogs.exe) to diagnose problems with Microsoft® Internet Information Server (IIS). It also discusses general debugging concepts and shows what to look for when reviewing ReadLogs output.

The Log File Analyzer works in conjunction with Exception Monitor to help identify the cause of an IIS crash or other problem. For complete information on Exception Monitor, see Troubleshooting with Exception Monitor.

Background

When you run Exception Monitor, it generates a log for troubleshooting problems with your IIS server. The Log File Analyzer tool (called ReadLogs.exe) breaks the Exception Monitor log down even further for easy analysis. ReadLogs allows you to view these components quickly to identify potential causes of the underlying IIS problem. With ReadLogs you can:

  • Determine which thread in the process caused the fault.
  • Display the "faulting stack" that lists the running modules and what they were doing when the fault occurred.
  • Gather a list of all loaded modules (regardless of what thread they were running in) and view available version information about these modules.
  • Look for and view any errors or warnings about symbols.
  • Look for any exception errors that may have occurred, but weren't severe enough to trip the analyzer.
  • Look for and display information about other types of exceptions.
  • Quickly display the output from any of the commands that were running during the troubleshooting process.
  • Generate a customizable report about the exception with a wide range of configurable data options.

These and other features are described further in this document.

It is important to note that while ReadLogs makes analysis easier, you may not be able to diagnose your problem with this tool alone. The entire Exception Monitor suite of tools, including ReadLogs, is designed to point you in the right direction. However, additional steps may be necessary to fully diagnose your situation.

Note Although ReadLogs version 7 has been designed to read log files from earlier versions of Exception Monitor, not all log files from earlier versions will parse properly. When trying to open log files generated with Exception Monitor version 6.1, you may notice high CPU utilization where the ReadLogs dialog box will not appear. If this occurs, use Task Manager to end the readlogs process, then use Notepad to read the log file. Microsoft is developing a remedy for this issue that has not yet been released.

Installing and Running ReadLogs

ReadLogs automatically installs when you install Exception Monitor. You may run the tool directly from Exception Monitor, or as a standalone application from the Exception Monitor Program group:

  • On the Start menu, point to Programs, point to Debugging Tools, point to Exception Monitor, and then point to ReadLogs.

    Note ** To further understand how the log generates, or how to interpret the output, see Understanding Windows NT and Windows NT Debuggers.

Walking Through a Log Analysis

When you first start ReadLogs, the system presents the Log File Analyzer main window (Figure 1), displaying different aspects of the log file.

Figure 1. The Log File Analyzer main window

The system populates the Main Information window and Main Output window with information relevant to the log you are analyzing. Also notice the information displayed in the upper left corner of the window:

  • The fault type (in this case, the debugger was manually tripped).
  • The version of Exception Monitor used to generate the log.
  • Whether the log used a custom command set or the standard command set.
  • The start and stop times of the log file.

Using the Main Information Window

When you first parse a log or choose a function of the program, the analyzer inserts a message in the Main Information window. The message may display data (discussed in other sections) or it may describe briefly the action you took or the results of the action. ReadLogs uses dynamic messaging, based on both the action you took and the results of some parsing of the log. Examples of dynamic messaging include:

  • When you first parse the log, ReadLogs looks to see if the log was manually tripped, if you were using an old debugger, if there were unresolved dynamic-link library (DLL) calls, and so on. This information displays automatically in the Main Information window.
  • When you click a line in the Main Output window (after it is populated with stack or module information), the Main Information window displays version information about the selected module.
  • When you click Fault Stack, the Main Information window displays the register dump for the processor and the extended stack information for the faulting stack.

Using the Main Output Window

The style of the Main Output window also changes depending on what information you are viewing. The relevant function help page provides examples of outputs with a description. Some examples of this dynamic output include:

  • Stack list output.
  • A list of the loaded DLLs.
  • A list of locks and the owning thread ID for the locks.
  • Any errors detected in the log.

Viewing Stack List

When you click Stack List in the Log File Analyzer main window, ReadLogs displays the Thread Stacks window (Figure 2).

Figure 2. Thread Stacks window

The information in the Thread Stacks window shows probably the single most important part of an automatic debug, or any debug for that matter. The thread stack indicates which code was executing at the time of the crash. This thread stack information is useful to understand which DLLs and functions within the DLLs were being executed. You should read a stack from the bottom up, as the DLL and function on the last line called the DLL and function on the next line up, and so on. Consider the following points when reading these stacks:

  • The Function column shows the DLL name and the function name inside the DLL that was called. The system separates the function name from the DLL with an exclamation point (!). Sometimes a plus (+) sign and a number follow the function name. This represents an offset, where the actual code being implemented is x number of bytes after the beginning of the function.
  • The information about the DLL and the function name come from the symbol file for the DLL. If the symbol file cannot be read properly, ReadLogs displays a function call like Image@ (above in Figure 2) followed by an address and an offset. The address represents the memory location of the function call within the DLL that was called. ReadLogs attempts to resolve this in the Name Resolution column by searching the list of loaded DLLs to see if the memory location lies within the starting and ending addresses of any of them. In the above example, ReadLogs found the address in Taxcalc.dll.
  • If there is no symbol for a given DLL, then the Function column shows only a memory address (such as 0x0230ffd0) and an offset. Again, ReadLogs attempts to resolve this and display the name in the Name Resolution column. If you come across DLLs like this, consider obtaining symbol files for the DLL for any future debugging.
  • Although the debugger was tripped by a fault caused by the topmost command on the stack, it does not necessarily indicate that the DLL at the top was responsible. In many cases, you will see core Microsoft® Windows NT® files (such as Ntdll.dll) at the top of the stack. This usually indicates that some other function passed in a bad parameter that caused the problem.
  • Just like you can get version information from the DLLs output by clicking on the address in the leftmost column, you can get the ChildEBP address for a DLL in the Thread Stacks window and view version information, if available.

By default, when you parse a new log, the faulting stack displays in the Main Output window. ReadLogs then determines which thread stack caused the default by reading the debug command prompt in the log. The Exception Monitor, when tripped, always sets itself to the thread that caused the trip. It also sets its prompt to reflect the thread causing the trip as follows: x:yyy>. The x indicates the number of the processor that the thread was running on, and yyy indicates the number of the thread.

Moving Through Threads

There are cases where you will want to view other threads besides the faulting thread. ReadLogs gives you a number of ways of moving through the threads. For example, you may use Fault Stack and its associated controls in the Log File Analyzer main window (see Figure 3).

Figure 3. Fault Stack and its associated controls

Use Fault Stack and its associated controls to move through threads as follows:

  1. Following any command in the analyzer, click Fault Stack to return to the faulting thread stack.
  2. Click << to move to the previous stack.
  3. Click >> to move to the next stack.
  4. Click in the edit box showing the number of the stack currently being displayed, and use the up and down arrow keys on your keyboard to navigate through the list.
  5. Select a specific thread stack from the drop-down list to jump to a specific stack.

One of the reasons for examining other threads is to help troubleshoot 100 percent CPU utilization problems, where a processor (or processors) runs at 100 percent and the computer becomes sluggish and/or appears to stop responding. If you know that a specific thread was using all of the CPU cycles at the time of the crash, then choose the thread from the drop-down list to see what was happening at that time. If you are interested in seeing what else the system was doing when it crashed, you can navigate through several threads.

Viewing DLLs

When you click DLLs in the Log File Analyzer main window, the Main Output window shows a list of all currently loaded DLLs. ReadLogs sorts the list numerically, based on the starting address of the DLL. The list shows the starting address (also known as the base address), the ending address, and the name (including the path) of the DLL. If a DLL appears in the list, the file has been called at least once while IIS was running. Exception Monitor uses the !inetdbg.mod command to obtain a list of loaded DLLs.

If you select the Hide Microsoft Files check box, ReadLogs checks DLL version information and only lists DLLs where the value in the Company Name field does not equal "Microsoft," or DLLs with no version information. Use this option when you want to quickly see which third-party DLLs were loaded when the fault occurred.

The log file contains version information for every DLL, when available. If you the starting address for a DLL, the Main Information window displays the available version information. If Exception Monitor could not obtain version information, the Main Information window displays the message "no version information was available." Exception Monitor uses the !inetdbg.ver command to obtain version information for loaded DLLs.

Viewing Errors

When you click Errors in the Log File Analyzer main window, the Main Output window shows a list of all errors detected in the log. This list includes warnings (such as symbol mismatches), errors (such as symbols that could not be loaded), and NTSD errors. For more information, see Understanding Locks.

Note Exception Monitor does not use explicit commands for trapping errors. Rather, these errors display throughout the log.

NTSD Errors

There may be several errors in the log in the form of "NTSD: Exception Number xxxxxxxx". These are Windows NT, IIS, or application-specific error codes. Usually, these errors are not severe enough to trip the session. They indicate that something failed, but did not crash the server. Sometimes these errors point to the actual problem, but most of the time they simply indicate small problems that should be addressed when the server stabilizes again.

When ReadLogs encounters an NTSD error, it tries to resolve it and display corresponding information. ReadLogs uses a built-in Windows NT application programming interface (API) call to resolve these errors. It passes the 8-digit error code to the Microsoft Win32® API FormatMessage() function, which looks up the code in the Windows NT internal error list. If the system finds the code, Windows NT responds with an interpretation of the error. You also may view these errors with a copy of the Windows C header file called Winerror.h. In the event that ReadLogs does not resolve the error, it usually indicates an application-specific error code (possibly even IIS) occurred, and that ReadLogs does not have access to the error codes.

Understanding Locks

First, let's discuss the definition and importance of a lock. A lock occurs whenever a thread (a piece of running code) needs to access code exclusively. The thread requests a lock and, if it can, the system gives the lock to the thread. At this point, nothing else in the system can access the locked code. This is useful whenever a thread is writing information to a file and it wants to be sure that no other thread is writing to it at the same time. When the thread finishes, it releases the lock. This happens all the time and is a normal part of any well-written, multithreaded application. It is important to note that even though a particular lock can be owned by only one thread at a time, a single thread can have multiple locks.

So far, this sounds benign. The problem arises when two or more threads own locks, but each waits on the other in order to finish. When this occurs, you reach a state known as a deadlock.

This scenario represents one of the few situations where you need to manually trip the session. When you reach a deadlock, the server appears to hang. A manual trip dumps the information you need to resolve this problem. If you have a manually tripped log, and you suspect the issue is a deadlock, click Locks to retrieve this information. If the Locks window is blank, then either you do not have proper symbols loaded (the !locks command works only if the symbols for Ntdll.dll are correct) or the situation most likely was not caused by a deadlock.

When you click Locks, the Main Output window shows a list of all locks detected in the log, along with the ID for the owning thread and the lock count for the lock. When you the owning thread ID, the Main Information window displays the stack for the thread owning the lock. Using this information and the sample scenario below, you can determine if there is a deadlock issue. Exception Monitor uses the !locks command to get locks information. The following represents sample output from this scenario:

CritSec ftpsvc2!g_csServiceEntryLock+0 at 6833dd68
LockCount          0
RecursionCount     1
OwningThread       a7
EntryCount         0
ContentionCount    0
*** Locked

CritSec isatq!AtqActiveContextList+a8 at 68629100
LockCount          2
RecursionCount     1
OwningThread       a3
EntryCount         2
ContentionCount    2
*** Locked

CritSec +24e750 at 24e750
LockCount          6
RecursionCount     1
OwningThread       a9
EntryCount         6
ContentionCount    6
*** Locked

Looking at the second lock in the above example, notice that the owning thread is thread a3. You can find this thread in the output by looking at the output of the ~*kb command. In this case it is thread 4:

  4  id: 97.a3   Suspend: 0 Teb 7ffd9000 Unfrozen
ChildEBP RetAddr  Args to Child
014cfe64 77f6cc7b 00000460 00000000 00000000 ntdll!NtWaitForSingleObject+0xb
014cfed8 77f67456 0024e750 6833adb8 0024e750 ntdll!RtlpWaitForCriticalSection+0xaa
014cfee0 6833adb8 0024e750 80000000 01f21cb8 ntdll!RtlEnterCriticalSection+0x46
014cfef4 6833ad8f 01f21cb8 000a41f0 014cff20 ftpsvc2!DereferenceUserDataAndKill+0x24
014cff04 6833324a 01f21cb8 00000000 00000079 ftpsvc2!ProcessUserAsyncIoCompletion+0x2a
014cff20 68627260 01f21e0c 00000000 00000079 ftpsvc2!ProcessAtqCompletion+0x32
014cff40 686249a5 000a41f0 00000001 686290e8 isatq!I_TimeOutContext+0x87
014cff5c 68621ea7 00000000 00000001 0000001e isatq!AtqProcessTimeoutOfRequests_33+0x4f
014cff70 68621e66 68629148 000ad1b8 686230c0 isatq!I_AtqTimeOutWorker+0x30
014cff7c 686230c0 00000000 00000001 000c000a isatq!I_AtqTimeoutCompletion+0x38
014cffb8 77f04f2c 00000000 00000001 000c000a isatq!SchedulerThread_297+0x2f
00000001 000003e6 00000000 00000001 000c000a kernel32!BaseThreadStart+0x51

Once you locate the thread, you see it has a call to the WaitForCriticalSection function*,* which means that not only does it have a lock, but it is also waiting for an object that is locked by something else. You can find out what is locking the object by looking at the first parameter of the WaitForCriticalSection call. In this case, it is waiting on 24e750. Now you return to the !locks command output and look at the third Critical Section:

CritSec +24e750 at 24e750
LockCount          6
RecursionCount     1
OwningThread       a9
EntryCount         6
ContentionCount    6
*** Locked

Now you see that thread 4 (which owns the second lock) is waiting on the third lock. This is still okay because when it is released, you can continue processing. So let's analyze the third lock-its thread is a9. Using the ~*kb output to look it up, you see:

  6  id: 97.a9   Suspend: 0 Teb 7ffd7000 Unfrozen
ChildEBP RetAddr  Args to Child
0155fe38 77f6cc7b 00000414 00000000 00000000 ntdll!NtWaitForSingleObject+0xb
0155feac 77f67456 68629100 6862142e 68629100 ntdll!RtlpWaitForCriticalSection+0xaa
0155feb4 6862142e 68629100 0009f238 686222e1 ntdll!RtlEnterCriticalSection+0x46
0155fec0 686222e1 0009f25c 00000001 0009f238 isatq!ATQ_CONTEXT_LISTHEAD__RemoveFromList+0xb
0155fed0 68621412 0009f238 686213d1 0009f238 isatq!ATQ_CONTEXT__CleanupAndRelease+0x30
0155fed8 686213d1 0009f238 00000001 01f26bcc isatq!AtqpReuseOrFreeContext+0x3f
0155fee8 683331f7 0009f238 00000001 01f26bf0 isatq!AtqFreeContext+0x36
0155fefc 6833984b ffffffff 00000000 00000000 ftpsvc2!ASYNC_IO_CONNECTION__SetNewSocket+0x26
0155ff18 6833adcd 77f05154 01f26a58 00000000 ftpsvc2!USER_DATA__Cleanup+0x47
0155ff28 6833ad8f 01f26a58 000a3410 0155ff54 ftpsvc2!DereferenceUserDataAndKill+0x39
0155ff38 6833324a 01f26a58 00000000 00000040 ftpsvc2!ProcessUserAsyncIoCompletion+0x2a
0155ff54 686211eb 01f26bac 00000000 00000040 ftpsvc2!ProcessAtqCompletion+0x32
0155ff88 68622676 000a3464 00000000 000a3414 isatq!AtqpProcessContext+0xa7
0155ffb8 77f04f2c abcdef01 ffffffff 000ad1b0 isatq!AtqPoolThread+0x32
0155ffec 00000000 68622644 abcdef01 00000000 kernel32!BaseThreadStart+0x51

Notice that this thread also is waiting on a locked item to be freed; this is still okay. Now let's see what lock it is waiting on. The address is listed above as the first argument to the WaitForCriticalSection() function, just as you saw before. Let's look this one up in the locks list:

CritSec isatq!AtqActiveContextList+a8 at 68629100
LockCount          2
RecursionCount     1
OwningThread       a3
EntryCount         2
ContentionCount    2
*** Locked

This shows lock number 2. Now you see the problem. Lock 2 is owned by thread 4, which is waiting on Lock 3, which is owned by thread 6, which in turn is waiting on Lock 2. This is a deadlock. In order to discern what is happening, you need to analyze threads 4 and 6 and determine what they were doing so you can see what originally caused the deadlock.

Using Reporting Features

When you click the Report button, ReadLogs generates a report and places it in the Main Output window. By default, the system also places the report contents in the Clipboard so you can paste it into any other application that you choose. To customize the information in the report, see Configuring Reporting and Warning Parameters.

Viewing Activity Prior to Crash

When you click the P.T.C. (prior to crash) button in the Log File Analyzer main window, ReadLogs display the last 20 lines in the log prior to the crash. Use this feature to see what was happening right before IIS crashed. You can configure the number of lines displayed by setting the following key in the registry:

HKEY_CURRENT_USER\Software\ReadLogs\READLOGS\Settings\LinesForPTC

Viewing System Information

When you click the Sys Info button in the Log File Analyzer main window, the system displays the System Information dialog box (see Figure 4).

Figure 4. System Information dialog box

This dialog box shows a tree view of various system settings and versions. This information may be useful for verifying information about the machine you are debugging.

Note The System Information dialog box is available only in logs generated with Exception Monitor version 6.1 or later. Exception Monitor uses the !emdbg.sysinfo command to get this information. The emdbg debug extension is an add-on for Exception Monitor written by the Exception Monitor team.

Configuring Reporting and Warning Parameters

When you the Config button in the Log File Analyzer main window, the system displays the ReadLogs Configuration dialog box (see Figure 5).

Figure 5. ReadLogs Configuration dialog box

With this dialog box, you may customize the settings used by ReadLogs. The check boxes on the left determine how ReadLogs generates reports, while the check boxes on the right indicate how ReadLogs warns you about certain logs. When you change the settings, the system retains your selections for the next time you open ReadLogs.

Note The Wizard button and its functionality are not addressed in this document.

The report parameter options include:

Report Parameters Description
Copy Report to Clipboard Copies the report contents to the clipboard.
Show Processed Stack Information Adds the report output from the !inetdbg.ds -v esp command to the report. This is essentially a dump of the Extended Stack Pointer (ESP), providing a more complete dump of the faulting stack. However, the information may not always be accurate as the command simply walks the stack and dumps any value that resides inside a DLL. The information may contain a command, or some parameters passed by a previous call.
Show Version Information for the Selected DLLs Displays version information for each DLL in the report. The DLLs on the report depend on other settings in the dialog.
Show Microsoft DLLs in the Report Includes DLLs containing "Microsoft" in the version information in the report.
Show a List of Errors from the Log Includes errors from the log in the report.
Show Non-Microsoft DLLs and Files with No Version Information Includes DLLs where "Microsoft" does not appear in the version information and DLLs without version information in the report.
Show Verbose Instructions and Explanation in the Report Prints additional instructions for reading the report throughout the report.

The warning parameter options include:

Warning Parameters Description
Show Warning Dialog on Manual Trip Displays a warning when ReadLogs detects that the system is tripped manually. The warning message displays, regardless of its state, in the Main Information window.
Show Warning Dialog on "Not in Memory" Displays a warning when ReadLogs detects that a call is made to a location in memory without any DLLs. The warning message displays, regardless of its state, in the Main Information window.
Show Warning Dialog on a Bad Log File Displays a warning when ReadLogs determines the log file is from an old debugger or does not contain any information. The warning message displays, regardless of its state, in the Main Information window.
Show Warning Dialog on a "Shut Down" Displays a warning when ReadLogs detects that the log was generated during a shutdown of the process being analyzed. Currently, ReadLogs detects process shutdowns by looking for a running thread that has a "Suspend: -2" setting. This is not the only indication of a shut down, but it is currently the only way ReadLogs detects a shutdown.
Hide Locks with a Lock Count of 0 Displays only those locks with a count greater than 0 when you click Locks in the Log File Analyzer main window.

Using Other Features

The Log File Analyzer main window includes other buttons with features as follows:

  • Notepad: Opens the log file in Notepad.
  • Help: Links to the Microsoft web site and opens this document.
  • New: Allows you to open a new log file.

Understanding Windows NT and Windows NT Debuggers

Exception Monitor is based on the Windows NT debugger called Cdb.exe. As a user-level debugger, Exception Monitor can debug user-mode processes, but not kernel-mode processes. CDB works by looking for some sort of access violation. When one occurs, Windows NT raises a flag that activates the debugger and halts the process to which the debugger is attached. Because the process halts, the system retains the entire memory space for the process in the exact state it was in at the time of the violation. This feature allows us to run some commands from CDB that dump certain information from this memory to the log.

Figure 6. CDB process

Processes, Threads, and Other Internal Stuff

Since this document includes several references to processes, threads, and stacks, it seems appropriate to include a quick primer on these topics. Additionally, complete information on these topics can be found in the following publications:

  • Programming Applications for Window, Fourth Edition by Jeffrey Richter (Microsoft Press).
  • Inside Windows NT, Second Edition by David Solomon (Microsoft Press).

A process is basically "an instance of a running program." A thread is "a path of execution within a process." These simple terms establish the concept of how Windows NT operates. A process starts up, creating the process listing you see in Task Manager. Inside of this process, there are one or more threads that execute all of the code producing the application. It is these threads that do the work, access the memory, and produce the output of the program. Windows NT is a multitasking, multithreaded operating system. As such, it can run multiple applications at the same time, and each of these applications can handle multiple tasks at the same time. Windows NT accomplishes this by slicing up the CPU's time among all of the active threads and giving each thread a "time slice." If you consider this fact, you now can see that at any given time, only one thread runs on any given processor. The time slices are short enough that the threads appear to be running simultaneously, when in fact they are not. This actually makes debugging easier since you know that if a problem occurs, the currently active thread generated it. As you dissect this further, you will see that this does NOT guarantee that the "faulting" thread was actually the culprit. It may actually have been a victim of another thread (see Heap and Stack Corruption).

This document does not address how Windows NT manages the execution of threads (see the above books for a discussion on this topic). However, consider that threads can voluntarily give up some of their time for any number of reasons, including the fact that they may be waiting for some information from another thread, or the user. In these instances, you will see calls on the stack such as:

  • WaitForSingleObject
  • WaitForMultipleObject
  • WaitForCriticalSection
  • Sleep
  • SleepEx

Any of these calls simply mean that the thread making this call wants to step aside and wait for some event to occur before it continues with its work. Windows NT will "wake up" the thread when it can continue processing.

Memory

This section describes the basic usage of memory in the process. Remember that in Windows NT, each process gets 4 gigabytes (GB) of virtually addressable memory, but 2 GB of that is system memory. The system memory is shared (meaning only one copy exists and is shared by all of the processes). The following table outlines how the memory space is mapped for any user-mode process in Windows NT 4.0:

Note This map does not necessarily apply to Windows NT Enterprise Edition. Windows NT Enterprise can have 3 GB of user memory and 1 GB of system memory.

Range (in hex) Size Function
0 to FFFF 64 kilobytes (KB) No access region
10000 to 7FFEFFFF 2 GB - 192 KB Private process address space
7FFDE000 to 7FFFFFFF 136 KB Reserved for various Windows NT functions
80000000 to FFFFFFFF 2 GB System memory

Virtually-addressable Memory

As previously stated, Windows NT gives each process 4 GB of memory. But how is this possible when some systems running Windows NT have only 64 megabytes (MB) of actual memory (or even less)? Well, the process actually has a 4-GB virtual address space and does not necessarily use the entire 4 GB. To understand why Windows NT does this, let's look at a basic programming concept.

With older DOS programs, the amount of memory space that a program runs in was a concern. Also, the industry built different "versions" of a program for different-sized machines. Other issues were whether the program used extended or expanded memory, or if there was enough extra memory. By allowing Windows NT to set a "virtual address space" of 4 GB, all programs now run in a flat memory model. You no longer need to worry about how to compile programs. You just let Windows NT translate the calls from the virtual address to the physical address.

When you debug a user-mode application, you don't care what translation occurs because the debugger displays all memory as the program sees it-in the virtual address space. With this information in mind, let's look at a very simple example of how Windows NT memory is laid out.

Memory Space Layout

When a process starts on Windows NT, the system builds memory space and then lays out memory inside this space. The way Windows NT lays out memory depends specifically on the application, but the same general principles apply to all processes. There are four parts to the memory of a process:

Figure 7. Memory space layout

  • The data (green) portion of memory contains the data or variables used by the application. Whenever a program executes a command such as malloc, Windows NT gives the application some space within the heap and allows the application to use this space for its own purposes. The application then reads from the memory, writes to it, or returns it to the heap when it's done (this is known as freeing the memory). See Virtual Memory and Heaps for a more detailed discussion.
  • The code (blue) portion of memory contains the actual commands and instructions for the application. Windows NT allocates this memory as the program loads. To protect the integrity of the program, Windows NT marks this memory as Read/Execute-Only. See Understanding DLLsfor a discussion on how the code loads and how it affects debugging.
  • The stack (yellow) portion of memory contains the "roadmap" for the code's path of execution. Programs are comprised of components (or modules) and these modules are comprised of functions. The functions can receive input (called arguments or parameters) and they can return results (called return values). In order for the application to keep track of where it has been (which function called what) and what values were passed in, the process uses a last in, first out (LIFO) list, known as a stack. See Dissecting the Stack for a discussion on the stack and how to read it.
  • The Unused (gray) portion of memory does not fall into any of the above three categories. Since Windows NT is dynamic in its usage of memory, this part of memory grows and shrinks as a process uses and frees memory.

Virtual Memory and Heaps

Windows NT uses two different methods for allowing a program to store and use data-heaps and virtually-allocated memory. (A third method, memory-mapped files, is beyond the scope of this document). The heap refers to memory that Windows NT reserves for an application and allows you to access. The heap usually is best used for large numbers of small pieces of data, whereas virtually-allocated memory is the memory a program specifically reserves for itself. Usually it is used to store large pieces of data. Additional differences between both types of memory allocation include:

  • Virtually-allocated memory can be shared between processes, allowing the memory to act as a means of sharing data. Heap memory cannot be shared.
  • The "private bytes" counter in Performance Monitor does not show shared memory and therefore may not be the actual size of the memory currently used.
  • Windows NT generates a default heap of 1 MB. Windows NT automatically adjusts the size of the heap as it needs more space. The application is responsible for growing virtually-allocated memory.

To view the layout of processes memory, use the Windows NT Resource Kit utility PViewer.exe to obtain a detailed analysis.

Understanding DLLs

DLLs provide a core reason for the power and speed of Windows today. The concept of using DLLs is very simple. A library is a set of routines that a program may use, and libraries usually consist of common routines that several programs may use. Rather than rewrite a routine every time you need to use it, you build a library once and include it with your program. To use a library's routines, it must be "linked" to the program, which usually occurs at development time. A shortcoming of this approach is that every program needs a copy of the library, which can quickly consume memory and disk space. Windows resolves this issue with DLLs by allowing the library to be "linked" to the application at run time (when the application executes). Windows gains even greater efficiency by allowing many different applications to share the same DLL. Only one copy of the DLL loads into memory even though five applications might use it.

DLLs provide an additional advantage since they do not need to remain in memory. Assume you have ten different DLLs, each of which is used only once. You can unload each DLL after it finishes, thus saving memory for other more important tasks. However, DLLs fundamental to a program load once and remain in memory for the duration of the program, making them available as needed.

While looking at debugging output, you may notice instances of DLL relocation due to a collision with another DLL. This is normal in a Windows environment. When a DLL builds, Windows assigns a preferred address to load into when it runs. However, if another DLL occupies the memory when it tries to load, a collision occurs and Windows NT moves it to another location and loads it. Windows NT then translates all calls to the DLL to this new memory location. To view the preferred loading address of a DLL, right-click it and select Quick View (available only in Windows NT 4.0). This displays header information about the DLL where the "Image Base" entry identifies the preferred loading address.

Manipulating Memory

As stated previously, code memory is read-only. When a debugger attaches to a process and attempts to modify the code, such as adding a breakpoint, Windows NT changes the memory from read-only to copy-on-write. This essentially makes a copy of the code and allows the debugger to modify the copy. Windows NT then changes the flow of the process to point to the new code. In this manner, Windows NT allows shared code to be modified without affecting any other process.

Dissecting the Stack

Note This section discusses some general concepts about a stack. The specifics of a stack depend on the type of CPU platform, the compiler used to generate the program, and the way in which the code was written.

The following is reproduced from the book Assembler Inside and Out (McGraw Hill Press):

The stack is designed to hold information temporarily. This is especially important when you are working with procedures.

Most medium-size and large programs are broken into self-contained modules called procedures. (In high-level languages, these modules are often called functions or subroutines.) When a procedure invokes another procedure, you say that the first one calls the second one. When the second procedure finishes, the program returns to the first procedure.

As part of executing a call instruction, the processor must save the location within the first procedure at which execution will resume when the second procedure is finished. This location is referred to as the return address.

The processor saves the return address by pushing it onto the stack. When the second procedure returns, the processor pops the return address off the stack and resumes execution at that location. Because the stack pops items in a last-in, first-out manner, procedures can call other procedures that call other procedures, and they will all return to the correct location.

Another important use for the stack is to save the contents of the registers at the beginning of a procedure. Before the procedure returns, the registers can be restored by popping the old values off the stack. This leaves each procedure free to use the registers as it sees fit. However, from the point of view of each calling procedure, the registers remain ** unchanged.

The stack can also be used to pass information from one procedure to another. A procedure can push data onto the stack and then call a second procedure that accesses the data. If necessary, the second procedure can use the stack to pass data back.

A final use for the stack is to hold temporary results that occur during complex operations. Sometimes you will store temporary values in a general register [on the CPU] and sometimes you will push them onto the stack. Which method you choose will vary depending on the logic of the specific program.

Stacks and Multithreaded Applications

The previous memory diagram (Figure 7) shows only one area of memory related to the stack. However, each thread has its own stack, because each thread needs to be aware of its own history, which is completely independent of the other threads in the process.

The stack always grows down (from high to low memory addresses) and is built by pushing values onto it. The location assigned to the next value is determined by the value of ESP register. When a value gets pushed on, the processor automatically decrements the ESP and points to the location for the next piece of information.

The following sample shows an executing thread. Notice the pseudocode for three different functions, and how the first one calls the next, which eventually calls the third one. Each time you make a call, you add a frame to the stack. The frame stores information, allowing the process to return and pass information back and forth.

Figure 8. Stack diagram

Understanding Stack Frames

Figure 8 shows how the line of execution of the program changes when it encounters a jump, such as a call or jnz statement. Since a jump moves to another section of code to continue execution, this is where you need to save information onto the stack, allowing the process to return when it finishes. The data you save onto the stack may be small or large. Regardless of the size, all the data saved at this point is referred to as a frame. Therefore, in the above example, you actually add two frames to the stack-one when the top function calls the second function, and one when the second function calls the third function. When the third function returns, the second function is restored and continues running by restoring its stack frame.

Stack information essentially provides a "roadmap" of what a program does. When you analyze logs generated by the Exception Monitor, you can view this roadmap and see what your application was doing at the time the fault occurred. Each frame on the stack corresponds to a line in the Main Output window of ReadLogs when you either first start ReadLogs or Fault Stack.

Heap and Stack Corruption

Heap corruption is best demonstrated with an example. He@@@@p corrupTHIS IS NOT PART OF THE SENTENCEn this subject.

Obviously the above sentence is corrupt. If you consider the space above as memory for the program to use, and you consider the sentence as the data (or code) that the program needs, you can see that the program will experience difficulty using this information. This is one form of heap corruption. The heap is a portion of memory that is dedicated to the use of the program to store information it needs to retrieve later. Let's say that thread 1 of an application decided to discuss something (in this case heap corruption), and thread 2 of an application decided it wanted to sing a song. Well, thread 1 was given some memory off the heap to store its monologue, and it wrote the monologue to this memory.

Now let's say that thread 2 also was given some memory, but it was busy doing something else and forgot where its memory was, so it just started at random times. During one of these instances, thread 2 overwrote the monologue for thread 1. Now thread 1 decides it's time to read its monologue back, but now it can't, and it fails. You get an exception. Unfortunately, the operating system has no idea that thread 2 caused the exception because thread 2 is busy doing something else.

That's probably not the most technical or accurate description you will ever see on heap corruption, but it conveys the general idea. Basically, any time a thread writes to memory that it doesn't own, it runs the risk of overwriting data or code owned by another thread. The biggest problem with this is that the memory that gets overwritten may sit around unused for some time. Then when it finally does get called and results in an exception, the offending code that overwrote it may not be running anymore, and you have no idea what actually corrupted the memory causing the crash.

The same idea applies to stack corruption, except now what happens is a pointer on the stack (otherwise known as a memory address) points to the wrong place. So the CPU comes to this pointer and tries to find the next command to execute. But what if the pointer points to empty memory? Then you have another exception. This usually occurs because some code passed a bad pointer to the next function in the stack. Going back to the stack trace, you see several stack frames on the list. This pointer might be passed around several times before it's actually used.

These are two reasons you cannot say the top function on a stack trace is always at fault. It might have gotten a bad pointer from below (which means some other function on the stack caused the problem), or it might have encountered a problem when accessing some memory that had been corrupted at any time (which means the culprit might not even be around anymore). You can sometimes detect heap corruption because you will see calls on the faulting stack similar to RtlCoalesceFreeBlocks. You should also look for string manipulation functions such as strcat, strcpy, memcpy, and so on. These often cause heap corruption if they are used improperly. Stack corruption shows up in many different ways and is beyond the scope of this paper. Therefore, it is not discussed further in this article.

CDB Commands Used by Exception Monitor

Exception Monitor uses several commands that you may find useful when debugging. These commands and their associated output are described below.

Kv Command

The first command Exception Monitor runs is kv. It reveals the function callback stack for the thread that was active at the time of the trip. This stack is read from bottom to top: the topmost function is the last function to be called and the bottommost function is the first function to be called for this thread.

Sample output (x86 platform):

ChildEBP RetAddr Args to Child
00f8fdf8 53d0f7f3 003070ac 00307078 00306d48 0x1000113c [Stdcall: 0]
00f8fe5c 53d0ad45 00306ee4 00f8ff0c 00f8ff4c w3svc!AddFilterResponseHeaders+0x26
00f8ff20 53d01ce6 00f8ff4c 00f8ff68 003154b0 w3svc!HTTP_REQUEST__DoDirList+0x2f7
00f8ff50 53d01f56 00f8ff68 00315478 00315460 w3svc!&127;ole32_NULL_THUNK_DATA+0x96e
00f8ff6c 53d04adb 00000166 00000000 00000001 w3svc!&127;ole32_NULL_THUNK_DATA+0xbde
00f8ff88 539895d0 00315460 00000166 00000000 w3svc!HTTP_REQ_BASE__`vftable'+0x1fd3
00f8ffb8 77f04f2c abcdef01 776a2254 00000028 infocomm!AtqPoolThread+0x1c8
00f8ffec 00000000 53989408 abcdef01 00000000 kernel32!BaseThreadStart+0x51

Sample output (alpha platform):

Callee-SP Return-RA  Call Site

 0815ffc0 77e6f5b8 : ntdll!DbgBreakPoint+0x4
 0815ffc0 77e6cc48 : KERNEL32!DebugBreak+0x8
 0815ffd0 00000000 : KERNEL32!BaseThreadStart+0x60

The function callback stack lists all functions that were active in the given thread. This means that function 1 calls function 2, which eventually calls function 3, and so on. When the last function to be called finishes what it is doing, it "unwinds," or returns to the function that called it; that function then finishes and returns to the function that called it, and so on. The callback stack lists which functions called what, and shows how the code should get back when it runs properly. Since this stack is the faulting stack, you know that the access violation occurred in the function listed at the top of the stack. However, this does not necessarily indicate that this function started the actual problem. The problem could have been caused by another function passing a bad pointer up the stack, or even by a different thread completely. See Heap and Stack Corruptionfor further explanation.

Notice that the command at the top of the stack does not list a calling command, confirming that this file does not have symbols loaded. When this stack loads into ReadLogs, it will attempt to resolve this function call to a friendly name by using the list of loaded modules. It first gets the memory address where the call was being made (0x1000113c), and then it looks for the loaded DLL that owns this memory location.

!inetdbg.mod

One of the extensions that the IIS debugger adds to CDB is the file Inetdbg.dll. If this file is present in the Windows NT directory (or the local debug directory), then the commands are available to CDB. If you precede an extension command with an '!' symbol, then CDB looks for the DLL and passes the command to the extension for processing. In this case, you call the Inetdbg extension and pass in the mod command. This command dumps a list of all currently loaded modules in the Inetinfo memory space and returns the information below (this is an abbreviated list).

Sample output:

Start End Entry Path
01000000 01006000 01001122 C:\WINNT\System32\inetsrv\inetinfo.exe
77f60000 77fbc000 00000000 C:\WINNT\System32\ntdll.dll
78000000 78047000 7800546d C:\WINNT\system32\MSVCRT.dll
77f00000 77f5e000 77f01000 C:\WINNT\system32\KERNEL32.dll
77dc0000 77dfe000 77dc1000 C:\WINNT\system32\ADVAPI32.dll
77e70000 77ec1000 77e7a2a2 C:\WINNT\system32\USER32.dll
77ed0000 77efc000 00000000 C:\WINNT\system32\GDI32.dll
10000000 10024000 10001650 C:\InetPub\scripts\selfdestruct.dll

This information can be helpful in determining what code is actually running in Inetinfo's memory space. In the kv command example above, you had an unresolved function. To resolve it, you take the memory address-0x1000113c-and see that it is within the range of the Start address and End address for Selfdestruct.dll.

!inetdbg.ver

Another extension given to us by Inetdbg is the ver command. This lists any version information it can extract from loaded DLLs. Most DLLs have version information in the header block of the actual file. This extension attempts to get the information from here.

Sample output:

Module @ 0x01000000 = inetinfo.exe
dwFileFlags = 0x00000000 (FREE)
CompanyName = Microsoft Corporation
FileDescription = Internet Information Services Application v 1.0
FileVersion = 4.00
InternalName = INETINFO.EXE
LegalCopyright = Copyright (C) Microsoft Corp. 1981-1996
OriginalFilename = INETINFO.EXE
ProductName = Microsoft® Windows NT™ Operating System
ProductVersion = 4.00

Module @ 0x77f60000 = ntdll.dll
dwFileFlags = 0x00000000 (FREE)
CompanyName = Microsoft Corporation
FileDescription = NT Layer DLL
FileVersion = 4.00
InternalName = ntdll.dll
LegalCopyright = Copyright (C) Microsoft Corp. 1981-1996
OriginalFilename = ntdll.dll
ProductName = Microsoft® Windows NT™ Operating System
ProductVersion = 4.00

In this example, notice that there is no listing for the DLL in question (Selfdestruct.dll). Inetdbg.mod does not show any information for DLLs for which it can't retrieve the version information. ReadLogs compares the file name for each entry from the ver command with the entries here and adds whatever version information it can to each of the DLLs it has in its list. If there isn't any version information available, then ReadLogs simply adds the statement "There is no version information available for this DLL*.*" This command's output is what ReadLogs uses to display the version information. This information can be very helpful in determining if you are running the proper version of a given DLL.

Note On some machines running Windows NT Service Pace (SP) 4, the ver command will fail to execute fully. You can run the command twice in hopes of getting proper version information. ReadLogs will look specifically for a version of output that does not contain any errors and will use this set to display the version information. If both copies fail to run properly, ReadLogs will not be able to display version information.

~*kb

The ~*kb command instructs the debugger to dump a copy of the stack (similar to kv) for every thread in the memory space. (The ~ command says "select a thread"; the * command says "all threads" and then you add whatever command you want to run against all threads, in this case kb.) You can use this command to see what else was going on at the time of the crash. You also use this information if you are troubleshooting 100-percent CPU utilization or deadlocks. Each stack is analyzed the same way as the kv stack.

You can use this information for troubleshooting 100-percent CPU utilization by running the debugger, and also running a Performance Monitor log where you capture the threads counter. With a 100-percent CPU utilization problem, you manually trip the debugger, and then stop the PerfMon log. By analyzing the PerfMon log, you obtain the number of the thread that was utilizing the CPU. You can then use ReadLogs (or manually inspect the debug log file) for the thread listed in the ~*kb command that has the same number. This tells you what the thread was doing and points to the culprit.

Note You must use the logging feature of PerfMon to capture the thread's output, and you should wait until the 100-percent utilization occurs before you start the log. Simply watching the threads in the Chart View might lead to erroneous thread numbers.

u eip-50 eip+20 (x86 Platform Only)

The u command instructs the debugger to display the assembly code residing at the address range provided. In this case, you are asking for the code starting 50 bytes before the address pointed to by the extended instruction pointer (EIP) register through 20 bytes after the EIP. The EIP register always points to the next line of code to be executed. By displaying this, you can look at what code was about to run at the time of the crash.

Sample output (abbreviated):

ntdll!RtlpWaitForCriticalSection+0x45:
77f6cc16 0000             add     [eax],al
77f6cc18 008b0d805afa     add     [ebx+0xfa5a800d],cl
77f6cc1e 7739             ja ntdll!RtlpWaitForCriticalSection+0x88 (77f6cc59)
77f6cc20 48               dec     eax
77f6cc21 2475             and     al,0x75
77f6cc23 2333             and     esi,[ebx]
77f6cc25 c0894604894608   ror     byte ptr [ecx+0x46890446],0x8
77f6cc2c 89460c           mov     [esi+0xc],eax
77f6cc2f 894610           mov     [esi+0x10],eax

!locks

The !locks command displays information about what sections of currently running code are marked as Critical Sections. See Understanding Locks for a discussion on Critical Sections.

Sample output:

CritSec w3svc!MDIDMappingTable+100 at 68c2b1b0
LockCount          0
RecursionCount     1
OwningThread       156
EntryCount         0
ContentionCount    0
*** Locked

!inetdbg.atq -g

The atq command displays information relevant to the IIS Asynchronous Thread Queue. The thread queue is a feature specific to IIS. When a new request comes into IIS, it needs a thread to be processed. Since the creation of threads is an expensive operation in terms of time and CPU usage, IIS maintains a set of worker threads ready to handle incoming requests; IIS manages this pool of threads. The atq command displays information, such as how many threads exist in the pool and how many threads are currently unused (or available). This command does not have any affect when debugging applications other than IIS.

Sample output:

Atq Globals:
   isatq!g_cConcurrency          =        0 (0x       0)
   isatq!g_cThreads              =        9 (0x       9)
   isatq!g_cAvailableThreads     =        9 (0x       9)
   isatq!g_cMaxThreads           =       15 (0x       f)

   isatq!g_fUseAcceptEx          =        1 (0x       1)
   isatq!g_cbXmitBufferSize      =     4096 (0x    1000)
   isatq!g_cbMinKbSec            =     1000 (0x     3e8)
   isatq!g_cCPU                  =        1 (0x       1)
   isatq!g_fShutdown             =        0 (0x       0)

   isatq!g_msThreadTimeout       = 43200000 (0x 2932e00)
   isatq!g_dwTimeoutCookie       =        2 (0x       2)
   isatq!g_cListenBacklog        =       25 (0x      19)
   isatq!AtqGlobalContextCount   =      347 (0x     15b)
   sizeof(ATQ_CONTEXT) = 212

!inetdbg.asp -g

The asp command displays information relevant to IIS Active Server Pages (ASP) processing. This command shows session, application and request information, and does not have any effect when debugging applications other than IIS.

Sample output:

Asp Globals:
        asp!g_nSessions                 =        0 (0x       0)
        asp!g_nApplications             =        2 (0x       2)
        asp!g_nApplicationsRestarting   =        0 (0x       0)
        asp!g_nBrowserRequests          =        0 (0x       0)
        asp!g_nSessionCleanupRequests   =        0 (0x       0)
        asp!g_nApplnCleanupRequests     =        0 (0x       0)
        asp!g_fShutDownInProgress       =        0 (0x       0)

!inetdbg.ds -v esp (x86 Platform Only)

The ds command displays information about the stack for the current thread. By passing -v esp to the parameters, you tell the ds command to display the stack starting at the ESP, and show only the names of the functions being called. You do not display any other information from the stack frames. The output from this command goes further back in the stack and therefore gives us a longer "roadmap" of what the thread was doing.

Sample output:

0a1a2ab0 : 7800fb30 : Image@78000000!wcscmp
0a1a2ab4 : 7800fa59 : Image@78000000!wcslen+0x19
0a1a2ac4 : 65345ed6 : oleaut32!VariantChangeTypeEx+0x586
0a1a2ad8 : 7800fab0 : Image@78000000!wcsncpy
0a1a2ae4 : 7800fab0 : Image@78000000!wcsncpy
0a1a2ae8 : 60502760 : MscsCore+0x2760
0a1a2b04 : 7800fab0 : Image@78000000!wcsncpy
0a1a2b10 : 77f67456 : ntdll!RtlEnterCriticalSection+0x46
0a1a2b14 : 0911b3a0 : Image@09110000+0xb3a0
0a1a2b18 : 091123d5 : Image@09110000+0x23d5
0a1a2b1c : 0911b3a0 : Image@09110000+0xb3a0
0a1a2b24 : 7800fb30 : Image@78000000!wcscmp
0a1a2b28 : 7800fa59 : Image@78000000!wcslen+0x19
0a1a2b2c : 09111256 : Image@09110000+0x1256
0a1a2b34 : 7800fa59 : Image@78000000!wcslen+0x19
0a1a2b3c : 091179d0 : Image@09110000+0x79d0

!inetdbg.ds

The ds command (called with no arguments) displays raw output of the stack based on the current top of the stack. Unlike the kv, kb, and ds -v commands, this is a dump of the exact stack (showing all parameters) and local storage. Although it is difficult to read and is not formatted, it shows the exact contents of the stack.

Sample output:

0a1a2aac : 0a1a2c85
0a1a2ab0 : 7800fb30 : Image@78000000!wcscmp
0a1a2ab4 : 7800fa59 : Image@78000000!wcslen+0x19
0a1a2ab8 : 0a1a2b3c
0a1a2abc : 80020005
0a1a2ac0 : 00000400
0a1a2ac4 : 65345ed6 : oleaut32!VariantChangeTypeEx+0x586
0a1a2ac8 : 0a1a2af8
0a1a2acc : 00000002
0a1a2ad0 : 37dc29f4
0a1a2ad4 : 00000000
0a1a2ad8 : 7800fab0 : Image@78000000!wcsncpy
0a1a2adc : 37dc29f4
0a1a2ae0 : 8002000b
0a1a2ae4 : 7800fab0 : Image@78000000!wcsncpy
0a1a2ae8 : 60502760 : MscsCore+0x2760
0a1a2aec : 00000000
0a1a2af0 : 00000008
0a1a2af4 : 11d045fd
0a1a2af8 : a0000000
0a1a2afc : 00000030
0a1a2b00 : 37dc29f0

dc esp-200 esp+200 and dc poi(eax)-200 poi(eax)+200 (x86 Platform Only)

The dc command displays memory pointed to by the arguments passed to it. The display consists of raw memory (displayed as groups of DWORDs or Double Words), with the first column showing the memory address being displayed and the last column displaying the ASCII character equivalent of the memory values. This allows you to see any character strings that might be stored in the memory. You call this command many times, passing in values based on the registers of the processor. This is a somewhat inaccurate approach which may or may not actually display any relevant information. The first command listed above tells the debugger to display the memory pointed to by the ESP register. The second command tells the debugger to display the memory pointed to by the memory that eax points to (for you programmers, this is a pointer to a pointer).

Sample output:

02a7d97c  00000000 4e4c5153 28205445 294e534f   ....SQLNET (OSN)
02a7d98c  00000000 33393931 00000000 00534e54   ....1993....TNS.
02a7d99c  716e736f 00006769 72746e65 00000a79   osnqig..entry...
02a7d9ac  00000000 00534e54 5754454e 004b524f   ....TNS.NETWORK.

dc ra-40 ra+40 (Alpha Platform Only)

The registers on an alpha system are different, so you change the arguments passed to the dc command to reflect the registers on the alpha system. The reasoning and the analysis are identical to the above paragraph for dc on the Intel platform.

Sample output:

77e6f57c  23de01b0 6bfa8001 00000000 00000000  ...#...k........
77e6f58c  00000000 000000ab a0000030 a4000000  ........0.......
77e6f59c  480050c0 6bfa8001 00000000 00000000  .P.H...k........
77e6f5ac  00000000 23defff0 b75e0000 d3415619  .......#..^..VA.
77e6f5bc  a75e0000 23de0010 6bfa8001 00000000  ..^....#...k....
77e6f5cc  00000000 243f77e6 23deffe0 a0012008  .....w?$...#. ..
77e6f5dc  47f00411 b53e0000 b75e0008 221e0018  ...G..>...^...."
77e6f5ec  6b404000 241f77e6 221e0010 a0002044  .@@k.w.$..."D ..
77e6f5fc  223e0018                             ..>"

More General Analysis Information

This section discusses some different outputs you may see in the log file and how they can help you debug your applications and make them more robust.

When you look at the raw log generated by the IIS debugger, you might notice that there are several lines of information in the log before you get to the access violation. This occurs because the debugger gets information all the time and displays it. The following discusses this information and what it may mean.

-----
NTSD ModLoad: 77720000 77731000 C:\WINNT\system32\mpr.dll
-----
LDR: Automatic DLL Relocation in inetinfo.exe
LDR: Dll scod32.dll base 10000000 relocated due to collision with
C:\Program Files\Common Files\System\ADO\msader15.dll
NTSD ModLoad: 04800000 0489f000 C:\WINNT\SYSTEM32\scod32.dll
-----
NTSD: C++ EH Exception
-----
NTSD: exception number c000008f

The first output in the above example, NTSD ModLoad, simply means that Inetinfo got a request for the code in the given module (or DLL), and is loading the DLL into memory. Some DLLs, when loaded, stay in memory for the duration of the session. Other DLLs unload as soon as the caller finishes with them. If needed again, the DLLs reload. For more information, see Chapter 19 of Jeffrey Richter's book Programming Applications for Microsoft Windows, Fourth Edition (Microsoft Press).

The second set of output above is from LDR and NTSD. LDR is stating that a relocation of a DLL is occurring. The second line gives the reason why. In this case, there was a collision with another DLL already in memory. In the third line, NTSD shows that the DLL loaded successfully at its new address. Initially, this set of statements may look frightening, but it is integral to the way Windows works.

Let's take a closer look at what really is happening in the second set of output above. When a compiler builds a DLL, it assigns a default base address. This is the address where the DLL will load, if possible. By loading here, the information for the locations of all of the exposed functions in the DLL do not need to be translated and can be called faster. Since there is another DLL in the same memory space, a collision occurs. The loader finds another place to load the DLL and then performs any necessary translations needed for calls to get to the DLL's functions.

Next, you see an NTSD C++ Exception. This is an error, but it is one that did not trip the debugger. A C++ exception is known as a "first chance exception." This means (in layman's terms) that when the exception occurred, Windows NT gave the DLL that owns the code a chance to correct its own problem (known in programming as "exception handling"). This is usually done in C by using a try/except wrapper, or in C++ by using a try/catch wrapper. A wrapper is a piece of code that "wraps" around a section of code and allows you to isolate it. In the case of a try/catch wrapper, if the code in the try part causes an exception, the catch/except code is called in hopes that the problem can be handled. If it can, the above error will not result in the crashing of the system.

The last entry you see is similar to the C++ exception, except that you are looking at an exception that was handled by the operating system. The number of the exception can sometimes be translated into meaningful text; ReadLogs attempts to do this automatically with every exception number it gets. See Viewing Errorsfor information about how ReadLogs does this.

In all honesty, the above paragraphs just skim the surface of what really is going on. Jeffrey Richter dedicates an entire section to this topic in his book, Programming Applications for Microsoft Windows, Fourth Edition (Microsoft Press).

Building Debug Files

This section addresses two issues: how to build a debug version of an application, and how to enhance the output. The discussion on both pertains strictly to Microsoft Visual C++®, versions 5 and 6. However, similar features are available with other C compilers, Java, and Visual Basic®.

Let's first build a debug version of the custom code you wish to debug. To do this, load the project into Microsoft Developer Studio®. Then complete the following:

  1. On the Build menu, click Set Active Configuration.

  2. Select the debug configuration from the dialog box and click OK.

  3. On the Project menu, click Settings, and then click the Link tab

  4. From the Category drop-down box, click Debug. In the Debug Info section, select Both Formats.

  5. This generates the PDB debug files with both Common Object File Format (COFF) and Microsoft information. CDB uses COFF symbol files, so this step allows CDB to read your symbols.

  6. Rebuild your project. Then, from the debug directory, copy both the DLL and the corresponding PDB file to the appropriate place on your Web server.

    Note The PDB needs to be in the same directory as the DLL. Now you should have symbols that CDB can use.

You can make it more challenging by adding output strings. Remember all of the information that ends up in the log before the debugger trips? You can add stuff to your code to output information as well. Let's say that your DLL makes a connection to some database and you want to know if it really connects. You can have your DLL write the connection status to the debugger, or you can have it print out the value of some variable. Or, if you really want to, you can have it say, "Hello World!!" All you have to do is use an API called OutputDebugString(). This function is designed to write strings out to the attached debugger. You can use wsprintf() to format a string with any information about the program you want, and pass that string to OutputDebugString. Then, anytime that piece of code executes, the string displays in the debug window. If a debugger is not attached, then Windows NT ignores the commands and the program executes normally.

Let's examine a real world situation where this might be helpful. Assume you have a DLL that seems to always cause some sort of stack corruption, but you're not sure what function in the DLL causes the problem. If you put an OutputDebugString at the beginning and end of every function that says something like "Entering <function name>" and "Exiting <function name>", then you can look at the log and see the last function called. This might not always be the offending function, but gives you a place to start your search. You also can use OutputDebugString to monitor variables or other things in your DLL.

Conclusion

Microsoft Technical Support uses Exception Monitor successfully to solve many issues. Along with ReadLogs, Exception Monitor provides a powerful suite of tools for diagnosing exceptions occurring on your server. Although the tool set cannot address every exception, it provides more information on the status of a failure than the server's logging mechanisms.