Debug Leaky Apps

Identify And Prevent Memory Leaks In Managed Code

James Kovacs

This article discusses:

  • Understanding memory leaks in managed apps
  • Unmanaged memory used in .NET apps
  • Helping the .NET garbage collector do its job
This article uses the following technologies:
.NET Framework

Code download available at: Memory Leaks 2007_01.exe(163 KB)

Contents

Memory in .NET Applications
Checking for Leaks
Leaking Stack Memory
Leaking Unmanaged Heap Memory
"Leaking" Managed Heap Memory
Conclusion

The first reaction many developers have to the idea of memory leaks in managed code is that it's not possible. After all, the garbage collector (GC) takes care of all memory management, right? The garbage collector only handles managed memory, though. There are a number of places where unmanaged memory is used in Microsoft® .NET Framework-based applications, either by the common language runtime (CLR) itself, or explicitly by the programmer when interoperating with unmanaged code. There are also occasions where the GC seems to be shirking its duties and not efficiently handling managed memory. Usually this is caused by subtle (or not so subtle) programming errors that hinder the GC from performing its job. As good memory citizens, we still have to profile our applications to ensure they are leak-free and make efficient use of the memory they require.

Memory in .NET Applications

As you probably know, .NET applications make use of several types of memory: the stack, the unmanaged heap, and the managed heap. Here's a little refresher.

The Stack The stack is where local variables, method parameters, return values, and other temporary values are stored during the execution of an application. A stack is allocated on a per-thread basis and serves as a scratch area for the thread to perform its work. The GC is not responsible for cleaning up the stack because the space on the stack reserved for a method call is automatically cleaned up when a method returns. Note, however, that the GC is aware of references to objects stored on the stack. When an object is instantiated in a method, its reference (a 32-bit or 64-bit integer depending on the platform) is kept on the stack, but the object itself is stored on the managed heap and is collected by the garbage collector once the variable has gone out of scope.

The Unmanaged Heap The unmanaged heap is used for runtime data structures, method tables, Microsoft intermediate language (MSIL), JITed code, and so forth. Unmanaged code will allocate objects on the unmanaged heap or stack depending on how the object is instantiated. Managed code can allocate unmanaged heap memory directly by calling into unmanaged Win32® APIs or by instantiating COM objects. The CLR itself uses the unmanaged heap extensively for its data structures and code.

The Managed Heap The managed heap is where managed objects are allocated and it is the domain of the garbage collector. The CLR uses a generational, compacting GC. The GC is generational in that it ages objects as they survive garbage collections; this is a performance enhancement. All versions of the .NET Framework have used three generations, Gen0, Gen1, and Gen2 (from youngest to oldest). The GC is compacting in that it relocates objects on the managed heap to eliminate holes and keep free memory contiguous. Moving large objects is expensive and therefore the GC allocates them on a separate Large Object Heap, which does not compact. For more information on the managed heap and GC, see Jeffrey Richter's two part series, "Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework" and "Garbage Collection-Part 2: Automatic Memory Management in the Microsoft .NET Framework". Although the articles were written based on the .NET Framework 1.0, the core concepts have not changed in versions 1.1 or 2.0, although the .NET GC has improved since then.

Checking for Leaks

There are a number of telltale signs that an application is leaking memory. Maybe it's throwing an OutOfMemoryException. Maybe its responsiveness is growing very sluggish because it started swapping virtual memory to disk. Maybe memory use is gradually (or not so gradually) increasing in Task Manager. When a memory leak is suspected, you must first determine what kind of memory is leaking, as that will allow you to focus your debugging efforts in the correct area. Use PerfMon to examine the following performance counters for the application: Process/Private Bytes, .NET CLR Memory/# Bytes in All Heaps, and .NET CLR LocksAndThreads/# of current logical Threads. The Process/Private Bytes counter reports all memory that is exclusively allocated for a process and can't be shared with other processes on the system. The .NET CLR Memory/# Bytes in All Heaps counter reports the combined total size of the Gen0, Gen1, Gen2, and large object heaps. The .NET CLR LocksAndThreads/# of current logical Threads counter reports the number of logical threads in an AppDomain. If an application's logical thread count is increasing unexpectedly, thread stacks are leaking. If Private Bytes is increasing, but # Bytes in All Heaps remains stable, unmanaged memory is leaking. If both counters are increasing, memory in the managed heaps is building up.

Leaking Stack Memory

Although it is possible to run out of stack space, which results in a StackOverflowException in the managed world, any stack space used during a method call is reclaimed once that method returns. Therefore, there are only two real ways to leak stack space. The first is to have a method call that consumes significant stack resources and that never returns, thereby never releasing the associated stack frame. The other is by leaking a thread, and thus that thread's entire stack. If an application creates worker threads for performing background work, but neglects to terminate them properly, thread stacks can be leaked. By default, the stack size on modern desktop and server versions of Windows® is 1MB. So if an application's Process/Private Bytes is periodically jumping in 1MB increments with a corresponding increase in .NET CLR LocksAndThreads/# of current logical Threads, a thread stack leak is very likely the culprit. Figure 1 shows one example of improper thread cleanup caused by (purposely bad) multithreaded logic.

Figure 1 Buggy Thread Cleanup

using System;
using System.Threading;

namespace MsdnMag.ThreadForker {
  class Program {
    static void Main() {
      while(true) {
        Console.WriteLine(
          "Press <ENTER> to fork another thread...");
        Console.ReadLine();
        Thread t = new Thread(new ThreadStart(ThreadProc));
        t.Start();
      }
    }

    static void ThreadProc() {
      Console.WriteLine("Thread #{0} started...", 
        Thread.CurrentThread.ManagedThreadId);
      // Block until current thread terminates - i.e. wait forever
      Thread.CurrentThread.Join();
    }
  }
}

A thread is launched, which displays its thread ID and then tries to Join on itself. Join causes the calling thread to block waiting on the other thread to terminate. So the thread is caught in a chicken-or-egg scenario-the thread is waiting for itself to terminate. Watch this program under Task Manager to see its memory usage increase by 1MB, the size of a thread stack, every time <Enter> is pressed.

The reference to the Thread object is being dropped every time through the loop, but the GC does not reclaim the memory allocated for the thread stack. A managed thread's lifetime is independent of the Thread object that creates it, a very good thing given that you wouldn't want the GC to terminate a thread that was still doing work simply because you lost all references to the associated Thread object. So the GC is collecting the Thread object, but not the actual managed thread. The managed thread does not exit (and the memory for its thread stack is not released) until its ThreadProc returns or it is explicitly killed. So if a managed thread is not properly terminated, the memory allocated to its thread stack will leak.

Leaking Unmanaged Heap Memory

If total memory use is increasing, but logical thread count and managed heap memory is not increasing, there is a leak in the unmanaged heap. We will examine some common causes for leaks in the unmanaged heap, including interoperating with unmanaged code, aborted finalizers, and assembly leaks.

Interoperating with Unmanaged Code One source of memory leaks involves interoperating with unmanaged code, such as when C-style DLLs are used through P/Invoke and COM objects through COM interop. The GC is unaware of unmanaged memory, and thus a leak here is due to a programming error in the managed code using the unmanaged memory. If an app is interoperating with unmanaged code, step through the code and examine memory usage before and after the unmanaged call to verify that memory is being reclaimed properly. If it isn't, look for the leak in the unmanaged component using traditional debugging techniques.

Aborted Finalizers A very insidious leak occurs when an object's finalizer does not get called, and it contains code to clean up unmanaged memory allocated by the object. Under normal conditions, finalizers will get called, but the CLR does not make any guarantees. While this may change in the future, current versions of the CLR use only one finalizer thread. Consider a misbehaving finalizer trying to log information to a database that is offline. If that misbehaving finalizer erroneously tries over and over again to access the database, never returning, the "well-behaved" finalizer will never get a chance to run. This problem can manifest itself very sporadically because it depends on the order of finalizers on the finalization queue as well as the behavior of other finalizers.

When an AppDomain is torn down, the CLR will attempt to clear the finalizer queue by running all finalizers. A stalled finalizer can prevent the CLR from completing the AppDomain tear down. To account for this, the CLR implements a timeout on this process, after which it will stop the finalization process. Typically, this isn't the end of the world, as most applications only have one AppDomain, and its teardown is due to the process being shut down. When an OS process is shut down, its resources will be recovered by the operating system. Unfortunately, in a hosting situation such as ASP.NET or SQL Server™, the teardown of the AppDomain doesn't mean the teardown of the hosting process. Another AppDomain can be spun up in the same process. Any unmanaged memory that was leaked by a component because its finalizer didn't run will still be sitting around unreferenced, unreachable, and taking up space. This can be disastrous as more and more memory is leaked over time.

In .NET 1.x, the only solution was to tear down the process and start again. The .NET Framework 2.0 introduces critical finalizers, which indicate that a finalizer will be cleaning up unmanaged resources and must be given a chance to run during AppDomain teardown. See Stephen Toub's article, "Keep Your Code Running with the Reliability Features of the .NET Framework" for more information.

Assembly Leaks Assembly leaks are relatively common and are caused by the fact that once an assembly is loaded, it can't be unloaded until the AppDomain is unloaded. In most cases, this is not a problem unless assemblies are being dynamically generated and loaded. Let's now look at dynamic code generation leaks, and specifically XmlSerializer leaks, in more detail.

Dynamic Code Generation Leaks Sometimes code needs to be generated dynamically. Maybe the application has a macro scripting interface for extensibility similar to Microsoft Office. Maybe a bond-pricing engine needs to load the pricing rules dynamically so end users can create their own bond types. Maybe the application is a dynamic language runtime/compiler for Python. In many cases, it is desirable to compile the macros, pricing rules, or code to MSIL for performance reasons. System.CodeDom can be used to generate MSIL on the fly.

The code in Figure 2 dynamically generates an assembly in memory. It can be called repeatedly without a problem. Unfortunately if the macro, pricing rule, or code changes, the dynamic assembly must be regenerated. The old assembly will no longer be used, but there is no way to evict it from memory, short of unloading the AppDomain in which the assembly was loaded. The unmanaged heap memory, which is used for its code, JITed methods, and other runtime data structures, has been leaked. (Managed memory has also been leaked in the form of any static fields on the dynamically generated classes.) There is no magic formula to detect this problem. If you're dynamically generating MSIL using System.CodeDom, check whether you regenerate code. If you do, you're leaking unmanaged heap memory.

Figure 2 Dynamically Generating an Assembly in Memory

CodeCompileUnit program = new CodeCompileUnit();
CodeNamespace ns = new 
  CodeNamespace("MsdnMag.MemoryLeaks.CodeGen.CodeDomGenerated");
ns.Imports.Add(new CodeNamespaceImport("System"));
program.Namespaces.Add(ns);

CodeTypeDeclaration class1 = new CodeTypeDeclaration("CodeDomHello");
ns.Types.Add(class1);
CodeEntryPointMethod start = new CodeEntryPointMethod();
start.ReturnType = new CodeTypeReference(typeof(void));
CodeMethodInvokeExpression cs1 = new CodeMethodInvokeExpression(
  new CodeTypeReferenceExpression("System.Console"), "WriteLine", 
    new CodePrimitiveExpression("Hello, World!"));
start.Statements.Add(cs1);
class1.Members.Add(start);

CSharpCodeProvider provider = new CSharpCodeProvider();
CompilerResults results = provider.CompileAssemblyFromDom(
  new CompilerParameters(), program);

There are two main techniques for solving this problem. The first is to load the dynamically generated MSIL into a child AppDomain. The child AppDomain can be unloaded when the generated code changes and a new one spun up to host the updated MSIL. This technique works on all versions of the .NET Framework.

Another technique introduced in .NET Framework 2.0 is lightweight code generation, also known as dynamic methods. Using a DynamicMethod, MSIL op codes are explicitly emitted to define the method body, and then the DynamicMethod is invoked either directly via DynamicMethod.Invoke or via a suitable delegate.

DynamicMethod dm = new DynamicMethod("tempMethod" + 
  Guid.NewGuid().ToString(), null, null, this.GetType());
ILGenerator il = dm.GetILGenerator();

il.Emit(OpCodes.Ldstr, "Hello, World!");
MethodInfo cw = typeof(Console).GetMethod("WriteLine", 
  new Type[] { typeof(string) });
il.Emit(OpCodes.Call, cw);

dm.Invoke(null, null);

The main advantage of dynamic methods is that the MSIL and all related code generation data structures are allocated on the managed heap. This means that the GC can reclaim the memory once the last reference to the DynamicMethod goes out of scope.

XmlSerializer Leaks Portions of the .NET Framework, such as the XmlSerializer, use dynamic code generation internally. Consider the following typical XmlSerializer code:

XmlSerializer serializer = new XmlSerializer(typeof(Person));
serializer.Serialize(outputStream, person);

The XmlSerializer constructor will generate a pair of classes derived from XmlSerializationReader and XmlSerializationWriter by analyzing the Person class using reflection. It will create temporary C# files, compile the resulting files into a temporary assembly, and finally load that assembly into the process. Code gen like this is also relatively expensive. So the XmlSerializer caches the temporary assemblies on a per-type basis. This means that the next time an XmlSerializer for the Person class is created, the cached assembly is used rather than a new one generated.

By default, the XmlElement name used by the XmlSerializer is the name of the class. Thus, Person would be serialized as:

<?xml version="1.0" encoding="utf-8"?>
<Person xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance" 
  xmlns:xsd="https://www.w3.org/2001/XMLSchema">
 <Id>5d49c002-089d-4445-ac4a-acb8519e62c9</Id>
 <FirstName>John</FirstName>
 <LastName>Doe</LastName>
</Person>

Sometimes it is necessary to change the root element name without changing the class name. (The root element name might be required for compatibility with an existing schema.) So Person may have to be serialized as <PersonInstance>. Conveniently, there is an overload of the XmlSerializer constructor that takes the root element name as its second parameter, like this:

XmlSerializer serializer = new XmlSerializer(typeof(Person), 
  new XmlRootAttribute("PersonInstance"));

When the app starts serializing/deserializing Person objects, everything works until an OutOfMemoryException is thrown. This overload of the XmlSerializer constructor does not cache the dynamically generated assembly, but generates a new temporary assembly every time you instantiate a new XmlSerializer! The app is leaking unmanaged memory in the form of temporary assemblies.

To fix the leak, use the XmlRootAttribute on the class to change the root element name of the serialized type:

[XmlRoot("PersonInstance")]
public class Person {
  // code
}

If the attribute is applied directly to the type, the XmlSerializer caches the generated assemblies for the type and there is no leak. If root element names need to be dynamically switched, the application can perform the caching of the XmlSerializer instances itself by using a factory to retrieve them:

XmlSerializer serializer = XmlSerializerFactory.Create(
  typeof(Person), "PersonInstance");

XmlSerializerFactory is a class I created that checks whether a Dictionary<TKey, TValue> contains an XmlSerializer for Person using the PersonInstance root element name. If it does, the instance is returned. If not, a new one is created, stored in the hash table, and returned to the caller.

"Leaking" Managed Heap Memory

Now let's turn our attention to "leaking" managed memory. When dealing with managed memory, the GC takes care of most of the work for us. We do need to provide the GC with the information it needs to do its job. However, there are a number of scenarios that prevent the GC from doing its job efficiently and result in higher managed memory use than would otherwise be required. These situations include large object heap fragmentation, unneeded rooted references, and a midlife crisis.

Large Object Heap Fragmentation If an object is 85,000 bytes or larger, it is allocated on the large object heap. Note that this is the size of the object itself and not any children. Take the following class as an example:

public class Foo {
  private byte[] m_buffer = new byte[90000]; // large object heap
}

Foo instances would be allocated on the normal generational managed heap as it only contains a 4-byte (32-bit Framework) or 8-byte (64-bit Framework) reference to the buffer, plus some other housekeeping data used by the .NET Framework. The buffer would be allocated on the large object heap.

Unlike the rest of the managed heap, the Large Object Heap is not compacted due to the cost of moving the large objects. So as large objects are allocated, freed, and cleaned up, gaps will appear. Depending on usage patterns, the gaps in the large object heap can result in significantly more memory usage than is required by the currently allocated large objects. The LOHFragmentation application that is included in this month's download demonstrates this by randomly allocating and freeing byte arrays in the Large Object Heap. Some runs of the application result in the newly created byte arrays fitting nicely into the gaps left by freed byte arrays. On other runs of the application, this is not the case and the memory required is much larger than the memory required for the currently allocated byte arrays. To visualize fragmentation of the large object heap, use a memory profiler, such as the CLRProfiler. The red regions in Figure 3 are allocated byte arrays whereas white regions are unallocated space.

Figure 3 The Large Object Heap in CLRProfiler

Figure 3** The Large Object Heap in CLRProfiler **(Click the image for a larger view)

There is no single solution for avoiding Large Object Heap fragmentation. Examine how the application uses memory and specifically the types of objects that are on the large object heap using tools like the CLRProfiler. If the fragmentation is due to re-allocating buffers, maintain a fixed set of buffers that are reused. If the fragmentation is being caused by concatenation of large numbers of strings, examine whether the System.Text.StringBuilder class can reduce the number of temporary strings created. The basic strategy is to determine how to reduce the application's reliance on temporary large objects, which are causing the gaps in the large object heap.

Unneeded Rooted References Let's consider how the GC determines when it can reclaim memory. When the CLR attempts to allocate memory and has insufficient memory in reserve, it performs a garbage collection. The GC enumerates all rooted references, including static fields and in-scope local variables on any thread's call stack. It marks these references as reachable and follows any references these objects contain, marking them as reachable as well. It continues this process until it has visited all reachable references. Any unmarked objects are not reachable and hence are garbage. The GC compacts the managed heap, tidies up references to point to their new location in the heap, and returns control to the CLR. If sufficient memory has been freed, the allocation proceeds using this freed memory. If not, additional memory is requested from the operating system.

If we forget to null out rooted references, the GC is prevented from efficiently freeing memory as quickly as possible, resulting in a larger memory footprint for the application. The problem can be subtle, such as a method that creates a large graph of temporary objects before making a remote call like a database query or call to a Web service. If a garbage collection happens during the remote call, the entire graph is marked reachable and is not collected. This becomes even more costly because objects surviving a collection are promoted to the next generation, which can lead to a midlife crisis.

Midlife Crisis A midlife crisis does not cause an application to go out and buy a Porsche. It can, however, cause an overuse of managed heap memory and excessive amounts of processor time spent in the GC. As mentioned previously, the GC uses a generational algorithm, which is predicated on the heuristic that if an object has lived a while, it will probably live a while longer. For example, in a Windows Forms application, the main form is created when the application starts and the application exits when the main form closes. It is wasteful for the GC to continually verify that the main form is being referenced. When the system requires memory to satisfy an allocation request, it first performs a Gen0 collection. If sufficient memory is not available, a Gen1 collection is performed. If the allocation request still can't be satisfied, a Gen2 collection is performed, which involves an expensive sweep of the entire managed heap. Gen0 collections are relatively inexpensive because only recently allocated objects are considered for collection.

A midlife crisis occurs when objects tend to live until Gen1 (or worse, Gen2), but die shortly thereafter. This has the effect of turning cheap Gen0 collections into much more expensive Gen1 (or Gen2) collections. How can this occur? Take a look at the following code:

class Foo {
  ~Foo() { }
}

This object will always be reclaimed in a Gen 1 collection! The finalizer, ~Foo(), allows us to implement cleanup code for our objects that, barring a rude AppDomain abort, will run before the object's memory is freed. The GC's job is to free up as much managed memory as possible as quickly as possible. Finalizers are user-written code and can do absolutely anything. Although not recommended, a finalizer could do something silly such as logging to a database or calling Thread.Sleep(int.MaxValue). So when the GC finds an unreferenced object with a finalizer, it places the object on the finalization queue and moves on. The object has survived a garbage collection and hence is promoted a generation. There is even a performance counter for this: .NET CLR Memory-Finalization Survivors, which is the number of objects during the last garbage collection that survived due to a finalizer. Eventually the finalizer thread will run the object's finalizer and it can subsequently be collected. But you have turned a cheap Gen0 collection into a Gen1 collection, all by simply adding a finalizer!

In most cases, finalizers are not necessary when writing managed code. They are only needed when a managed object holds a reference to an unmanaged resource that needs cleanup, and even then you should use a SafeHandle-derived type to wrap the unmanaged resource rather than implementing a finalizer. Additionally, if you're using unmanaged resources or other managed types that implement IDisposable, implement the Dispose pattern to allow users of the object to aggressively clean up the resources and avoid any related finalization.

If an object only holds references to other managed objects, the GC will clean up unreferenced objects. This is in stark contrast to C++, where delete must be called on child objects. If a finalizer is empty or simply nulling out references to child objects, remove it. It is hurting performance by needlessly promoting the object to an older generation, making them more expensive to clean up.

There are other ways to get into a midlife crisis, such as holding onto objects before making a blocking call like querying a database, blocking on another thread, or calling a Web service. During the call, one or more collections can occur and result in cheap Gen0 objects being promoted to a later generation, again resulting in much higher memory usage and collection costs.

There is an even more subtle case that occurs with event handlers and callbacks. I will use ASP.NET as an example, but the same type of problem can occur in any application. Consider performing an expensive query and wanting to cache the results for 5 minutes. The query is page-specific and based on query-string parameters. To monitor caching behavior, an event handler logs when an item is removed from the cache (see Figure 4).

Figure 4 Logging Items Removed from Cache

protected void Page_Load(object sender, EventArgs e) {
  string cacheKey = buildCacheKey(Request.Url, Request.QueryString);
  object cachedObject = Cache.Get(cacheKey);
  if(cachedObject == null) {
    cachedObject = someExpensiveQuery();
    Cache.Add(cacheKey, cachedObject, null, 
      Cache.NoAbsoluteExpiration,
      TimeSpan.FromMinutes(5), CacheItemPriority.Default, 
      new CacheItemRemovedCallback(OnCacheItemRemoved));
  }
  ... // Continue with normal page processing
}

private void OnCacheItemRemoved(string key, object value,
                CacheItemRemovedReason reason) {
  ... // Do some logging here
}

This innocuous-looking code contains a major problem. All of these ASP.NET Page instances just became long-lived objects. The OnCacheItemRemoved is an instance method and the CacheItemRemovedCallback delegate contains an implicit this pointer, where this is the Page instance. The delegate is added to the Cache object. So there now exists a dependency from the Cache to the delegate to the Page instance. When a garbage collection occurs, the Page instance remains reachable from a rooted reference, the Cache object. The Page instance (and all the temporary objects it created while rendering) will now have to wait for at least five minutes before being collected, during which time they will likely be promoted to Gen2. Fortunately this example has a simple solution. Make the callback function static. The dependency on the Page instance is broken and it can now be collected cheaply as a Gen0 object.

Conclusion

I have discussed a variety of problems in .NET applications that can lead to memory leaks or overconsumption of memory. Although .NET reduces the need for you to be concerned with memory, you still must pay attention to your application's use of memory to ensure that it is well-behaved and efficient. Just because an application is managed doesn't mean you can throw good software engineering practices out the window and count on the GC to perform magic. You must continue to monitor your application's memory performance counters during the development and testing process. But it's worth it. Remember, a well-behaved application means happy customers.

James Kovacs is an independent architect, developer, trainer, and jack-of-all-trades living in Calgary, Alberta specializing in the .NET Framework, security, and enterprise application development. He is a Microsoft MVP for Solutions Architecture and received his Masters degree from Harvard University. James can be reached at jkovacs@post.harvard.edu or www.jameskovacs.com.