.NET Matters

Debugging finalizers

Stephen Toub

Q I have a bunch of custom types that, for one reason or another, need to implement IDisposable. I want to make sure that the other developers on my team using this class always dispose of my types correctly. What can I do to warn one of my teammates if he forgets to call Dispose?

A For starters, Visual Studio® 2005 and FxCop can help with this if you perform static analysis on your code. Rule CA2000, described at msdn2.microsoft.com/ms182289, is "Dispose objects before losing scope," which checks to see if any local IDisposable objects are created and then not disposed of before all references to the object are out of scope. Though helpful, it's not a perfect solution, as there's only so much that can be detected through static analysis. What you really need is a way to warn developers using your types when those types are garbage collected without having been disposed, and to do that you can take advantage of finalizers.

In general, you should avoid implementing finalizers on your types unless you absolutely have to. And with SafeHandles in the Microsoft® .NET Framework 2.0, there are very few reasons you absolutely have to. There are, however, many scenarios in which you should implement IDisposable, including any scenario where your type owns a managed resource that itself implements IDisposable; in such a scenario, your type should provide a Dispose method that in turn calls Dispose on the contained resource, as shown in Figure 1. (For a much more in-depth look at implementing IDisposable, see Shawn Farkas's CLR Inside Out column in the July 2007 issue of MSDN® Magazine at msdn.microsoft.com/msdnmag/issues/07/07/CLRInsideOut.)

Figure 1 Disposing Contained Managed Resources

public class DisposableClass : IDisposable
{
    private SomeOtherDisposableClass _data = ...;

    public DisposableClass()
    {
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this); // if DisposableClass isn't sealed
    }

    protected virtual void Dispose(bool disposing)
    {
        if (disposing) 
        {
            _data.Dispose();
        }
    }

    ...
}

In Figure 1, DisposableClass doesn't directly use any native resources, and thus it doesn't need to provide a finalizer (its Dispose method does still call GC.SuppressFinalize in case a derived type implements a finalizer). However, by adding a finalizer for debugging purposes, you can introduce a way to find out when a class was not properly disposed, as shown in Figure 2. As long as all instances of the type are disposed of properly, the finalizer will never be called; if, however, any instance is not disposed of, the finalizer thread will call its Finalize method (~DisposableClass) after that instance is found to no longer be referenced, which will in turn cause an assertion in the debugger.

Figure 2 Adding a Finalizer for Debugging

public class DisposableClass : IDisposable
{
    private SomeOtherDisposableClass _data = ...;

    public DisposableClass()
    {
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    protected virtual void Dispose(bool disposing)
    {
        if (disposing)
        {
            data.Dispose();
        }
    }

#if DEBUG
    ~DisposableClass()
    {
        Debug.Fail("You forgot to Dispose this instance.");
    }
#endif

    ...
}

Unfortunately, while the developer using your type will now know that they missed disposing an instance, and they'll know which instance, they may not know where that instance came from. If your types are only being instantiated in one place, that may be sufficient. But if your types are being instantiated from all over the code base, the developers will have no way of knowing where this particular instance came from, making it more difficult for them to figure out how this instance slipped through.

To address that, you can collect additional information into your class at construction time. For example, you could add the following members to keep track of the stack trace, thread, and time at and from which this instance was created:

#if DEBUG
    private StackTrace _stack = new StackTrace(true);
    private int _threadId = Thread.CurrentThread.ManagedThreadId;
    private DateTime _time = DateTime.UtcNow;
    ...
#endif

You could also modify Debug.Fail to include information from these members when the finalizer is executed, providing the user a better chance of finding exactly where this instance came from and tracking down why it wasn't properly disposed. Note that the pragma directives in the previous code snippet (#if/#endif) are used to ensure that all of this code is only compiled into DEBUG builds, rather than RELEASE builds. Retrieving stack traces is a relatively expensive operation, and it's something you should avoid whenever possible if performance is a concern.

Additionally, you'll notice that in Figure 2 I've surrounded the finalizer in the same fashion. This is mostly a matter of style. Finalizers add some overhead to the system, even if they're suppressed, so there's no point in having one on our class if it's not necessary for correct functionality. (When allocated, finalizable objects are added to a finalization list. When these instances are no longer reachable and the GC runs, they're moved to the "FReachable" queue, which is processed by the finalizer thread. Suppressing finalization with GC.SuppressFinalize sets a "do not run my finalizer" flag in the object's header, such that the object will not get moved to the FReachable queue by the GC. As a result, while minimal, there is still overhead to giving an object a finalizer even if the finalizer does nothing or is suppressed.)

This works well for an individual type, but if you're creating lots of types that need similar behavior, most coding practices dictate that you factor out this code into a reusable class. This is actually a deceptively tricky problem, and I'll spend the rest of this answer walking through several different implementations, providing the pros and cons of each. You can pick and choose from each implementation based on your needs.

One thing that's common to all implementations is a need to store various amounts of information at construction time. For this purpose, I've chosen to create a custom exception class. This exception class, when constructed, retrieves data similar to that mentioned previously. It can then later be thrown when the finalizer is invoked, or its ToString method can be used as the argument to a call to Debug.Fail or something similar. (Note that with the .NET Framework 2.0 and later, throwing an exception from a finalizer thread will, by default, tear down the application. This is different than in previous versions, where exceptions from the finalizer thread would be silently eaten by the runtime. I've chosen not to throw the exception, but if you believe that not disposing an object is a serious enough error to warrant tearing down the process, feel free to uncomment the relevant line.) My InvokedFinalizerException is shown in Figure 3.

Figure 3 InvokedFinalizerException

[Serializable]
public class InvokedFinalizerException : Exception
{
    public readonly DateTime InstantiationTime = DateTime.UtcNow;
    public readonly int InstantiationThreadId = 
        Thread.CurrentThread.ManagedThreadId;
    public readonly StackTrace InstantiationStackTrace = 
        new StackTrace(true);

    public InvokedFinalizerException() : 
        base("A finalizer was invoked.") { }

    public InvokedFinalizerException(string message) : base(message) { }

    public InvokedFinalizerException(string message, Exception inner) :
        base(message, inner) { }

    protected InvokedFinalizerException(
        SerializationInfo info, StreamingContext context) :
        base(info, context) { }

    public override string ToString()
    {
        return
            "Time: " + InstantiationTime + Environment.NewLine +
            "Thread: " + InstantiationThreadId + Environment.NewLine +
            "Stack: " + InstantiationStackTrace;
    }
}

With that in place, the first technique involves moving all of the tracking code into a separate constructible class, FinalizationDebugger, as shown in Figure 4. An instance of this class is stored as a member of your disposable class and is constructed when your class is constructed. When your class is disposed, this separate FinalizationDebugger class is also disposed. If your class isn't disposed, when it's no longer referenced FinalizationDebugger will eventually be collected, and its finalizer will provide all of the relevant information we need to figure out what instance wasn't properly disposed of, where it came from, and so forth. Note that FinalizationDebugger holds a reference to your object, but since your object contains the only reference to this FinalizationDebugger instance, its reference to your object will not prevent your object from being collected (if the .NET Framework garbage collector instead worked on a reference counting scheme rather than the mark-and-sweep scheme it does use, this cycle could be a problem). Of course, as you can see in Figure 4, using this in your disposable class still requires adding a bit of goo. In fact, this isn't much better than our original version we were trying to refactor into a cleaner solution.

Figure 4 Member Approach

Refactored into Separate Class


public sealed class FinalizationDebugger : IDisposable
{
    private InvokedFinalizerException _exc;
    private object _obj;

    public FinalizationDebugger(object obj)
    {
        _exc = new InvokedFinalizerException();
        _obj = obj;
    }

    public void Dispose()
    {
        GC.SuppressFinalize(this);
    }

    ~FinalizationDebugger()
    {
        Debug.Fail(_exc.ToString());
        // throw _exc;
    }
}

Using FinalizationDebugger

public class DisposableClass : IDisposable
{
    private SomeOtherDisposableClass _data = ...;
#if DEBUG
    private FinalizationDebugger _fd;
#endif

    public DisposableClass()
    {
#if DEBUG
       _fd = new FinalizationDebugger(this);
#endif
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    protected virtual void Dispose(bool disposing)
    {
        if (disposing)
        {
            _data.Dispose();
#if DEBUG
            _fd.Dispose();
#endif
        }
    }
}

The next approach takes this one step further by moving all of this code into a base class that your disposable class can derive from. This approach is shown in Figure 5. Notice how clean this makes your disposable class, as the only changes required are adding FinalizationDebuggerBase as the base class for your type, deleting your parameterless Dispose method (since it's already implemented by the base class), and changing your Dispose(bool) method to override the base implementation.

Figure 5 Base Class Approach

Refactored into Base Class


public class FinalizationDebuggerBase : IDisposable
{
#if DEBUG
    private FinalizationDebugger _fd;
#endif

    public FinalizationDebuggerBase()
    {
#if DEBUG
        _fd = new FinalizationDebugger(this);
#endif
    }

    public void Dispose() { Dispose(true); }

    protected virtual void Dispose(bool disposing)
    {
#if DEBUG
        if (disposing) _fd.Dispose();
#endif
    }
}

Using FinalizationDebuggerBase


public class DisposableClass : FinalizationDebuggerBase
{
    private SomeOtherDisposableClass _data = ...;

    public DisposableClass() {}

    protected override void Dispose(bool disposing)
    {
        try
        {
            if (disposing) _data.Dispose();
        }
        finally { base.Dispose(disposing); }
    }
}

Unfortunately, there are several problems with this approach. First, the #if/#endif pragma directives are evaluated at compile time when FinalizationDebuggerBase is compiled. If FinalizationDebuggerBase is part of the same project as your DisposableClass, it will always pick up the same DEBUG/RELEASE compilation flags as your class and will be kept in sync, which is almost certainly what you want to happen. But if FinalizationDebuggerBase is in another assembly (for example, if it's compiled into an assembly that's shared by a bunch of projects at your company), the assembly you reference will have already been compiled, which means it won't respect your DEBUG/RELEASE choice for your project. As a result, you could compile your project in RELEASE mode and expect none of this finalization code to be included. But if the FinalizationDebuggerBase class were compiled in DEBUG mode, your expectations would not be met. The second problem, which should be more obvious, is that .NET does not support multiple class inheritance, meaning that a class can only have one base class (though it can implement any number of interfaces). This means if your class already has a base class, you can't use this approach (unless you can somehow finagle this base class into the hierarchy somewhere).

A third option involves factoring everything out into a static class that your instances call into. My implementation of this is shown in Figure 6. In this approach, FinalizationDebugger is a static class that exposes three static methods: Constructor, Dispose, and Finalizer. The idea is that you call these methods from the appropriate place in your class, where the appropriate place should be obvious from the names of these methods (see Figure 6 for an example). This is minimally invasive into your code, as it typically involves adding only three lines of code (though if your type doesn't already have a finalizer, you'd need to add those few additional lines). All of these methods are marked with a ConditionalAttribute such that they'll only be called by your class when you compile your class in DEBUG mode.

Figure 6 Static Class Approach

Refactored into a Static Class


public static class FinalizationDebugger<T> where T : class
{
    private static Dictionary<object, InvokedFinalizerException> _db =
        new Dictionary<object, InvokedFinalizerException>();

    [Conditional("DEBUG")]
    public static void Constructor(T obj)
    {
        if (obj == null) throw new ArgumentNullException("obj");
        ObjectEqualityWeakReference weakRef = 
            new ObjectEqualityWeakReference(obj);
        InvokedFinalizerException exc = new InvokedFinalizerException();
        lock (_db) _db.Add(weakRef, exc);
    }

    [Conditional("DEBUG")]
    public static void Dispose(T obj)
    {
        if (obj == null) throw new ArgumentNullException("obj");
        lock (_db) _db.Remove(obj);
    }

    [Conditional("DEBUG")]
    public static void Finalizer(T obj)
    {
        if (obj == null) throw new ArgumentNullException("obj");
        InvokedFinalizerException exc;
        lock (_db)
        {
            if (_db.TryGetValue(obj, out exc)) _db.Remove(obj);
        }
        if (exc != null)
        {
            Debug.Fail(exc.ToString());
            // throw exc;
        }
    }
    ... // private class ObjectEqualityWeakReference
}

Using FinalizationDebugger<T>


public class DisposableClass : IDisposable
{
    private SomeOtherDisposableClass _data = ...;

    public DisposableClass()
    {
        ... 
        FinalizationDebugger<DisposableClass>.Constructor(this);
    }

    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    protected virtual void Dispose(bool disposing)
    {
        if (disposing)
        {
            FinalizationDebugger<DisposableClass>.Dispose(this);
            _data.Dispose();
            ...
        }
    }

    ~DisposableClass()
    {
        FinalizationDebugger<DisposableClass>.Finalizer(this);
        ...
    }
}

The implementation of this class deserves a bit of explanation. FinalizationDebugger contains a static Dictionary that maps objects to our custom exception class that contains all of the information about when and how that associated object was constructed. When an object is constructed and calls the Constructor method, passing itself as an argument, the object is inserted into this table along with an instance of the exception. When the Dispose method is called, the object and associated exception are removed from the table. When the Finalizer method is called, the table is queried to see if the object exists; if it does, the associated exception is retrieved and is used to assert. (It can also be thrown if desired.) There are, of course, a few intricacies to this implementation.

The first is that, as these are static methods and could conceivably be called from multiple threads concurrently, we need to protect access to shared data. To do so, I use a monitor to shield all accesses to the underlying Dictionary.

The second issue is more interesting. The Dictionary is a static member of the FinalizationDebugger class, which makes it a GC root. As such, any objects stored in this Dictionary will never be free for collection by the garbage collector, because they'll always be reachable. If I were to simply store your disposable class instances into this Dictionary, they would never be available for garbage collection, and thus they'd never be finalized, defeating this whole system (not to mention causing a huge memory leak). To solve this, we need weak references.

The System.WeakReference class takes advantage of functionality provided by the garbage collector and System.GCHandle. When you instantiate a WeakReference for an object, internally WeakReference allocates a weak GCHandle (either GCHandleType.Weak or GCHandleType.WeakTrackResurrection) for the object. Rather than storing a reference to your object, it simply stores this GCHandle. A weak GCHandle is used to track an object but still allow it to be collected (when an object is collected, the contents of the GCHandle are zeroed). Thus, WeakReference allows you access to the underlying object, but it doesn't maintain a strong reference to it, such that if the garbage collector runs and the object would otherwise be collectable, it'll be collected and WeakReference will return null on any future attempts to access the object. As such, WeakReferences solve our static Dictionary problem. By storing WeakReferences instead of the actual object, the objects will still be collectable.

That, however, leads to another problem. Dictionary uses an object's hash code as well as its Equals method to determine where in the Dictionary to store the object and whether the object already exists in the Dictionary (for lookups and the like). The WeakReference reference type, however, doesn't override either GetHashCode or Equals; its equality semantics default to that of System.Object, and thus checks to see whether the two WeakReferences instances being compared are the exact same instance. As a result, code such as the following will print out "False" twice:

object a = new object();
Dictionary<object, bool> dict = new Dictionary<object, bool>();
dict.Add(new WeakReference(a), true);
Console.WriteLine(dict.ContainsKey(a));
Console.WriteLine(dict.ContainsKey(new WeakReference(a)));

To work around this, there are two solutions we can attempt. The first would be to modify the FinalizationDebugger.Constructor method to return the created WeakReference. Your DisposableClass could then hold onto this WeakReference and provide it (rather than itself) in the future calls to FinalizationDebugger.Dispose and FinalizationDebugger.Finalizer:

private WeakReference _weakRef;
...
_weakRef = FinalizationDebugger.Constructor(this);
...
FinalizationDebugger.Finalizer(_weakRef);

This introduces another problem, however. The order in which objects are finalized is undefined, and, as a result, it's a bad idea to reference one finalizable instance (that could have already been finalized) in the finalizer of another. WeakReference itself implements a finalizer, and thus the previous code snippet results in attempting to use a WeakReference that may have already been finalized from the finalizer of your disposable type.

The second solution, and the one I've chosen to implement, solves the problem at its core: implementing GetHashCode and Equals on WeakReference. To do so, I've created the class shown in Figure 7, ObjectEqualityWeakReference, which overrides WeakReference's Equals and GetHashCode methods to provide the semantics I need. When ObjectEqualityWeakReference is constructed, it caches into a member variable the hash code of the supplied object, and it is this cached value that's returned from the overridden GetHashCode; that way, the hash code is based on the underlying object. And even if the object is collected, the ObjectEqualityWeakReference will continue to return the same hash code value. An object's hash code should never change, and Dictionary uses the hash code as the first step to finding an object in its table; if the object's hash code changes after the object has been added, there's a good chance Dictionary won't be able to find it.

Figure 7 ObjectEqualityWeakReference

private class ObjectEqualityWeakReference : WeakReference
{
    private int _hashCode;

    public ObjectEqualityWeakReference(object obj) :
        base(obj, true)
    {
        _hashCode = RuntimeHelpers.GetHashCode(obj);
    }

    public override bool Equals(object obj)
    {
        WeakReference other = obj as WeakReference;

        if (other == null) return ReferenceEquals(obj, this.Target);

        return ReferenceEquals(other, this) ||
               ReferenceEquals(other.Target, this.Target);
    }

    public override int GetHashCode() { return _hashCode; }
}

ObjectEqualityWeakReference also overrides the Equals method. I wanted Equals to work when comparing an instance of ObjectEqualityWeakReference to another WeakReference referencing the same underlying object or when comparing it directly against that object. As such, Equals first checks to see if the object being compared is another WeakReference. If it is, Equals returns whether the WeakReference instances are the same or whether the underlying objects are the same. If the object being compared is not a WeakReference, Equals returns whether it matches the underlying object. (Wanting to support adding ObjectEqualityWeakReference instances but querying by the underlying object type is the reason that the Dictionary's TKey type parameter is bound to Object rather than to ObjectEqualityWeakReference.)

There's one more important thing to note about ObjectEqualityWeakReference, and that's its constructor:

public ObjectEqualityWeakReference(object obj) : base(obj, true)

This constructor delegates to the constructor of WeakReference that accepts two parameters, not one. The second parameter is a Boolean that indicates to WeakReference when to stop tracking the object. If the value is false, the object is only tracked until finalization, and if the value is true, the object is tracked until after finalization. Because we need the object to survive through the call to FinalizationDebugger.Finalize, we set this to true. Note that this Boolean parameter simply dictates what kind of GCHandle is created. If the value is false, a GCHandle.Weak is created, and if it's true, a GCHandle.WeakTrackResurrection is created. The naming comes from the concept of resurrection, whereby an object can be saved during finalization. If an object's finalizer creates a new rooted reference to the object (such as by storing the object into a static field somewhere), the object will now again be reachable and thus won't be collected. In order for that to work with a WeakReference, the GCHandle must not allow the object to be collected until after the finalization phase, and hence naming the previously described behavior "WeakTrackResurrection."

That completes the static class solution, which works well. It does, however, have drawbacks just like the other approaches. For one, due to the static Dictionary that needs to be protected, this approach requires locks, which could cause contention if lots of objects are being constructed and disposed. This is mitigated to some extent in a few ways. First, the work performed in the locked regions is kept to a minimum—I'm simply wrapping a Dictionary.Add call in Constructor, a Dictionary.Remove call in Dispose, and Dictionary.TryGetValue and Dictionary.Remove calls in Finalizer. Second, I've made FinalizationDebugger a generic class based on the type of the objects being tracked. This, in effect, creates a whole new FinalizationDebugger type per type you're working with (with separate locks and dictionaries), and as such only those calls based on the same type will cause contention.

Another drawback to this approach is that locking in a finalizer is not the greatest thing to do. If for some reason the finalizer isn't able to obtain the lock within a reasonable amount of time, the CLR may simply abort the finalizer, but that's very unlikely given the current code.

The biggest drawback I see with this approach is that it requires your disposable class to implement a finalizer so that it can call the FinalizationDebugger.Finalizer method. If the Finalizer method is never called, the check will never be performed to see whether the object wasn't properly disposed. This isn't a big deal if your class already implements a finalizer, but if it doesn't, you'll need to add one for this purpose.

There are other variants on all of these approaches, and I'm sure there are other approaches I haven't considered. If you have a new approach that you think solves all of the various problems each of these approaches has, I'd be interested in hearing about it. In the meantime, any of these should help you to enable developers using your disposable types to track down cases where they are not properly disposing your objects.

Send your questions and comments for Stephen to netqa@microsoft.com.

Stephen Toub is a Senior Program Manager on the Parallel Computing Platform team at Microsoft. He is also a Contributing Editor for MSDN Magazine.