Sunday, May 26, 2013

.NET Interview Question: Garbage Collection

Q1. What is garbage collection?

Garbage collection is a heap-management strategy where a run-time component takes responsibility for managing the lifetime of the memory used by objects. This concept is not new to .NET - Java and many other languages/runtimes have used garbage collection for some time.

Q2. Is it true that objects don't always get destroyed immediately when the last reference goes away?

Yes. The garbage collector offers no guarantees about the time when an object will be destroyed and its memory reclaimed.

There was an interesting thread on the DOTNET list, started by Chris Sells, about the implications of non-deterministic destruction of objects in C#. In October 2000, Microsoft's Brian Harry posted a lengthy analysis of the problem. Chris Sells' response to Brian's posting is here.

Q3. Why doesn't the .NET runtime offer deterministic destruction?

Because of the garbage collection algorithm. The .NET garbage collector works by periodically running through a list of all the objects that are currently being referenced by an application. All the objects that it doesn't find during this search are ready to be destroyed and the memory reclaimed. The implication of this algorithm is that the runtime doesn't get notified immediately when the final reference on an object goes away - it only finds out during the next 'sweep' of the heap.

Futhermore, this type of algorithm works best by performing the garbage collection sweep as rarely as possible. Normally heap exhaustion is the trigger for a collection sweep.

Q4. Is the lack of deterministic destruction in .NET a problem?

It's certainly an issue that affects component design. If you have objects that maintain expensive or scarce resources (e.g. database locks), you need to provide some way to tell the object to release the resource when it is done. Microsoft recommend that you provide a method called Dispose() for this purpose. However, this causes problems for distributed objects - in a distributed system who calls the Dispose() method? Some form of reference counting or ownership-management mechanism is needed to handle distributed objects - unfortunately the runtime offers no help with this.

Q5. Should I implement Finalize on my class? Should I implement IDisposable?

This issue is a little more complex than it first appears. There are really two categories of class that require deterministic destruction - the first category manipulate unmanaged types directly, whereas the second category manipulate managed types that require deterministic destruction. An example of the first category is a class with an IntPtr member representing an OS file handle. An example of the second category is a class with a System.IO.FileStream member.

For the first category, it makes sense to implement IDisposable and override Finalize. This allows the object user to 'do the right thing' by calling Dispose, but also provides a fallback of freeing the unmanaged resource in the Finalizer, should the calling code fail in its duty. However this logic does not apply to the second category of class, with only managed resources. In this case implementing Finalize is pointless, as managed member objects cannot be accessed in the Finalizer. This is because there is no guarantee about the ordering of Finalizer execution. So only the Dispose method should be implemented. (If you think about it, it doesn't really make sense to call Dispose on member objects from a Finalizer anyway, as the member object's Finalizer will do the required cleanup.) For classes that need to implement IDisposable and override Finalize, see Microsoft's documented pattern.

Note that some developers argue that implementing a Finalizer is always a bad idea, as it hides a bug in your code (i.e. the lack of a Dispose call). A less radical approach is to implement Finalize but include a Debug.Assert at the start, thus signalling the problem in developer builds but allowing the cleanup to occur in release builds.

Q6. Do I have any control over the garbage collection algorithm?

A little. For example the System.GC class exposes a Collect method, which forces the garbage collector to collect all unreferenced objects immediately.

Also there is a gcConcurrent setting that can be specified via the application configuration file. This specifies whether or not the garbage collector performs some of its collection activities on a separate thread. The setting only applies on multi-processor machines, and defaults to true.

Q7. How can I find out what the garbage collector is doing?

Lots of interesting statistics are exported from the .NET runtime via the '.NET CLR xxx' performance counters. Use Performance Monitor to view them.

Q8. What is the lapsed listener problem?

The lapsed listener problem is one of the primary causes of leaks in .NET applications. It occurs when a subscriber (or 'listener') signs up for a publisher's event, but fails to unsubscribe. The failure to unsubscribe means that the publisher maintains a reference to the subscriber as long as the publisher is alive. For some publishers, this may be the duration of the application.

This situation causes two problems. The obvious problem is the leakage of the subscriber object. The other problem is the performance degredation due to the publisher sending redundant notifications to 'zombie' subscribers.

There are at least a couple of solutions to the problem. The simplest is to make sure the subscriber is unsubscribed from the publisher, typically by adding an Unsubscribe() method to the subscriber. Another solution, documented here by Shawn Van Ness, is to change the publisher to use weak references in its subscriber list.

Q9. When do I need to use GC.KeepAlive?

It's very unintuitive, but the runtime can decide that an object is garbage much sooner than you expect. More specifically, an object can become garbage while a method is executing on the object, which is contrary to most developers' expectations. Chris Brumme explains the issue on his blog. I've taken Chris's code and expanded it into a full app that you can play with if you want to prove to yourself that this is a real problem:

using System;
using System.Runtime.InteropServices;
class Win32
{
  [DllImport("kernel32.dll")]
  public static extern IntPtr CreateEvent( IntPtr   lpEventAttributes,
  bool bManualReset,bool bInitialState, string lpName);
  [DllImport("kernel32.dll", SetLastError=true)]
  public static extern bool CloseHandle(IntPtr hObject);
  [DllImport("kernel32.dll")]
  public static extern bool SetEvent(IntPtr hEvent);
}

class EventUser
{
  public EventUser()
  {
    hEvent = Win32.CreateEvent( IntPtr.Zero, false, false, null   );
}
~EventUser()
{
  Win32.CloseHandle( hEvent );
  Console.WriteLine("EventUser finalized");
}
  public void UseEvent()
{
  UseEventInStatic( this.hEvent );
}
  static void UseEventInStatic( IntPtr hEvent )
  Satish Marwat Dot Net Web Resources satishcm@gmail.com 12 Page
{
  //GC.Collect();
  bool bSuccess = Win32.SetEvent( hEvent );
  Console.WriteLine( "SetEvent " + (bSuccess ? "succeeded" :    "FAILED!") );
}
IntPtr hEvent;
}
class App
{
static void Main(string[] args)
{
EventUser eventUser = new EventUser();
 eventUser.UseEvent();
}
}

If you run this code, it'll probably work fine, and you'll get the following output:

SetEvent succeeded
EventDemo finalized

However, if you uncomment the GC.Collect() call in the UseEventInStatic() method, you'll get this output:

EventDemo finalized
SetEvent FAILED!

(Note that you need to use a release build to reproduce this problem.)

So what's happening here? Well, at the point where UseEvent() calls UseEventInStatic(), a copy is taken of the hEvent field, and there are no further references to the EventUser object anywhere in the code. So as far as the runtime is concerned, the EventUser object is garbage and can be collected. Normally of course the collection won't happen immediately, so you'll get away with it, but sooner or later a collection will occur at the wrong time, and your app will fail.

A solution to this problem is to add a call to GC.KeepAlive(this) to the end of the UseEvent method, as Chris explains.

No comments:

Post a Comment

Automatic Traffic Exchange

YallaTech Facebook page