On reference counting and garbage collection

¶On reference counting and garbage collection

I find it interesting that both Apple and Microsoft have opted to invest in compiler-assisted garbage collection through reference counting. It's called Automatic Garbage Collection (ARC) on Mac OS X / iOS and it's part of the Windows Runtime environment on the Microsoft side. This is despite reference counting being regarded as the weakest and poorest performing type of garbage collection.

Wikipedia has a much better writeup of reference counting than I could do, but in summary, you find a place to associate a reference count with each object, increment and decrement the reference count whenever references are created or destroyed, and destroy the object when the count reaches zero. This can either be done either intrusively (COM AddRef()/Release()) or non-intrusively (std::shared_ptr). Refcounting is thus a simple and effective mechanism, but the cost of maintaining the reference counts and the inability to handle cycles often make tracing garbage collection a preferable option for language runtimes.

The commonly cited problem with tracing GC is the delay imposed by the mark-and-sweep operation. There are lots of approaches used to mitigating this, such as generational and incremental collection, and for the most part I'd say they're effective on desktop apps, as we're long past the point where a Java program would visibly pause on screen while the GC cleaned house. What I don't see mentioned as often is the cost of getting this in place, namely:

The need to emit enough metadata to allow for precise garbage collection, where the GC can definitively distinguish references from non-references in memory. This is required for a compaction and still highly desirable for non-compacting GC. I've tried imprecise GC, and trust me, it sucks. You don't realize how bad it is until you've had 10MB of memory leaked by a float that looked like a heap pointer.
The need to modify code generation to suit the GC, usually to assist in tracking modified references in the heap. This can be done through paging hardware, but that requires OS support and only achieves page granularity (often 4K or 64K depending on the hardware). Doing it in software avoids these problems but requires
tailored code generation and imposes a performance penalty. For example, the .NET CLR has to call a small function every time you modify a reference in order to update a card marking table.
The interop selfishness imposed by the GC. At a minimum, this is the need for the GC to see all external references to objects in its heap ("GC roots"). For a compacting GC this becomes the need to pin all objects while they could be referenced externally, which is usually also accompanied by documentation stating how you are being unspeakably cruel to the GC by doing so. Then, there's also the "there can only be one" problem, where the GC wants to be the sole overseer of memory and you get cycle-related problems if you tie two GCs together.

These reasons why I don't like attempts to make tracing GC an optional language feature: the GC dependency tends to leak out into the rest of the program and gets its tendrils into everything. This is highly undesirable in C++, which is largely a pay-as-you-go language where you don't pay for a language feature if you don't use it. An imprecise, non-compacting conservative garbage collector is sometimes used to avoid imposing onerous requirements on the C++ compiler and runtime environment, but in my experience this is unsatisfactory: you still have to make sure the GC sees everything and you get memory leaks from the imprecise collection.

This brings us back to reference counting. The often touted advantage to reference counting is that it is deterministic, but I like even more that it has minimal requirements: adding a reference count and AddRef/Release operations to an object is all that is required. Simple and effective, can be used in just about any environment, and used only where needed. I also find that the performance cost of reference counting is overstated since data structures are often read more often than they are modified and there is usually no need to take a strong reference on temporary pointers during read access. Cycles are still a problem but can often be avoided by judicious use of untracked weak pointers, the most common case being parent pointers in a tree structure. Reference counting also doesn't require traversal of live but infrequently accessed objects, which is the main culprit in tracing GC performing badly when a virtual memory system is paging.

Despite this, using reference counting correctly and efficiently does require some manual effort, and I'm skeptical about trying to hide it by building it into the compiler as "automatic." It will definitely be easier and less error prone on average, but there will be cases where the compiler fails to optimize out refcount adjustments on critical paths, and the lower visibility will make it easier to fall into refcounting traps. Looking over the ARC and WinRT docs a bit, I had two thoughts:

The WinRT docs don't mention the cycle trap. (The ARC docs do.)
Neither specifies whether it is allowed to temporarily resurrect an object during destruction. This results in a crash with naive Release() implementations that leave the refcount at 0 and attempt a double-destroy when it goes to 1 and back to 0 again.

That's not counting the performance overhead, which I haven't been able to assess not having used either implementation. As such, while I like refcounting, I have to think that compiler-aided refcounting is only suitable as a stop-gap until tracing GC can be further improved enough to be used universally.

19 comments | Nov 06, 2011 at 08:26 | default

Current version

Navigation

Archives

¶On reference counting and garbage collection

Comments