Tuesday, September 21, 2010

Are multicore-capable garbage collectors hard to write ?

In this era of multicore computing, garbage collectors need to allow user threads (aka mutators) to run in parallel on separate cores simultaneously in order to facilitate efficient shared memory parallel programming.
There are two relevant phrases from garbage collection terminology here:
Parallel GC means the garbage collector itself has been parallelised in order to speed up garbage collections. For example, a stop-the-world collector might traverse the heap in parallel when marking in an attempt to reduce pause times.
Concurrent GC means the garbage collector runs at the same time as the user threads (aka mutators). For example, Dijkstra's algorithm and derivatives like Doligez-Leroy use fine-grained synchronization to keep a concurrently-running collector apprised of the constantly-changing heap topology.
However, we are talking about neither parallel GC nor concurrent GC but, rather, the simpler challenge of just allowing mutators to run in parallel. In the absence of any established terminology, we call any GC that allows mutator threads to run in parallel with each other a multicore-friendly garbage collector.
Frustratingly, many people are perpetuating the myth that it is difficult to write parallel or concurrent or even just multicore-friendly garbage collectors. In particular, this is happening around the OCaml language as a result of Xavier Leroy's (in)famous "standard lecture on threads" from 2002 where he explained that they were not creating a multicore-friendly garbage collector for OCaml because multicores would never become popular and he described Intel's hyperthreading as "the last convulsive movement of SMP's corpse".
Read more: The Flying Frog Blog