Tuesday, May 24, 2011

Read-only and threadsafe are different

Here's a common problem that we face in the compiler realm all the time: you want to make an efficient immutable lookup table for mapping names to "symbols". This is in a sense the primary problem that the compiler has to solve; someone says "x = y + z;" and we have to figure out what "x", "y" and "z" mean before we can do any more analysis. An obvious way to do that is to figure out all the name-to-symbol mappings for a particular declaration space once, ahead of time, stuff the results into a lookup table, and then use that lookup table during the analysis.

The lookup table can be immutable because once it is built, it's built; no more symbols are going to be added to it. It's going to be used solely as a lookup mechanism. A common and cheap way of doing that is to use what I whimsically call "popsicle" immutability: the data structure is a fully mutable structure until you freeze it, at which point it becomes an error to attempt to change it. This technique differs markedly from the sort of "persistently re-usable data structure" immutability we've talked about in the past, where you want to reuse existing bits of the data structure as you build new, different data structures out of it.

For example, we might write up a really simple little hash table that supports "freezing". Something like this:

abstract class Symbol 
    public string Name { get; protected set; }
}

sealed class SymbolTable
{
    private bool frozen = false;
    private class BucketListNode
    {
        public Symbol Symbol { get; set; }
        public BucketListNode Next { get; set; }
        public BucketListNode Prev { get; set; }
    }