Wednesday, March 09, 2011

References and Pointers, Part One

Writing code in C# is really all about the programmatic manipulation of values. A value is either of a value type, like an integer or a decimal, or it's a reference to an instance of a reference type, like a string or an exception. Values you manipulate always have a storage location that stores the value; those storage locations are called "variables". Often in a C# program you manipulate the values by describing which variable you're interested in.

In C# there are three basic operations you can do to variables:

* Read a value from a variable
* Write a value to a variable
* Make an alias to a variable

The first two are straightforward. The last one is accomplished by the "ref" and/or "out" keywords:

void M(ref int x)
{
    x = 123;
}
...
int y = 456;
M(ref y);

The "ref y" means "make x an alias to the variable y". (I wish that the original designers of C# had chosen "alias" or some other word that is less confusing than "ref", since many C# programmers confuse the "ref" in "ref int x" with reference types. But we're stuck with it now.) While inside M, the variable x is just another name for the variable y; they are two names for the same storage location.

There's a fourth operation you can do to a variable in C# that is not used very often because it requires unsafe code. You can take the address of a fixed variable and put that address in a pointer.

unsafe void M(int* x)
{
    *x = 123;
}
...
int y = 456;
M(&y);

The purpose of a pointer is to manipulate a variable itself as data, rather than manipulating the value of that variable as data. If x is a pointer then *x is the associated variable.

Clearly pointers are very similar to references, and in fact references are implemented behind the scenes with a special kind of pointer. However, you can do things with pointers that you cannot do with references. For example, this doesn't do anything useful:

int Difference(ref double x, ref double y)
{
    return y - x; 
}
...
double[] array = whatever;
difference = Difference(ref array[5], ref array[15]);

That's illegal; it just takes the difference of the two doubles and tries to convert it to an int. But with pointers you can actually figure out how far apart in memory the two variables are:

unsafe int Difference(double* x, double* y)
{
    return y - x;
}
...
double[] array = whatever;
fixed(double* p1 = &array[5])
  fixed(double* p2 = &array[15])
    difference = Difference(p1, p2); // 10 doubles apart