Sunday, October 10, 2010

Order in Chaos: .NET Collections

This is a review of the current available collection types in .NET Framework 4.0
Note that I don’t plan to explain all the history of the different collection. If a class is obsolete, that is enough information.
The intention is to use this post as a reference when you need to decide which type of collection you should use.
Also, I’m not going to give the complexity of each function. You can find such information in the relevant class documentation on MSDN.
Anyway, you can deduce it yourself, since this it quite common knowledge, for example, an array-based list has O(1) for adding to the end and O(n) for adding in the middle. Just think about it.
One final note before we begin, I'm skipping the special collections, mainly the synchronized and concurrent ones.
The review will have have the following convention: Class Name, Namespace, Description and Bottom Line.
For example:
Class name: Array
Namespace: System
Description: The base class of all the arrays in the .NET framework.
Bottom line: Very basic type, but sometimes this is all you need.
The Old Collections
Common Old Collections
Class name: ArrayList
Namespace: System.Collection
Description: List of objects, internally uses an array which resizes dynamically.
Bottom line: Use List<T> instead.
Class name: Stack
Namespace: System.Collection
Description: Represents a last-in-first-out (LIFO) collection of objects. Implemented as a circular array, sized dynamically.
Bottom line: Use Stack<T> instead.
Class name: Queue
Namespace: System.Collection
Description: Represents a first-in-first-out (FIFO) collection of objects.
Implemented as a circular array, sized dynamically.
Bottom line: Use Queue<T> instead.
Class name: Hashtable
Namespace: System.Collection
Description: Represents a mapping between a key and a value using a hash table.
Bottom line: Use Dictionary<T> instead.
Less Common Old Collections
Class name: CollectionBase
Namespace: System.Collection
Description: A base class intended to be used in implementations of strongly typed collections.
Bottom line: Use Collection<T> instead.
Class name: ReadOnlyCollectionBase
Namespace: System.Collection
Description: A base class intended to be used in implementation of strongly typed read-only collections.
Bottom line: Use ReadOnlyCollection<T> instead.
Class name: SortedList
Namespace: System.Collection
Description: A list of key-value pairs, sorted by keys, can be accessed using key or index.
Bottom line: Use SortedList<TKey,TValue> instead.
Class name: ListDictionary
Namespace: System.Collections.Specialized
Description: Like Hashtable, but internally uses a singly linked list. Recommended only when the number of items in the collection is 10 or less.
Bottom line: IMHO, this optimization doesn’t worth my time. Use Dictionary<T> instead.
Class name: HybridDictionary
Namespace: System.Collections.Specialized
Description: Uses ListDictionary when the collection is small and switches to Hashtable when the collections gets large.
Bottom line: If this is so great it should have been Hashtable default implementation. Use Dictionary<T> instead.
Class name: OrderedDictionary
Namespace: System.Collections.Specialized
Description: A list of key-value pairs, can be accessed using key or index.
Bottom line: Use SortedList<TKey, TValue> instead.
Class name: StringCollection
Namespace: System.Collections.Specialized
Description: A collection of strings.
Bottom line: Use List<string> instead.
Class name: StringDictionary
Namespace: System.Collections.Specialized
Description: A hash table with both keys and values as strings.
Bottom line: Use Dictionary<string,string> instead.
Still Useful
Class name: BitArray
Namespace: System.Collection
Description: An efficient array of bit values, uses only 1 bit for each flag.
Bottom line: Only if you have tons of flags.
Class name: BitVector32
Namespace: System.Collections.Specialized
Description: Like BitArray but more performant and limited to 32 bits
Bottom line: Only when a plain old Enum doesn’t answer your needs.
The New Collections
Common New Collections
Class name: List<T>
Namespace: System.Collections.Generic
Description: Generic list of T, internally uses an array which resizes dynamically.
Bottom line: Your first choice when in need of a simple dynamic collection.
Class name: LinkedList<T>
Namespace: System.Collections.Generic
Description: Generic list of T, internally uses doubly linked nodes to store the data.
Bottom line: Same as List<T> only with different performance in some of the operations.
Class name: Stack<T>
Namespace: System.Collections.Generic
Description: Generic last-in-first-out (LIFO) collection.
Bottom line: Your first choice when in need of a stack.
Read more: Arik Poznanski's Blog