Wednesday, July 25, 2007

Collections Best Practices

This is referenced form an article in August issue of CLR Inside Out. Link

The basic purpose of collections is to tie certain objects together as a group because they have some logical connection to each other.

General criteria to select a particular collection object goes like this

  1. Are you going to be adding all the elements at once?
  2. Are you going to be dynamically adding and removing elements?
  3. Do you only need to iterate over the entire collection, or do you also need to locate and use specific elements?
  4. Are you exposing your collection through a public API?
  5. What are the performance characteristics and requirements of your application?


If you’re using the .NET Framework 2.0 or later, using the old non-generic collections is strongly discouraged. Following table shows the mapping between

.NET 1.0 Collection classes and .NET 2.0 Generic collection classes

Non-GenericGeneric Replacement
ArrayListList<T>
BitArrayNo replacement exists
CollectionBaseCollection<T>
ComparerComparer<T>
CompatibleComparerComparer<T>
DictionaryBaseKeyedCollection<TKey,TItem>
HashtableDictionary<TKey,TValue>
ICollectionICollection<T>
IDictionaryIDictionary<TKey,TValue>
IListIList<T>
QueueQueue<T>
ReadOnlyCollectionBaseReadOnlyCollection<T>
SortedListSortedList<TKey, TValue>
StackStack<T>

Pros and Cons of each type:

Dictionary
When you want additions(fast), removals(fast), and lookups(average) to be very quick, and when you are not concerned about the order of the items in the collections, your first inclination should be to use a System.Collections.Generic.Dictionary<TKey, TValue>.

List
If your usage pattern requires few deletions and mostly additions, and if it is important for you to keep the collection in order, you may want to choose a List<T>. If you do not care about sorting or inserting items in the middle, nor do you care about looking up individual items in the list but want to iterate over complete list.

Queue
to implement a first-in-first-out (FIFO) order

Stack
to implement a last-in-first-out (LIFO) order

LinkedList
when you need to maintain order yet still achieve fast inserts. But increases increased activity by the garbage collector. While the actual act of inserting an item into a LinkedList<T> is much faster than doing so into a List<T>, finding the particular location at which you want to insert a new value still requires traversing the list to find the correct location.

SortedDictionary
uses a balanced tree implementation as the underlying data store; this provides relatively fast lookups and maintains items in a sorted order, but insertions will most likely be slower.

SortedList
which uses two separate arrays to maintain the keys and the values separately and to maintain them both in order

Extending Collections
The System.Collections.ObjectModel.Collection<T> class implements all of the standard collection interfaces, and your custom collection type can simply derive from Collection<Double> in order to provide the additional overload methods. You could also use extension methods in C# 3.0 to achieve the same.

Gist on Collection Interfaces

IEnumerable
methods:
GetEnumerator - returns IEnumerator

IEnumerator
methods:
Reset
Current
MoveNext

ICollection - extends IEnumerable
methods:
Add
Remove
Clear
Contains
CopyTo - copies items to an array
properties:
Count - Readonly count
IsReadOnly - items cannot be removed or added
(mutable items can be modified even if the collection is read-only).

IList - extends ICollection
methods:
IndexOf
Insert
RemoveAt
indexer: allows to get and set any item in the collection, just as you can when using arrays


IDictionary - extends ICollection
methods:
Add - overload that associates a key with the item that is added
ContainsKey - method will check for the existence of a particular key
Properties:
Keys - returns an ICollection of all the keys
Values - returns an ICollection of all the values
remove - overload based on the key instead of the value
indexer: using the key.