The Evolution of Generics in C# 4.0

Before I begin, let me point out that I’m primarily writing this post to solidify my own understanding of the new generic structures and put it into my own words, there are already some great posts that already explain this in much greater depth and detail: (Generic Variance in C# 4.0 by Discord & Rhyme, What’s the difference between covariance and assignment compatibility? By Eric Lippert)

So that being said, let start off with the problem that generics solve in the first place. (If you have a good understanding, you can jump to the new features in C# 4.0)

C# Prior to 2.0

Lets say I have the following class hierarchy:

    public abstract class Fruit { }

    public class Apple : Fruit { }

    public class Orange : Fruit { }

    public class FruitBasket
    {
        public Fruit[] Fruit { get; set; }
    }

    public class Tree
    {
        public Fruit[] PickFruit()
        {
            Fruit[] daFruit = new Fruit[2];
            daFruit[0] = new Apple();
            daFruit[1] = new Orange();
            return daFruit;
        }
    }

We have a hierarchy of objects, a base Fruit class, an Apple class, which is a Fruit, and an Orange class, which is also a Fruit. In addition, we have a FruitBasket which has an enumerable of Fruit, and a Tree which I can “PickFruit()” from.

Now, lets say I want to have an instance of a fruit basket, however, I also want another one that deals with only Oranges (I don’t want it to even be possible to put apples in the basket) so that I can make orange juice, apples just ruin a good glass of orange juice. I have two possibilities, I can create another fruit basket and ASSUME that I only put oranges in, or I create another classes that only allows oranges to be inserted in. Because I decide that I want the compiler to absolutely not allow apples in with the oranges, I now have to create a new class:

    public class OrangeBasket
    {
        public Oranges[] Oranges { get; set; }
    }

Or if I now want to have a basket that holds potatoes, I have to build another class:

    public class PotatoBasket
    {
        public Potato[] Potatos { get; set; }
    }

Right. Pattern. We are repeatedly creating virtually identical objects that simply contain other objects or that apply some sort of processing to those specific elements, all because we want the consumer (the person that is using this object) to be able to put an specifically typed object in and get a specifically typed object out, without having to cast it.

C# 2.0 – 3.5

With the release of C# 2.0 Microsoft introduced this concept of “generics” in programming. It allows programmers to create generalized algorithms that take and receive specific objects, store or process them, and return them, without knowing the creator of the object being worked on or knowing specific type. C# 2.0 also included a number of generic collections and interfaces that implemented these features, the most useful in my mind being the generic IEnumerable<T> interface. This now allows us to rewrite our basket class like follows:

public class Basket { IEnumerable Contents{ get; set; } }

Now, instead of having a PotatoBasket, a FruitBasket, and a OrangeBasket we can replace it like so while still using the same class:

    Basket
    Basket
    Basket

And our Tree class now becomes:

    public class Tree
    {
        public IEnumerable PickFruit()
        {
            yield return new Apple();
            yield return new Orange();
        }
    }

So now, we have a Basket of Potatos, a Basket of Fruit, and a Basket of Oranges. Yet, we didn’t have to duplicate the classes, and any the implementations still preserves type safety.

This is where it gets… Interesting.

Lets say I have an instance of a Tree class, and an instance of my Basket<Fruit> class. Now, it’s simple to do an assignment like this:

    var fruitBasket = new Basket();
    var Tree = new Tree();
    fruitBasket.Contents = tree.PickFruit();

BUT if I have a tree that has a PickApples() method like this:

    public IEnumerable PickApples()
    {
        yield return new Apple();
        yield return new Apple();
        yield return new Apple();
    }

And I attempt to do the same assignment as before but with the PickApples() method instead, C# 3.5 will not compile it because the generic parameters are not the same:

    var fruitBasket = new Basket();
    var Tree = new Tree();
    fruitBasket.Contents = tree.PickApples();

That sucks. Why?

The reason has to do with the relationships of types, covariance / contra-variance / invariance (Erick Lipperts post on the difference between covariance and assignment compatibility is a much better source for it’s relation with mathematics and type hierarchies). With versions prior to 4.0 the C# compiler does NOT allow generics of one type to be assigned to generics of another type. Thus, an IEnumerable<Apple> cannot be assigned to a variable of IEnumerable<Fruit>. But lets say we were allowed to do this, why would it be a problem? Lets say I have an IList<Fruit> and an IList<Apple>; it seems to make perfect sense that I could so the following assignment:

    var fruitList = new List();
    var appleList = new List();
    fruitList = appleList;

Right?

Wrong.

This assignment would actually be plausable for IEnumerable<T> because IEnumerable<T> is immutable, it can not change, there’s no way to “Insert” a new element into an IEnumerable. However, with an IList<T> it defines an Add(T item) method. An IList can be altered. In the previous example, if I were to take fruitList and look at the Add method in intellisense, it would show that I can insert any object that is a fruit into the add method of the fruitList, however, fruitlist isn’t technically a list of fruits anymore given this senario, its a pointer to a List<Apple>. So in this flawed example, I could now call fruitList.Add(new Orange()); which would now be a runtime error because I can’t insert an Orange into a List<Apple>. Bad. (In C# however, there is an edge case where this sort of error can occur with arrays. For instance, if you have a object[] objarray = new string[10]; you can then assign a fruit to one of the slots and cause a runtime exception. For more detail on this, see Erick Lipperts post on Covariance and Contravariance in C# arrays) So we have two things that should be solved. We know that there are certain situations where IEnumerable<X> should be assignable to IEnumerable<T>, specifically when X is a subclass of T, but we also realize that something like IList<X> should not be assignable to a variable of IList<T>.

C# 4.0

Enter variance modifiers for generic types that have now been introduced in C# 4.0.

out and in.

First, these can only be applied to generic type parameters of interfaces and delegates, in can only be applied to generic parameters that are contra-variant or invariant valid, and out can only be applied to generic parameters that are co-variant or invariant valid.

Gulp.

First, invariance, contra-variance, and covariance. In terms of types, if I have T1 and T2, those types are Invariant if they are the same type. For instance, a Fruit and a Fruit are invariant, a Fruit and an Apple is not invariant even though a Fruit variable can hold an Apple object.

Co-variant is where the inheritance chains are kept, if I have T1 and T2, the projection of T1 to T2 is covariant if T2 is lower in the inheritance chain than T1. For instance, a Fruit to Apple is Co-variant because Apple is an instance of Fruit.

Contra-variant is where the inheritance chain is reversed, if I have T1 and T2, the projection of T1 to T2 is contra-variant if T2 is higher up the inheritance chain than T1. For instance, an Apple to Fruit is contra-variant because Apple is an instance of Fruit, the relationship is reversed, flipped.

So, it allows me to do the following in C# 4.0 with the new variant structure:

    var fruitBasket = new Basket();
    var Tree = new Tree();
    fruitBasket.Contents = tree.PickApples();

Because IEnumerable is defined as follows:

    IEnumerable { /* .. */ }

Remembering that out is co-variant because it can return the item or a subclass of that item, in this case PickApples() is returning an IEnumerable that is lower in the inheritance chain than the variable it’s being assigned to (by lower, I mean that its a subclass, or sub-sub..n class of the other object)

To demonstrate a class with contra-variance with an in parameter, lets say we have an interface and classes like so:

    IPieMaker where T : Fruit
    {
        Pie MakePie(IEnumerable fruits);
    }

    ApplePieMaker : IPieMaker { /* ... Some implementation ...*/ }
    FruitPieMaker : IPieMaker { /* ... Some implementation ...*/ }

Now, because the input parameter is contra-variant, I can have a variable declaration like so:

    IPieMaker applePieMaker = new FruitPieMaker();

Wait. Seem odd? Because T is not specifying what Pie we are making only what is put IN to make the pie, I can put apples into a FruitPieMaker. It’s contra-variant.

Try and wrap you head around that 🙂

Note: There’s a good possibility that I didn’t accurately describe the terms covariance and contra-variance in relation to mathematical projection and ordering, corrections and better descriptions are greatly appreciated.

Additional Links: