Hackle's blog
between the abstractions we want and the abstractions we get.

Check out my workshops at NDC { Minnesota }
Nov 15-16 TypeScript Power Users
Nov 17-18 Simple by Design: Declutter Your Architecture, Code and Test

Generics: are you keeping it generic?

There is really just one hard rule to using generics, and it's also easy: just keep it generic!

However this is only easy to say, because holding on to this simple discipline is hard, and programmers do not always succeed in resisting the temptation of convenience.

Identity: count the implementations

This is to test your understanding of generics. How many implementations are there for the Id function below?

T Id<T>(T thing)
{
    // how to implement this?
}

// or
id : a -> a
id thing = ?

If you so much as hesitate for a second, you might want to rethink if you really get it.

The answer is - there should be only ONE meaningful (without "cheating") implementation.

Believe me, I am in no position to gloat here: as a C# / Object-Oriented programmer back then, it took me VERY long to figure out exactly why.

These days we may take it for granted in any language (with the exception of Go), but for a long time generics was mostly exclusive for the functional programming crowd. Extraordinary stories were to be told about the introduction of it to the likes of Java or C#; at the same time, it's helpful to understand that due to its history, generics does not always fit into the slowly evolving (if not sometimes stagnant) OO thinking, in fact, it can be at odds with that.

a is forall

No suspensions. The point of epiphany for me was when I learned from Haskell that a more verbose way to write the id function above. (Think of a as T if that suits your better)

id :: forall a. a -> a

forall a. really says everything there is: any meaningful implementation of id must apply to ALL possible types.

Now take a minute to think about it: this is a tall order! It's like asking a chef to cook a dish that everyone on earth likes. Well, that's actually quite impossible, but you get the idea: if it ever is possible, the dish must be very, very generic!

For the id function, it really cannot be too particular, and it best not make ANY assumptions about its input: is it an Int? A String? A complex class or struct? No! That's not the way to go about it. It must obey the only rule: keep it generic!

Now you see, id cannot assume knowledge, or pry into the type of its parameter; That really leaves it very few options; actually, there is only ONE sensible implementation:

T Id<T>(T thing)
{
    return thing;
}

// Haskell

id : a -> a
id thing = thing

That's it! That's the most important thing to know about generics.

No way! You say, this can't be! Generics is way cooler than this dumb stuff. Where goes the magic? Oh well, we shall see, there should be no magic; actually, most magic out there are tricks to cheat generics, and should really be frowned upon, not celebrated.

Cheating generics

Back at the time, as a C# programmer I couldn't accept the "generic" nature of generics (hello!), and really tried to prove otherwise. For example, what if I do this?

T Id<T>(T thing)
{
    throw new Exception("Busted");
}

Later I learned, this is considered "cheating", because this function does not really return T, as the type says. Throwing an exception satisfies any return type, a cheat code! (Besides, it's like GOTO, and it's mostly dynamic typing (of exceptions) in disguise.)

When reasoning with generics (or anything for that matter), it's very helpful to leave side-effects out of the equation; otherwise, there is no end to surprises. For another example,

T Id<T>(T thing)
{
    SpendLifeSavingOnLottery();
    return thing;
}

This satisfies forall a. a -> a just right, but it's not very pleasant: the first time it's called, the poor user's life saving is spent; the second time, it possibly throws a scary exception "account is overdrawn". Side-effects really throws in a monkey's wrench, and stops us from reasoning about the behaviour of Id in a sensible way.

Peeping, a violation

It was previously mentioned that generics can be at odds with traditional OO concepts. For one: generics is also called Parameteric Polymorphism, whereas OO champions another form of polymorphism in sub-typing, you know, inheritance and all.

These two types of polymorphism are not exactly mutually exclusive, but together they can make things quite awkward some times. For example, how come a value of type List<Teacher> cannot be assigned to a value of List<Person>, but IEnumerable<Teacher> can to IEnumerable<Person>?

var teachers = new List<Teacher>();
// Cannot implicitly convert type 'System.Collections.Generic.List<Teacher>' to 'System.Collections.Generic.List<Person>'
List<Person> people = teachers;

IEnumerable<Teacher> teachers = new List<Teacher>();
IEnumerable<Person> people = teachers;

Enough with variance but let's look at how we can cheat with generics: reflection! The source of many evils.

static T Id<T>(T thing)
{
    if (thing.GetType() == typeof(int))
    {
        return (T)(object)((int)((object)thing) + 1);
    }

    return thing;
}

Is this magical? Yes. Should we use it at every chance? Definitely not.

This peeps inside T and assumes knowledge about thing, and throws in some magic. How clever! But... do you see the problem?

This magic version of Id will surprise any unsuspecting callers, who wouldn't be expecting the special treatment to int.

You see, language designers are eager to please and keep giving us powerful features to use; unfortunately, more often than not, such features are used at the cost of our fellow engineers confusion.

For a less contrived example, you would have seen people gloating about examples of pattern matching on types, sometimes, even on generics. (Why restrict ourselves when the code can be written for ALL types?)

static string PersonInfo<T>(T psn)
{
  switch (psn) 
  {
    case Student st: return $"{st.Name} is a student";
    case Teacher tch: return $"{tch.Name} is a teacher";
    default: throw new UnimplementedException($"Cannot handle {psn.GetType()} yet");
  }
}

This is by far the worst use of generic, because it completely undermines the promise of being GENERIC! A bag of special cases hidden under the beautiful promise of T. Please, don't write anything like this.

(Note PersonInfo would be better typed as string PersonInfo(Person psn) to minimise confusion; however it would still be a bad design as it opens up what's meant to be closed)

Nullable

The evil can catch us off-guard. For example, this magic Map.

public static U Map<T, U>(T input) where U : new()
{
    if (input == null) return new U();

    // maps T to U
}

You would have heard of the "million dollar mistake" by sir Tony Hoare, and Map is a noble attempt at nibbing that from (not exactly) the bud.

This is completely valid syntax-wise and may even seem quite reasonable and helpful to many; the author is considerate enough to use type constraints to inform the caller that U must be constructible without any parameters. However, without knowing the implementation, a programmer would use it as follows,

var person = Map<Teacher, Person>(teacher);
if (person == null) return;

SendFlowers(person);

This is great, defensive code. However, thanks to be magic in Map, the defensiveness here is rendered useless, and a lot of flowers will be sent to non-existent, nameless Persons.

Dynamic typing is no exemption

One of the biggest misunderstanding is parametric polymorphism only applies to statically-typed languages, this is underestimation of the worst kind. It's true that types make it "in the face", but sticking to "generic for all" when applicable is valuable advice in general, no matter what language.

The most famous counter-example would have to be promise in JavaScript. Simply put in TypeScript,

This is no mind-bender, it's just one level of nesting. Let's see how it plays out.

> Promise.resolve("a").then(p => console.log(p));
a   // so far so good

> Promise.resolve(Promise.resolve("a")).then(p => console.log(p))
a   // hold on, what's going on?!

Nesting promises is a futile enterprise - they collapse into one single level. The designers may not like the movie "Inception" very much, and are very eager to swat out any attempt at nesting promises. Not just that, what about the code below?

> Promise.resolve({ foo: "bar" }).then(p => console.log(p.foo));
bar // ok

> Promise.resolve({ foo: () => "bar" }).then(p => console.log(p.foo()));
bar // ok

> Promise.resolve({ then: () => "bar" }).then(p => console.log(p.then()));
PromiseĀ {<pending>}     // what's going on? not "bar"?

> Promise.resolve({ then: () => "bar", foo: "bar" }).then(p => console.log(p.foo));
PromiseĀ {<pending>}     // now even p.foo doesn't work, it should be "bar"!

Within the implementation somewhere, there is a special case to inspect the value to be resolved, if it has a then field whose value is a function, this then field is then treated as a Promise, which once resolved, trips up innocent code as above.

You may think this is a trivial edge case, but I am not ready to accept it as a bug. Using "dynamic" as an excuse is not good enough; sticking to forall. T is not that hard! Especially when it's made very clear with ample discussion. Ever wonder what design flaw looks like? Look no further.

In closing

The power of generics, or parametric polymorphism is also its downfall in the eyes of the clever programmer. For foo<T> to work for ANY type is both powerful and restricting, depending on the perspective.

With mainstream languages, thanks to the inevitable transition and mix-up of paradigms, we are given the tools to break out of the rigidity of such strong constraints, and more often than we should, such tools are used for convenience; promises are broken, magic is played, and the confusion begins.