Skip to main content

Generics Considered Harmful

Posted by arnold on June 27, 2005 at 10:53 AM PDT

I don't know how to ease into this gently. So I'll just spit it out.

Generics are a mistake.

This is not a problem based on technical disagreements. It's a fundamental language design problem.

Any feature added to any system has to pass a basic test: If it adds complexity, is the benefit worth the cost? The more obscure or minor the benefit, the less complexity its worth. Sometimes this is referred to with the name “complexity budget”. A design should have a complexity budget to keep its overall complexity under control.

Generics are way out of whack. I have just finished (well, nearly) my work in the fourth edition of The Java Programming Language. I am glad to say that David Holmes, not I, was the one who covered generics. Just reviewing it and reading the specifications was enough to put my brain through a cuisinart stuck on “pulse”.

We went through that chapter multiple times, consulting with several people who wrote the specs and are otherwise experts. We were only able to cover the highest level in the book, and it's still pretty hard to understand (although David exceeded himself in making it as comprehensible as possible).

Learning to use generified types can get very complicated. It's hard to understand why you cannot do some things without casts, for example. But writing generified classes is rocket science. Here's one that showed up at the very last minute: It's a bad idea to declare a type that returns an array of a type parameter. That is, you shouldn't do this:

    interface Holder<T> {
        T[] toArray();
    }

Why, you ask? Well, the problem is that T might itself be a generic type. That is, someone might declare a Holder>. And, ... uh, hold on, I'm trying to remember the issue here...

Actually, I'm only mildly embarrassed to say that I've forgotten. But I remember that it took a few back-and-forths between David and the advising expert so that David — remember, David is the guy who has been writing a chapter on generics after several months of experimentation and research and over a year of thinking about how to approach it — could understand the problem. So our book recommends against it because it isn't good.

Although there is an exception. It's okay to do this if the method takes as a parameter an array of T or a Class object:

        T[] toArray(Class);

That's OK to do.

Which brings up the problem that I always cite for C++: I call it the “Nth order exception to the exception rule.” It sounds like this: “You can do x, except in case y, unless y does z, in which case you can if ...”

Humans can't track this stuff. They are always loosing which exception to what other exception applies (or doesn't) in any given case.

Or, to show the same point in brief, consider this: Enum is actually a generic class defined as Enum>. You figure it out. We gave up trying to explain it. Our actual footnote on the subject says:

Enum is actually a generic class defined as Enum>. This circular definition is probably the most confounding generic type definition you are likely to encounter. We're assured by the type theorists that this is quite valid and significant, and that we should simply not think about it too much, for which we are grateful.

And we are grateful. But if we (meaning David) can't explain it so programmers can understand it, something is seriously wrong.

So now we know it's complex. But if it really saved your programming butt a lot it could be worth it. So what does it save you? It saves you making mistakes, like putting a String in a list that should only contain Longs, or attempting to pull out a String from such a list.

But we have a demonstration proof that we can live without it, namely that we have for nearly a decade. Of course there are such bugs in code, and if you generify a bunch of code you might even find one or two that were waiting to bite you (unless that code is actually orphaned). But I have yet to find someone who believes this to be a major source of error in their code, compared to other problems.

So we have a feature whose complexities are high, whose learning curve is steep, and whose benefit is limited. And add to that the feature is ubiquitous -- with Java 5 it is nearly possible to write code that doesn't interact with generics.

The complexity of Java has been turbocharged to what seems to me relatively small benefit. I don't see that the value is there to justify the cost. Not that we can change things, but I think we should at least view it as an demonstration proof of the value of an explicit complexity budget against which features must be justified. Without such a budget, it feels like the JSR process ran far ahead, without a step back to ask “Is this feature really necessary”. It seemed to just be understood that it was necessary.

It was understood wrong.

Related Topics >>

Comments

Syntax renders incorrectly There are several spots in ...

Syntax renders incorrectly

There are several spots in this article which crucially refer to specific Java syntax, where the syntax has rendered incorrectly due to the conflict between Java angle-bracket syntax and html:



"Well, the problem is that T might itself be a generic type. That is, someone might declare a
Holder<Set<String>>

And, ... uh, hold on"


"Or, to show the same point in brief, consider this: Enum is actually a generic class defined as
Enum<T extends Enum<T>> 

You figure it out."


"Enum is actually a generic class defined as
Enum<T extends Enum<T>> 

. This circular definition is probably the most confounding generic type definition..."

This is an old article, but discusses some crucial topics. ...

This is an old article, but discusses some crucial topics. Alas, some of the key Java syntax does not render properly, due to conflict between template angle bracket syntax and html, and probably blog processing. In a subsequent comment I'm going to point out the problem instances, but first a test to see how to make angle brackets appear in comments.
Test in raw text, no html entities:
Test in raw text, using html entity: <enclosed>
Now in a BBCode code block:

<br />
In the code block<br />
Here's brackets: <enclosed><br />

Now with html code tags
Here's brackets: <enclosed> 

Hope one or other worked!
(After editing) Conclusions:
1. In comments, in regular text, the less-than must be represented by html entity, otherwise it causes the entire "tag" to disappear.
2. The BBCode square-brackets code block is broken -- it renders gratuitous visible break tags.
3. html code tags work.
4. The comment preview function is broken (renders only part of the comment).
OK... on to the business at hand.

Indeed...

As I sit and stare in stunned horror at this monster

public interface Identifiable&lt;T extends Identifier&lt;? extends What&gt;, What&gt; {
    public T getObjectID();
    public Class&lt;? super What&gt; type();
}
public interface Identifier&lt;T&gt; {
    public long getID();
    public Class&lt;? super T&gt; type();
}
interface X &lt;SubjectType extends Identifiable, RelationshipType extends
    Enum&lt;RelationshipType&gt; &amp; Related, ObjectType extends Identifiable&gt;{}

static class A&lt;
    SI extends Identifier&lt;? extends SubjectType&gt;,
    OI extends Identifier&lt;? extends ObjectType&gt;,
    SubjectType extends Identifiable&lt;SI, SubjectType&gt;,
    RelationshipType extends Enum&lt;RelationshipType&gt; &amp; Related,
    ObjectType extends Identifiable&lt;? extends OI, ? extends ObjectType&gt;&gt;
    implements
    X&lt;SubjectType, RelationshipType, ObjectType&gt; {
}

(and I am, alas, the author of it), I am inclined to agree.

Generics have the unfortunate property of being rather invasive - once you start, they can lead you down the primrose path to something like this (try having ObjectType have its own parameter types and keep this mess compilable).