The Source for Java Technology Collaboration
User: Password:



David Walend

David Walend's Blog

Naming Generic Types

Posted by dwalend on December 05, 2004 at 04:55 PM | Comments (16)

Naming Generic Types

We've had blogs covering DRY and magic Strings in the last week. I'm going to blog about generic type names, specifically using names longer than one letter.

Some of us are old enough to have used systems with a limit the length of variable names. But I haven't seen anyone use "x" as a generic double name in years (caveat coordinates). We left that habit behind because of the confusion it caused. We now teach people to write self-documenting code with descriptive variable names.

The generics tutorial that comes with JDK 5 suggests using one capitol letter to represent generic types: "A note on naming conventions. We recommend that you use pithy (single character if possible) yet evocative names for formal type parameters. It’s best to avoid lower case characters in those names, making it easy to distinguish formal type parameters from ordinary classes and interfaces. Many container types use E, for element, as in the examples above." I tried to follow this advice; I used N for the node type and E for the edge type.


public interface IndexedDigraph<N,E> 
    extends Digraph<N,E>
wasn't so bad. But the typespec for Semirings has five generic types.

public interface Semiring<N,E,L,B extends IndexedDigraph<N,E>,LD extends IndexedMutableDigraph<N,L>>
It got painful quickly. What's L? What was E again? Element?

Using fully spelled out names removes the guessing from Digraph:


public interface IndexedDigraph<Node,Edge> 
    extends Digraph<Node,Edge>
and changes Semiring from cryptically intimidating to merely complex:

public interface Semiring<Node,
                          Edge,
                          Label,
                          BaseDigraph extends IndexedDigraph<Node,Edge>,
                          LabelDigraph extends IndexedMutableDigraph<Node,Label>>

Changing "E"s to "Edges", etc. was mindless drudgery. "E"s and "N"s are much more common than "x"s. I made mistakes during the cleanup. Changing full names would have been much easier for me. Reading full names should be easier for everyone else.

I like to name generics after the role they play in code. I didn't have any problem distinguishing the generic names from interfaces or classes. JDigraph doesn't have a Node, Edge or Label class (which is much easier to figure out now that I've used generics to cut the number of classes in half).

To avoid abbreviations in code, I keep a thesaurus on my desk. My thesaurus is probably the right tool to clear up any confusion between generic typespecs, classes and interfaces.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • I could not agree more. I've never understood why they always used single characters to represent types in C++ templates, and I *really* couldn't understand why they were doing it in java. I expected more legibility in java.

    I still don't understand why they do it, but at least now I now that I'm not the only one who thought full type names would be better. :-)

    Posted by: paulrivers on December 05, 2004 at 10:28 PM

  • Well in JDK it's not so bad because they do not use really complex generic types. Most complex one is Map with K as key and V as value. So what's the deal? It's readable.

    However, iI agree that if you design your own complex generic types such as the mentioned example, it really makes sense to avoid acronyms and on-letter type aliases.

    Posted by: patrikbeno on December 06, 2004 at 03:51 AM

  • I don't agree completely with your solution because now I look at Edge on a member variable or method and think, Class or Interface. David remarks that this isn't a problem because the classes don't exist, but at the same time you are losing the intrinsic value of Java's naming scheme.

    Posted by: jhook on December 06, 2004 at 07:45 AM

  • The problem with the JDK naming is that most people will copy that style because it is in the JDK.

    Posted by: jtr on December 06, 2004 at 08:09 AM

  • If physicists always used long variable names, Einsteins equation would be energy = mass * Math.pow(speed_of_light, 2). It is "self-documenting", but have you ever heard of physicists who want to ditch e, m, and c?

    Provided they are unambigous in a given context, short variable names are better than long variable names. For example, the Java collections library uses E, K, V, T, for type variables--unambiguius and easy on the eyes.

    In a graph program, I see nothing wrong with using N and E for node and edge types. I do not know what a Label type is, so maybe a longer type parameter would be better in that case.

    I find an overly verbose style oppressive, hard to read and hard to type. Use short names wisely, but by all means, use them when you can.

    Cheers,

    Cay


    Posted by: cayhorstmann on December 06, 2004 at 08:15 AM

  • It's ironic that you had so much trouble renaming E to Edge. The point of naming things with one capital letter is so that you can tell instantly what you're looking at. Few classes fit on one page and when someone else comes along and sees "Edge" they're naturally going to think - ok this is a class or an interface - because that's the rule! By violating the recommendation of the JLS, you're also violating the Principle of Least Astonishment. Yes, it's a tradeoff. B is not as nice as BaseDigraph, but it's actually *more clear*.
    The irony is that it would have been easy to rename E to Edge if you were using a modern IDE, while an IDE would have obviated the need to rename by making it immediately clear what E is (through a quick inline popup of its javadoc or jump to its definition and accompanying comments) or even what Edge would be (via syntax highlighting).

    Posted by: cypherpunks on December 06, 2004 at 09:36 AM

  • Cay,

    Wow -- years ago I abandoned single letter abbreviations in physics code -- particle collider simulations -- starting by replacing "c" with "LIGHT_SPEED"! ("c" was too many other things. It was "speed of wave propagation in a material" before it was the speed of light, and a materials researcher wanted to use it as a coefficient of heat or somesuch.)

    Maybe the thing to focus on is "pithy" instead of single letter. I found "'E' is the same sort of thing as 'T'" jarring. "E" and "E1" would have been better. How about "Elem" and "Elem1"?

    N to me is permanently part of O(N). People unfamiliar with graphs will have a leap to get from N to Node and from E to Edge. People unfamiliar with graph algorithms (or at least Cormen's treatment of them) will have trouble getting from L to Label. I suppose our code is for some audience.

    The reason to use longer names is to get self-documenting code. It's much more robust than comments. The thing to avoid is names like


    public void temporaryMethodToInitializeTheSystemForTodaysCrisis(String xmlStringToGetThingsGoing)


    while avoiding


    public void init(String arg)


    Hope that helps,

    Dave

    Posted by: dwalend on December 07, 2004 at 05:10 AM

  • I find E, K, V, T hugely confusing. However, I do note that using Edge does look like a Class and while it

    Posted by: markswanson on December 07, 2004 at 06:20 AM

  • Sorry, Konqueror seems to have snipped away part of my last post....
    I find E, K, V, T hugely confusing. However, I do note that using Edge does look like a Class and while it's better I still don't like the confusion. The solution: perhaps take a lesson from interfaces and use EdgeType the way some folks use EdgeIfc. Does anyone see anything wrong with this?

    Posted by: markswanson on December 07, 2004 at 06:20 AM

  • "Principle of Least Astonishment" my arse.

    Astonishment, seriously. Perhaps there should be some sort of naming convention, or specific IDE coloring for template types to differentiate them.

    BUT, writing "N value;" tells you nothing about N without having previous knowledge of the context. Oh, you mean we have Nodes? I though it was Number. Or Name. Or Network.

    Java is supposed to be readable. A single letter for a type is not readable.

    Posted by: paulrivers on December 07, 2004 at 08:50 AM

  • "N value; tells you nothing about N without having previous knowledge of the context."
    Not true at all - and that's the point. N tells you that what you're looking at is a TypeParameter! Ok, maybe you don't know what N stands for but as I say, it's a trade-off. Unfortunately the thing you're trading off against are the common practices that the JLS practically *mandates*. All things being equals, you might choose long Names for TypeParameters, but in this case, the trade-off has already been made by the language designers and you're going against it.

    Posted by: cypherpunks on December 07, 2004 at 09:57 PM

  • How about NodeType or GNodeType, GEdgeType or GenNodeType to differentiate class from generics.

    Posted by: parthav on December 07, 2004 at 10:24 PM

  • It looks like in order to violate the least astonishment for the most people, we would have to choose something other than a capital letter, since people see the capital letter and straight away think 'ah, its a class'.

    But if we start things out with a lower case letter, then people will parse that as some kind of variable... tricky.

    Maybe we should use 'anti camel casing'? eg:
    eDGE or
    nODEtYPE
    Oh sure, its a little ugly, but you won't get it confused with anything else.
    First letter Caps, rest lower (camelled) is taken (for classnames)
    All Caps is taken (for consts)
    All lower case is taken (keywords)
    First letter lower, rest lower (camelled), is taken (for variable and method names)

    Looks like the only thing left is the reverse of variable/method naming. :D

    Personally, I'm a little surprised they didn't shove a symbol on the front of them, like @ or #. Given that @ would get confused with metadata (which... in a way... generics are...) I'd consider it a likely candidate (we've got to avoid paying too much 'keyboard symbol tax' after all).

    Posted by: rickcarson on December 07, 2004 at 11:27 PM

  • And we're only one step away from hungarian notation. I hoped and thought I never had to see it again and that it outlived it's purpose, guess I was wrong. Hmmmmz, generics, maybe I haven't spent enough time on it, but I've got the creeping feeling this monster is going to turn around and bite us in the ###.

    Posted by: salp on December 08, 2004 at 12:59 AM

  • The problem seems to be that we've introduced a new type into the system. We had classes, methods, and variables before and conventions on naming them. Now we have generic typing and we need a convention for how to name it such.

    Personally, single character names are atrocious. Frankly, I'm stunned the Java team followed such a 1960's coding style on that. It's already common practice to distinguish types of variables via a naming convention (constants, static, and instance). It's not a big stretch to think we simple need a similar decision for generics:

    g_GenericName
    _GenericName_
    __GenericName

    Seems like there's a plethora of possible conventions that would clearly denote a Generic vs. a Class name and still allow useful names for generic types.

    Posted by: ckessel on December 09, 2004 at 08:26 AM

  • Desciptive type names is the right way to go. I agree it's a shame that Sun set a bad example. If you want to see by inspection in stone-age tools or colorless printing that something is a parameter, then what's wrong with KeyT, ValueT, etc.?

    I've filed an IDEA enhancement request with Jetbrains for settable coloration for generic types. I was surprised they don't yet have that feature.

    Posted by: daveyost on August 26, 2005 at 08:29 PM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds