Skip to main content

Java Generics II - Understanding erasure

Posted by hellofadude on June 1, 2014 at 4:13 PM PDT

Java's generics is implemented using erasure which is a mechanism that removes specific type information within the body of a generic class or method. Essentially, when you instantiate a generic class with an actual type, the syntax within the generic class would suggest that every occurrence of the type parameter within the body of the generic class is substituted by the actual type. This is not entirely correct. In actuality, the compiler erases all type information by replacing unbounded type parameters with Object and erasing bounded type parameters to their first bound. As an example, consider the following class:-

package generics.java.erasure;

class Shrub {
     void grow(){
         System.out.println("growing");
     }
}

public class ErasedType<T> {
     private T type;
     void set(T aType) {
         type = aType;
     }
     void fertilise() {
         //! type.grow(); Compiler error
     }
     public static void main(String[] args) {
         ErasedType<Shrub> plant = new ErasedType<Shrub>();
         plant.set(new Shrub());
         plant.fertilise();

     }
}

ErasedType<T> is a class that uses an unbounded type parameter which the compiler erases to Object within the body of the class. Effectively, the only methods available to the generic type reference are those methods available to Object and this is inspite of the replacement in the main() of the type parameter with an actual type Shrub. Because of this, the compiler is unable to map the requirement to make a call grow().
If you want the compiler to be able to call methods other than those available to Object, or in other words methods from within other classes, then you must include a bound that restricts the type parameter to conform to subtypes of a particular type. You do this by reusing the extends keyword in relation to the type parameter:-
package generics.java.erasure;

public class ErasedType2<T extends Shrub> {
     private T type;
     void set(T aType) {
         type = aType;
     }
     void fertilise() {
         type.grow(); //ok
     }
     public static void main(String[] args) {
         ErasedType2<Shrub> plant = new ErasedType2<Shrub>();
         plant.set(new Shrub());
         plant.fertilise();

     }
}
/* Output
growing
*//

By specifying a bound you in effect give the compiler a hint of context because in those circumstances the type parameter is erased to the specified bound. As you can see with this technique, it is now possible to call the grow() method. You might notice that this is exactly the same thing as replacing the type parameter with the actual class in a non-generic class:-
package generics.java.erasure;

public class ErasedType3 {
     private Shrub type;
     void set(Shrub aType) {
         type = aType;
     }
     void fertilise() {
         type.grow();
     }
     public static void main(String[] args) {
         ErasedType3 plant = new ErasedType3();
         plant.set(new Shrub());
         plant.fertilise();
     }
}
/* Output
growing
*//

The above code inevitably leads to the question - "so what's the point of generics?". The power of generics become apparent when there is usually a degree of complexity in it's application in terms of classes that cut across types as opposed to being limited to a specific type and it's subtypes.
Regardless of how one may feel about it, the fact is erasure has substantial implication for the way generics operate and having an understanding of the circumstances surrounding it's implementation and it's limitations, might serve as a useful precept if you want to be able to effectively cope with some of it's peculiarities.

Far from being just another convenient language feature, erasure may be seen from the perspective of being the result of a compromise that became necessary following the requirement to ensure that the implementation of generics, which were not made a part of the Java language from the beginning (an oversight perhaps?), did not compromise backward compatibility with contemporary libraries - i.e classes continued to mean what they meant as before - and provided a safe path to migration - migration compatibility - for developers who wanted to migrate to generics. The solution was to make it such that each library and application was independent of all others regarding whether generics was being used or not. Hence, by erasing all evidence that a particular library or application is using generics at runtime, it was possible for generified clients to coexist with non-generified libraries. You can certainly get a sense of this compromise from the fact that generics is not so tightly enforced in Java:-

package generics.java.erasure;

class GenericType<T> {
     private T type;
     public void set(T anyType) { type = anyType; }
     public T get() {
         return type;
     }
}
class DerivedOne<T> extends GenericType<T> {} // ok - parameterised

class DerivedTwo extends GenericType {}       // Warning, but will compile!

public class UnenforcedGenerics {
     public static void main(String[] args) {
         DerivedTwo dt = new DerivedTwo();
         dt.set("erased");                    // Warning only!
         String anObject = (String)dt.get();  // cast needed
     }
}

From the example, it is clear that a non-parameterised type may inherit from a parameterised type, albeit with a warning! The only difference will be the need to perform a cast when retrieving the stored object.

Furthermore, according to official JDK documentation, you should be able to return an array of "type variable" objects using the Class.getTypeParameters() method; if we try this, the result is certainly not what you would expect given it's 'extraordinary' description:-

package generics.java.erasure;

import java.util.*;

class Flat {}
class Building<X> extends ArrayList<X> {}

public class LostTypeInformation {
     public static void main(String[] args) {
        List<Double> decimals = new ArrayList<Double>();
         Map<Integer, String> var = new HashMap<Integer, String>();
         Building<Flat> estate = new Building<Flat>();
         for(int i = 0; i < 10; i++)
             estate.add(new Flat());
             System.out.println(Arrays.toString(decimal.getClass().getTypeParameters()));
             System.out.println(Arrays.toString(var.getClass().getTypeParameters()));
             System.out.println(Arrays.toString(estate.getClass().getTypeParameters()));
     }
}
/* Output
[E]
[K, V]
[X]
*//

The output returns something that looks more like parameter type identifiers, and whatever it is, it is certainly nothing to do with actual types.

To get your mind around what is possibly a lot of confusion it helps to consider the motivations behind generics and the choices that were available to early Java designers in order to understand why it works the way it does in Java. The initial the motivation behind generics was to loosen in a significant way, the constraints on the types your classes and methods work with i.e. allow you the programmer to write more generalised code. Early Java designers realised they could achieve this by allowing the compiler to undertake type checking and casting only at the 'boundaries' - i.e. the points in your program where objects enter and leave a method - to ensure internal consistency in the way types are used. This made it possible for the compiler to use it's erasure to remove all type information in the body of a method or class thereby solving the backward compatibility problem.

For instance, observe the methods in the following class. The first method returns a generic array of the specified size while the second method returns a Collection of the specified type and size. Note that there is no specific type information inside either method:-

package generics.java.erasure;

import java.lang.reflect.Array;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;

public class BoundaryTypeChecking<T> {
     private Class<T> type;
     public BoundaryTypeChecking(Class<T> aType) { this.type = aType; }
     @SuppressWarnings("unchecked")
     T[] createArray(int size)  {
         return  (T[])Array.newInstance(type, size);    // Warning Unchecked cast
     }

     Collection<T> create(T type, int size) {           // Boundary
         Collection<T>  list = new ArrayList<T>();
         for(int i = 0; i < size; i++)
             list.add(type);
         return list;
     }

     public static void main(String[] args) {
        BoundaryTypeChecking<String> s = new BoundaryTypeChecking<String>(String.class);
         System.out.println(Arrays.toString(s.createArray(10)));
         System.out.println(s.create("Boo", 10));
     }
}
/* Output
[null, null, null, null, null, null, null, null, null, null]
[Boo, Boo, Boo, Boo, Boo, Boo, Boo, Boo, Boo, Boo]
*//

Even though the type reference in the above example is stored as Class<T>, erasure erases this to just Class. Consequently, the Array.newInstance() method in the body of the createArray() method, does not have any information about specific type and hence the need for an explicit cast at the point of exit.
The same logic applies to the create() method where if we accept that there is no type information in the body of the method, then the only logical explanation for being able to recover the correct type from the Collection is because the compiler has already undertaken type checking at the boundary of the method. To be precise, Collection<T> has no type information and is simply stored as Collection. Same goes for the ArrayList, however the compiler is still able to ensure the correct type at the point of entry at compile time. Because of erasure it is useful to remind yourself that your type references are only objects within the body of your class.

Capturing erased type

The implication of this loss of type information is that you can not use type parameters in operations that explicitly refer to runtime types like casts, instanceof operations and expressions beginning with new.

There are a number of ways to compensate for the loss of type information. For instance, to check the runtime type of an object, you can insert a type tag and then use the Class.isInstance() method instead, here's how:-

package generics.java.erasure;

class Meal {
     public Meal() {}
}
class Breakfast extends Meal {}

public class CaptureType<T> {
     Class<T> type;
     public CaptureClassType(Class<T> anyType) {
         this.type = anyType;
     }
     public static void main(String[] args) {
        CaptureType<Breakfast> breakfast = new CaptureType<Breakfast>(Breakfast.class);
         CaptureType<Meal> meal = new CaptureType<Meal>(Meal.class);
         System.out.println(breakfast.type.isInstance(new Meal()));
         System.out.println(breakfast.type.isInstance(new Breakfast()));
         System.out.println(meal.type.isInstance(new Meal()));
         System.out.println(meal.type.isInstance(new Breakfast()));
     }
}
/* Output
false
true
true
true
*//

By passing in a type tag you can recover type information and still be able to perform the kind of type checking you would have got using the instanceof keyword.

The following example demonstrates how to recover type information that would be useful in instantiating new types:-

package generics.java.erasure;

import java.util.HashMap;
import java.util.Map;
 
public class CaptureTypeFromContainer {
     @SuppressWarnings("serial")
     public static class AMap<T> extends HashMap<String, Class<?>> {
         T type;
         public void addType(String typeName, Class<?> type) {
                put(typeName, type);
         }
         @SuppressWarnings("unchecked")
         public T createNew(String typeName)  {
             for(Map.Entry<String, Class<?>> anEntry : entrySet())
                if(anEntry.getKey().equalsIgnoreCase(typeName)) {
                     try {
                           type = (T) anEntry.getValue().newInstance();
                          System.out.println("Correct class - required: " + typeName + "\n" + "found: " + anEntry.getKey());
                     } catch(InstantiationException e) {
                          System.out.println("Unable to create");
                     } catch(IllegalAccessException e) {
                          System.out.println("Unable to Access!");
                     }
                 } else {
                     System.out.println("Wrong class - required: " + typeName + "\n" + "found: " + anEntry.getKey());
                 }
             return type;
          }
          T get() { return type; }
     }
     public static void main(String[] args) {
        CaptureTypeFromContainer.AMap<Meal> meal = new CaptureTypeFromContainer.AMap<Meal>();
         meal.addType("Meal", Meal.class);
         meal.addType("Breakfast", Breakfast.class);
         Meal meal2 = meal.createNew("meal");
         System.out.println(meal2.getClass().getSimpleName());
     }
}
/* Output
Wrong class - required: meal
found: Breakfast
Correct class - required: meal
found: Meal
Meal
*//

The above code uses a nested Map to store an objects class name as it's key and type as the corresponding value. You can return a new type by calling the createNew() method with a specified type name. If the type exists in the container, a new instance is created and cast to the correct type. Again, the cast is necessary because erasure erases type information. Note that this version of newInstance() assumes it is working with a class that has a no args constructor.

A more flexible approach to instantiating new types would be to use explicit factories whose type can be constrained only to those classes implementing it to dynamically instantiate objects at runtime like so:-

package generics.java.erasure;

interface Factory<T> {
     T create(String arg);
}
class Batman {
     String name;
     public Batman(String hero) {
         this.name = hero;
         System.out.println(hero);
     }
     static class HeroFactory implements Factory<Batman> {
         public Batman create(String name) {
             return new Batman(name);
         }
     }
}
class Spiderman  {
     String name;
     public Spiderman(String hero) {
         this.name = hero;
         System.out.println(hero);
     }
     static class HeroFactory implements Factory<Spiderman> {
         public Spiderman create(String name) {
             return new Spiderman(name);
         }
     }
}
class StringFactory implements Factory<String> {
     public String create(String aString) {
         return new String(aString);
     }
}
class TypeMaker<T> {
     private T aType;
     public <I extends Factory<T>> TypeMaker(I afactory, String arg) {
         aType = afactory.create(arg);
     }
     public T getType() {
         return aType;
     }
}

public class Factories {
     public static void main(String[] args) {
         TypeMaker<Batman> hero1 = new TypeMaker<Batman>(new Batman.HeroFactory(), "Batman");
         TypeMaker<Spiderman> hero2 = new TypeMaker<Spiderman>(new Spiderman.HeroFactory(), "Spiderman");
         TypeMaker<String> fact = new TypeMaker<String>(new StringFactory(), ""Good sense is equally distributed among men" ");
         System.out.println(fact.getType());
     }
}
/* Output
Batman
Spiderman
"Good sense is equally distributed among men"
*//

The TypeMaker constructor accepts only classes that implement the Factory interface type and because you have explicit factories, unlike the previous example, you are not limited to being able to instantiate only those classes that have a default constructor.

You can achieve a similar thing with reflection using the newInstance() method that accepts initialisation arguments:-

package generics.java.erasure;

import java.lang.reflect.Constructor;
import java.lang.reflect.InvocationTargetException;

class MyType {
     public MyType(String s) {
         System.out.println(s);
     }
}
public class InstantiateTypeWithReflection<T> {
     Class<T> type;
     public InstantiateTypeWithReflection(Class<T> aType) {
         this.type = aType;
     }
     public T createNew() {
          T t = null;
          try {
             Constructor<T> constructor = type.getConstructor(String.class);
             try {
                 t = constructor.newInstance("MyType created Successfully!");
             } catch (InstantiationException iException) {
                 System.out.println(iException);
             } catch (IllegalAccessException iAException) {
                 System.out.println(iAException);
             } catch (IllegalArgumentException iAException) {
                 System.out.println(iAException);
             } catch (InvocationTargetException iTException) {
                 System.out.println(iTException);
             }
         } catch(NoSuchMethodException nSMException) {
             System.out.println(nSMException);
         } catch(SecurityException sExeption) {
             System.out.println(sExeption);
         }
         return t;
     }
     public static void main(String[] args)  {
       InstantiateTypeWithReflection<MyType> reflector = new InstantiateTypeWithReflection<MyType>(MyType.class);
         reflector.createNew();
     }
}
/* Output
MyType created Successfully!
*//

The version of the newInstance() method accepts varargs lists, which means you can work with both no args constructor and constructors that require arguments.

Generics and Arrays

Because erasure ensures the runtime type of an array can only be Object[], creating an array of generics can become slightly confusing, take a look at this code:-

package generics.java.erasure;

class Generic<T> {}

public class ArrayOfGeneric {
     public static void main(String[] args) {
           // Generic<String>[] genArray = new Generic<String>[3]            // Illegal syntax
           // Generic<String>[] genArray = (Generic<String>[])new Object[5];  // Class cast exception
          Generic<String>[] genArray = (Generic<String>[])new Generic[5];   // ok!  Runtime type is raw
          //! genArray[0] = new Object();                                     // Compiler error
         for(int i = 0; i < genArray.length; i++)
             genArray[i] = new Generic<String>();
         for(int i = 0; i < genArray.length; i++)
         System.out.println(genArray[i]);
         System.out.println(genArray.getClass().getSimpleName());
     }
}
/* Output
generics.java.erasure.Generic@15db9742
generics.java.erasure.Generic@6d06d69c
generics.java.erasure.Generic@7852e922
generics.java.erasure.Generic@4e25154f
generics.java.erasure.Generic@70dea4e
Generic[]
*//

The first line in the main() of this example shows it is not possible to create an array of generics in the normal way you would normally create arrays of other types. This is because the runtime type of an array of generics is erased to an array of Object, but the compiler will still not allow you to cast an array of Object to an array of generics. The only way to create an array of generics is to create an array of the erased type and then cast that back to the array of generics. Once you successfully create an array of generics, you get the normal compile time type checking.

Within the body of a class, you can create a generic type array either by casting an Object[] array to a type array T[] within the constructor or by using an Object[] array and casting it's elements to T only when required in a convenience get() method for example. Either way you still lose valuable type information about the underlying array implementation, which must be erased to Object[] at runtime.

The best way to get around this is to use reflection by passing a type tag to the Array.newInstance() method to create an array of specified dimensions and then cast the result of that to generic type array T[], Here's how:-

package generics.java.erasure;

import java.lang.reflect.Array;
import java.util.Random;

public class GenericArray<T> {
     private T[] anArray;
     @SuppressWarnings("unchecked")
     public GenericArray(Class<?> type, int size) {
         anArray = (T[])Array.newInstance(type, size);
     }
     public void add(int index, T anObject) {
         anArray[index] = anObject;
     }
     public T get(int index) {
         return anArray[index];
     }
     public T[] impl() { return anArray; }

     public static void main(String[] args) {
        GenericArray<Integer> holder = new GenericArray<Integer>(Integer.class, 10);
         Integer[] implementation = holder.impl();  // underlying type information
         for(int i = 0; i < 10; i++) {
             holder.add(i, i*new Random().nextInt(20));
         }
         for(int i = 0; i < 10; i++)
             System.out.print(holder.get(i).toString() + " ");
         System.out.println();
         System.out.println("generic array implementation: " + implementation)
     }
}
/* Output
0 18 30 51 68 85 84 105 144 117
generic array implementation: [Ljava.lang.Integer;@7852e922
*//

Using this technique, it is clear that we can retain the type of the underlying array implementation as well as having access to the elements within the array.

The benefits of understanding the effects and limitations imposed by erasure and how you might overcome it far outweigh any effort and makes you a more effective programmer.

Comments

This is because the runtime type of an array of generics is ...

This is because the runtime type of an array of generics is erased to Object. - 02 media