Java Generics<Part-2>: Getting your hands dirty (Really?)

In my previous post – Java Generics: <Part-1>, we learnt the basics of Java Generics, and how to use it. Now, it’s time to do some actual work. In this post, we’ll learn how to write and instantiate our own generic types. Further down, we’ll also see how we can write a generic method in a non-generic type, which will be useful if you don’t want your entire class to be generic.

Writing your own Generic Type:

Generic classes are normal classes with one or more type parameters added to it’s definition. The type parameter are declared in angular brackets – <> following the class name. Let’s see how you declare a very simple generic class:

public class Container<T> {
    private List<T> list;
    public Container() {
        this.list = new ArrayList<T>();
    }
    public List<T> getList() { return this.list; }
}

The Container class declares a single type parameter T. As you can see, the type of field list is <List>. We can used the type declared as a part of class declaration in the members declared inside that class. Though there are some restriction, which we’ll see later on. Then we have initialized the list inside the constructor. Again we pass the same type parameter T as actual type argument while instantiating the generic ArrayList<E> type. Though it is not really an actual type argument, but it’s for ArrayList<E>, where E is replaced by T. The type parameters T will be replaced by the actual type argument we use to instantiate our generic class, as shown below:

public static void main(String[] args) {
    Container<String> strContainer = new Container<String>(); 
    List<String> strList = strContainer.getList();
    Container<Integer> intContainer = new Container<Integer>();
    List<Integer> intList = intContainer.getList();
}

We have two concrete instantiation1 of our generic container (also called, concrete parameterized type) – one with String as actual type argument, and one with Integer as actual type argument. Notice, how the getList() method returns the List of corresponding type. You can compile and execute that code, and see that it would work fine. Well, currently it isn’t doing much, so you wouldn’t get any output.

1As we’ll learn later, there is another way to instantiate a generic type called wildcard instantiation(Also called, wildcard parameterized type).

So, the basic idea is, we declare a generic class, and then we instantiate the same class with different type based on our requirement. This makes our task simple. We don’t have to write different container for say – Integer, or String. Apart from that, the code is typesafe, as we know the Container<String> holds a String, so the getList() method will return a List<String> only.

All instantiation of a generic type share the same runtime type.

As we know that there is no difference between generic and non-generic code at runtime. All the generic type information is removed by the compiler during the type erasure process. That means, all the instantiation of a generic type are same at runtime. We say that generics are not reified. Like, List<String>, List<Integer>, List<Date>, etc, all are just a List. They share the same runtime type. To see how, try the following code:

List<String> strList = new ArrayList<String>();
List<Integer> intList = new ArrayList<Integer>(); 
System.out.println(strList.getClass());   // java.util.ArrayList
System.out.println(intList.getClass());   // java.util.ArrayList
System.out.println(strList.getClass() == intList.getClass());   // true

Does this behaviour affect the way you can write code? Certainly yes. The way Java Generics is implemented, restrict you to do certain things which might seem to you obvious on first look. Here’s the most common of them.

Generics are invariant:

This simply means that instantiation of a generic type with two covariant type argument, doesn’t make the parameterized type covariant themselves. So, even though Number is a superclass of Integer, the parameterized type List<Number> and List<Integer> are not. So, you cannot write code like these:

List<Number> list = new ArrayList<Integer>();  // Illegal: Won't compile

List<Integer> intList = new ArrayList<Integer>();
List<Number> numList = intList;                // Illegal too

So, the polymorphism concept doesn’t apply on generics. Now the question arises, why? That’s because JVM cannot differentiate between a List<Number> and a List<Integer> at runtime. They both are List at runtime as already explained. So, if compiler allowed such assignment as above, then it may cause havoc at runtime. Let me show you how.

Consider you have a superclass Animal and it’s two subclasses – Dog and Cat. Now consider the code below:

List<Dog> dogs = new ArrayList<Dog>();
List<Animal> animals = dogs;    // Suppose it was allowed

// Below code is valid, since Cat is a subclass of Animal, we can keep it in a List<Animal>
animals.add(new Cat("mewww"));

// And we do this - get dog at index 0 from dogs
Dog dog = dogs.get(0);    // Oh Dear! You assigned a Cat to a Dog reference.

You see what happened? The last assignment there would throw a ClassCastException at runtime. That is why 2nd assignment is not allowed, and the compiler stops you there only showing you a compiler error.
So, isn’t there any way to achieve polymorphic behaviour in generics? Of course there is. We’ll see how we can achieve this using wildcards.

Generic Methods:

Generic methods are methods that declare type parameters. You declare the type parameters for a generic method in the same way you would do for a generic type. You give them inside an angular brackets. Difference is, it is not after the method name, but just before the return type of the method.

Suppose you want to write a method to convert from an array to a List. Without generics, you would have to write different methods for different types of array. Like one for String[] to List<String>, another for Integer[] to List<Integer>. For example, consider an example for List<Integer>:

public static List<Integer> toList(int... array) {
    List<Integer> list = new ArrayList<Integer>();
    for (int elem: array) {
        list.add(elem);
    }
    return list;
}

This might soon become cumbersome, as it’s not practical enough to have overloaded methods for different types. Only if there was a way to have a single method handle all the types.

Generic method to the rescue:

So there it is. You can write a single generic method, which will handle all the types for you. This is how you write it:

@SafeVarargs   // Don't worry about this annotation. We'll discuss it later, why it's needed
public static <T> List<T> toList(T... array) {
    List<T> list = new ArrayList<T>();	
    for (T elem: array) {
        list.add(elem);
    }		
    return list;
}

The <T> before the return type of the method declares a type parameter T whose scope is confined to the method itself. It is similar to defining the formal parameter for a method. And similar to formal parameters, generic type parameters are part of method signatures. You can also declare multiple type parameters, just as you can declare multiple formal parameters. Just put them in the angular bracket, separated by comma <T, S, U>.

How do you invoke a generic method?

Just like any normal method. The type parameters will be inferred based on the type of arguments you passed, just like they are inferred while instantiating a generic type. So, if you pass a Integer[] and type T will be inferred as Integer, if you pass String[], it will be inferred as String. If you have noticed, I’ve used varargs as the formal parameter type. This will give us the flexibility to invoke the method without passing an explicitly created array. Actually, there might be some issue with using varargs of a type parameter, or a parameterized type, which we’ll discuss later. That is why I’ve used @SafeVarargs annotation there.

So, this is how you would invoke the method:

public static void main(String... args) {
    List<String> strList = toList("a", "b", "c"); // T inferred as String. Return List<String>
    List<Integer> intList = toList(1, 2, 3);      // T inferred as Integer. Return List<Integer>
    List<Box> boxList = toList(new Box(5), new Box(10));  // T inferred as Box. Return List<Box>
}

So you can see the advantage. A single method is sufficient to fulfill our need. With this, I’ll wrap up this post here only.

Java Generics<Part-1>: A Basic Introduction

Welcome to the world of Java Generics, a feature introduced long back in Java 5, along with some others like, boxing/unboxing, enhanced for-loop, enum, varargs, etc. Though it is already a bit old feature, still many find it difficult to digest some of the tricky part of it. This is the first post among the series of posts, that I’ll probably write on this feature, considering both amateur and intermediate Java Developer.

What is Java Generics?

We all know that writing bug-free code on the very first attempt is like a nightmare. Even the most experienced programmers can often have bugs in their code. And these bugs can often increase the cost of the software we write. This increase in cost is directly proportional to the delay in discovery of those bugs. The earlier a bug is detected, the lower is the cost of fixing it.
Generics helps us to detect the bugs early at the compilation time, rather than letting it persist till runtime, by doing stronger type check at compile time. Yes, generics is all about compile time activity. It has no business at runtime.

How generic code differs from non-generic code?

Let’s consider a very simple example, where you create a List, add an element, and then fetch it back. We will see code with and without generics (as you would write in Java versions < 5)

1. Before Java 5:

List list = new ArrayList();
list.add(new Integer(5));
// list.get(0) returns an Object, so you need a cast to Integer
int val = (Integer)list.get(0);

As you see, before generics were introduced, you would have to cast the value fetched from the list to appropriate type, since the list is nothing but a list of Object. Well and good till now.
But consider the case, when you add a String into your list, and cast it to an Integer while fetching:

List list = new ArrayList();
list.add("abc");                  // Will compile fine
int val = (Integer)list.get(0);   // ClassCastException at runtime

Now, there is no way for the compiler to know what type you actually get from the list. And then this will blow at runtime, with a ClassCastException.

Generics avoid this issue, by providing generic type List, which contains a type parameter as a part of it’s declaration – List<E>. The actual type argument is passed while instantiating the List, which replaces the type parameter E. The actual type argument specifies the type of element that will be stored in that list. We’ll discuss later on, what is the significance of E. So, let’s see how the same code looks like with generics:

2. Java 5 onwards:

List<Integer> list = new ArrayList<Integer>(); // Instantiate generic type List
list.add(5);  // Autoboxing from 5 to Integer reference
int val = list.get(0);  // No type cast needed now

Now, this code looks pretty more clear, both to you and the compiler. Passing the type argument Integer while instantiating the List, specifies that the list will contain element of type Integer. Since the compiler knows this, any attempt to add a type incompatible with Integer will result in a compile time error. And for that reason only, you don’t need to add a cast while fetching an element from the list. Note that, the auto-boxing feature introduced in Java 5 allows us to write – add(5); instead of add(new Integer(5));. Now, modify the code, and try adding a String to the list, or assigning the result of list.get(0) to a String, and see what happens.

List<Integer> list = new ArrayList<Integer>(); 
list.add("abc");  // This won't compile itself.
int val = list.get(0);

So, it seems interesting, how generics enforces type check at compile time, to avoid potential ClassCastException at runtime.

If you remember, I said that generics has no business at runtime. Even though there is a considerable difference in the code you write with or without generics, the generated bytecode are same for both the codes. Surprised? Well, this is the tricky part to understand. The compiler removes all the generic type information from the code as a process of type erasure. We’ll discuss this in details later on. So, there is no difference between both the above shown codes as far as runtime performance is concerned. Of course, compiler adds appropriate cast, to make the code still applicable.

A better example:

Suppose we want to create a list of 3 integers, and then iterate over the list, and print the contents. I’ll show you how the code would look both with and without generics. As an exercise, I would not explain the already explained concept, and let you figure out what is happening.

1. Before Java 5:

List intList = new ArrayList(Arrays.asList(
                   new Integer[] {
                            new Integer(1),
                            new Integer(2),
                            new Integer(3)
                        }
	           ));

for (Iterator iter = intList.iterator(); iter.hasNext();) {
    int element = (Integer)iter.next();
    System.out.println(element);
}

Since, there was no varargs, or autoboxing before Java 5, that is why you have to explicitly create new Integer[] array to pass to Arrays#asList(Object[]) method. Also, you had to create integer objects like – new Integer(1);. This is not the case from Java 5. Let’s see the code from Java 5 onwards:

2. Java 5 onwards:

List<Integer> genericList = new ArrayList<Integer>(Arrays.asList(1, 2, 3));
for (int element: genericList) {
    System.out.println(element);
}

The features used in the above code are:

  • Generics
  • Varargs
  • Autoboxing
  • Enhanced for-loop

I’ll wrap up this post here only. In the next post we’ll see how to write a generic type yourself.

Regex to split a String on comma outside double quotes.

Suppose you have a string like so:

String str = "abc,foo,c;bar=\"this,demo\",hello\"world,syzygy\"";

and you want to write regex to split this string on comma. Well, no big deal in this. But you would notice that there are some comma inside the double quotes too. Now what if you want to ignore those commas while splitting?

Here’s working code that will do it:

String str = "abc,foo,c;bar=\"this,demo\",hello\"world,syzygy\"";
String[] tokens = str.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
		
for (String token: tokens) {
    System.out.println(token);
}

This would give you the following output:

abc
foo
c;bar="this,demo"
hello"world,syzygy"

So, what that regex is doing? The trick is to split on commas that are followed by an even number of double quotes (Of course this would work only if the quotes are balanced). Enjoy.