Java Generics<Part-2>: Getting your hands dirty (Really?)

In my previous post – Java Generics: <Part-1>, we learnt the basics of Java Generics, and how to use it. Now, it’s time to do some actual work. In this post, we’ll learn how to write and instantiate our own generic types. Further down, we’ll also see how we can write a generic method in a non-generic type, which will be useful if you don’t want your entire class to be generic.

Writing your own Generic Type:

Generic classes are normal classes with one or more type parameters added to it’s definition. The type parameter are declared in angular brackets – <> following the class name. Let’s see how you declare a very simple generic class:

public class Container<T> {
    private List<T> list;
    public Container() {
        this.list = new ArrayList<T>();
    }
    public List<T> getList() { return this.list; }
}

The Container class declares a single type parameter T. As you can see, the type of field list is <List>. We can used the type declared as a part of class declaration in the members declared inside that class. Though there are some restriction, which we’ll see later on. Then we have initialized the list inside the constructor. Again we pass the same type parameter T as actual type argument while instantiating the generic ArrayList<E> type. Though it is not really an actual type argument, but it’s for ArrayList<E>, where E is replaced by T. The type parameters T will be replaced by the actual type argument we use to instantiate our generic class, as shown below:

public static void main(String[] args) {
    Container<String> strContainer = new Container<String>(); 
    List<String> strList = strContainer.getList();
    Container<Integer> intContainer = new Container<Integer>();
    List<Integer> intList = intContainer.getList();
}

We have two concrete instantiation1 of our generic container (also called, concrete parameterized type) – one with String as actual type argument, and one with Integer as actual type argument. Notice, how the getList() method returns the List of corresponding type. You can compile and execute that code, and see that it would work fine. Well, currently it isn’t doing much, so you wouldn’t get any output.

1As we’ll learn later, there is another way to instantiate a generic type called wildcard instantiation(Also called, wildcard parameterized type).

So, the basic idea is, we declare a generic class, and then we instantiate the same class with different type based on our requirement. This makes our task simple. We don’t have to write different container for say – Integer, or String. Apart from that, the code is typesafe, as we know the Container<String> holds a String, so the getList() method will return a List<String> only.

All instantiation of a generic type share the same runtime type.

As we know that there is no difference between generic and non-generic code at runtime. All the generic type information is removed by the compiler during the type erasure process. That means, all the instantiation of a generic type are same at runtime. We say that generics are not reified. Like, List<String>, List<Integer>, List<Date>, etc, all are just a List. They share the same runtime type. To see how, try the following code:

List<String> strList = new ArrayList<String>();
List<Integer> intList = new ArrayList<Integer>(); 
System.out.println(strList.getClass());   // java.util.ArrayList
System.out.println(intList.getClass());   // java.util.ArrayList
System.out.println(strList.getClass() == intList.getClass());   // true

Does this behaviour affect the way you can write code? Certainly yes. The way Java Generics is implemented, restrict you to do certain things which might seem to you obvious on first look. Here’s the most common of them.

Generics are invariant:

This simply means that instantiation of a generic type with two covariant type argument, doesn’t make the parameterized type covariant themselves. So, even though Number is a superclass of Integer, the parameterized type List<Number> and List<Integer> are not. So, you cannot write code like these:

List<Number> list = new ArrayList<Integer>();  // Illegal: Won't compile

List<Integer> intList = new ArrayList<Integer>();
List<Number> numList = intList;                // Illegal too

So, the polymorphism concept doesn’t apply on generics. Now the question arises, why? That’s because JVM cannot differentiate between a List<Number> and a List<Integer> at runtime. They both are List at runtime as already explained. So, if compiler allowed such assignment as above, then it may cause havoc at runtime. Let me show you how.

Consider you have a superclass Animal and it’s two subclasses – Dog and Cat. Now consider the code below:

List<Dog> dogs = new ArrayList<Dog>();
List<Animal> animals = dogs;    // Suppose it was allowed

// Below code is valid, since Cat is a subclass of Animal, we can keep it in a List<Animal>
animals.add(new Cat("mewww"));

// And we do this - get dog at index 0 from dogs
Dog dog = dogs.get(0);    // Oh Dear! You assigned a Cat to a Dog reference.

You see what happened? The last assignment there would throw a ClassCastException at runtime. That is why 2nd assignment is not allowed, and the compiler stops you there only showing you a compiler error.
So, isn’t there any way to achieve polymorphic behaviour in generics? Of course there is. We’ll see how we can achieve this using wildcards.

Generic Methods:

Generic methods are methods that declare type parameters. You declare the type parameters for a generic method in the same way you would do for a generic type. You give them inside an angular brackets. Difference is, it is not after the method name, but just before the return type of the method.

Suppose you want to write a method to convert from an array to a List. Without generics, you would have to write different methods for different types of array. Like one for String[] to List<String>, another for Integer[] to List<Integer>. For example, consider an example for List<Integer>:

public static List<Integer> toList(int... array) {
    List<Integer> list = new ArrayList<Integer>();
    for (int elem: array) {
        list.add(elem);
    }
    return list;
}

This might soon become cumbersome, as it’s not practical enough to have overloaded methods for different types. Only if there was a way to have a single method handle all the types.

Generic method to the rescue:

So there it is. You can write a single generic method, which will handle all the types for you. This is how you write it:

@SafeVarargs   // Don't worry about this annotation. We'll discuss it later, why it's needed
public static <T> List<T> toList(T... array) {
    List<T> list = new ArrayList<T>();	
    for (T elem: array) {
        list.add(elem);
    }		
    return list;
}

The <T> before the return type of the method declares a type parameter T whose scope is confined to the method itself. It is similar to defining the formal parameter for a method. And similar to formal parameters, generic type parameters are part of method signatures. You can also declare multiple type parameters, just as you can declare multiple formal parameters. Just put them in the angular bracket, separated by comma <T, S, U>.

How do you invoke a generic method?

Just like any normal method. The type parameters will be inferred based on the type of arguments you passed, just like they are inferred while instantiating a generic type. So, if you pass a Integer[] and type T will be inferred as Integer, if you pass String[], it will be inferred as String. If you have noticed, I’ve used varargs as the formal parameter type. This will give us the flexibility to invoke the method without passing an explicitly created array. Actually, there might be some issue with using varargs of a type parameter, or a parameterized type, which we’ll discuss later. That is why I’ve used @SafeVarargs annotation there.

So, this is how you would invoke the method:

public static void main(String... args) {
    List<String> strList = toList("a", "b", "c"); // T inferred as String. Return List<String>
    List<Integer> intList = toList(1, 2, 3);      // T inferred as Integer. Return List<Integer>
    List<Box> boxList = toList(new Box(5), new Box(10));  // T inferred as Box. Return List<Box>
}

So you can see the advantage. A single method is sufficient to fulfill our need. With this, I’ll wrap up this post here only.

Java Generics<Part-1>: A Basic Introduction

Welcome to the world of Java Generics, a feature introduced long back in Java 5, along with some others like, boxing/unboxing, enhanced for-loop, enum, varargs, etc. Though it is already a bit old feature, still many find it difficult to digest some of the tricky part of it. This is the first post among the series of posts, that I’ll probably write on this feature, considering both amateur and intermediate Java Developer.

What is Java Generics?

We all know that writing bug-free code on the very first attempt is like a nightmare. Even the most experienced programmers can often have bugs in their code. And these bugs can often increase the cost of the software we write. This increase in cost is directly proportional to the delay in discovery of those bugs. The earlier a bug is detected, the lower is the cost of fixing it.
Generics helps us to detect the bugs early at the compilation time, rather than letting it persist till runtime, by doing stronger type check at compile time. Yes, generics is all about compile time activity. It has no business at runtime.

How generic code differs from non-generic code?

Let’s consider a very simple example, where you create a List, add an element, and then fetch it back. We will see code with and without generics (as you would write in Java versions < 5)

1. Before Java 5:

List list = new ArrayList();
list.add(new Integer(5));
// list.get(0) returns an Object, so you need a cast to Integer
int val = (Integer)list.get(0);

As you see, before generics were introduced, you would have to cast the value fetched from the list to appropriate type, since the list is nothing but a list of Object. Well and good till now.
But consider the case, when you add a String into your list, and cast it to an Integer while fetching:

List list = new ArrayList();
list.add("abc");                  // Will compile fine
int val = (Integer)list.get(0);   // ClassCastException at runtime

Now, there is no way for the compiler to know what type you actually get from the list. And then this will blow at runtime, with a ClassCastException.

Generics avoid this issue, by providing generic type List, which contains a type parameter as a part of it’s declaration – List<E>. The actual type argument is passed while instantiating the List, which replaces the type parameter E. The actual type argument specifies the type of element that will be stored in that list. We’ll discuss later on, what is the significance of E. So, let’s see how the same code looks like with generics:

2. Java 5 onwards:

List<Integer> list = new ArrayList<Integer>(); // Instantiate generic type List
list.add(5);  // Autoboxing from 5 to Integer reference
int val = list.get(0);  // No type cast needed now

Now, this code looks pretty more clear, both to you and the compiler. Passing the type argument Integer while instantiating the List, specifies that the list will contain element of type Integer. Since the compiler knows this, any attempt to add a type incompatible with Integer will result in a compile time error. And for that reason only, you don’t need to add a cast while fetching an element from the list. Note that, the auto-boxing feature introduced in Java 5 allows us to write – add(5); instead of add(new Integer(5));. Now, modify the code, and try adding a String to the list, or assigning the result of list.get(0) to a String, and see what happens.

List<Integer> list = new ArrayList<Integer>(); 
list.add("abc");  // This won't compile itself.
int val = list.get(0);

So, it seems interesting, how generics enforces type check at compile time, to avoid potential ClassCastException at runtime.

If you remember, I said that generics has no business at runtime. Even though there is a considerable difference in the code you write with or without generics, the generated bytecode are same for both the codes. Surprised? Well, this is the tricky part to understand. The compiler removes all the generic type information from the code as a process of type erasure. We’ll discuss this in details later on. So, there is no difference between both the above shown codes as far as runtime performance is concerned. Of course, compiler adds appropriate cast, to make the code still applicable.

A better example:

Suppose we want to create a list of 3 integers, and then iterate over the list, and print the contents. I’ll show you how the code would look both with and without generics. As an exercise, I would not explain the already explained concept, and let you figure out what is happening.

1. Before Java 5:

List intList = new ArrayList(Arrays.asList(
                   new Integer[] {
                            new Integer(1),
                            new Integer(2),
                            new Integer(3)
                        }
	           ));

for (Iterator iter = intList.iterator(); iter.hasNext();) {
    int element = (Integer)iter.next();
    System.out.println(element);
}

Since, there was no varargs, or autoboxing before Java 5, that is why you have to explicitly create new Integer[] array to pass to Arrays#asList(Object[]) method. Also, you had to create integer objects like – new Integer(1);. This is not the case from Java 5. Let’s see the code from Java 5 onwards:

2. Java 5 onwards:

List<Integer> genericList = new ArrayList<Integer>(Arrays.asList(1, 2, 3));
for (int element: genericList) {
    System.out.println(element);
}

The features used in the above code are:

  • Generics
  • Varargs
  • Autoboxing
  • Enhanced for-loop

I’ll wrap up this post here only. In the next post we’ll see how to write a generic type yourself.

Regex to split a String on comma outside double quotes.

Suppose you have a string like so:

String str = "abc,foo,c;bar=\"this,demo\",hello\"world,syzygy\"";

and you want to write regex to split this string on comma. Well, no big deal in this. But you would notice that there are some comma inside the double quotes too. Now what if you want to ignore those commas while splitting?

Here’s working code that will do it:

String str = "abc,foo,c;bar=\"this,demo\",hello\"world,syzygy\"";
String[] tokens = str.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
		
for (String token: tokens) {
    System.out.println(token);
}

This would give you the following output:

abc
foo
c;bar="this,demo"
hello"world,syzygy"

So, what that regex is doing? The trick is to split on commas that are followed by an even number of double quotes (Of course this would work only if the quotes are balanced). Enjoy.

Check whether a String value is convertible to an instance of a Wrapper class given as Class object?

This is the follow up of my answer to this Stack Overflow question. The question goes like this:

Suppose we have a String and a Class<?> object like this:

String string = "true";
Class<?> clazz = Integer.class;

How do we find that the value in string is convertible to an instance of class given by Class<?> clazz or not?

Assumption here is that, we are dealing with wrapper classes. It wouldn’t be that easy(if only possible) to do this for any normal class. Now, considering the fact that, every wrapper class has a method – valueOf(String), that converts a string value to an instance of that wrapper class, we can use some reflection hack to get around with this.

Here’s the working code:

public class ReflectionTest {

    public static void main(String... args) throws InvocationTargetException, IllegalAccessException {

        String string = "24";
        Class<?> clazz = Integer.class;

        Method method = null;

        try {
            method = clazz.getMethod("valueOf", String.class);
        } catch (NoSuchMethodException | SecurityException e) {
            System.out.println(e.getMessage());
        }

        if (method != null) {
            try {
                Object obj = method.invoke(null, string);
                System.out.println("String value converted to " +
                                   clazz.getSimpleName() +
                                   " instance: " + obj);

            } catch (IllegalArgumentException ex) {
                System.out.println(ex.getMessage());
                System.out.println("Failure : " + string +
                                   " is not of type " +
                                   clazz.getSimpleName());
            }
        }
    }
}

Assuming that there is no SecurityManager to prevent you from running this code, it will print:

Output:
String value converted to Integer instance: 24

So, what is that code actually doing? We’ll understand it step by step in the following paragraph.

We already have the Class object for Integer.class class. If you look at the Integer class, or any other wrapper classes, they have a static method – valueOf(String), which creates that wrapper class instance from passed String, and throws NumberFormatException if the string cannot be converted to the wrapper class instance.

We then use Class#getMethod(String, Class<?>...) method to get a Method instance for valueOf method in the following part:

method = clazz.getMethod("valueOf", String.class);

After we have got the Method instance, we try to invoke the method using Method#invoke(Object, Object...) method, passing null as first argument (To invoke a static method, we pass null as first argument), and the given String as second argument. This is done in following part of the code:

Object obj = method.invoke(null, string);

If the method invocation is successful, and the string value can be converted to an Integer, the next statement in the try block is executed, and the success message is printed. Else, if any exception is thrown, the error message in the catch block is printed.

The above code will pass, as the string "24" is parseable to Integer instance. You can try out the code giving an invalid integer value in String, and see if exception is thrown or not.

Final Note: An important point to note here is that for Boolean wrapper, any value other than true will give you a Boolean.FALSE instance. Even the null argument will give false. The Boolean#valueOf(String) method doesn’t throw any exception.

What’s wrong with calling Overridable method in constructor?

This is the follow-up post of my answer on – this Stack Overflow question. Basically the question goes like this:

The application is doing this when creating an instance of the class B using a load class by name method.

  • Calls overridden load() in class B, from A class constructor ( B extends A)
  • Initializes variables (calls “private string testString = null” according to debugger), nulling them out

The major issue with the code in that post is, the non-final base class (A) constructor invokes a non-final method, which is overridden in the derived class (B). Due to this, the code will behave unexpectedly. I’ll explain the issue in detail in the following post. But first of all, you should understand (if you not already do) what happens behind the scene, when you create an object of a class. For that, I suggest you to go through my last post – Object creation process: Inheritance. If you already know about the process, you can proceed further.

When a method of a class is called, it expects that the instance on which it is called is completely initialized, so that it can work freely on the data (fields) of that class. Now,  as explained in my last post (that I linked), when you create an instance of a class, first of all, all it’s super class members are initialized, then at the end, it’s own constructor executes further to initialize it’s own member fields. So, when an overridden method is called from the base class constructor, it will invoke the overridden version in derived class, rather than the base class. But at that point, the members of the derived class has not been initialized yet for the current instance (as super class constructor is not finished yet). That can cause trouble if that method is using the instance fields. Let’s understand it with the help of an example:

Code Talks Better: -

abstract class Operation {

    public Operation() {
        divide();
    }
    abstract int divide();
}

class Division extends Operation {

    private int numerator = 0;
    private int denominator = 0;

    public Division(int numerator, int denominator) {
        super();
        this.numerator = numerator;
        this.denominator = denominator;
    }

    int divide() {
        return numerator / denominator;
    }
}

Now from the main method, we create an instance of Division class:

public static void main(String... args) {
    Operation division = new Division(4, 2);
}

When you run the code, you shouldn’t be surprised on seeing the output:

Exception in thread "main" java.lang.ArithmeticException: / by zero
	at Division.divide(Main.java:75)
	at Operation.(Main.java:58)
	at Division.(Main.java:69)
	at Main.main(Main.java:84)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:491)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

So, what do you think have happened? Basically, when the “Operation” class constructor invoked the “divide()” method, it overridden method in “Division” class is called. At that point, since the “numerator” and “denominator” are not yet initialized by the “Division” class constructor, they will have their default value 0. And thus you got “/ by zero” exception, as “denominator” is 0.

So, the moral of the post is: -

  • Never invoke a non-final method of a non-final class inside it’s constructor. That method might have been overridden in one of it’s derived class, which might use the fields, which hasn’t been initialized yet.

Object creation process: Inheritance

We know that every class in Java is a part of an inheritance hierarchy. Either it extends a superclass explicitly, or the Object class implicitly. So, inheritance is always there. But most of the beginners begin to get confused with this topic – Object creation process, only when they actually learn about Inheritance, by extending an explicit superclass. Few of the doubts that might come in their minds are: -

  • How can we access this keyword in the constructor, even though the object has not been constructed yet?
  • When we instantiate a subclass (which they always do, but don’t know initially), does the super class also gets instantiated?
  • If the 2nd point is true, then how on earth does the super class gets instantiated, if it is an abstract class?

Quite interesting questions, aren’t they? Well, I’ll talk about them towards the end of this post, after I explain the bits and pieces about the Object Creation Process.

So, when is an Object created?

Object creation can be explicit or implicit: -

  • Explicit Object Creation takes place when we write a Class Instance Creation expression (one where you make use of new keyword). In it’s simplest form, a Class Instance Creation expression looks like: –
    new Apple();

    Assuming, we have a class named Apple somewhere on our machine, under default package, or the corresponding package being imported at the top of the class. Rest of the post will be an elaboration of this process only. Also note that, I have not assigned any reference to the newly created object here. Because, that is not needed here.

  • Implicit Object Creation, on the other hand, keeps on happening throughout our code. Some of the examples where an implicit object is created are: –
    • When you perform a String Concatenation operation (excluding the one that takes place between compile-time constants, viz, string literals), a new String object is created for each concatenation (When I say, new object is created, I mean object that is created on Heap and not on literal pool).
    • When you invoke a method on an immutable object, for e.g. String, the modified object returned is always a new object. In other words, any method of String class, or any other class, whose objects are immutable for that matter, will create and return a new object.

    Likewise, there are many more situations, where a new object is being created, and most of the time it goes unnoticed to beginners. Anyways, this is not the matter of discussion of this post, so let’s not extend it any further and move ahead with the actual topic.

Explicit Object Creation (Behind the scene stuffs): -

The object creation process involves several behind the scene steps, which I’ll try to explain in most basic form.

First and foremost thing to understand is, the object is created by the new operator and not by the constructor. Constructor is only used to initialize the state of the newly created object.
Whenever a class instance is created (by new operator), memory is allocated for it, including memory for all the instance fields in that class and all the superclasses of that class. Then all the instance fields for that particular instance is initialized with their default values.

At this point, the object is already created. Now, for this newly created object, the constructor following the new keyword in the object creation expression is invoked to initialize the state of the object. That constructor in turn can invoke another constructor of the same class explicitly using this() call, passing appropriate arguments if needed. If the constructor does not invoke any constructor using this(), then an implicit or explicit call to the immediate superclass constructor is made using super() call.

Note that, every constructor must have either a this(parameters) call or a super(parameter) call as the first statement. If we don’t insert either of them, then the compiler inserts a super() call as the first statement of every constructor, which will invoke the 0-arg constructor of the immediate superclass.

Eventually, at the end of this chaining of constructor invocation, the last constructor which is invoked is that of Object class. So, the first constructor to execute is Object class constructor. Then the rest of the body of each sub class constructor is executed in order.

If any class has defined an instance initializer block, then the compiler adds that block to every constructor of that class, immediately after the super or this call. Thus, all the instance initializer blocks are also executed for each super class.

At any point, in the process of this recursive invocation, and execution of all the constructors in the inheritance hierarchy, if an exception is thrown, then that particular constructor invocation ends abruptly. In which case, the whole procedure ends abruptly.

Finally, when all the constructors complete the execution process successfully, the new operator returns a reference to the object.

Code Talks Better: -

package com.rjcodeblog;

import java.math.BigDecimal;

class Fruit {
    /** Private instance variable **/
    private BigDecimal pricePerKg;

    public Fruit() {
        this.pricePerKg = BigDecimal.ZERO;
        System.out.println("Superclass 0-arg constructor");
    }

    public Fruit(BigDecimal pricePerKg) {
        this.pricePerKg = pricePerKg;
        System.out.println("Superclass parameterized constructor");
    }
}

class Apple extends Fruit {
    /** Private instance variable **/
    private String color;

    /** A no-arg constructor **/
    public Apple() {
        this.color = "";
        System.out.println("Subclass 0-arg constructor");
    }

    public Apple(String color, BigDecimal pricePerKg) {
        super(pricePerKg);  // If we don't add this, compiler will add super()
        this.color = color;
        System.out.println("Subclass parameterized constructor");
    }

    /** Getters and Setters **/
}

public class Juicer {
    public static void main(String[] args) {
        Apple fake = new Apple();
        Apple real = new Apple("red", new BigDecimal("50"));
    }
}

So, we have a superclass – Fruit with 1 instance field, and 2 constructors – 0-arg and parameterized. In addition, we have one subclass – Apple with 1 instance field, and 2 constructors. You can notice that, we have added an explicit super call in parameterized constructor. This is because, if we don’t add it, compiler will add a super call to 0-arg constructor of super class, which we don’t want.

So, till now you might have got the output of the above program right? Let’s see if I get it right or not. And this is what I’ve got: -

Superclass 0-arg constructor
Subclass 0-arg constructor
Superclass parameterized constructor
Subclass parameterized constructor

So, this is all in this topic for now. Note that I’ve not covered all the aspects of the complete process, but only a short overview. If you want to learn more about this topic, you can always refer to the Java Language Specification.

Further Reading: -

Now, to end this post, let’s get back to the questions that I pointed out at the beginning of this post: -

  • How can we access this keyword in the constructor, even though the object has not been constructed yet?

Now that you understand that the constructor is not responsible for the creation of object. It’s the new operator that creates an object. So, it is probably clear that while you are executing the constructor, the object is already being created, and this refers to that created object.

  • When we instantiate a subclass (which they always do, but don’t know initially), does the super class also gets instantiated?

No. As I said earlier, the object is created only of the class, whose constructor is written as the part of Instance Creation Expression. That is, the class constructor that is followed by the new keyword. And then, on the newly created object, the constructor of that class, and eventually, all the super classes are invoked to initialize the state of that object.

  • If the 2nd point is true, then how on earth does the super class gets instantiated, if it is an abstract class?

I think you got the answer of this one now. Still, I will revise it for my own sake. The abstract class is not being instantiated, rather only it’s constructor is invoked to initialize the state of object that is the part of that class.

Follow

Get every new post delivered to your Inbox.