Introduction to Collections

In Java Collections means collections of Objects. Collections Frameworks in Java provides set of APIs using which we can group objects together and then iterate through it and other operations such as add more objects to the collection, search or remove the objects from it etc.

The Collections Framework in Java took shape in JDK 1.2 and then expanded in JDK 1.4, JDK 1.5, JDK 1.6, JDK 1.7 and then in JDK 1.8 too.

The Collections Framework provides lists, sets, maps and queues for grouping objects. They all does the same job but you will have to choose the one according to your programming demands.

Why do we need Collections ?

Just imagin what if flipkart or amazon uses arrays to group products ? They might be having millions of products in their website, isn’t it ?. Suppose if lets say they created a class called Product and created array of size TOTAL_NO_OF_PRODCUTS to represent each product and then user request only for Android Smartphone with screen size 5″ or greater. It takes the gut out of you to search for matching products in that array isn’t ?

Are you really that courageous ?

That would chill your spine is if some one ask you to add new product to 10,000 position of that array :-). I hope you have an idea why do we need Collections.

What we can do using Collections Frameworks ?

Below is the list of some basic operation we do with Collections

  • Add objects to the collection.
  • Remove objects from the collection.
  • Find out if an object ( or group of objects ) is in the collection.
  • Retrieve an object from the collection without removing it.
  • Iterate through the collection, looking at each element (object) one after another.

What does Collections Framework contains ?

Like I told you earlier it contains group of APIs. When we say APIs we know that it’s Interface and Classes. In Collections Framework set of Interface and Classes tailored together to group the object and organize them in some fashion.

Basically it starts with group of interfaces and then loads of loads of Concrete Classes which implements it. From interview and exam perspective you need to be aware of below 9 core interfaces.

Collection Set SortedSet
List Map SortedMap
Queue NavigableSet NavigableMap

And below is list of 13 core Concrete Classes and most implements the above interfaces.

Maps Sets Lists Queues Utilities
HashMap HashSet ArrayList PriorityQueue Collections
Hashtable LinkedHashSet Vector Arrays
TreeMap TreeSet LinkedList
LinkedHashMap

The interface and class hierarchy for collections

Important Note: You might easily get confused between “Collections” and “Collection”. Here “Collections” is a utility class with many static methods where as “Collection” is a interface.

All the classes and interfaces of Collections Frameworks resides in java.util package.

Collections comes in four basic flavours:

  1. Lists : List of Objects ( classes that implements List interface )
  2. Sets : Uniques Objects ( classes that implements Set interface )
  3. Maps : Where Objects are stored in key and value pair ( Classes that implements Map interface )
  4. Queues : Objects arranged in a particular fashion eg : FIFO, or based on priority.

4 Sub flawours <<@vinay to do if required >>


Working with List

List interface :

A List cared about the index. One thing that List got but not Queue or Set is the extra methods related to index. Those key method includes like get(int index), indexOf(Object o), add(int index, Object obj) and so on.

We had enough theory 🙂 let’s spice things up with some Action 🙂

ArrayList example :

video

Generics :

video

video

You can think of ArrayList as a dynamically growing array. Has we seen in the last example we been just calling add() method and we never defined any size as such. If we have to store 10k+ objects in collection then we can simple keep calling add() method 10k+ times. Isn’t it wonderful 🙂 ?

Let’s list the difference between Arrays and ArrayList in Java so that we can know the advantages and disadvantage of choosing ArrayList over Arrays

  • First and Major difference between Array and ArrayList in Java is that Array is a fixed length data structure while ArrayList is a variable length Collection class. You can not change length of Array once created in Java but ArrayList re-size itself when gets full depending upon capacity and load factor. Since ArrayList is internally backed by Array in Java, any resize operation in ArrayList will slow down performance as it involves creating new Array and copying content from old array to new array.
  • Another difference between Array and ArrayList in Java is that you can not use Generics along with Array, as Array instance knows about what kind of type it can hold and throws ArrayStoreException, if you try to store type which is not convertible into type of Array. ArrayList allows you to use Generics to ensure type-safety.
  • You can also compare Array vs ArrayList on How to calculate length of Array or size of ArrayList. All kinds of Array provides length variable which denotes length of Array while ArrayList provides size() method to calculate size of ArrayList in Java.
  • You can not store primitives in ArrayList, it can only contain Objects. While Array can contain both primitives and Objects in Java. Though Autoboxing of Java 5 may give you an impression of storing primitives in ArrayList, it actually automatically converts primitives to Object.

Example:

Example was given in the first video of this lesson

  • We use add() method to insert element into ArrayList where as in Arrays we simply use assignment operator to store the element.
  • You can create instance of ArrayList without specifying size, Java will create ArrayList with default size but its mandatory to provide size of Array. By the way you can also initialize ArrayList while creating it.
In terms of performance Array and ArrayList provides similar performance in terms of constant time for adding or getting element if you know index. Though automatic resize of ArrayList may slow down insertion a bit Both Array and ArrayList is core concept of Java and any serious Java programmer must be familiar with these differences between Array and ArrayList or in more general Array vs List.  To know more about the efficiency of Arrays vs ArrayList go to this stackoverflow discussion.
video

Vector :

Vector API is one of the first API introduced in Collections Framework. Vector & HashTable were the two original collections; the remaining APIs were added in Java 2 (JDK 1.2).

A Vector is basically same as ArrayList, but vector methods are synchronized for thread safety.

We normally do not use Vector because the synchronized methods add a performance overhead which might not need. And if we need thread safety, there are utility methods in class Collections that can help.

Vector is the only class other than ArrayList which implements RandomAccess Interface. RandomAccess is a Marker interface used by List implementations(concrete classes that implement List) to indicate that they support fast (generally constant time) random access.

ArrayList vs Vector :

There are only 2 differences

1. Synchronization

If multiple threads access an ArrayList concurrently then we must externally synchronize the block of code which modifies the list either structurally or simply modifies an element. Structural modification means addition or deletion of element(s) from the list. Setting the value of an existing element is not a structural modification.

Collections.synchronizedList is normally used at the time of creation of the list to avoid any accidental unsynchronized access to the list.

2. Data growth

Internally, both the ArrayList and Vector hold onto their contents using an Array. When an element is inserted into an ArrayList or a Vector, the object will need to expand its internal array if it runs out of room. A Vector defaults to doubling the size of its array, while the ArrayList increases its array size by 50 percent.

(source)

LinkedList :

Just like ArrayList, LinkedList is also ordered by index position but in LinkedList elements are double linked to one another.

This linkage gives you new methods to insert or remove objects to the beginning or to the end of the List which makes it easy for us to create Stack or Queue data structure. Basically LinkedList class has got some extra methods other than the one found in List interface. For example addFirst(), addLast(), getFirst(), getLast(), removeFirst() & removeLast().

In Java 5, LinkedList has been enhanced to implement Queue interface so it’s now has got behaviours related to Queue such as peek(), poll() & offer().

ArrayList vs LinkedList :

video

1) Search: ArrayList search operation is pretty fast compared to the LinkedList search operation. get(int index) in ArrayList gives the performance of O(1) while LinkedList performance is O(n).

Reason: ArrayList maintains index based system for its elements as it uses array data structure implicitly which makes it faster for searching an element in the list. On the other side LinkedList implements doubly linked list which requires the traversal through all the elements for searching an element.

  • Random access works faster in ArrayList.
  • Random access works slow in LinkedList.
  • Sequential access works slow in ArrayList.
  • Sequential access works fast in LinkedList.

2) Deletion: LinkedList remove operation gives O(1) performance while ArrayList gives variable performance: O(n) in worst case (while removing first element) and O(1) in best case (While removing last element).

Conclusion: LinkedList element deletion is faster compared to ArrayList.

Reason: LinkedList’s each element maintains two pointers (addresses) which points to the both neighbor elements in the list. Hence removal only requires change in the pointer location in the two neighbor nodes (elements) of the node which is going to be removed. While In ArrayList all the elements need to be shifted to fill out the space created by removed element.

  • Deleting the object from LinkedList is lot faster compared to ArrayList,

3) Inserts Performance: LinkedList add method gives O(1)performance while ArrayList gives O(n) in worst case. Reason is same as explained for remove.

  • Inserting the object from LinkedList is lot faster compared to ArrayList.

4) Memory Overhead: ArrayList maintains indexes and element data while LinkedList maintains element data and two pointers for neighbor nodes hence the memory consumption is high in LinkedList comparatively.

  • LinkedList consumes more memory compared to ArrayList.

5) Behaviours: LinkedList provides few extra methods other than the one present in List interface which we listed earlier in our notes.

There are few similarities between these classes which are as follows:

  1. Both ArrayList and LinkedList are implementation of List interface.
  2. They both maintain the elements insertion order which means while displaying ArrayList and LinkedList elements the result set would be having the same order in which the elements got inserted into the List.
  3. Both these classes are non-synchronised and can be made synchronised explicitly by using Collections.synchronizedList method.
  4. The iterator and listIterator returned by these classes are fail-fast (if list is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove or add methods, the iterator will throw a ConcurrentModificationException).

When to use LinkedList and when to use ArrayList?

1) As explained above the insert and remove operations give good performance (O(1)) in LinkedList compared to ArrayList(O(n)). Hence if there is a requirement of frequent addition and deletion in application then LinkedList is a best choice.

2) Search (get method) operations are fast in Arraylist (O(1)) but not in LinkedList (O(n)) so If there are less add and remove operations and more search operations requirement, ArrayList would be your best bet.

For more information visit

When to use a linked list over an array arraylist

When to use linkedlist over arraylist


Iterating through Collection

 

Using for loop is not the best way to iterate through the collection. There is a method called iterate() in List, Set, Queue and even in Map which returns the object of type Iterator.

Iterator is an interface which has got only three methods out of which 2  are used for iterating through collection and one is to remove object from collection.

Note: There is remove() method in all Collections APIs itself. There remove() method of Iterator do not return the object which it removes from collection but remove() method of Collections APIs do return the object which has been removed.

Using for loops is not the best way to iterate through Collection because it’s always random access. Iterator provides the best possible implementation to iterate through respective collection. For example ArrayList will provide the best possible implementation which works best with arrays where as LinkedList does something more appropriate to its data structure and so on.

Note: foreach loop internally uses iterator itself. We choose foreach loop only for the readability purpose.

More info : Iterator vs For loop


Generics

We already discussed about Generics in “Working with List” unit which is more than enough for interviews and exams.

 

@vinay : to add <?> in future ( least important )


Working with Set

Set Collection will never have duplicates. It uses equals() method to determine if two objects are identical or not.

Below are the 3 classes that implement Set interface.

HashSet :

A HashSet is an unsorted and unordered Set. It uses hashCode() of the object that you are going to insert. So more efficient your hashCode() method implementation is, the better the access performance you’ll get. Use this class when you want a Collection with no duplicates and don’t care about the order when you iterate through it.

Output :

As you can see in the above output there is not order when we iterate through.

LinkedHashSet :

LinkedHashSet is an ordered version of HashSet which maintains a doubly linked List across all elements. Use this class instead of HashSet when you want an order while you iterate through the collection.

LinkedHashSet maintains the order in which you insert the objects.

Note: When using HashSet or LinkedHashSet, the objects added to them must override both hashcode() and equals(). If they don’t override hashCode(), the default Object.hashCode() method will allow multiple objects to be inserter which are “meaningfully equal”.

For example, below code  allows duplicate objects even though their contents are same.

public class Student {

    private int id;
    private String name;

    public Student(int id, String name) {
        this.id = id;
        this.name = name;
    }

    public Student() {
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return id + " " + name;
    }
}

 

public class SetDemo {

    public static void main(String[] args) {

        Set<Student> myStudents = new HashSet<Student>();
                                  //It doesn't matter if it HashSet or
                                  //LinkedHashSet
        myStudents.add(new Student(1,"Vinay"));
        myStudents.add(new Student(2,"Vivaan"));
        myStudents.add(new Student(1,"Vinay")); //its get inserted
        //as we didn't override hashCode() method in Student


        for(Student student: myStudents)
            System.out.println(student);
    }
}

Output :

As we can see in the above output the object with same data is repeated as hashCode() method of Object class returns different values for those objects.

How do solve this ?

Simple we have to override hashCode() method in Student class which compares the values and returns same value if the are meaningfully equal.

below is the modified version of Student class

//Student class Version 2
public class Student {

    private int id;
    private String name;

    @Override
    public boolean equals(Object obj) {
        if(obj instanceof Student){
            Student student = (Student) obj;
            if(student.id == this.id)
                return true;
        }
        return false;
    }


    @Override
    public int hashCode() {
        return id * 111;
    }


    public Student(int id, String name) {
        this.id = id;
        this.name = name;
    }

    public Student() {
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public String toString() {
        return id + " " + name;
    }
}

 

TreeSet :

TreeSet is one among the two Sorted Collections API, the other one is TreeMap.

It used Red-Black tree structure and guarantees that the elements will be in ascending order. You can control what way you want to have it sorting by using one of it’s constructor by implementing an interface called Comparator which is something I am going to teach you in Sorting in Collections chapter.

TreeSet expect you to send the objects which is also an instance of type Comparable to sort them.

Research Assignment: What is Red-Black tree structure ?

Simple example for TreeSet

As you can see in the above output, the objects are sorted in alphabetical order. But why in alohabetical order why in based on length. It’s because TreeSet called compareTo() method of String object and based on it output it sorts the collection.

In order to illustrate it I have written one more example where I have implemented Comparable interface but I do not expect you understand this now. For now just note it down in your notebook but come back to this after Sorting Collections chapter is over.

public class Product implements Comparable<Product>{

    private String name;
    private double price;

    public Product(String name, double price) {
        this.name = name;
        this.price = price;
    }

    public Product() {
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public double getPrice() {
        return price;
    }

    public void setPrice(double price) {
        this.price = price;
    }

    @Override
    public int compareTo(Product product) {
        if(this.price > product.getPrice())
            return 1;
        else  if(this.price < product.getPrice())
            return -1;
        else
            return 0;
    }
}

 

 

As you can see in the output(from now on I will not add output but I expect you to write the program and have to output with you), the products are sorted based on their price because I told TreeSet to do so by specifying it in compareTo() method of Product class :-). If you don’t understand this then nothing to worry just come back here after completing “Sorting in Collections” chapter.

 

 

HashSet vs LinkedHashSet vs TreeSet

HashSet LinkedHashSet TreeSet
How they work internally? HashSet uses HashMap internally to store it’s elements. LinkedHashSet uses  LinkedHashMap internally to store it’s elements. TreeSet uses TreeMap internally to store it’s elements.
Order Of Elements HashSet doesn’t maintain any order of elements. LinkedHashSet maintains insertion order of elements. i.e elements are placed as they are inserted. TreeSet orders the elements according to supplied Comparator. If no comparator is supplied, elements will be placed in their natural ascending order.
Performance HashSet gives better performance than the LinkedHashSet and TreeSet. The performance of LinkedHashSet is between HashSet and TreeSet. It’s performance is almost similar to HashSet. But slightly in the slower side as it also maintains LinkedList internally to maintain the insertion order of elements. TreeSet gives less performance than the HashSet and LinkedHashSet as it has to sort the elements after each insertion and removal operations.
Insertion, Removal And Retrieval Operations HashSet gives performance of order O(1) for insertion, removal and retrieval operations. LinkedHashSet also gives performance of order O(1) for insertion, removal and retrieval operations. TreeSet gives performance of order O(log(n)) for insertion, removal and retrieval operations.
How they compare the elements? HashSet uses equals() and hashCode() methods to compare the elements and thus removing the possible duplicate elements. LinkedHashSet also uses equals() and hashCode() methods to compare the elements. TreeSet uses compare() or compareTo() methods to compare the elements and thus removing the possible duplicate elements. It doesn’t use equals() and hashCode() methods for comparision of elements.
Null elements HashSet allows maximum one null element. LinkedHashSet also allows maximum one null element. TreeSet doesn’t allow even a single null element. If you try to insert null element into TreeSet, it throws NullPointerException.
Memory Occupation HashSet requires less memory than LinkedHashSet and TreeSet as it uses only HashMap internally to store its elements. LinkedHashSet requires more memory than HashSet as it also maintains LinkedList along with HashMap to store its elements. TreeSet also requires more memory than HashSet as it also maintains Comparator to sort the elements along with the TreeMap.
When To Use? Use HashSet if you don’t want to maintain any order of elements. Use LinkedHashSet if you want to maintain insertion order of elements. Use TreeSet if you want to sort the elements according to some Comparator.

Working with Queue

A Queue is designed to hold a list of objects presented in some order which is typically FIFO ( first-in, first-out).

What ever the method which we seen so far in collection are also available in Queue and Queue itself have some methods to add and subtract elements and to review queue elements.

LinkedList

In Java 5 LinkedList has been enhanced to implement Queue so LinkedList acts as both List as well as Queue. LinkedList handles the basic feature of Queue i.e FIFO.

Example:

public class Patient {

        private int id;
        private String name;

        public Patient() {
        }

        public Patient(int id, String name) {
            this.id = id;
            this.name = name;
        }

        public int getId() {
            return id;
        }

        public void setId(int id) {
            this.id = id;
        }

        public String getName() {
            return name;
        }

        public void setName(String name) {
            this.name = name;
        }
    }

Since its FIFO Roy is polled out of Queue first then Joy and at last Jack.

 

PriorityQueue

This API was introduced in JDK 1.5 whose purpose is to provide “priority-in, priority out” implementation of the Queue.

PriorityQueue gives priority based on the natural order (Comparable) or using its constructor we can customize the priority using Comparator interface.

You may not understand the below example now but surely you will do once you complete “Sorting in Collections” chapter. Once you complete that chapter, please come back here again 🙂

Priority Queue Example 1

We see the output as

Bruce Lee
Gowtham Budha
Lobsang Rampa
Swami Vivekananda

because priority is given based on natura order (Comparable Interface). Basically we have object of type String in this collections. String class implements Comparable Interface and override compareTo(String obj) method where it compares the strings based on alphabets.

 

Another Example :

 

package com.felight.collections.queue;

/**
 * Created by Vinay Noah
 * For The Best Online Training with quick professional
 * developer support visit www.felight.io
 * <p>
 * For 100% Job Assured Java Training visit www.felight.com
 * youtube.com/felight
 * +91 96116 96116, +91 9611406060
 */



public class Patient implements Comparable<Patient> {

    private int id;
    private String name;

    public Patient() {
    }

    public Patient(int id, String name) {
        this.id = id;
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @Override
    public int compareTo(Patient patient) {
        if(this.id < patient.id)
            return -1;
        else if(this.id>patient.id)
            return 1;
        else
            return 0;
    }
}

 

Notice that class Patient now implement Comparable interface and has got definition for compareTo() method. This is what the meaning of “natural order” i.e implementing Comparable interface and telling the order by comparing.

package com.felight.collections.queue;

import java.util.PriorityQueue;
import java.util.Queue;

/**
 * Created by Vinay Noah
 * For The Best Online Training with quick professional
 * developer support visit www.felight.io
 * <p>
 * For 100% Job Assured Java Training visit www.felight.com
 * youtube.com/felight
 * +91 96116 96116, +91 9611406060
 */


public class PatientPriorityDemo {

    public static void main(String[] args) {

        Queue<Patient> patientsPriorityQueue = new PriorityQueue<Patient>();
        patientsPriorityQueue.offer(new Patient(12, "Roy"));
        patientsPriorityQueue.offer(new Patient(10, "Joy"));
        patientsPriorityQueue.offer(new Patient(200, "Jack"));

        while(true){
            Patient currentPatient = patientsPriorityQueue.poll();
            if(currentPatient==null)
                break;
            System.out.println(currentPatient.getId() + " : " + currentPatient.getName());
        }
    }
}

In the above example priority is given based on id of the patient so Joy is served first then Roy and then Jack.

 

One example for priority Queue:

 

In the above example priority is given based on the length of the name of the Patient. The one with lesser name will get served first. Not really a real world use case but surely it will increase the length of the interview you attend 🙂


Sorting in Collections

<v1>

<v2>

<v3>