Friday, June 25, 2010

HashMap Internal


I always wanted to implement the logic behind the famous data structure of Java, HashMap and here it comes. Though it’s not the complete implementation considering all optimization scenarios, but it will highlight the basic algorithm behind the HashMap.

So let’s start, this implementation will use my LinkedList implementation (reason: for this implementation I thought to write everything on scratch apart from primitive data types. Sounds weird? May be ;-) ). You may refer my earlier post on LinkedList, as I’m going to use it.

HashMap is a key-value pair data structure, where you retrieve value by passing key. To optimize the search of key, a concept of Bucket (an Array of Java Objects) has been introduced. This is used, so that if search hits this Bucket, corresponding value element is instantly retrieved, otherwise it iterates sequentially through the Linked List associated with each Bucket element. So you can say if all HashMap elements get fit into the Bucket, retrieving elements back, will be super fast and this is the Best Case. But that also means you are keeping everything in an array which is not good in-terms of heap memory storage limitation. So we need to consider both the things Bucket and LinkedList. This is really interesting, so let me know, if you guys want to discuss more on this.

To make sure that all Key_Index (Hashcode of Key Object, calculated by calling Object’s hashCode method) becomes less than Bucket_Size, we will do the mod operation i.e.
Key_Index = key.hashCode() % Bucket_Size,
This returns integer value which is always less than Bucket_Size. If multiple key Objects return same Key_Index, then all matching keys will be stored in the associated LinkedList of the Bucket element. So if the bucket size is 10, following figure depicts the behavior of element entry into HashMap-

Bucket: [K0][K1][K2][K3][K4]......[K9]
        [V0][V0][V0][V0][V0]
        [V1]    [V1]
        [V2]
 K0, K1 etc. are elements stored in the bucket, where as V0, V1, etc. are elements stored in LinkedList of corresponding bucket elements. So, 3 elements stored in K0’s LinkedList, 1 element stored in K1’s LinkedList.

Structure of HashMap Node Object

import com.ds.link.LinkedList;

public class HashNode {

      /**
       * HashMap Key
       */
      Object key;
      /**
       * HashMap Value
       */
      Object value;
      /**
       * List of HashMap nodes which doesn't get fit into bucket
       */
      LinkedList list;
     
      /**
       * by default the list will be null
       */
      public void initList(){
            if(list == null)
                  list = new LinkedList();
      }
}

MyHashMap.java

First thing first, define the Bucket (an Object Array)-
final int bucket_size = 10;
Object[] bucket;
/**
  * @param bucket_size- custom bucket size
  */
public MyHashMap(int bucket_size){
      this.bucket_size = bucket_size;
      bucket = new Object[bucket_size];
}
     
/**
  * default bucket_size
  */
public MyHashMap(){
      bucket = new Object[bucket_size];
}

Add Element
/**
  * Add key, value pair in the HashMap
  * @param key
  * @param value
  */
public void add(Object key, Object value){
      int kCode = key.hashCode();
      int index = kCode % bucket_size;
      System.out.println("->"+ index);
           
      if(bucket[index] == null){
            System.out.println("First entry in Bucket: Key="+ key);
            HashNode node = new HashNode();
            node.key = key;
            node.value = value;
            bucket[index] = node;
      }
      else{
            // value already exists, so add this element into LinkedList
            HashNode node = (HashNode)bucket[index];
            System.out.println("Add into List of: "+ node.key +" Key="+ key);
            // initiate the list
            node.initList();
            HashNode newNode = new HashNode();
            newNode.key = key;
            newNode.value = value;
            // add the new element into list
            node.list.addToEnd(newNode);
      }
}

Get Element
/**
  * If the key is not present, it will return null
  * @param key
  * @return
  */
public Object get(Object key){
           
      int index = key.hashCode() % bucket_size;
      System.out.println("Searching for Bucket Index: "+ index);
      if(index >= bucket.length)
            new ArrayIndexOutOfBoundsException("Invalid Attempt: Index="+ index);
           
      HashNode node = (HashNode)bucket[index];
           
      if(node == null){
            // At this position, there is no node in the bucket
            return null;
      }
           
      if(node.key.hashCode() == key.hashCode()){
            // element is found in the bucket
            System.out.println("1st Level Match....");
            return node.value;
      }
           
      // unable to find in the element in the bucket
      // find in the associated list
      LinkedList list = node.list;
           
      if(list == null){
            System.out.println("No associated list");
            return null;
      }
           
      // pointer to the root node of the linkedlist
      Node first = list.getFirst();
           
      while(true){
            if(first == null)
                  break;
            HashNode hn = (HashNode)first.data;
            if(key.hashCode() == hn.key.hashCode()){
                  System.out.println("2nd Level Match Found...");
                  return hn.value;
            }
            first = first.pointer;
      }
           
      // still unable to find the element
      System.out.println("Failed to hit bulls eye!! No match found");
      return null;
}

Thursday, June 17, 2010

Binary Search Tree in Java


-->
This post is a part of my continuous effort to just implement few of the important data structures in Java. Please find my previous post on LinkedList and QuickSort. This post is regarding BST (Binary Search Tree).

Let’s first declare the structure of a BST Node
/**
 * BST Node Structure
 * @author prasanta
 */
public class BST_Node {

      /**
       * Left subtree pointer
       */
      BST_Node left;
      /**
       * Right subtree pointer
       */
      BST_Node right;
      /**
       * key which is used for comparison
       */
      int value;
}
Add Node into BST
BST_Node root;
     
public void insert(int value){
      root = add(root, value);
}
     
private BST_Node add(BST_Node node, int value){
      if(node == null){
            node = new BST_Node();
            node.value = value;
            return node;
      }    
      if(value < node.value)
            // Left Node (<)
            node.left = add(node.left, value);
      else
            // Right Node (>=)
            node.right = add(node.right, value);
      return node;
}

So your Binary Search Tree is complete and you can refer it using root. There are multiple ways you can traverse a BST- Pre-Order, In-Order, Post-Order and Level-Order (get more details on WiKi- http://en.wikipedia.org/wiki/Tree_traversal). Here I’ll show implementation of Pre-Order and In-Order, as I’m facing some problem with my Post-Order implementation.

Pre-Order Traversal- (it’s simple; the recursive call is doing the trick). The below function will display BST node values in Pre-Order
public void preOrder(BST_Node node){
      // root-left-right
      if(node == null)
            return;
      System.out.print(node.value+" ");
      preOrder(node.left);
      preOrder(node.right);
}

In-Order Traversal- well this is really a smart traversal method. It will return you a sorted list. Again, recursion is doing the trick.
/**
 * This will give you the sorted list. So to sort a list of random
 * order element- Create a BST and do In-Order traversal,
 * Bingo! It’s sorted
 * @param node
 */
public void inOrder(BST_Node node){
      // left-root-right
      if(node == null)
            return;
      inOrder(node.left);
      System.out.print(node.value+ " ");
      inOrder(node.right);
}


Search node in BST
public boolean search(int x){
      round = 0;
      return find(root, x);
}
     
private boolean find(BST_Node node, int x){
      if(node == null){
            return false;
      }
      // how many iteration required     
round++;
      if(x == node.value)
            return true;
           
      else if(x < node.value)
            return find(node.left, x); // you'll get it on the left branch
      else
            return find(node.right, x); // you'll get it on right branch
}