Snapshots of my experiments...: HashMap Internal

I always wanted to implement the logic behind the famous data structure of Java, HashMap and here it comes. Though it’s not the complete implementation considering all optimization scenarios, but it will highlight the basic algorithm behind the HashMap.

So let’s start, this implementation will use my LinkedList implementation (reason: for this implementation I thought to write everything on scratch apart from primitive data types. Sounds weird? May be ;-) ). You may refer my earlier post on LinkedList, as I’m going to use it.

HashMap is a key-value pair data structure, where you retrieve value by passing key. To optimize the search of key, a concept of Bucket (an Array of Java Objects) has been introduced. This is used, so that if search hits this Bucket, corresponding value element is instantly retrieved, otherwise it iterates sequentially through the Linked List associated with each Bucket element. So you can say if all HashMap elements get fit into the Bucket, retrieving elements back, will be super fast and this is the Best Case. But that also means you are keeping everything in an array which is not good in-terms of heap memory storage limitation. So we need to consider both the things Bucket and LinkedList. This is really interesting, so let me know, if you guys want to discuss more on this.

To make sure that all Key_Index (Hashcode of Key Object, calculated by calling Object’s hashCode method) becomes less than Bucket_Size, we will do the mod operation i.e.

Key_Index = key.hashCode() % Bucket_Size,

This returns integer value which is always less than Bucket_Size. If multiple key Objects return same Key_Index, then all matching keys will be stored in the associated LinkedList of the Bucket element. So if the bucket size is 10, following figure depicts the behavior of element entry into HashMap-

Bucket: [K0][K1][K2][K3][K4]......[K9]

[V0][V0][V0][V0][V0]

[V1] [V1]

[V2]

K0, K1 etc. are elements stored in the bucket, where as V0, V1, etc. are elements stored in LinkedList of corresponding bucket elements. So, 3 elements stored in K0’s LinkedList, 1 element stored in K1’s LinkedList.

Structure of HashMap Node Object

import com.ds.link.LinkedList;

public class HashNode {

/**

* HashMap Key

Object key;

/**

* HashMap Value

Object value;

/**

* List of HashMap nodes which doesn't get fit into bucket

LinkedList list;

/**

* by default the list will be null

public void initList(){

if(list == null)

list = new LinkedList();

}

MyHashMap.java

First thing first, define the Bucket (an Object Array)-

final int bucket_size = 10;

Object[] bucket;

/**

* @param bucket_size- custom bucket size

public MyHashMap(int bucket_size){

this.bucket_size = bucket_size;

bucket = new Object[bucket_size];

}

/**

* default bucket_size

public MyHashMap(){

bucket = new Object[bucket_size];

}

Add Element

/**

* Add key, value pair in the HashMap

* @param key

* @param value

public void add(Object key, Object value){

int kCode = key.hashCode();

int index = kCode % bucket_size;

System.out.println("->"+ index);

if(bucket[index] == null){

System.out.println("First entry in Bucket: Key="+ key);

HashNode node = new HashNode();

node.key = key;

node.value = value;

bucket[index] = node;

}

else{

// value already exists, so add this element into LinkedList

HashNode node = (HashNode)bucket[index];

System.out.println("Add into List of: "+ node.key +" Key="+ key);

// initiate the list

node.initList();

HashNode newNode = new HashNode();

newNode.key = key;

newNode.value = value;

// add the new element into list

node.list.addToEnd(newNode);

}

Get Element

/**

* If the key is not present, it will return null

* @param key

* @return

public Object get(Object key){

int index = key.hashCode() % bucket_size;

System.out.println("Searching for Bucket Index: "+ index);

if(index >= bucket.length)

new ArrayIndexOutOfBoundsException("Invalid Attempt: Index="+ index);

HashNode node = (HashNode)bucket[index];

if(node == null){

// At this position, there is no node in the bucket

return null;

}

if(node.key.hashCode() == key.hashCode()){

// element is found in the bucket

System.out.println("1st Level Match....");

return node.value;

}

// unable to find in the element in the bucket

// find in the associated list

LinkedList list = node.list;

if(list == null){

System.out.println("No associated list");

return null;

}

// pointer to the root node of the linkedlist

Node first = list.getFirst();

while(true){

if(first == null)

break;

HashNode hn = (HashNode)first.data;

if(key.hashCode() == hn.key.hashCode()){

System.out.println("2nd Level Match Found...");

return hn.value;

}

first = first.pointer;

}

// still unable to find the element

System.out.println("Failed to hit bulls eye!! No match found");

return null;

}

12 comments:

SachinMarch 18, 2011 at 12:38 PM
Hi Prasanta,

Nice effort to understand the working of a HashMap.

Just to add on to the above algorithm :

Java HashMap supports null keys, and null doesn't have a hash code

So to deal with it and prevent NPE, it just defaults to 0th index if key is null.
That means whenever key is null..bucket 0 is selected and the node is added to the corresponding list.
Similarly while fetching the value of null key , bucket 0 is searched for node whose key is null.

~Sachin
lariApril 23, 2011 at 11:13 PM
Very Nice Prasanta.
I have 2 observations.
bucket_size can not be final & in the add method
node.initList() should be done following a null check .
if(node.list == null){ node.initList() }
AnonymousDecember 29, 2011 at 7:16 AM
Wonderful, too good !
AnonymousJanuary 18, 2012 at 11:07 PM
Thanks, it was helpful, Could you add details on Resizing and Race conditions too ?
Divya KumarApril 8, 2012 at 12:28 AM
Excellent explanation,I am still confuse how it handles null key and why we require to add this feature
BalamuruganJune 5, 2012 at 11:19 PM
Nice explanation. Keep sharing about collections and Threads. Thanks in advance
AnonymousJuly 17, 2012 at 9:11 PM
Awesome...!

By
Vasan Rajendran
HimanshuAugust 9, 2012 at 11:40 AM
Good Explanation and thanks!
but i am confused in the following code of lines:
if(node.key.hashCode() == key.hashCode()){
// element is found in the bucket
System.out.println("1st Level Match....");
return node.value;
}

Here we are just matching the key hashcode and according to that we are returning value but If more that one keys returned the same hashcode then then what will be returned/

For example if our key was AB and hascode return its length that is bucket is 2 now suppose if we search with key CD then again hashcode is 2 and according to above line of code it will return the value of associated with key AB.

Please let me know
ankitAugust 25, 2012 at 12:35 AM
Nice One...
But 1 Observation in 2nd Level Match in place of
if(key.hashCode() == hn.key.hashCode()){ It should be
if(key.equals(hn.key)){}

Please correct me if I am wrong

Thanks

AnonymousMay 23, 2013 at 12:15 PM
nice explanation
UnknownJune 28, 2013 at 10:44 PM
your above mentioned linkedList link is not working.. kindly check and let us know the exact URL for your linkedList post
Prasanta PaulJune 28, 2013 at 10:51 PM
I have changed the URL. Pls check-
http://prasanta-paul.blogspot.in/2010/06/linkedlist-in-java.html

BR,
Prasanta

Friday, June 25, 2010

HashMap Internal

12 comments: