HashMap is one of the most widely used data structures in Java Collection. Insertion and Retrieval are the most frequently used operations, and HashMap execute these processes in constant time, no matter how big the data is. It stores key-value pairs and to access a value, one must know it's key.
An instance of HashMap has two parameters that affect its performance:
HashMap is built on the principle of HashTable. The capacity in Hash Table points to the bins it has. Using some Hashing Algorithm, all the keys are stored in these bins. If there are multiple keys at the same bin, chaining in the form of linked list is used.
Here's an example of a Hash Table.
In our example, the capacity is 7.
When we instanciate a HashMap, it has initial capacity of 16; which means there are 16 bins initially in it.
Consider a scenario when you have just 16 bins and you have to add 160 items to it. Even if all the items are distributed equally, every bin will have a chain of length 10. In this case, lookup cost will be very high which violates the very purpose of a HashMap. So we need to increase the number of bins OR the capacity of the HashMap.
The decision regarding when to increase the capacity of the HashMap is taken on the basis of Load Factor. It sets a threshold beyond which, capacity is doubled. Load Factor's value lies between 0 and 1 and the default value is 0.75.
Lets assume out Load factor is 0.50 and capacity is 16,
Threshold = Load Factor * Capacity,
Threshold = 0.50 * 16 = 8
Once 8th item (key-value pair) is added to the HashMap, its capacity is doubled. So now after 8th element, capacity is 16.
If we set Load Factor to be a high value, Space used by the HashMap will be very low. On the other hand, it is almost filled at every time, which makes lookup cost (time) very high.
If we set Load Factor to be a low value, Space used by the HashMap will be very high. On the other hand, it is mostly empty at every time, which makes lookup cost (time) very low.