Table of Contents
Description
Think of a hash table as a dictionary. You want to find all the information you can on a word - its spelling, pronunciation, definition, synonyms, etc. But you don’t want to read through every page of the dictionary until you find the word. Instead, you can take the first letter of the word and go to that part of the book. All you need is the first letter of the word, and you can find all the info you want about the word really quickly.
- A hash table is a data structure that specializes in very fast searching.
- Instead of iterating through an entire array to find specific data, you can store it in a hash table and look it up quickly.
- Data is stored in a hash table as key-value pairs.
![Untitled](<https://s3-us-west-2.amazonaws.com/secure.notion-static.com/b351554b-ea91-402d-a544-61e0a94a1766/Untitled.png>)
- A value contains all of the data you want to store and lookup.
- It’s usually an object with fields for all the data you want to keep track of. However, it can be anything.
- A key can be anything you’ll use to lookup the data you want.
- Usually a field of the object, such as a String (name) or int/long (id).
- Keys must be unique. One key points to one value, and no keys can match.
Usage
- Hash Tables are extremely useful. Whenever you want to search for specific data quickly, you should store it in a hash table.
- Hash Tables vs Arrays
- If you want to get the 5th element in your storage quickly (whatever it may be), use an array.
- If you want to get the element with the value “Ivan” from your storage quickly, use a hash table.
- Interviews: One of the first things you should ask yourself is “can the solution be implemented in a hash table?”.
Implementation
-
When searching a hash table for the value associated with a key, you fist convert the key to a hash code, then convert the hash code into an index in the array. The index is then used to find the value.
- Key → Hash Code → Index → Value
- A hash function is used to convert a key into a hash code.
![Untitled](<https://s3-us-west-2.amazonaws.com/secure.notion-static.com/96c17793-e0a8-41bb-8861-03aef86d4861/Untitled.png>)
-
Collisions
- In terms of possibilities, there are more possible keys than hash codes, and more possible hash codes than indices in the array.
- Keys > hash codes > indices.
- This means that:
- Different keys can have the same hash code, and thus point to the same index in the array.
- Different hash codes can point to the same index in the array.
- When either of the above two cases happen, a collision occurs: instead of having one key-value lookup pointing to one index, you have multiple different key-value pairs pointing to the same array index.
-
To resolve collisions, chaining is done.
- All the resulting key-value pairs from a key lookup are placed in a linked list.
- The linked list is placed at the target array index.
- When you search for a value, all the key-value pairs are searched. The correct key is taken.
Standard Implementation: Array of Linked Lists
![Untitled](<https://s3-us-west-2.amazonaws.com/secure.notion-static.com/a5fc34cf-e224-4923-9cd2-a384a178ca1f/Untitled.png>)
![Untitled](<https://s3-us-west-2.amazonaws.com/secure.notion-static.com/33c64f3d-1254-48db-b2ae-c168cf5ec0c3/Untitled.png>)
- One linked list for each hashed version of a Key
- The value paired with the key is placed into the linked list as a node.
- Insertion process
- Key-value pair inserted
- Key is hashed, converted to hashcode
- Hashcode is converted to index in array
- Value paired with the key is inserted into the linked list at that index.
- Collisions
- Sometimes two different keys have the same hashcode. This is called a collision.
- Their values are placed into the same linked list.
- Linked lists are small compared to the array, so resolving collisions by searching the linked list is fast compared to searching the whole array.
- O(1) vs O(N)