What Is a Hash Table?
A hash table (also called a hash map) is a data structure that maps keys to values for highly efficient lookup, insertion, and deletion. In most programming languages, hash tables are used under the hood for dictionaries, sets, and associative arrays.
The secret to their speed is the hash function — a function that converts any key into an integer index pointing to a slot in an underlying array.
How a Hash Function Works
Suppose you want to store the key "apple". The hash function processes the string and returns an index, say 3. The value is then stored at array[3]. To retrieve it later, you hash the key again, get 3, and look it up instantly — O(1) average time.
A good hash function should:
- Be deterministic (same input → same output every time)
- Distribute keys uniformly to minimize collisions
- Be fast to compute
Handling Collisions
Two different keys can produce the same hash index — this is called a collision. There are two classic strategies to handle them:
1. Chaining
Each slot in the array holds a linked list. Colliding entries are appended to the list. Lookup then traverses the list for the exact key. This is simple and commonly used.
2. Open Addressing
When a collision occurs, the algorithm probes for the next available slot using a defined strategy (linear probing, quadratic probing, or double hashing). This keeps all data in the main array and avoids pointers, which can be cache-friendly.
Time and Space Complexity
| Operation | Average Case | Worst Case |
|---|---|---|
| Insert | O(1) | O(n) |
| Lookup | O(1) | O(n) |
| Delete | O(1) | O(n) |
| Space | O(n) | O(n) |
Worst case occurs when all keys hash to the same slot — a sign of a poor hash function or a very unlucky input. In practice, well-implemented hash tables are nearly always O(1).
Load Factor and Resizing
The load factor is the ratio of stored entries to the total number of slots. As it rises, collisions increase and performance degrades. Most implementations resize (rehash) the table — typically doubling in size — when the load factor exceeds a threshold (commonly 0.7 or 0.75).
Hash Tables in Coding Challenges
Hash tables are indispensable in competitive programming and interviews. Here are common patterns:
- Frequency counting: Count occurrences of characters or numbers in O(n).
- Two Sum pattern: Store complements as you iterate to find pairs in O(n).
- Caching/memoization: Store previously computed results keyed by input.
- Grouping anagrams: Use sorted strings or character counts as keys.
- Sliding window with counts: Track window contents with a frequency map.
Language Reference
- Python:
dictandcollections.Counter/defaultdict - JavaScript:
Mapor plainObject - Java:
HashMap,LinkedHashMap - C++:
unordered_map
When you need ordered key traversal, use a TreeMap (Java) or collections.OrderedDict (Python) — but note the O(log n) tradeoff per operation.
Key Takeaway
Hash tables are the single most useful data structure in interview settings. If you're ever unsure whether you need one, ask yourself: "Am I looking up, counting, or grouping things?" If yes, reach for a hash table first.