HashMap Vs. TreeMap

Friday, October 15, 2010

HashMap Vs. TreeMap

This is one of my favorite programming interview questions: "What's the difference between Hash Tables and Binary Search Trees?" (or: "HashMap vs. TreeMap"). Let's see...

	Hash Tables (HashMap)	Binary Search Trees (TreeMap)
Algorithm	Keys are mapped to values by using hash functions. Hash functions transform the key to a numeric index (usually by calculating an integer value from the key first, and then applying a "modulo arraysize" operation). This index identifies an array element ("bucket"), where all corresponding values reside (values which all stem from keys with equal hash value). The key must still be looked up within the bucket, but as hashing algorithms are supposed to ensure a good hash value distribution, hence small bucket size, this lookup is very fast. The bucket array may have to be resized dynamically when a certain load factor is reached.	Node-based tree data structure, where every node contains one record (key-only, or key and value), and has a left subtree (with smaller keys) and a right subtree (with larger keys); hence keys must be comparable. Self-balancing trees, like red-black-trees, keep their height as small as possible for fast searching and inserting.
Java Implementations	Hashtable HashMap HashSet	TreeMap
.NET Implementations	Hashtable Dictionary HashSet	SortedDictionary SortedList
C++ STL Implementations	std::unordered_map std::unordered_set	std::map
Sorted	No	Yes
Range Search	No	Yes
Runtime Complexity	Inserting: O(1) (normal case), O(N) (worst case, only with bad hash algorithm) Searching: O(1) (normal case), O(N) (worst case, only with bad hash algorithm) More details	Inserting: O(log(n)) (when balanced) Searching: O(log(n)) (when balanced) More details
Implementation Notes	Equal objects must produce the same hash code, however unequal objects need not produce distinct hash codes. Equals() implementations must be reflexive, symmetric, transitive, consistent. More details. .NET's ValueType.Equals() and GetHashCode() methods are based on reflection, hence slow; you should provide your own implementation in structs.	CompareTo() implementations must be reflexive, symmetric, transitive, consistent. More details.