This is Part 5 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.
In this article, I will study the actual implementations of hash tables in C++ to understand where are the bottlenecks. Hash functions are CPU-intensive and should be optimized for that. However, most of the inner mechanisms of hash tables are just about efficient memory and I/O access, which will be the main focus of this article. I will study three different hash table implementations in C++, both in-memory and on-disk, and take a look at how the data are organized and accessed. This article will cover:
1. Hash tables
1.1 Quick introduction to hash tables
1.2 Hash functions
2. Implementations
2.1 unordered_map from TR1
2.2 dense_hash_map from SparseHash
2.3 HashDB from Kyoto Cabinet
3. Conclusion
4. References