Skip to content

Tag: Robin Hood

Implementing a Key-Value Store – Part 6: Open-Addressing Hash Tables

This is Part 6 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

In this article, I will compare several open-addressing hash tables: Linear Probing, Hopscotch hashing, and Robin Hood hashing. I have already done some work on this topic, and in this article I want to gather data for more metrics in order to decide which hash table I will use for my key-value store.

The result section also contains an interesting observation about the maximum DIB for Robin Hood hashing, which originated from Kristofer Karlsson, a software engineer at Spotify and the author of the key-value store Sparkey.

This article will cover:

1. Open-addressing hash tables
2. Metrics
3. Experimental Protocol
4. Results and Discussion
5. Conclusion
6. References

Robin Hood hashing: backward shift deletion

In my article about Robin Hood hashing [1], I had reached the conclusion that the Robin Hood hashing algorithm was performing poorly after deletions had occurred in the hash table, and I was quite disappointed with my results. In the research I had done, I had stumbled upon an article by Paul Khuong who had done some work on Robin Hood hashing [2], and I decided to email him to ask what he thought of my results. Paul got back to me, and after looking at my code on Github, he suggested that the performance could be greatly improved with only a little change to the removal method. He even submitted a patch to point out what the change would be, and he ran this modified version himself on the test cases I had included in the source code. The results he got were looking really good, and therefore I had no other choice but to write a follow-up article to present the change to the code and show those results.

In this article, I am presenting the backward shift deletion variant for the Robin Hood hashing algorithm. I start by explaining what backward shift deletion is and how to implement it, then I use it on the same test cases that I had run in my previous article. Finally, I discuss the new results.


Robin Hood hashing

As I am still doing research on open addressing hash table algorithms, I tested an approach called Robin Hood hashing. Just like for other reordering schemes, Robin Hood hashing will move entries that are already stored in the hash table as new items are inserted. In the original Robin Hood paper, it is claimed that Robin Hood hashing is very efficient due to the fact that the probe sequence length has a small and almost constant variance.

In this article, I am presenting how Robin Hood hashing handles insertion, deletion, and search, including a graphical representation for the insertion operation. I then use my own C++ implementation of Robin Hood hashing to see how the algorithm behaves under various load conditions and try to determine how efficient it really is. Finally, I discuss some of the drawbacks of Robin Hood hashing, and its applicability for the implementation of an on-disk key-value store.