I am currently experimenting with various hash table algorithms, and I stumbled upon an approach called hopscotch hashing. Hopscotch hashing is a reordering scheme that can be used with the open addressing method for collision resolution in hash tables. When using open addressing with only a probing sequence and no reordering, entries are inserted in the first empty buckets found in the sequence. With a reordering scheme, entries already in the table can be moved as new entries are inserted. Hopscotch hashing is interesting because it guarantees a small number of look-ups to find entries. In addition, those look-ups are in contiguous memory areas, which is very cache friendly.
This article makes minor contributions to the original publication. First, a clear explanation of the insertion process is being given, which is independent from the representation of neighborhoods. The original paper was using the bitmap representation to present the algorithm, and I believe that things are simpler without it. Second, a new neighborhood representation is introduced, it is called the “shadow” representation. It derives the relationship between neighborhoods and initial buckets from the hashed keys, and does not require any extra memory. Finally, a C++ implementation is provided, and experimentation with this implementation allows to shed light on some of the limitations of hopscotch hashing that were not expressed in the original paper.