Emmanuel Goossaert

Cuckoo Hashing

Published by Emmanuel Goossaert on July 20, 2013

As part of my work on my key-value store project, I am currently researching hashing methods with the goal to find one that would fit the performance constraints of on-disk storage. In this article, I am making a quick review of cuckoo hashing, a method to resolve collisions in hash tables. This article is not part of the IKVS series as it is not specific to key-value stores.

Implementing a Key-Value Store – Part 5: Hash table implementations

Published by Emmanuel Goossaert on May 13, 2013

This is Part 5 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

In this article, I will study the actual implementations of hash tables in C++ to understand where are the bottlenecks. Hash functions are CPU-intensive and should be optimized for that. However, most of the inner mechanisms of hash tables are just about efficient memory and I/O access, which will be the main focus of this article. I will study three different hash table implementations in C++, both in-memory and on-disk, and take a look at how the data are organized and accessed. This article will cover:

1. Hash tables
    1.1 Quick introduction to hash tables
    1.2 Hash functions
2. Implementations
    2.1 unordered_map from TR1
    2.2 dense_hash_map from SparseHash
    2.3 HashDB from Kyoto Cabinet
3. Conclusion
4. References

Estimated reading time

Published by Emmanuel Goossaert on April 27, 2013

Of all the currently available media, the written format is the only one for which we do not know the exact durations ahead of time. Indeed, we know exactly how long it will take to watch a film or listen to a podcast, but we have no idea how long…

Implementing a Key-Value Store – Part 4: API Design

Published by Emmanuel Goossaert on April 3, 2013

This is Part 4 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

I finally settled on a name for this whole key-value store project, which from now on will be referred as ~~FelixDB~~ KingDB.

In this article, I will take a look at the APIs of four key-value stores and database systems: LevelDB, Kyoto Cabinet, BerkekeyDB and SQLite3. For each major functionality in their APIs, I will compare the naming conventions and method prototypes, to balance the pros and cons and design the API for the key-value store I am currently developing, KingDB. This article will cover:

1. General principles for API design
2. Defining the functionalities for the public API of KingDB
3. Comparing the APIs of existing databases
    3.1 Opening and closing a database
    3.2 Reads and Writes
    3.3 Iteration
    3.4 Parametrization
    3.5 Error management
4. Conclusion
5. References

Implementing a Key-Value Store – Part 3: Comparative Analysis of the Architectures of Kyoto Cabinet and LevelDB

Published by Emmanuel Goossaert on December 30, 2012

This is Part 3 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

In this article, I will walk through the architectures of Kyoto Cabinet and LevelDB, component by component. The goal, as stated in Part 2 of the IKVS series, is to get insights at how I should create the architecture my own key-value store by analyzing the architectures of existing key-value stores. This article will cover:

1. Intent and methodology of this architecture analysis
2. Overview of the Components of a Key-Value Store
3. Structural and conceptual analysis of Kyoto Cabinet and LevelDB
    3.1 Create a map of the code with Doxygen
    3.2 Overall architecture
    3.3 Interface
    3.4 Parametrization
    3.5 String
    3.6 Error Management
    3.7 Memory Management
    3.8 Data Storage
4. Code review
    4.1 Organization of declarations and definitions
    4.2 Naming
    4.3 Code duplication

Implementing a Key-Value Store – Part 2: Using existing key-value stores as models

Published by Emmanuel Goossaert on December 3, 2012

This is Part 2 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

In this article, I will start by explaining why I think it is important to use models for this project and not start completely from scratch. I will describe a set of criteria for selecting key-value store models. Finally, I will go over some well-known key-value store projects, and select a few of them as models using the presented criteria. This article will cover:

1. Not reinventing the wheel
2. Model candidates and selection criteria
3. Overview of the selected key-value stores

Kir: find commands by describing them from the shell

Published by Emmanuel Goossaert on November 17, 2012

When doing system administration to fix a crash on some Unix-based server, I have run several times into the issue of trying to remember how to perform a certain task, but not remembering the exact sequence of commands. After that, I am always doing the same thing, and I have to resort to do a search on Google to find the commands I need. Those tasks are generally not frequent enough to be worth it to memorize the commands or create a script, but frequent enough for the process of searching to become really annoying. It’s also a productivity issue since it requires me to stop the current workflow, open a web browser and perform a search. For me, those things include tasks such as “how to find the number of processors on a machine” or “how to dump a Postgresql table in CSV format.”

I thought that it would be great to have some piece of code to just be able to query Google from the command-line. But that would be a mess, as for each query I would need a simple sequence of commands that I need to type, and not a blog article with fluffy text all around which is what Google is likely to return. Also, I thought about using the API of commandlinefu.com to get results directly from there. So I did a small Python script that performs text search that way, but the results were never exactly what I was looking for, since the commands presented there have been formatted by people who do not have the exact same needs I have. This is what brought me to implement Kir, a tiny utility to allow for text-search directly from the command-line and give the exact list of commands needed.

Implementing a Key-Value Store – Part 1: What are key-value stores, and why implement one?

Published by Emmanuel Goossaert on November 7, 2012

This is Part 1 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

In this article, I will start with a short description of what key-value stores are. Then, I will explain the reasons behind this project, and finally I will expose the main goals for the key-value store that I am planning to implement. Here is the list of the things I will cover in this article:

1. A quick overview of key-value stores
2. Key-value stores versus relational databases
3. Why implement a key-value store
4. The plan

Implementing a Key-Value Store

Published by Emmanuel Goossaert on November 7, 2012

UPDATE July 21, 2016: This article series is still on-going, and the key-value store, KingDB, has already been released: http://kingdb.org. Over the coming weeks I will publish the last articles for the IKVS series, which will cover the architecture and data format of KingDB. To get an update when it’s…

Europe is not ready for drop-shipping

Published by Emmanuel Goossaert on October 26, 2012

I heard about drop-shipping for the first time a few months ago, when I stumbled upon an AMAA on Reddit with some guy claiming he was making $100k per month running drop-shipping websites. This guy also apparently verified the information with some mods of the AMAA sub-reddit, and provided a short introductory guide to drop-shipping that he later removed. Lucky me, I also bookmarked the link to the guide when I bookmarked the AMAA, here is the guide he made. The guide includes, at the very end, a list of the companies that he is using for his marketing. Some comments on Hacker News about this AMAA said that this looks like a scam aimed at promoting those companies.

After reading the post on Reddit, I started to look into drop-shipping as a possibility for creating a small business that would generate small but steady revenue. I already explored other options, as documented in a previous blog post about micro-ISVs. Here is what I found and what I think about drop-shipping.

Author: Emmanuel Goossaert