Skip to content

Year: 2014

An afternoon in 1983 with Hector

ad-small

Back in 1998, I was 13 years old and had no money to buy a computer. But that wouldn’t stop me. I didn’t care if I had a brand new computer or a used one, all I wanted was a computer. From the conversations of adults around me, I heard it was common for businesses to renew their hardware and throw their old computers into the trash. So I figured, all I had to do was to be in the right trash at the right time, or even better, make the trash come to me. In these years, there were tons of computer paper magazines, so I decided to send a letter to one of them, of which I have now forgotten the name, to publish an ad in their classified section: “13 year-old student in the Paris area, will come to your home or office to get any computers you are about to throw in the bin”. From this classified, I got a 486DX-80 PC, a printer, and a Hector HR2+ computer.

The 486DX-80 and the printer have served me well, and are long gone. Last summer, while visiting my family in France I decided to take a look into the attic. As I was making my way through rusty nails and spider webs, I noticed a bag covered with dust in a dark corner of the room. I had the feeling that I was about to make a very good discovery, and I was not disappointed. I opened the bag with excitement, and there it was, the almighty Hector 2HR+ computer!

In the bag came all the booklets, cables, and cassettes, so I decided that I would spend the afternoon writing code on that machine, and that I would run this code before nightfall. But before I go any further on that, a bit of history about the Hector 2HR+…

Implementing a Key-Value Store – Part 6: Open-Addressing Hash Tables

This is Part 6 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts.

In this article, I will compare several open-addressing hash tables: Linear Probing, Hopscotch hashing, and Robin Hood hashing. I have already done some work on this topic, and in this article I want to gather data for more metrics in order to decide which hash table I will use for my key-value store.

The result section also contains an interesting observation about the maximum DIB for Robin Hood hashing, which originated from Kristofer Karlsson, a software engineer at Spotify and the author of the key-value store Sparkey.

This article will cover:

1. Open-addressing hash tables
2. Metrics
3. Experimental Protocol
4. Results and Discussion
5. Conclusion
6. References

Coding for SSDs – Part 6: A Summary – What every programmer should know about solid-state drives

This is Part 6 over 6 of “Coding for SSDs”. For other parts and sections, you can refer to the Table to Contents. This is a series of articles that I wrote to share what I learned while documenting myself on SSDs, and on how to make code perform well…

Coding for SSDs – Part 5: Access Patterns and System Optimizations

This is Part 5 over 6 of “Coding for SSDs”, covering Sections 7 and 8. For other parts and sections, you can refer to the Table to Contents. This is a series of articles that I wrote to share what I learned while documenting myself on SSDs, and on how to make code perform well on SSDs. If you’re in a rush, you can also go directly to Part 6, which is summarizing the content from all the other parts.

Now that I have covered most of the inner workings of solid-state drives in the previous sections, I can provide data that will help build an understanding of which access patterns should be used and why they are indeed better than others. In this part, I explain how writes should be done, how reads should be done, and why concurrent read and write operations are interfering. I also cover a few optimizations at the level of the filesystem which can improve performance.

Translations: This article was translated to Simplified Chinese by Xiong Duo and to Korean by Matt Lee (이 성욱).

ssd-presentation-05

Coding for SSDs – Part 4: Advanced Functionalities and Internal Parallelism

This is Part 4 over 6 of “Coding for SSDs”, covering Sections 5 and 6. For other parts and sections, you can refer to the Table to Contents. This is a series of articles that I wrote to share what I learned while documenting myself on SSDs, and on how to make code perform well on SSDs. If you’re in a rush, you can also go directly to Part 6, which is summarizing the content from all the other parts.

In this part, I cover briefly some of the main SSD functionalities such as TRIM and over-provisioning. I am also presenting the different levels of internal parallelism in an SSD, and the concept of clustered block.

Translations: This article was translated to Simplified Chinese by Xiong Duo and to Korean by Matt Lee (이 성욱).

ssd-presentation-04

Coding for SSDs – Part 3: Pages, Blocks, and the Flash Translation Layer

This is Part 3 over 6 of “Coding for SSDs”, covering Sections 3 and 4. For other parts and sections, you can refer to the Table to Contents. This is a series of articles that I wrote to share what I learned while documenting myself on SSDs, and on how to make code perform well on SSDs. If you’re in a rush, you can also go directly to Part 6, which is summarizing the content from all the other parts.

In this part, I am explaining how writes are handled at the page and block level, and I talk about the fundamental concepts of write amplification and wear leveling. Moreover, I describe what is a Flash Translation Layer (FTL), and I cover its two main purposes, logical block mapping and garbage collection. More particularly, I explain how write operations work in the context of a hybrid log-block mapping.

Translations: This article was translated to Simplified Chinese by Xiong Duo and to Korean by Matt Lee (이 성욱).

ssd-presentation-03

Coding for SSDs – Part 2: Architecture of an SSD and Benchmarking

This is Part 2 over 6 of “Coding for SSDs”, covering Sections 1 and 2. For other parts and sections, you can refer to the Table to Contents. This is a series of articles that I wrote to share what I learned while documenting myself on SSDs, and on how to make code perform well on SSDs. If you’re in a rush, you can also go directly to Part 6, which is summarizing the content from all the other parts.

In this part, I am explaining the basics of NAND-flash memory, cell types, and basic SSD internal architecture. I am also covering SSD benchmarking and how to interpret those benchmarks.

Translations: This article was translated to Simplified Chinese by Xiong Duo and to Korean by Matt Lee (이 성욱).

ssd-presentation-02

Coding for SSDs – Part 1: Introduction and Table of Contents

Translations: This article was translated to Simplified Chinese by Xiong Duo and to Korean by Matt Lee (이 성욱). Introduction I want to make solid-state drives (SSDs) the optimal storage solution for my key-value store project. For that reason, I had to make sure I fully understood how SSDs work,…