Service Ownership Checklist

2017 November 12

Building and maintaining infrastructure services requires to strive for quality and ownership. But it’s not always easy to know what we are missing, and what assumption we are making that we don’t know of. To help myself and my colleagues reason about whether we are addressing the important topics, I came up with something I call the Service Ownership Checklist. It’s still in a draft format, but I’ve already refined it thanks to the help and feedback of many of my peers around me, and I’m now releasing it on my blog hoping that it can help other infrastructure engineers as well.

The way to use this document is to share it with your colleagues and teams, and have them ask each other some of the questions to see how they’re doing on all these topics and challenge their assumptions. You will hopefully uncover unknown issues, and create enough urgency to go and fix them.

This blog post is organized in two parts. The first part is the Service Ownership Checklist, a set of loose questions that can be used in brainstorming and sharing sessions, and the second part is a condensed version in the form of a questionnaire, the Service Ownership Questionnaire, which shows the different levels of quality for each reliability topic.

The SRE organization at Google is running Launch Reviews when they release new services, and for this they use a Launch Review Checklist. I recommend that you read Chapter 27 of the Google SRE book, which covers the subject.

Finally, if you think of any other topics, or of a better way to group the questions into categories, please post a comment below! And if you enjoyed this article, subscribe to the mailing list at the top of this page, and you will receive an update every time a new article is posted.

read more…

Planned Reading: The Trick for Reading Nonfiction

2017 May 6
by Emmanuel Goossaert

I am always on the lookout for unusual and interesting books, from which I hope to learn new ideas. But time becomes an issue, as I have to prioritize which books and articles to read first. About a year ago, I wrote an article about the industry standards for nonfiction books, and why such books all end up looking the same. The conclusion of that article was that instead of chasing speed reading and memorization techniques, it was more important to first select the right books, and read only those.

Continuing in my quest to read more and better, about six months ago I came across an idea that is simply mind blowing to me: if you’re reading nonfiction books cover-to-cover, you’re doing it wrong. In this article I will talk about a better way to read nonfiction content, called planned reading, which will save you time, will keep you more focused, and will help you retain more from the books you read.

Don’t read nonfiction books from cover to cover

Let’s take a shortcut for the sake of simplification, and say we can classify all books into three main categories: technical, fiction, and nonfiction.

Technical books and textbooks concentrate all the knowledge about one domain into a single place, and organize that knowledge to make it simple to navigate down to details. Technical books are easy to use: you open a technical book because you have a question, thus you’re going to start at the Table of Contents or at the Index to identify which areas of the book have the knowledge you’re looking for, and you’re going to jump to those pages directly.

Fiction books such as novels tell stories, and most likely you’re reading those books for entertainment, thus you’re shouldn’t be fussed to spend time on it, because you enjoy it. Fictions books are easy too, just open the book on the first page, and keep reading until you reach the last page. The authors designed their content linearly in such a way that you get deeper into the story as you keep reading. Stories may be nonlinear, but the delivery structure is linear, i.e. one page after another.

Nonfiction books convey ideas and opinions using arguments, examples, anecdotes, data, surveys, etc. The most likely reason why you’re reading those books is because you want to learn about some new concepts or ideas, not because you are enjoying the authors’ style. The authors of nonfiction books present with a linear structure and invite us to follow that structure to get the gist of their ideas. And this is where most of us fail, when the books becomes boring, repetitive, sentences don’t make sense, and rightfully so: most authors of nonfiction are bad writers.

If you research a bit into the process that book publishers use to get nonfiction books out the door, you’ll learn that most nonfiction books are just 10-page articles that have been stretched into 250-page books using bad prose and anecdotes. Therefore the structure offered by authors in nonfiction books is a trap, and following it will inevitably lead you to getting bored and dropping out. And if you do manage to finish a 250-page nonfiction book, chances are you spent 20-25 hours on it, and you feel like you wasted most of that time zoning away and being half-distracted.

There is a better way to reading nonfiction books that takes less time, leaves you with a more in-depth understanding of the author’s ideas, and keeps you engaged as you read. Enter planned reading.

Planned reading

Planned reading relies on two things: your intent for reading a book, and the Pareto principle. Your intent is what makes you want to read that book, what goals you have, what you expect to learn, what insights you want to get, etc. The Pareto principle states that 80% of the effects come from 20% of the causes.

Now when you combine the concept of having an intent and the Pareto principle to reading a nonfiction book, this means that a few areas of that book will contain most of the ideas that are relevant to the personal goals you have for reading that book, which I’ll name “relevant areas”. Thus, two people reading the same nonfiction book but with different intents will find different areas to be relevant.

Spending as little time as possible on a nonfiction book boils down to building a reading plan so you identify the areas relevant to you, and read those only. Building a plan means that not only you decide what to read first, but also decide what you will ignore and not read.

Let’s take the example of an imaginary 20-page chapter in a nonfiction book, which can be broken down as such:

  • p1: introduction
  • p2-5: anecdotes that tell stories related to the theme of the chapter
  • p6: statement of the core idea of the chapter
  • p7-11: examples that illustrate the core idea
  • p12: a diagram
  • p13: section with more on the core idea
  • p14: table with data
  • p15-19: business cases and more anecdotes
  • p20: conclusion

If you read this chapter linearly, from beginning to end, using only the structure provided by the author, your information intake will be linear as well. It is very likely that the anecdotes, examples, and business cases won’t bring much value, but by reading linearly you have to go through them anyway.

Now if you apply planned reading to this imaginary chapter, you would start by skimming over all the pages of the chapter quickly to get an idea of its content, and then you would build your plan by selecting to read only the areas relevant to your intent. For the sake of this article, let’s say that one possible plan is:

  1. First, the introduction and conclusion, because the author is likely to introduce his idea and summarize his argument there (p1 and p20).
  2. Then the pages with statements about the core idea of the chapter, because this is where the author articulates some arguments to support his idea. (p6 and p13).
  3. Finally the pages with the diagram and table, because those are information dense (p12 and p14).

Then keeping the Pareto principle in mind, the rest of the pages could simply be ignored, because after reading the most relevant pages, you will get diminishing returns for every extra page you’ll read. If we were to represent visually the difference in information intake between the linear reading and the planned reading approaches, it would look like this:

This was only to plan the reading of a chapter. When reading a nonfiction book, you would start by looking at the Table of Contents, Table of Figures, Index, Appendixes, to map out what’s in the book. Then you would decide which chapters to read and in what order, and which chapters you would ignore completely. And finally for each chapter you would decide what pages to read, and in which order.

Using this technique myself, I have noticed that the simple act of planning my reading, and as I read keeping my intent in mind, helps me stay focused and engaged. I have also noticed that I am able to remember the main ideas and arguments of books more clearly and for longer periods of time.

If you want to learn more, I strongly encourage you to read this 10-page booklet by Paul N. Edwards, which expands on the concept of planned reading and offer more tips and ideas.

Do you have any reading techniques or tricks that you’ve used for nonfiction books and that’s worth sharing? Write a comment below!

Autonomous Peer Learning at and How You Can Do it in Your Organization

2016 December 11

This article was originally published for’s Technology blog on November 23rd, 2016. Click here for the original article.

Continuous learning on the job is hard. We all see things we want to improve, but maybe we’re missing a few skills to really make an impact. With most days filled with emails and meetings, there’s often not much time left for learning, no matter how much we want to develop our skills.

Although many organizations try to remedy this issue by employing external companies to handle training, they rarely follow-up to ensure such trainings are actually value for money. Not only that, employees are often left to figure out how their new skills can be applied to daily work, and sometimes they are even left wondering if the training taught them anything useful at all.

I work at as an engineering manager, and in my job I wanted to learn about a topic for which there was no formal training. I ended up creating a study group that became the blueprint for autonomous peer learning in our Technology department. It’s an initiative that has been scaled to 50 Peer-to-Peer (or P2P) learning groups over the last 18 months.

The premise of P2P groups is that participants take the time to think about what they want to learn and why. This means their learning is tailored from the very beginning, ensuring that it is both relevant to their work and beneficial to their organization.
read more…

Optimize your monitoring for decision-making

2016 August 11

Working in infrastructure is about building and maintaining complex systems made of many moving parts. To operating such systems you need to make sure they are running and healthy, that is to say that they are performing to the level of quality you have defined, and for this you need monitoring.

For most engineers, “monitoring” means having web dashboards with graphs and numbers. I had to build monitoring for half a dozen large systems, and I’ll be honest, the first dashboards I made were really bad. They were packed with too much data, were using too many colors, and required too much internal knowledge to be useful outside of my team. Through my experience, I have come to the point where I can build decent monitoring dashboards — or at least I’d like to think so — and I want to share one simple yet powerful tool I’ve learned: the health box.

Building effective dashboards is hard

Everybody’s first dashboard looks close to this:


This dashboard is shiny and polished, it has many colors, and a black background to make it look cool. Now I want you to take a couple of minutes before you read further, and take a careful look at the graphs and labels in the dashboard above. Can you tell if this system is healthy or not?

Have you really looked or did you cheat? If you haven’t looked, please do it. There is something that counts the number of errors at the top, and the count is two. But does it mean the system has an outage? Honestly, I don’t know if this system is healthy or not, and that’s the answer I was expecting from you.

We have no idea what’s going on with this system and it’s not our fault. That dashboard was poorly designed, and whoever made it was probably building a dashboard for the first time, so it’s not their fault either.

Assume the viewer knows nothing

The makers of the dashboard above assumed that the viewer was like them, and had as much knowledge of the system as they did. Instead, they should have assumed that the viewer:

  • Does not know what every graph means.
  • Does not know what the graphs are supposed to look like when the system becomes unhealthy.
  • Does not know the internal components and how they fit together.
  • Has never read the source code.

Before you read further, take a couple of minutes to look carefully at this other dashboard below. Can you tell if this system is healthy?


What I expected you to think was:

  • The log_statistics_minutely service is definitely broken, it looks like it stopped running for the last 90 minutes at least.
  • There is something wrong going on with the puppet service on the host storage-31. Not as serious as log_statistics_minutely, but worth looking into.
  • Except for the two issues mentioned above, everything else seems healthy.

This dashboard is better than the previous one because it does not make you think, it tells you right away what is wrong and how critical each problem is. Now that you are aware of the limitations of graph dashboards, it’s time to formalize the solution. And here I give you: the health box.

Health box: a decision-making shortcut

The second dashboard did not have any graphs, it only had boxes of different colors with text in them. Those boxes are health boxes, and each of them has only one job: to tell you if the service it represents is healthy or not.


A health box has four bits of information:

  • Service name: what service is represented.
  • Status: OK, WARNING, or CRITICAL.
  • Message: a hint as to what is causing the status.
  • Opdocs code: this is a unique identifier that represents the service, and also a link to the Opdocs for that service. I will talk more about Opdocs later in this article.

A health box can only have three states, based on the three statuses it can represent: green, orange or red. Below is an example of a health box for a service called query_monitor, and what the health box would look like in different states.


There will be times when you have to look at your monitoring at 3 o’clock in the morning, and you’ll have come up with an answer to the question: is my system on fire and should I wake up my colleague?

By showing the viewer exactly what he needs so he can answer that question within seconds, you remove cognitive load. The viewer no longer needs to interpret and combine data from multiple sources, the decision has already been made for him and he can move onto troubleshooting and remediation.

Control your health box with thresholds

To control the state of your health boxes, you need to read the metrics that you have collected with your monitoring infrastructure, and compare those metrics against thresholds. For example if your system uses a queue to distribute work among workers, you want to monitor the number of items on that queue, which we’ll call N. Let’s assume that under normal load, N is about 50 per second, i.e. an average of 50 items per second on your queue.

You want to check that the size of your queue is not growing too much, which could be a sign that the workers are dead or are not processing items fast enough. You also want to ensure that there are at least some items on the queue, as an empty queue could mean that there is a problem somewhere upstream in your pipeline. Here is what those conditions could look like:

if N > 200 for the last 5 minutes:
    set to CRITICAL
else if N == 0 for the last 30 minutes:
    set to CRITICAL
else if N > 100 for the last 5 minutes:
    set to WARNING
else if N < 10 for the last 10 minutes:
    set to WARNING
    set to OK

All your monitoring and alerting systems must tell the same story

You’ll notice that the status of a health box is the same thing as the alerting or paging, that is to say, the set of conditions you have configured for a service in order to send yourself a message when that service becomes unhealthy.

When one of your systems goes down and your on-call engineer gets paged, the first thing this engineer will do is open the health web dashboard that has all the relevant health boxes, so he can get an instant view as to which services are unhealthy. Therefore it is very important for your health boxes to be in sync with your pager alerts at all time, so that there is only one version of the truth that can be trusted as the true state of your system.

Opdocs: Operational Documentation

The Opdocs are the Operational Documentation for a service, which describes the remediation steps that one can use to fix outages. Some engineers also call that a Playbook. Another useful concept is the “Opdocs code”, which is a unique identifier for every service in a system. For example if your system is named Bulldog, or BLG for short, then all the services that form that system and which you are monitoring should have a different Opdocs code: BLG-001, BLG-002, BLG-003, etc. This is just one possible convention, and you could also have a different Opdocs code for each failure mode of the same service, that's totally up to you.

When a service is unhealthy and sends an alert by text message or email, that alert should include the Opdocs code. The on-call engineer can then use the Opdocs code to search in the internal documentation of your company, and find guidance to fix the outage.

Putting it all together

Don’t get me wrong, I'm not saying that graph dashboards are bad. You definitely need them to monitor the internals of your systems. They should not, however, be your first level of monitoring.

Your top-level monitoring should be optimized for decision-making, so you can quickly figure out if you have an outage that needs a human to act immediately. One way to reach that goal is to build a health status dashboard using health boxes, and keep your graph dashboards as a second level, for troubleshooting.


This setup has proven very effective for me, but it's only a tool and you need to think for yourself whether it would work in the context of your own infrastructure.

Do you have monitoring or dashboard best practices that you want to share? Post a comment below! And if you enjoyed this article, subscribe to the mailing list at the top of this page, and you will receive an update every time a new article is posted.

Looking for a job?

Do you have experience in infrastructure, and are you interested in building and scaling large distributed systems? My employer,, is recruiting Software Engineers and Site Reliability Engineers (SREs) in Amsterdam, Netherlands. If you think you have what it takes, send me your CV at emmanuel [at] codecapsule [dot] com.

Implementing a Key-Value Store – Part 10: High-Performance Networking: KingServer vs. Nginx

2016 July 21

This is Part 10 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts. In this series of articles, I describe the research and process through which I am implementing a key-value database, which I have named “KingDB”. The source code is available at Please note that you do not need to read the previous parts to be able to follow what is going on here. The previous parts were mostly exploratory, and starting with Part 8 is perfectly fine.

In this article, I explain the model and inner workings of KingServer, the network server for KingDB. In order to put things into perspective I also cover Nginx, the high-performance HTTP server well-known for its network stack, and how it differs from KingDB.

read more…

How to get started with infrastructure and distributed systems

2016 January 3
by Emmanuel Goossaert

Most of us developers have had experience with web or native applications that run on a single computer, but things are a lot different when you need to build a distributed system to synchronize dozens, sometimes hundreds of computers to work together.

I recently received an email from someone asking me how to get started with infrastructure design, and I thought that I would share what I wrote him in a blog post if that can help more people who want to get started in that as well.

To receive a notification email every time a new article is posted on Code Capsule, you can subscribe to the newsletter by filling up the form at the top right corner of the blog.

A basic example: a distributed web crawler

For multiple computers to work together, you need some sort of synchronization mechanisms. The most basic ones are databases and queues. Part of your computers are producers or masters, and another part are consumers or workers. The producers write data in a database, or enqueue jobs in a queue, and the consumers read the database or queue. The database or queue system runs on a single computer, with some locking, which guarantees that the workers don’t pick the same work or data to process.

Let’s take an example. Imagine you want to implement a web crawler that downloads web pages along with their images. One possible design for such a system will require the following components:

  • Queue: the queue contains the URLs to be crawled. Processes can add URLs to the queue, and workers can pick up URLs to download from the queue.
  • Crawlers: the crawlers pick URLs from the queue, either web pages or images, and download them. If a URL is a webpage, the crawlers also look for links in the page, and push all those links to the queue for other crawlers to pick them up. The crawlers are at the same time the producers and the consumers.
  • File storage: The file storage stores the web pages and images in an efficient manner.
  • Metadata: a database, either MySQL-like, Redis-like, or any other key-value store, will keep track of which URL has been downloaded already, and if so where it is stored locally.

The queue and the crawlers are their own sub-systems, they communicate with external web servers on the internet, with the metadata database, and with the file storage system. The file storage and metadata database are also their own sub-systems.

Figure 1 below shows how we can put all the sub-systems together to have a basic distributed web crawler. Here is how it works:

1. A crawler gets a URL from the queue.
2. The crawler checks in the database if the URL was already downloaded. If so, just drop it.
3. The crawler enqueues the URLs of all links and images in the page.
4. If the URL was not downloaded recently, get the latest version from the web server.
5. The crawler saves the file to the File Storage system: it talks to a reserse proxy that’s taking incoming requests and dispatching them to storage nodes.
6. The File Storage distributes load and replicates data across multiple servers.
7. The File Storage update the metadata database so we know which local file is storing which URL.


Figure 1: Architecture of a basic distributed web crawler

The advantage of a design like the one above is that you can scale up independently each sub-system. For example, if you need to crawl stuff faster, just add more crawlers. Maybe at some point you’ll have too many crawlers and you’ll need to split the queue into multiple queues. Or maybe you realize that you have to store more images than anticipated, so just add a few more storage nodes to your file storage system. If the metadata is becoming too much of a centralized point of contention, turn it into a distributed storage, use something like Cassandra or Riak for that. You get the idea.

And what I have presented above is just one way to build a simple crawler. There is no right or wrong way, only what works and what doesn’t work, considering the business requirements.

Talk to people who are doing it

The one unique way to truly learn how to build a distributed system is to maintain or build one, or work with someone who has built something big before. But obviously, if the company you’re currently working at does not have the scale or need for such a thing, then my advice is pretty useless…

Go to and find groups in your geographic area that talk about using NoSQL data storage systems, Big Data systems, etc. In those groups, identify the people who are working on large-scale systems and ask them questions about the problems they have and how they solve them. This is by far the most valuable thing you can do.

Basic concepts

There are a few basic concepts and tools that you need to know about, some sort of alphabet of distributed systems that you can later on pick from and combine to build systems:

  • Concepts of distributed systems: read a bit about the basic concepts in the field of Distributed Systems, such as consensus algorithms, consistent hashing, consistency, availability and partition tolerance.
  • RDBMs: relational database management systems, such as MySQL or PostgreSQL. RDMBs are one of the most significant invention of humankind from the last few decades. They’re like Excel spreadsheets on steroid. If you’re reading this article I’m assuming you’re a programmer and you’ve already worked with relational databases. If not, go read about MySQL or PostgreSQL right away! A good resource for that is the web site
  • Queues: queues are the simplest way to distribute work among a cluster of computers. There are some specific projects tackling the problem, such as RabbitMQ or ActiveMQ, and sometimes people just use a table in a good old database to implement a queue. Whatever works!
  • Load balancers: if queues are the basic mechanism for a cluster of computer to pull work from a central location, load balancers are the basic tool to push work to a cluster of computer. Take a look at Nginx and HAProxy.
  • Caches: sometimes accessing data from disk or a database is too slow, and you want to cache things in the RAM. Look at projects such as Memcached and Redis.
  • Hadoop/HDFS: Hadoop is a very spread distributed computing and distributed storage system. Knowing the basics of it is important. It is based on the MapReduce system developed at Google, and is documented in the MapReduce paper.
  • Distributed key-value stores: storing data on a single computer is easy. But what happens when a single computer is no longer enough to store all the data? You have to split your storage into two computers or more, and therefore you need mechanisms to distribute the load, replicate data, etc. Some interesting projects doing that you can look at are Cassandra and Riak.
  • Read papers and watch videos

    There is a ton of content online about large architectures and distributed systems. Read as much as you can. Sometimes the content can be very academic and full of math: if you don’t understand something, no big deal, put it aside, read about something else, and come back to it 2-3 weeks later and read again. Repeat until you understand, and as long as you keep coming at it without forcing it, you will understand eventually. Some references:

    Introductory resources

    Real-world systems and practical resources

    Theoretical content

    Build something on your own

    There are plenty of academic courses available online, but nothing replaces actually building something. It is always more interesting to apply the theory to solving real problems, because even though it’s good to know the theory on how to make perfect systems, except for life-critical applications it’s almost never necessary to build perfect systems.

    Also, you’ll learn more if you stay away from generic systems and instead focus on domain-specific systems. The more you know about the domain of the problem to solve, the more you are able to bend requirements to produce systems that are maybe not perfect, but that are simpler, and which deliver correct results within an acceptable confidence interval. For example for storage systems, most business requirements don’t need to have perfect synchronization of data across replica servers, and in most cases, business requirements are loose enough that you can get away with 1-2% or erroneous data, and sometimes even more. Academic classes online will only teach you about how to build systems that are perfect, but that are impractical to work with.

    It’s easy to bring up a dozen of servers on DigitalOcean or Amazon Web Services. At the time I’m writing this article, the smallest instance on DigitalOcean is $0.17 per day. Yes, 17 cents per day for a server. So you can bring up a cluster of 15 servers for a weekend to play with, and that will cost you only $5.

    Build whatever random thing you want to learn from, use queuing systems, NoSQL systems, caching systems, etc. Make it process lots of data, and learn from your mistakes. For example, things that come to my mind:

    • Build a system that crawls photos from a bunch of websites like the one I described above, and then have another system to create thumbnails for those images. Think about the implications of adding new thumbnail sizes and having to reprocess all images for that, having to re-crawl or having to keep the data up-to-date, having to serve the thumbnails to customers, etc.
    • Build a system that gathers metrics from various servers on the network. Metrics such as CPU activity, RAM usage, disk utilization, or any other random business-related metrics. Try using TCP and UDP, try using load balancers, etc.
    • Build a system that shards and replicate data across multiple computers. For example, you’re complete dataset is A, B, and C and it’s split across three servers: A1, B1, and C1. Then, to deal with server failure you want to replicate the data, and have exact copies of those servers in A2, B2, C2 and A3, B3, C3. Think about the failure scenarios, how you would replicate data, how you would keep the copies synced, etc.?

    Look at systems and web applications around you, and try to come up with simplified versions of them:

    • How would you store the map tiles for Google Maps?
    • How would you store the emails for Gmail?
    • How would you process images for Instagram?
    • How would you store the shopping cart for Amazon?
    • How would you connect drivers and users for Uber?

    Once you’ve build such systems, you have to think about what solutions you need to deploy new versions of your systems to production, how to gather metrics about the inner-workings and health of your systems, what type of monitoring and alerting you need, how you can run capacity tests so you can plan enough servers to survive request peaks and DDoS, etc. But those are totally different stories!

    I hope that this article helped explain how you can get started with infrastructure design and distributed systems. If you have any other resources you want to share, or if you have questions, just drop a comment below!

    To receive a notification email every time a new article is posted on Code Capsule, you can subscribe to the newsletter by filling up the form at the top right corner of the blog.

    Looking for a job?

    Do you have experience in infrastructure, and are you interested in building and scaling large distributed systems? My employer,, is recruiting Software Engineers and Site Reliability Engineers (SREs) in Amsterdam, Netherlands. If you think you have what it takes, send me your CV at emmanuel [at] codecapsule [dot] com.

You don’t need to read faster, just pick the right books

2015 December 13
by Emmanuel Goossaert


  • The non-fiction book publishers are imposing a standard that is toxic for the readers: books that are 200-300 pages long, sell for $10-20, and get 4-5 star ratings on Amazon.
  • Trying to read everything is overwhelming and unnecessary, as most non-fiction books do not deserve your time anyway. They just recycle old ideas that are available in shorter and better formats elsewhere.
  • Reading speed is a distraction from the real problem: you don’t need to read faster, just pick the right books.
  • Apply the Pareto principle, and just pick the three great books that will bring you 99% of the value you need, and focus on reading and re-reading only those.
  • Skip any paragraph or chapter that is not relevant to you. And if a book is just plain bad, drop it right away.
  • Trying to remember everything is a waste of time. Summarize what you read down to the core concepts and models. Make your brain a search engine, not a storage system.

To receive a notification email every time a new article is posted on Code Capsule, you can subscribe to the newsletter by filling up the form at the top right corner of the blog.

Learning is awesome

Like most people, I love learning about new ideas, and a great way to that is to read non-fiction books. Books about psychology, business, history, technical topics, etc. But there are so many books out there, and so little time, it’s almost overwhelming!

So like many before me, I have researched techniques on how to read faster, but all of them ended up being no so workable for me. Reading speed doesn’t really matter for non-fiction books, as it is worthless to read a book as fast as possible if nothing is remembered. Thus another criteria to consider is retention, and how to assimilate the concepts you read about, so you can relate to them in your future thinking.

So what should we prioritize on? Read many books as fast as possible, or try to remember as much as possible from books? I’ve read a whole lot of non-fiction books over the past few years, and I’ve come to a conclusion: most of them are not even worth reading, and here why.

The current state of the book industry

The publishing industry has had a lot of time to experiment and refine their offers to find the sweet spot for their market. Let’s take an example to make it this more real. There has been a bunch of pop psychology books on the topic of expertise, here are the five most notable ones:

  • The Talent Code by Daniel Coyle, 256 pages, 4.5 stars, $16
  • Talent is Overrated by Geoff Colvin, 240 pages, 4.2 stars, $18
  • Mastery by Robert Greene, 352 pages, 4.5 stars, $12
  • Bounce by Matthew Syed, 336 pages, 4.4 stars, $11
  • Outliers by Malcolm Gladwell, 336 pages, 4.4 stars, $20

Do you notice something in that list? All those books fall in the same bucket, which is the industry standard for the average non-fiction books:

  • priced 10-20 dollars/euros
  • around 250-300 pages long
  • rated 4-5 stars on Amazon

Cal Newport wrote about how the non-fiction book publishing industry works. Non-fiction writers rarely write books and then try to sell them. In the majority of the cases they have agents who talk to publishers, and so writers pitch ideas for books to their agents who green-light them based on what they know about the current trends and what publishers want. The publishers then impose the industry standard, the infamous 250-300 pages that will sell for $10-20.

The standard of the book industry is toxic

So among those five books on expertise, which one should you read? That’s easy, the answer is none of them! They’re all closely or remotely based on the same 2007 HBR article The Making of an Expert by Anders Ericsson, itself based on earlier research by Ericsson and his peers. The HBR article is about 10 pages long, and will cost you nothing as you can read it online on the HBR website.

I’ve taken this one example of non-fiction books on the topic of expertise, and I’m sure you can easily think of other topics you know about where similar books are competing and bringing nothing new to the table. Most books are repeating what other books before them have covered, and will be forgotten 10 years from now.

And that’s why most non-fiction books feel the same, and are very repetitive. They’re basically trying to expand a 10-page article into a 250-page book by bending anecdotes with hindsight bias, all of that just to fit into an industry standard. And you as a reader and consumer end up wasting your time reading a 250-page book that really should just be 10 pages of concise and surgically-edited high quality content.

Pick a handful of books, read them, then re-read them

You don’t need to read everything, just read the good books. Following the Pareto principle, there should be 1% of the books out there that should provide you with 99% of all the value you need at this very moment of your life. Let’s pick a number, let’s say three books. There should be no more than three non-fiction books really worth reading for you right now, which are well-written, concise, and relevant to who you are and the areas where you need to grow at this very moment of your life.

So here is my advice to all of you trying to find ways to read faster or remember everything from what you read: just don’t. There is no need for that. Just try to find, among the gigantic pile of redundant books jamming the search results of Amazon, the one book that matters in a field and which will have 99% of impact on your mind, and just focus on reading that one. Assuming that this book was the best in its field, from there any additional book you read in the same field will have an exponentially decreasing impact on your mind.

How I read books at the moment

I’ve looked into various memory techniques such as the method of loci and spaced repetition, and concluded that for book it was not the good solution, because the underlying assumption and goal are wrong. Here my reasoning: if you need the knowledge every day, then the actual practice of it will make you remember it. And if you do not need the knowledge every day, then why do you even bother trying to remember it!

Here is what works for me for non-fiction books:

  • I do active reading by taking notes and writing down action points as I read.
  • In my notes, I summarize the concepts and models from each chapter.
  • If a paragraph or chapter does not seem relevant, I skim over it. If it’s not relevant at all, I just skip it.
  • Every 6-8 week period after the first reading, if I feel like I need a refresher then I re-read the book and my notes. Those re-reading always take a lot less time that the initial ones.
  • What really matters is not the actual content and anecdotes in the book, but the core concepts and models.
  • Next time I face a problem that relates to those concepts, I will know which book they came from. So essentially, I want to make my brain a search engine, not a storage system.

Of course it’s partly boring, because I’m not reading new books about some shiny new topic everybody is talking about, but who cares? I’m re-reading the few books I know are the top 1% in terms of impact on my brain, and spending the time re-reading them is the only way to ensure this great content will have a lasting impact on the way I think.

What’s next?

What’s your opinion on non-fiction books? Any reading trick or method you think I should have mentioned here? Drop a comment!

To receive a notification email every time a new article is posted on Code Capsule, you can subscribe to the newsletter by filling up the form at the top right corner of the blog.

Implementing a Key-Value Store – Part 9: Data Format and Memory Management in KingDB

2015 August 3

This is Part 9 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts. In this series of articles, I describe the research and process through which I am implementing a key-value database, which I have named “KingDB”. The source code is available at Please note that you do not need to read the previous parts to be able to follow. The previous parts were mostly exploratory, and starting with Part 8 is perfectly fine.

In this article, I explain how the storage engine of KingDB works, including details about the data format. I also cover how memory management is done through the use of a compaction process.

read more…

Implementing a Key-Value Store – Part 8: Architecture of KingDB

2015 May 25

This is Part 8 of the IKVS series, “Implementing a Key-Value Store”. You can also check the Table of Contents for other parts. In this series of articles, I describe the research and process through which I am implementing a key-value database, which I have named “KingDB”. The source code is available at Please note that you do not need to read the previous parts to be able to follow. The previous parts were mostly exploratory, and starting with Part 8 is perfectly fine.

In the previous articles, I have laid out the research and discussion around what needs to be considered when implementing a new key-value store. In this article, I will present the architecture of KingDB, the key-value store of this article series that I have finally finished implementing.

read more…

Introducing KingDB v0.9.0

2015 May 14
by Emmanuel Goossaert

I am pleased to announce that I am releasing the very first version of KingDB, the fast persisted key-value store. KingDB is a side-project that I have been hacking on intermittently over the last couple of years. It has taken a lot of my personal time, therefore I am very happy to finally have reached that moment.

Go to to find the source code, documentation and benchmarks.

KingDB is interesting for many reasons:

  • Fast for heavy write workloads and random reads.
  • The architecture, code, and data format are simple.
  • Multipart API to read and write large entries in smaller parts.
  • Multiple threads can access the same database safely.
  • Crash-proof: nothing ever gets overwritten.
  • Iterators and read-only consistent snapshots.
  • Compaction happens in a background thread, and does not block reads or writes.
  • The data format allows hot backups to be made.
  • Covered by unit tests.

Version 0.9.0 is still alpha code, therefore even if KingDB has many unit tests that ensure the stability of its core components, make sure you run tests in your own environment before using KingDB in production. New features and optimizations will come along the way.

Over the coming weeks I will publish the last articles for the IKVS series, which will cover the architecture and data format of KingDB.

What now? Go to to check out KingDB! If you have any questions, drop a comment below or join the KingDB mailing list, I would be happy to answer them 🙂