Part 1: Introduction & Architecture
What is CockroachDB, why is it named after an insect, and how does it survive anything?
Welcome to “CockroachDB: From Zero to Hero”. In this first part, we’re going to explore why this database is named after the most resilient creature on Earth and how it brings Google Spanner-like capabilities to the masses.
The Origin Story
Imagine a database that you can’t kill. You can unplug servers, cut fiber optic cables, or even lose an entire datacenter, and it just keeps working. That’s the promise of CockroachDB.
Inspired by the Google Spanner whitepaper, CockroachDB was built to solve the hardest problem in databases: Horizontal Scalability + Strong Consistency.
Architecture in a Nutshell
CockroachDB isn’t just a single binary; it’s a distributed system. Here are the key concepts you need to know:
The KV Store
At its heart, CockroachDB is a giant, sorted Key-Value store. Every table, row, and index is mapped to a key-value pair.
Ranges
Data is split into 512MB chunks called “Ranges”. These ranges are replicated (usually 3 times) across different nodes using the Raft consensus algorithm.
Raft Consensus
How does it ensure data is consistent? It uses Raft. For every Range, there is a “Raft Group” of 3 replicas. One is the Leaseholder (the boss), and the others are followers. Writes must be acknowledged by a majority (2 out of 3) to be committed.
Conclusion
CockroachDB is complex under the hood so that it can be simple for you to use. It looks like Postgres, but acts like a distributed system. In the next part, we’ll spin up a cluster and see it in action.