Database Internals: Where Your Data Actually Lives

A CloudStreet Educational Book

Written by Opus 4.5

Read Online

Read the book online - Hosted on GitHub Pages

About This Book

Ever wondered what happens when you hit COMMIT? Why does that one query take 30 seconds while another returns instantly? What's actually going on when your database "recovers" after a crash?

This book takes you on a journey into the heart of database systems—the storage engines, B-trees, write-ahead logs, and MVCC implementations that power everything from your local SQLite database to planet-scale distributed systems. We'll explore how databases transform your SQL queries into disk operations, manage concurrent access from thousands of users, and guarantee your data survives power failures and hardware crashes.

Whether you're a developer trying to understand why your queries are slow, an engineer designing data-intensive systems, or simply curious about one of the most sophisticated pieces of software ever created, this book will give you the mental models to understand what's really happening beneath the abstraction layers.

Who This Book Is For

Backend developers who want to write better queries and design better schemas
Software engineers building systems that interact heavily with databases
System architects making decisions about data storage and retrieval
The curious who want to understand the engineering marvels hiding behind SELECT * FROM users

What You'll Learn

How data is physically organized on disk and in memory
The data structures that make queries fast (and when they don't)
How databases handle multiple users reading and writing simultaneously
What guarantees ACID actually provides and how they're implemented
Why write-ahead logging is essential for crash recovery
How query optimizers decide the best way to execute your SQL
The trade-offs between different storage engine architectures
How distributed databases maintain consistency across machines

How to Read This Book

This book is designed to be read sequentially, as later chapters build on concepts introduced earlier. However, if you're already familiar with certain topics, feel free to skip ahead:

New to databases? Start from Chapter 1 and work through sequentially.
Know the basics? Skip to Part II for the data structure deep-dives.
Here for concurrency? Part III covers transactions, locking, and MVCC.
Query performance issues? Part IV on query processing will be most relevant.
Scaling up? Part V covers distributed systems and different storage architectures.

Building Locally

This book is built using mdBook. To build locally:

# Install mdBook
cargo install mdbook

# Build the book
mdbook build

# Serve locally with hot-reload
mdbook serve --open

Conventions Used

Throughout this book, we use several conventions:

Code blocks indicate SQL, pseudocode, or data structure representations
Bold terms indicate important concepts being introduced
Italics are used for emphasis and technical terms
ASCII diagrams illustrate data structures and system architectures
PostgreSQL is used as the primary reference implementation, with notes on how other databases differ

About the Author

This book was written by Opus 4.5, Anthropic's AI assistant, as part of the CloudStreet educational series. The content synthesizes knowledge from database research papers, system documentation, and practical engineering experience into an accessible guide for working developers.

License

This work is part of the CloudStreet Educational Series.

"The database is the most important software component in most applications, yet it remains a black box to most developers. Let's open that box."

— Opus 4.5

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
book.toml		book.toml
custom.css		custom.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Database Internals: Where Your Data Actually Lives

Read Online

About This Book

Who This Book Is For

What You'll Learn

Table of Contents

Part I: Foundations

Part II: Data Structures

Part III: Transactions and Concurrency

Part IV: Query Processing

Part V: Reliability and Scale

Appendices

How to Read This Book

Building Locally

Conventions Used

About the Author

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

cloudstreet-dev/Database-Internals

Folders and files

Latest commit

History

Repository files navigation

Database Internals: Where Your Data Actually Lives

Read Online

About This Book

Who This Book Is For

What You'll Learn

Table of Contents

Part I: Foundations

Part II: Data Structures

Part III: Transactions and Concurrency

Part IV: Query Processing

Part V: Reliability and Scale

Appendices

How to Read This Book

Building Locally

Conventions Used

About the Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages