Loading
[ loading ]
Loading
[ loading ]
[ sys ]entropysha 901ddb7measured-in-repo
I wrote the real internals of a disk-backed relational engine in C++20: a recursive-descent parser, a cost-based optimizer, Volcano executors, MVCC and two-phase-locking transactions, a WAL with ARIES recovery, and a B+ tree over an LRU buffer pool. No SQL-parsing or storage libraries.
the shape of the engine
One SELECT all the way down: every box is a compiled library, every arrow is the path a tuple takes from SQL text to a B+ tree leaf, made physical.
SELECT * FROM users WHERE id = 42tokens -> ASTbound AST (resolved cols + types)cost-chosen plan: index scanVolcano iterator: next() -> tuplepin(page) -> frame (LRU on miss)leaf slot -> tuple #42SELECT descending the real engine. Each box is a compiled library under src/. Each arrow is the path one tuple takes from SQL text to a B+ tree leaf.Most database portfolio projects are a thin SQL wrapper over an in-memory hash map. I wanted the real internals of a disk-backed relational engine, end to end: parse SQL, plan and cost it, execute it through composable operators, and persist it through a buffer pool with ACID transactions and crash recovery.
The constraint that made it honest: modern C++20, and no SQL-parsing or storage libraries pulled in. If the parser, the B+ tree, and the recovery log were going to exist, I had to write them.
Entropy mirrors the logical pipeline as separately compiled C++ libraries, each with an explicit boundary. A hand-written recursive-descent parser and binder feed a cost-based optimizer (statistics, cost model, index selector), which produces Volcano-style iterators for scans, joins, sort, aggregate, filter, and DML.
Underneath sit MVCC and a two-phase-locking lock manager with deadlock detection, a write-ahead log with ARIES three-phase recovery, and a storage engine of slotted pages, a B+ tree, an extendible hash index, and an LRU buffer pool over a disk manager. The benchmark harness uses Google Benchmark and drives the same queries against SQLite when ENTROPY_BENCH_COMPARE_SQLITE=ON.
Layered under src/: parser/, optimizer/, execution/ with about twelve operators, transaction/ (MVCC, lock manager, WAL, recovery), storage/ (pages, B+ tree, hash index, buffer pool, disk manager), catalog/, and a public API in include/entropy/.
src/, with an explicit boundary. The arrows are the data path a statement follows down to disk.DESIGN.md carries twelve architecture decision records of roads not taken. Here are the four that shaped the engine most. Each names what I chose, what I gave up, and where the reasoning lives.
Hand-written recursive-descent parser
over hsql / libpg_query
I owned the whole grammar and every edge case, for zero deps and full control.
MVCC + snapshot isolation
over strict 2PL
More version bookkeeping, in exchange for reads that never block writers.
Volcano iterator execution
over vectorized / push-based
I traded analytics throughput for composable operators and one-tuple-at-a-time memory.
Nested-loop join first, hash join after
over hash join only
Generality before speed: nested-loop handles any predicate, hash join handles equi-joins fast.
The benchmark below carries the same honesty: Entropy loses to SQLite on point selects, and I kept that result in rather than cherry-pick a win.
On a batch of 1,000 single-transaction inserts, Entropy runs at 1M+ rows/s, about 11% faster than SQLite through the same Google Benchmark harness.
On point selects it is 2.0-2.6× slower. Both numbers come from the same run, pinned to sha 901ddb7, against system libsqlite3. The chart anchors on SQLite at 1.0×. The losses sit at full size, not hidden.
These are single-repetition results. The harness takes --benchmark_repetitions for p50/p95/p99 when I want them. Source: docs/benchmarks/bench_summary.csv.
Five places where the engine does the work a toy version skips. Each links into the source at the pinned SHA.
The playground steps a real SQL statement through this exact architecture and renders the Entropy-vs-SQLite chart from the committed CSV. Nothing there is fabricated. It runs the project's own material.