Columnar vs row
Picture a table with four rows and three columns. Same data, two ways the bytes can sit in memory. Pick the right one for your workload and you read one column instead of fifty. Pick the wrong one and you eat I/O for data you never look at.
The same twelve values in memory, two ways. The query
SUM(age) WHERE age > 30
touches four cells either way; in column layout, those four cells live next to each other.
The query above is the whole story
A query like SELECT SUM(age) FROM people WHERE age > 30 touches exactly four cells out of twelve. The optimizer can prove this. The execution layer cares about whether those four cells are easy to read or scattered across the disk.
In row layout (top strip), records sit next to each other. To read all four ages, the engine has to skip past id and name cells over and over. On disk, this means reading every row group, every page, every record. The four "interesting" cells are scattered through twelve.
In column layout (bottom strip), fields sit next to each other. To read all four ages, the engine reads one contiguous block. The id and name columns are not even loaded. The four interesting cells are right there together.
Why column layout wins for analytics
Three reasons, in order of how much they matter in practice.
Less I/O. A 50-column table with a query touching 3 columns reads 6% of the bytes instead of 100%. This shows up everywhere: faster scans, smaller working set, better cache behavior, lower S3 costs.
Better compression. Same-typed values compress better than mixed-typed values. A column of integers compresses 5-10x. A column of repeated strings (like country codes) compresses 50-100x. Row layout cannot compress like this because each cell is a different type from its neighbors.
Vectorization and SIMD friendly. A SIMD instruction operates on a vector of values. Column layout puts those values exactly where SIMD wants them: contiguous, same type, in a register. Row layout would have to gather them from scattered positions first.
Why row layout wins for transactional workloads
Three reasons, mirrored.
Whole-record reads are fast. A query like SELECT * FROM users WHERE id = 42 wants every column of one row. Row layout has them all in one place. Column layout has to seek to each column.
Inserts are atomic. Appending a new record to a row store writes one block. Inserting into a column store requires touching every column's storage.
The working set is small. OLTP queries touch a few rows. Row layout fits those rows on a few pages. Column layout would touch every column file.
The picture you should carry around
Row layout optimizes for "give me the whole record." Column layout optimizes for "give me one field across every record."
Analytical engines pick column layout because analytical queries do the second thing. Transactional engines pick row layout because transactional queries do the first.
Arrow is the in-memory columnar format that became the lingua franca. Parquet is the on-disk equivalent. Together they let an engine read one column from blob storage without reading any of the others.
Further reading
- Arrow Columnar Format, the spec.
- Parquet documentation, the on-disk format that makes column-store-on-cheap-storage work.
- Abadi, Madden, Hachem “Column-Stores vs. Row-Stores” (2008), the canonical comparison paper.