Anatomy of a query
Type SELECT count(*) FROM events WHERE region = 'us-east-1' AND date = '2026-05-26' and press enter. Somewhere between you hitting return and the number coming back, your SQL goes through about five transformations. Reading those transformations is the fastest way to understand why some queries are fast and others are not.
Five transformations between six artifacts. The optimizer is the stage most queries get slow at. The bottom row shows what each stage knows that the previous one did not.
The parser
SQL is text. The parser turns it into an abstract syntax tree, a tree of nodes representing the structure of the query. The AST captures clause structure (SELECT, FROM, WHERE, GROUP BY, ORDER BY) but does not yet know what events refers to or what types the columns have. Most engines write their own parser; some use generated parsers (yacc, bison, antlr); Sail uses Chumsky, a parser combinator library. The parser stage is almost never the bottleneck.
The logical plan
With the AST in hand, the engine resolves names against the catalog. events becomes a specific table with a specific schema. region becomes column three of type string. The result is a logical plan: a tree of relational operators expressed in terms of what, not how. Filter, project, join, group-by, aggregate. This is the level at which the Volcano paper (Graefe, 1993) defined query execution and where most database textbooks live.
The optimizer
The optimizer rewrites the logical plan to be cheaper. Push filters down to the scan so the engine never reads rows that fail. Drop projections that no one uses. Reorder joins by estimated cardinality. Eliminate redundant subqueries. The bulk of database research over the last forty years lives here.
Two kinds of rules. Rule-based rules always apply (predicate pushdown, projection pruning). Cost-based rules depend on table statistics (which side of a join to broadcast, which join algorithm to pick, how to order joins). Cost-based decisions are where the optimizer can be wrong, and where you sometimes have to hint or rewrite the query.
The physical plan
The optimizer also lowers the logical plan to a physical plan: the same tree, but with each operator replaced by a specific execution strategy. A logical "join" becomes a HashJoinExec, SortMergeJoinExec, or NestedLoopJoinExec depending on size and key distribution. A logical "scan" becomes a ParquetExec with predicate pushdown to skip row groups. A logical "aggregate" becomes a hash aggregation with a specific batch size.
Execution
The execution layer streams record batches through the operator tree. Each operator takes a stream in and produces a stream out. Backpressure flows up naturally; partial results flow down. Most engines parallelize at this layer: each physical operator can produce N streams (one per worker or core), and the scheduler decides where each runs.
The result is whatever the top operator collects: a single row for an aggregate, a bounded set for a LIMIT, an unbounded stream for a streaming query.
Where most queries get slow
When a query is slow, the diagnosis usually lands on one stage.
| Stage | When it goes wrong |
|---|---|
| Parse | Almost never. SQL would have to be pathologically large. |
| Logical plan | Sometimes. A correlated subquery or a CTE the engine can't see through. |
| Optimizer | Often. Cost model guessed cardinality wrong; join order or algorithm is wrong. |
| Physical plan | Often. Parquet files lack statistics so predicate pushdown can't prune. Or partitioning is wrong, so every query scans everything. |
| Execution | Often. I/O or shuffle dominates the wall clock. |
Reading EXPLAIN ANALYZE is reading the optimizer's notebook. Every modern engine emits one. Learning to read it for your engine of choice is one of the highest-ROI database skills.
Further reading
- Graefe, “Volcano: An Extensible and Parallel Query Evaluation System” (1993), the origin of the stream-of-iterators execution model.
- DuckDB CIDR 2020 paper, short and worth reading.
- DataFusion documentation, the canonical reference for the Rust side.